This message is for discussion, the patches attached are not intended
to be committed quite yet. I'd like to spend at least one more night
of heavy break-in testing before finally signing off on my modified
patch or the original.
Florian sent me a patch (attached: hashtable.diff) last week or so.
Applying it changes the data structure backing store for users and
channels from using linked lists to using a hash table.
I wrote a test harness and ran 1000 clients against a local ngircd
using the hashtable patch to test for memory leaks and logical
errors. The test clients ran in a tight loop, and as often as
possible they would change their nicks, join and/or part channels,
send messages to channels (existent and not) for 18 hours. While
using my own local client to watch the carnage, the ircd stayed very
responsive and the test harness did not experience any errors with
data constistency when comparing each client's local channel list with
the results of WHOIS. Other than memory issues already present
(getpwd() in libc leaks a small amount of memory, but it is run once
at startup), there were no memory leaks that I could find. I'm going
to retest the ircd on a different system tonight (one where I have
better tools for profiling and memory) to ensure that there are no
memory issues.
I've attached my own modified version of the patch
(hashtable.mine.diff) for subsequent review. There is only one
notable change, in Client_Destroy(), where I've restructured nearly
the entire function.
Comments on the patch are inline below (please note that the quoted
text is *not* the complete patch. the quoted text is still in valid
diff syntax, but I removed large portions for brevity). If you just
want the final thoughts, jump to the end of this message.
> Index: ngircd/src/ngircd/channel.c
> ===================================================================
> --- ngircd.orig/src/ngircd/channel.c 2008-05-25 10:52:59.000000000
> -0700
> +++ ngircd/src/ngircd/channel.c 2008-05-25 13:05:55.000000000 -0700
this is a huge win. every single channel search goes from being
linear in time to almost constant time.
> @@ -226,26 +217,14 @@ Channel_Part(CLIENT * Client, CLIENT * O
> assert(Name != NULL);
> assert(Reason != NULL);
>
> - /* Check that specified channel exists */
> - chan = Channel_Search(Name);
> - if (!chan) {
> - IRC_WriteStrClient(Client, ERR_NOSUCHCHANNEL_MSG,
> - Client_ID(Client), Name);
> - return false;
> - }
> -
> - /* Check that the client is in the channel */
> - if (!Get_Cl2Chan(chan, Client)) {
> - IRC_WriteStrClient(Client, ERR_NOTONCHANNEL_MSG,
> - Client_ID(Client), Name);
> + chan = Channel_Search( Name );
> + if(( ! chan ) || ( ! Cl2Chan_Get( chan, Client )))
> + {
> + IRC_WriteStrClient( Client, ERR_NOSUCHCHANNEL_MSG,
> Client_ID( Client ), Name );
> return false;
> }
>
> - /* Part client from channel */
> - if (!Remove_Client(REMOVE_PART, chan, Client, Origin, Reason, true))
> - return false;
> - else
> - return true;
> + return Remove_Client( REMOVE_PART, chan, Client, Origin, Reason,
> true);
> } /* Channel_Part */
>
>
> @@ -896,46 +711,43 @@ Remove_Client( int Type, CHANNEL *Chan,
retaining the variable c to represent the channel structure instead of
using the variable Chan in this change would make this portion of the
diff a lot less intimidating and invasive. However, I did not remove
this portion of the diff in my version because it cleans up the method
considerably, making variable naming throughout the function much more
consistent.
> * so only inform other clients in same channel. */
> assert( InformServer == false );
> LogDebug("User \"%s\" left channel \"%s\" (%s).",
> - Client_Mask( Client ), c->name, Reason );
> + Client_Mask( Client ), Chan->name, Reason );
> break;
> case REMOVE_KICK:
> /* User was KICKed: inform other servers and all users in channel
> */
> if( InformServer )
> IRC_WriteStrServersPrefix( Client_NextHop( Origin ),
> - Origin, "KICK %s %s :%s", c->name, Client_ID( Client ), Reason);
> - IRC_WriteStrChannelPrefix(Client, c, Origin, false, "KICK %s %s :
> %s",
> - c->name, Client_ID( Client ), Reason );
> + Origin, "KICK %s %s :%s", Chan->name, Client_ID( Client ),
> Reason);
> + IRC_WriteStrChannelPrefix(Client, Chan, Origin, false, "KICK %s
> %s :%s",
> + Chan->name, Client_ID( Client ), Reason );
> if ((Client_Conn(Client) > NONE) &&
> (Client_Type(Client) == CLIENT_USER))
> {
> IRC_WriteStrClientPrefix(Client, Origin, "KICK %s %s :%s",
> - c->name, Client_ID( Client ), Reason);
> + Chan->name, Client_ID( Client ), Reason);
> }
> LogDebug("User \"%s\" has been kicked off \"%s\" by \"%s\": %s.",
> - Client_Mask( Client ), c->name, Client_ID(Origin), Reason);
> + Client_Mask( Client ), Chan->name, Client_ID(Origin), Reason);
> break;
> default: /* PART */
> if (InformServer)
> - IRC_WriteStrServersPrefix(Origin, Client, "PART %s :%s", c-
> >name, Reason);
> + IRC_WriteStrServersPrefix(Origin, Client, "PART %s :%s", Chan-
> >name, Reason);
>
> - IRC_WriteStrChannelPrefix(Origin, c, Client, false, "PART %s :%s",
> - c->name, Reason);
> + IRC_WriteStrChannelPrefix(Origin, Chan, Client, false, "PART %s :
> %s",
> + Chan->name, Reason);
>
> if ((Client_Conn(Origin) > NONE) &&
> (Client_Type(Origin) == CLIENT_USER))
> {
> - IRC_WriteStrClientPrefix( Origin, Client, "PART %s :%s", c-
> >name, Reason);
> + IRC_WriteStrClientPrefix( Origin, Client, "PART %s :%s", Chan-
> >name, Reason);
> LogDebug("User \"%s\" left channel \"%s\" (%s).",
> - Client_Mask(Client), c->name, Reason);
> + Client_Mask(Client), Chan->name, Reason);
> }
> }
>
> - /* Wenn Channel nun leer und nicht pre-defined: loeschen */
> - if( ! strchr( Channel_Modes( Chan ), 'P' ))
> - {
> - if( ! Get_First_Cl2Chan( NULL, Chan )) Delete_Channel( Chan );
> - }
> -
> + /* if channel empty and not predefined -> remove */
> + if ((hashtable_getfillcount(&Chan->members) == 0) && !
> strchr(Channel_Modes( Chan ), 'P'))
> + hashtable_delete(&My_Channels, Chan->name);
> return true;
> } /* Remove_Client */
>
> Index: ngircd/src/ngircd/cl2chan.h
> ===================================================================
> --- /dev/null 1970-01-01 00:00:00.000000000 +0000
> +++ ngircd/src/ngircd/cl2chan.h 2008-05-25 13:05:55.000000000 -0700
separating out the cl2chan functions is a really good refactoring move
as far as code cleanup goes.
> Index: ngircd/src/ngircd/client.c
> ===================================================================
> --- ngircd.orig/src/ngircd/client.c 2008-05-25 10:52:59.000000000
> -0700
> +++ ngircd/src/ngircd/client.c 2008-05-25 13:05:55.000000000 -0700
again this looks more like it could have been it's own patch instead
of being included here. that said, it works properly; since it's
already here we may as well merge it in.
> @@ -141,8 +146,17 @@ Client_ThisServer( void )
> GLOBAL CLIENT *
> Client_NewLocal(CONN_ID Idx, char *Hostname, int Type, bool Idented)
> {
> - return Init_New_Client(Idx, This_Server, NULL, Type, NULL, NULL,
> - Hostname, NULL, 0, 0, NULL, Idented);
> + /* # is illegal in a nick name, so #+CONN_ID is unique.
> + * We need this as a placeholder until the clients sets
> + the real nick. */
> + char fake_nick[CLIENT_NICK_LEN];
> +
> + snprintf(fake_nick, sizeof fake_nick, "#%x#", Idx);
> +#ifdef DEBUG
> + assert(Client_Search(fake_nick) == NULL);
> +#endif
> + return Init_New_Client(Idx, This_Server, NULL, Type, fake_nick,
> NULL,
> + Hostname, NULL, 0, 0, NULL, Idented);
> } /* Client_NewLocal */
>
>
> @@ -241,93 +269,110 @@ Client_Destroy( CLIENT *Client, char *Lo
> strlcat(msg, Client->id, sizeof (msg));
> }
>
> - last = NULL;
> - c = My_Clients;
> - while( c )
> +#ifdef DEBUG
> + strlcpy(nickbuf, Client->id, sizeof nickbuf);
> + if (hashtable_get(&My_Clients, nickbuf) == NULL)
> + LogDebug("Client_Destroy() called for unknown nickname %s!",
> nickbuf);
> +#endif
> + hashtable_iterator_reset(&My_Clients);
> + while(( c = hashtable_iterator_getnext(&My_Clients)))
> {
> - if(( Client->type == CLIENT_SERVER ) && ( c->introducer ==
> Client ) && ( c != Client ))
> - {
> - /* der Client, der geloescht wird ist ein Server. Der Client,
> den wir gerade
> - * pruefen, ist ein Child von diesem und muss daher auch
> entfernt werden */
> - Client_Destroy( c, NULL, msg, false );
> - last = NULL;
> - c = My_Clients;
> + LogDebug("Testing client nick: %s, want %s", c->id, Client->id);
> +
I think there's a serious optimization to be made here:
If Client is not a server, then no iteration is necessary, so maybe
separating this function into two functional portions, one for normal
clients and one for servers, would be a good idea, since this function
is needlessly inefficient if the Client is a server. This is an
especially large bottleneck with a multiserver network, since
destroying a server would have taken O(n^2) as written in this patch.
This portion of the code was changed in my patch.
> + if (c != Client) {
> + if ((Client->type == CLIENT_SERVER) && (c->introducer ==
> Client)) {
> + /* The client that is about to be removed is a server,
> + * the client we are checking right now is a child of that
> + * server and thus has to be removed, too. */
> + Client_Destroy( c, NULL, msg, false );
> + hashtable_iterator_reset(&My_Clients);
> + }
> continue;
> }
> - if( c == Client )
> - {
> - /* Wir haben den Client gefunden: entfernen */
> - if( last ) last->next = c->next;
> - else My_Clients = (CLIENT *)c->next;
> -
> - if( c->type == CLIENT_USER )
> - {
> - if( c->conn_id != NONE )
> - {
> - /* Ein lokaler User */
> - Log( LOG_NOTICE, "User \"%s\" unregistered (connection %d):
> %s", Client_Mask( c ), c->conn_id, txt );
> -
> - if( SendQuit )
> - {
> - /* Alle andere Server informieren! */
> - if( FwdMsg ) IRC_WriteStrServersPrefix( NULL, c, "QUIT :%s",
> FwdMsg );
> - else IRC_WriteStrServersPrefix( NULL, c, "QUIT :" );
> + /* c == Client */
…<snip>…
> + LogDebug("Remove Nick \"%s\" from hashtable (Destroy)", c->id);
> + hashtable_delete(&My_Clients, c->id);
> + break;
> }
> + hashtable_iterator_reset(&My_Clients);
> +
> +#ifdef DEBUG
> + assert(hashtable_get(&My_Clients, nickbuf) == NULL);
> +#endif
> } /* Client_Destroy */
>
>
> @@ -575,13 +635,14 @@ Client_GetFromToken( CLIENT *Client, int
this looks a lot less efficient than it could be. luckily, it's
called only once during a connection's lifetime, but if a connection
has not been fully registered and logged in, than why not use the
token (or a variation of it) as the hashtable's key?
> assert( Client != NULL );
> assert( Token > 0 );
>
> - c = My_Clients;
> - while( c )
> - {
> - if(( c->type == CLIENT_SERVER ) && ( c->introducer == Client ) &&
> ( c->token == Token )) return c;
> - c = (CLIENT *)c->next;
> + hashtable_iterator_reset(&My_Clients);
> + while((c = hashtable_iterator_getnext(&My_Clients))) {
> + if(( c->type == CLIENT_SERVER ) &&
> + ( c->introducer == Client ) &&
> + ( c->token == Token )) break;
> }
> - return NULL;
> + hashtable_iterator_reset(&My_Clients);
> + return c;
> } /* Client_GetFromToken */
>
>
> @@ -884,12 +939,11 @@ Client_MyServerCount( void )
this comment applies to all the Client_*Count() functions: performance
in these functions is very unfortunate, changing from a hash table to
a linked list does not make this any faster, but if performance in
these sections of the code is important, then the counts should
probably be cached. for extra credit, the cached values could be
updated in some way that does not include a O(n) search of the hash
table.
> CLIENT *c;
> unsigned long cnt = 0;
>
> - c = My_Clients;
> - while( c )
> - {
> + hashtable_iterator_reset(&My_Clients);
> + while ((c = hashtable_iterator_getnext(&My_Clients)))
> if(( c->type == CLIENT_SERVER ) && ( c->hops == 1 )) cnt++;
> - c = (CLIENT *)c->next;
> - }
> +
> + hashtable_iterator_reset(&My_Clients);
> return cnt;
> } /* Client_MyServerCount */
>
> @@ -1062,18 +1118,14 @@ Generate_MyToken( CLIENT *Client )
this function could also stand to use some help from caching. it
seems raucously inefficient.
> CLIENT *c;
> int token;
>
> - c = My_Clients;
> token = 2;
> - while( c )
> - {
> - if( c->mytoken == token )
> - {
> - /* Das Token wurde bereits vergeben */
> + hashtable_iterator_reset(&My_Clients);
> + while ((c = hashtable_iterator_getnext(&My_Clients))) {
> + if( c->mytoken == token ) {
> + /* Token already in use */
> token++;
> - c = My_Clients;
> - continue;
> + hashtable_iterator_reset(&My_Clients);
> }
> - else c = (CLIENT *)c->next;
> }
> Client->mytoken = token;
> Log( LOG_DEBUG, "Assigned token %d to server \"%s\".", token,
> Client->id );
> Index: ngircd/src/ngircd/hash.c
> ===================================================================
> --- ngircd.orig/src/ngircd/hash.c 2008-05-25 10:52:59.000000000 -0700
> +++ ngircd/src/ngircd/hash.c 2008-05-25 13:05:55.000000000 -0700
> @@ -19,16 +20,33 @@ static char UNUSED id[] = "$Id: hash.c,v
> #include "imp.h"
> #include <assert.h>
> #include <string.h>
> +#include <errno.h>
> +#include <stdlib.h>
>
> #include "defines.h"
> +#include "log.h"
> #include "tool.h"
>
> #include "exp.h"
> #include "hash.h"
>
allowing some way for the builder to set this at configure-time
(instead of adding -DNITIAL_HASHSIZE in their CFLAGS) would probably
be much appreciated, since network size varies enormously. if nothing
else, the initial size should err on the side of being too large, as
hash table performance does not suffer considerably from being too
empty.
> +#define INITIAL_HASHSIZE 23
> +
> +static struct bucket *hashtable_bucket_get(hashtable *h, const void
> *key);
> +static struct bucket *hashtable_bucket_steal(hashtable *h, const
> void *key);
>
> static UINT32 jenkins_hash PARAMS(( register UINT8 *k, register
> UINT32 length, register UINT32 initval ));
>
> +/* Enable Extra Debug Output in Hash Insertion/Removal/Iteration
> code.
> + * This creates a _LOT_ of debug messages. */
> +/* #define DEBUG_HASH */
> +
> +#ifdef DEBUG
> +static void hashtable_stats(const hashtable * const h);
> +#else
> +static inline void hashtable_stats(const UNUSED hashtable * const
> h) { }
> +#endif
> +
>
> GLOBAL UINT32
> Hash( const char *String )
> @@ -122,4 +140,493 @@ jenkins_hash( register UINT8 *k, registe
> } /* jenkins_hash */
>
>
> -/* -eof- */
> +static size_t get_larger_prime(size_t old)
> +{
> + size_t i;
since I've never seen an IRC network with more than 6664223 users,
this is probably fine. having artificial limitations allows for a
much faster get_larger_primes() than would otherwise be possible. I'd
suggest using consecutive primes that are approximately twice the
value of the previous in the list (extra credit: as close to a power
of 2 without exceeding).
> + static const size_t primes[] = { 223, 523, 823, 1123,
> + 1523, 1823, 2423, 4523, 8123,
> + 20323, 28723, 47123, 76423,
> + 96223, 133723, 220123,
> + 386723, 647723, 999623,
> + 1923323, 4735823, 6664223 };
> +
> + for (i=0; i < sizeof primes/sizeof primes[0];i++) {
> + if (old < primes[i])
> + return primes[i];
> + }
> +#ifdef DEBUG
> + Log(LOG_DEBUG, "Could not find larger prime than %u", old);
> +#endif
> + return old*2 + 1;
> +}
…<snip>…
> +static struct bucket *hashtable_iterator_getnextlist(hashtable *h,
> size_t startpos)
> +{
> + struct bucket *b = NULL;
> + size_t i = startpos;
> +
> + assert(h->hash_size > 0);
> + while (i < h->hash_size) {
> + b = h->buckets[i];
> + if (b) break;
> + i++;
> + }
> +
> + h->iterator.nextres = b;
> + h->iterator.curhash = i;
> +#ifdef DEBUG_HASH
> + LogDebug("hashtable_iterator_getnextlist: found %sresult, start at
> pos %u, now at %u",
> + b?"":"no ", startpos,i);
> +#endif
> + return b;
> +}
> +
> +
while we're here, it's a good time to mention: iterating over a hash
table should be done as little as possible. cscope reports that there
are 16 places in the code that call hashtable_iterator_getnext(),
which is probably at least 15 too many.
> +void *hashtable_iterator_getnext(hashtable *h)
> +{
> + void *retval;
> + struct bucket *nextres;
> +
> + assert(h != NULL);
> + nextres = h->iterator.nextres;
> +#ifdef DEBUG_HASH
> + if (!nextres) LogDebug("hashtable_iterator_getnext(): initial h-
> >iterator.nextres == NULL");
> +#endif
> + if (h->iterator.curhash == 0 && !nextres)
> + nextres = hashtable_iterator_getnextlist(h, 0);
> +
> + if (!nextres || h->iterator.curhash >= h->hash_size)
> + return NULL;
> +
> + retval = nextres->elemptr;
> + h->iterator.nextres = nextres->next;
> +
> + if (!h->iterator.nextres) /* no more buckets in this list, find
> next list */
> + (void)hashtable_iterator_getnextlist(h, h->iterator.curhash+1);
> +#ifdef DEBUG_HASH
> + LogDebug("hashtable_iterator_getnext(): %s, next result: %s",
> + retval ? "ok": "(NULL)", h->iterator.nextres ? "yes" :"(NULL)");
> + if (h->iterator.nextres)
> + assert(hashtable_checkref(h, h->iterator.nextres->keyptr) == 1);
> +#endif
> + return retval;
> +}
> +
> +
as per my recent change using strtok_r() in IRC_JOIN(), it's probably
worth looking into allowing a caller to provide/use their own iterator
context variables. as much as iterating a hash table is awful,
sometimes it's necessary. this is more a note for the future, as
adding support for iterator contexts provided by the caller would be
an unagreeably large change to add to an already large patch.
> +void hashtable_iterator_reset(hashtable *h)
> +{
> + assert(h != NULL);
> + h->iterator.curhash = 0;
> + h->iterator.nextres = NULL;
> +}
> +
…<snip>…
it's good that this function is provided, but it should be called as
little as possible via allowing the size of the table to be set easily
at compile-time or (even more flexible) in the config file.
> +static bool rehash(hashtable *h, size_t newsize)
> +{
> + hashtable newhash;
> + size_t i;
> +#ifdef DEBUG
> + unsigned int collisions = 0;
> +#endif
> +
> + hashtable_stats(h);
> +
> +bool hashtable_add(hashtable *h, const void *key, void *item)
> +{
> + size_t pos;
> + struct bucket *b;
> +
> + assert(h->hash_size > 0);
> + assert(key != NULL);
> + assert(item != NULL);
> +
> + if ((h->hash_fillcount * 100 / h->hash_size) > 90)
> + rehash(h, get_larger_prime(h->hash_size));
> +
> + errno = 0;
> + if (hashtable_bucket_get(h, key)) {
> + errno = EEXIST; /* dupe */
> + return false;
> + }
> +
> + b = bucket_alloc();
> + if (!b) return false;
> +
There's nothing in the hash table's implementation that seems to
require a prime table size (usually only required by doublehashing,
which does not use a list of buckets on collisions). A minor
optimization could include enforcing a power-of-2 table size, since
compilers for some platforms compile modulus to a multiply and divide,
whereas modulus by a power of 2 is strength-reduced by all at-least-
semi intelligent compilers as a simple and instruction. Using powers
of 2 for the table size would also make the get_larger_prime function
faster and easier, if not misleadingly named.
> + pos = h->hashfunc(key) % h->hash_size;
> + b->keyptr = key;
> + b->elemptr = item;
> + b->next = h->buckets[pos];
> + h->buckets[pos] = b;
> +
> + h->hash_fillcount++;
> +#ifdef DEBUG_HASH
> + LogDebug("hashtable_insert(): pos %u", pos);
> + assert(hashtable_checkref(h, key) == 1);
> +#endif
> + return true;
> +}
> +
> +
…<snip>…
> Index: ngircd/src/ngircd/irc-info.c
> ===================================================================
> --- ngircd.orig/src/ngircd/irc-info.c 2008-05-25 10:52:59.000000000
> -0700
> +++ ngircd/src/ngircd/irc-info.c 2008-05-25 13:05:55.000000000 -0700
there should probably be a faster way to iterate through servers than
this. iterating through all open connections feels a lot like using a
shotgun in a crowd to kill a pickpocket. for a future optimization,
maybe servers and users could warrant separate tables?
> @@ -225,7 +226,7 @@ IRC_LINKS( CLIENT *Client, REQUEST *Req
> {
> if( ! IRC_WriteStrClient( target, RPL_LINKS_MSG,
> Client_ID( target ), Client_ID( c ),
> Client_ID( Client_TopServer( c ) ? Client_TopServer( c ) :
> Client_ThisServer( )), Client_Hops( c ), Client_Info( c ))) return
> DISCONNECTED;
> }
> - c = Client_Next( c );
> + c = Client_Next();
> }
>
> IRC_SetPenalty( target, 1 );
this is slow, but so is calling /names on any network. for future
scaling, it might be worth looking into supporting the ability to
return a "server too busy" reply if the server's input socket has
waiting data being blocked by running this command?
> @@ -382,7 +383,7 @@ IRC_NAMES( CLIENT *Client, REQUEST *Req
> if( ! IRC_Send_NAMES( from, chan )) return DISCONNECTED;
>
> /* naechster Channel */
> - chan = Channel_Next( chan );
> + chan = Channel_Next();
> }
>
> /* Nun noch alle Clients ausgeben, die in keinem Channel sind */
> Index: ngircd/src/ngircd/irc-login.c
> ===================================================================
> --- ngircd.orig/src/ngircd/irc-login.c 2008-05-25 10:54:27.000000000
> -0700
> +++ ngircd/src/ngircd/irc-login.c 2008-05-25 13:05:55.000000000 -0700
this change is related to the change in Client_SetID() (more
precisely, Client_NewLocal() and Init_New_Client() ) (where a unique
identifier is used for clients that have not logged in, but are
connected). It probably should have been part of a different change,
but it makes a lot of use of the new hash table backend, so extracting
it out is not worth the trouble.
> @@ -244,7 +244,9 @@ IRC_NICK( CLIENT *Client, REQUEST *Req )
> Client_Conn( Client ));
>
> /* Register new nickname of this client */
> - Client_SetID( target, Req->argv[0] );
> + if (!Client_SetID( target, Req->argv[0] ))
> + return IRC_WriteStrClient( Client, ERR_NICKNAMEINUSE_MSG,
> + Client_ID( Client ), Req->argv[0] );
>
> /* If we received a valid USER command already then
> * register the new client! */
> Index: ngircd/src/ngircd/irc-oper.c
> ===================================================================
> --- ngircd.orig/src/ngircd/irc-oper.c 2008-05-25 10:52:59.000000000
> -0700
> +++ ngircd/src/ngircd/irc-oper.c 2008-05-25 13:05:55.000000000 -0700
> @@ -292,7 +292,7 @@ IRC_WALLOPS( CLIENT *Client, REQUEST *Re
again with the iterating. this is really inefficient and something to
look at in the future.
> if (!from)
> return IRC_WriteStrClient(Client, ERR_NOSUCHNICK_MSG,
> Client_ID(Client), Req->prefix);
>
> - for (to=Client_First(); to != NULL; to=Client_Next(to)) {
> + for (to=Client_First(); to != NULL; to=Client_Next()) {
> if (Client_Conn(to) < 0) /* no local connection or WALLOPS origin */
> continue;
>
> Index: ngircd/src/ngircd/irc-write.c
> ===================================================================
> --- ngircd.orig/src/ngircd/irc-write.c 2008-05-25 10:52:59.000000000
> -0700
> +++ ngircd/src/ngircd/irc-write.c 2008-05-25 13:05:55.000000000 -0700
> @@ -310,7 +311,7 @@ va_dcl
again, sending messages to many other servers should not require
iterating through the entirety of the client list.
> /* Ziel-Server gefunden. Nun noch pruefen, ob Flags stimmen */
> if(( Flag == '\0' ) || ( strchr( Client_Flags( c ), Flag ) !=
> NULL )) IRC_WriteStrClientPrefix( c, Prefix, "%s", buffer );
> }
> - c = Client_Next( c );
> + c = Client_Next();
> }
> } /* IRC_WriteStrServersPrefixFlag */
>
> Index: ngircd/src/ngircd/irc.c
> ===================================================================
> --- ngircd.orig/src/ngircd/irc.c 2008-05-25 10:52:59.000000000 -0700
> +++ ngircd/src/ngircd/irc.c 2008-05-25 13:05:55.000000000 -0700
> @@ -279,7 +279,7 @@ IRC_TRACE( CLIENT *Client, REQUEST *Req
again, iterating for servers should not require iterating through the
entirety of the client list.
> if( ! IRC_WriteStrClient( from, RPL_TRACEOPERATOR_MSG,
> Client_ID( from ), Client_ID( c ))) return DISCONNECTED;
> }
> }
> - c = Client_Next( c );
> + c = Client_Next();
> }
>
> IRC_SetPenalty( Client, 3 );
Final thoughts:
Most of the lines in this diff are actually due to function renaming
or variable renaming. Some of this was necessary (the arguments to
Client_Next() and Channel_Next() changed, for instance), but for the
most part the function name changes can be seen as code cleanup. Some
portions of it are valid bugfixes that could have been split out, but
in a few cases the bugfixes interact closely with the new code, so
it's very justifiable to include them. As I've thoroughly reviewed
and tested this patch, I feel that despite it's size, it is not
unnecessarily large or risky.
Users and servers (and in the future, services) should probably be
stored in separate hash tables. Iterating over a hash table of the
servers will be notably faster than iterating over a list of
connections and filtering out all clients that are not servers.
Almost anywhere that Client_Next() or Channel_Next() is called is a
possible place to optimize. In a lot of cases Client_Next() was being
used, but most of the set was not being used (when trying to iterate
over all servers or opers, for instance). This would be a
considerable speedup, especially since iterating over the hash table
takes longer than iterating over a linked list by approximately the m/
n fill ratio.
As ngircd grows and gets used on larger networks, it might be worth
finding expensive operations that cannot be sped up more (like /names,
or /list) and return a "server too busy" message if processing the
command would result in too many waiting inputs going unprocessed for
too long. Multithreading would fix this, but most of the code
(especially the hash table in this patch and the linked list currently
in use) is not thread safe.
/* comments welcome */
./scott
This patch does significant cleanup on the join code by using strtok_r
instead of mangling strchr to parse channel names and keys in parallel
when a JOIN command contains a list of channels and keys.
since strtok_r is not universally portable, Florian was kind enough to
provide a working /src/portab/strtok_r.c which I've included into the
configure path.
I'd like to more widely move the codebase from using strtok to using
strtok_r where possible. A few reasons for this:
1) strtok_r and strtok are the same speed and functionality. there
are no compelling reasons to pick strtok() over strtok_r() outside of
portability. Since strtok_r is fairly universal (all modern operating
systems support it in their standard libraries), and a solid
implementation exists in /src/portab, portability is not a compelling
concern.
2) strtok_r will alleviate concerns about program state in the
future. This is important because functions may need to tokenize
strings while a function higher in the stack has not completed
tokenization on a different string. This would lead to incorrect
behaviour. By using strtok_r, this is no longer a concern when adding
features or fixing bugs.
There are other reasons, but most of them relate to thread-safety.
While ngircd could benefit from having worker threads processing
server input (since reading from a socket is a low cost operation
compared to actually processing the input), this isn't a very good
argument for ngircd at the moment. There are more useful ways to make
ngircd faster at the moment that don't include reorganizing vast
swaths of code. More on that later.
Anyway, back to the original point: this patch cleans up IRC_JOIN
considerably and provides strtok_r() for future use through /src/portab.
The attached patch applies with no warnings to the latest git head.
./scott
Hello,
I am using ngircd 0.11 with gnutls support enabled. For a certain
reason (guess why) I had to change the certificate but appearently
after a reload ngircd still presents the old certificate.
Did I miss something (or did I something wrong)? Else I'd like to add
a feature request to reload the SSL credentials upon SIGHUP.
Christoph
Hi!
ngIRCd 0.12.0 has been released and can be downloaded from the
following locations, as usual:
a) ftp://ngircd.barton.de/pub/ngircd/
ftp://ngircd.barton.de/pub/ngircd/ngircd-0.12.0.tar.gz
b) ftp://ftp.berlios.de/pub/ngircd/
ftp://ftp.berlios.de/pub/ngircd/ngircd-0.12.0.tar.gz
Patches from 0.11.1 (about 220K uncompressed) and 0.12.0-pre2 as well
as GnuPG signatures are available, too.
The most prominent enhancement is support for IPv6, and the complete
ChangeLog since 0.11.1 is:
- Fix Bug: 85: "WHO #SecretChannel" that user is not a member of now
returns proper RPL_ENDOFWHO_MSG instead of nothing. (Ali Shemiran)
- Fix complie on FreeBSD 5.4 and AIX.
- If bind() fails, also print ip address and not just the port number.
ngIRCd 0.12.0-pre2 (2008-04-29)
- IPv6: Add config options to disabe ipv4/ipv6 support.
- Don't include doc/CVS.txt in distribution archive, use
doc/GIT.txt now!
- Documentation: get rid of some more references to CVS, switch
to GIT.
- Get rid of cvs-version.* and CVSDATE definition.
- Report ERR_NOTONCHANNEL when trying to part a channel one is
not member of.
- Testsuite: remove erroneous ConfUID setting in config file.
ngIRCd 0.12.0-pre1 (2008-04-20)
- Include Mac OS X Xcode project in distribution archives.
- Do not exit on SIGHUP or /REHASH if the config file cannot opened.
- Add IPv6 support.
- Install a LaunchDaemon script to start/stop ngIRCd on Mac OS X.
- Implemented IRC commands INFO, SUMMON (dummy), and USERS (dummy)
and enhanced test suite to check these commands. (Dana Dahlstrom)
- RPL_WHOREPLY messages generated by IRC_WHO didn't include
flags (*,@,+). (Dana Dahlstrom)
- IRC_WHO now supports search patterns and will test this against
user nickname/servername/hostname, etc. as required by RFC 2812,
Section 3.6.1. (reported by Dana Dahlstrom)
- Add test cases for "WHO" command. (Dana Dahlstrom)
- Implement RFC 2812 handling of "0" argument to 'JOIN': must be
treated as if the user had sent PART commands for all channels
the user is a member of. (Dana Dahlstrom)
- Allow NOTICEs to be sent to a channel. (Fabian Schlager)
Nine people already committed to the post-0.12.0 GIT repository, so
there are good chances that the next release of ngIRCd won't be too
far in the future ;-)
You can use this mailing liste, the ngIRCd bug tracker (http://arthur.barton.de/cgi-bin/bugzilla/index.cgi
) or the IRC channel #ngircd on irc.barton.de:6667 (irc://irc.barton.de/ngircd
) to get in touch with us.
Regards
Alex
Hello all,
We are steadily moving towards ngIRCd 0.12.0 and I released ngIRCd
0.12.0-pre2 for testing a few days ago (0.12.0-pre1 has been released
a little bit too quickly, missed some "needed" features, and therefore
hasn't been announced loudly).
Most notably, ngIRCd will support IPv6, became more RFC-compliant, has
enhanced support for Mac OS X, and the source code repository has been
switched from CVS to GIT (see below).
The ChangeLog for the upcoming 0.12.0 release reads as following:
ngIRCd 0.12.0-pre2 (2008-04-29)
- IPv6: Add config options to disabe ipv4/ipv6 support.
- Don't include doc/CVS.txt in distribution archive, use
doc/GIT.txt now!
- Documentation: get rid of some more references to CVS, switch
to GIT.
- Get rid of cvs-version.* and CVSDATE definition.
- Report ERR_NOTONCHANNEL when trying to part a channel one is
not member of.
- Testsuite: remove erroneous ConfUID setting in config file.
ngIRCd 0.12.0-pre1 (2008-04-20)
- Include Mac OS X Xcode project in distribution archives.
- Do not exit on SIGHUP or /REHASH if the config file cannot opened.
- Add IPv6 support.
- Install a LaunchDaemon script to start/stop ngIRCd on Mac OS X.
- Implemented IRC commands INFO, SUMMON (dummy), and USERS (dummy)
and enhanced test suite to check these commands. (Dana Dahlstrom)
- RPL_WHOREPLY messages generated by IRC_WHO didn't include flags
(*,@,+). (Dana Dahlstrom)
- IRC_WHO now supports search patterns and will test this against
user nickname/servername/hostname, etc. as required by RFC 2812,
Section 3.6.1. (reported by Dana Dahlstrom)
- Add test cases for "WHO" command. (Dana Dahlstrom)
- Implement RFC 2812 handling of "0" argument to 'JOIN': must be
treated as if the user had sent PART commands for all channels
the user is a member of. (Dana Dahlstrom)
- Allow NOTICEs to be sent to a channel. (Fabian Schlager)
The pre-release (archives, patches, GnuPG signatures) can be
downloaded from the usual places:
- ftp://ngircd.barton.de/pub/ngircd/
ftp://ngircd.barton.de/pub/ngircd/ngircd-0.12.0-pre2.tar.gz
- ftp://ftp.berlios.de/pub/ngircd/
ftp://ftp.berlios.de/pub/ngircd/ngircd-0.12.0-pre2.tar.gz
Please download this release and throw it at as many machines as you
can reach ;-)
And please let us know about all problems you encounter -- thanks!
GIT Source Code Repository
During the last weeks we switched the source code repsoitory from CVS
to GIT, because Florian and I think that GIT eases distributed
development a lot -- and the first results do look very promising,
have a look at the great work done by Dana and his team. Thanks!
Actually the old CVS repository is still up and running, but no new
code will be committed to it. Eventually I'll shut it down, but are a
little bit unsure about when and how ...
The new official GIT repository of ngIRCd can be reached as following:
- git://ngircd.barton.de/ngircd.git
So you can use the command "git clone git://ngircd.barton.de/
ngircd.git" to download a local copy of this repository. Please have a
look at doc/GIT.txt and the excellent GIT documentation located at http://git.or.cz/
for more details.
A "GITweb" interface (as replacement for CVSweb) is available, too:
- http://ngircd.barton.de/ngircd.git [which will redirect to:]
- http://ngircd.barton.de/cgi-bin/gitweb.cgi?p=ngircd.git
So please test ngIRCd, report bugs, code patches, and let us know how
ngIRCd could be enhanced. Thanks!
You can use this mailing liste, the ngIRCd bug tracker (http://arthur.barton.de/cgi-bin/bugzilla/index.cgi
) or the IRC channel #ngircd on irc.barton.de:6667 (irc://irc.barton.de/ngircd
) to get in touch with us.
Regards
Alex