This message is for discussion, the patches attached are not intended
to be committed quite yet. I'd like to spend at least one more night
of heavy break-in testing before finally signing off on my modified
patch or the original.
Florian sent me a patch (attached: hashtable.diff) last week or so.
Applying it changes the data structure backing store for users and
channels from using linked lists to using a hash table.
I wrote a test harness and ran 1000 clients against a local ngircd
using the hashtable patch to test for memory leaks and logical
errors. The test clients ran in a tight loop, and as often as
possible they would change their nicks, join and/or part channels,
send messages to channels (existent and not) for 18 hours. While
using my own local client to watch the carnage, the ircd stayed very
responsive and the test harness did not experience any errors with
data constistency when comparing each client's local channel list with
the results of WHOIS. Other than memory issues already present
(getpwd() in libc leaks a small amount of memory, but it is run once
at startup), there were no memory leaks that I could find. I'm going
to retest the ircd on a different system tonight (one where I have
better tools for profiling and memory) to ensure that there are no
memory issues.
I've attached my own modified version of the patch
(hashtable.mine.diff) for subsequent review. There is only one
notable change, in Client_Destroy(), where I've restructured nearly
the entire function.
Comments on the patch are inline below (please note that the quoted
text is *not* the complete patch. the quoted text is still in valid
diff syntax, but I removed large portions for brevity). If you just
want the final thoughts, jump to the end of this message.
> Index: ngircd/src/ngircd/channel.c
> ===================================================================
> --- ngircd.orig/src/ngircd/channel.c 2008-05-25 10:52:59.000000000
> -0700
> +++ ngircd/src/ngircd/channel.c 2008-05-25 13:05:55.000000000 -0700
this is a huge win. every single channel search goes from being
linear in time to almost constant time.
> @@ -226,26 +217,14 @@ Channel_Part(CLIENT * Client, CLIENT * O
> assert(Name != NULL);
> assert(Reason != NULL);
>
> - /* Check that specified channel exists */
> - chan = Channel_Search(Name);
> - if (!chan) {
> - IRC_WriteStrClient(Client, ERR_NOSUCHCHANNEL_MSG,
> - Client_ID(Client), Name);
> - return false;
> - }
> -
> - /* Check that the client is in the channel */
> - if (!Get_Cl2Chan(chan, Client)) {
> - IRC_WriteStrClient(Client, ERR_NOTONCHANNEL_MSG,
> - Client_ID(Client), Name);
> + chan = Channel_Search( Name );
> + if(( ! chan ) || ( ! Cl2Chan_Get( chan, Client )))
> + {
> + IRC_WriteStrClient( Client, ERR_NOSUCHCHANNEL_MSG,
> Client_ID( Client ), Name );
> return false;
> }
>
> - /* Part client from channel */
> - if (!Remove_Client(REMOVE_PART, chan, Client, Origin, Reason, true))
> - return false;
> - else
> - return true;
> + return Remove_Client( REMOVE_PART, chan, Client, Origin, Reason,
> true);
> } /* Channel_Part */
>
>
> @@ -896,46 +711,43 @@ Remove_Client( int Type, CHANNEL *Chan,
retaining the variable c to represent the channel structure instead of
using the variable Chan in this change would make this portion of the
diff a lot less intimidating and invasive. However, I did not remove
this portion of the diff in my version because it cleans up the method
considerably, making variable naming throughout the function much more
consistent.
> * so only inform other clients in same channel. */
> assert( InformServer == false );
> LogDebug("User \"%s\" left channel \"%s\" (%s).",
> - Client_Mask( Client ), c->name, Reason );
> + Client_Mask( Client ), Chan->name, Reason );
> break;
> case REMOVE_KICK:
> /* User was KICKed: inform other servers and all users in channel
> */
> if( InformServer )
> IRC_WriteStrServersPrefix( Client_NextHop( Origin ),
> - Origin, "KICK %s %s :%s", c->name, Client_ID( Client ), Reason);
> - IRC_WriteStrChannelPrefix(Client, c, Origin, false, "KICK %s %s :
> %s",
> - c->name, Client_ID( Client ), Reason );
> + Origin, "KICK %s %s :%s", Chan->name, Client_ID( Client ),
> Reason);
> + IRC_WriteStrChannelPrefix(Client, Chan, Origin, false, "KICK %s
> %s :%s",
> + Chan->name, Client_ID( Client ), Reason );
> if ((Client_Conn(Client) > NONE) &&
> (Client_Type(Client) == CLIENT_USER))
> {
> IRC_WriteStrClientPrefix(Client, Origin, "KICK %s %s :%s",
> - c->name, Client_ID( Client ), Reason);
> + Chan->name, Client_ID( Client ), Reason);
> }
> LogDebug("User \"%s\" has been kicked off \"%s\" by \"%s\": %s.",
> - Client_Mask( Client ), c->name, Client_ID(Origin), Reason);
> + Client_Mask( Client ), Chan->name, Client_ID(Origin), Reason);
> break;
> default: /* PART */
> if (InformServer)
> - IRC_WriteStrServersPrefix(Origin, Client, "PART %s :%s", c-
> >name, Reason);
> + IRC_WriteStrServersPrefix(Origin, Client, "PART %s :%s", Chan-
> >name, Reason);
>
> - IRC_WriteStrChannelPrefix(Origin, c, Client, false, "PART %s :%s",
> - c->name, Reason);
> + IRC_WriteStrChannelPrefix(Origin, Chan, Client, false, "PART %s :
> %s",
> + Chan->name, Reason);
>
> if ((Client_Conn(Origin) > NONE) &&
> (Client_Type(Origin) == CLIENT_USER))
> {
> - IRC_WriteStrClientPrefix( Origin, Client, "PART %s :%s", c-
> >name, Reason);
> + IRC_WriteStrClientPrefix( Origin, Client, "PART %s :%s", Chan-
> >name, Reason);
> LogDebug("User \"%s\" left channel \"%s\" (%s).",
> - Client_Mask(Client), c->name, Reason);
> + Client_Mask(Client), Chan->name, Reason);
> }
> }
>
> - /* Wenn Channel nun leer und nicht pre-defined: loeschen */
> - if( ! strchr( Channel_Modes( Chan ), 'P' ))
> - {
> - if( ! Get_First_Cl2Chan( NULL, Chan )) Delete_Channel( Chan );
> - }
> -
> + /* if channel empty and not predefined -> remove */
> + if ((hashtable_getfillcount(&Chan->members) == 0) && !
> strchr(Channel_Modes( Chan ), 'P'))
> + hashtable_delete(&My_Channels, Chan->name);
> return true;
> } /* Remove_Client */
>
> Index: ngircd/src/ngircd/cl2chan.h
> ===================================================================
> --- /dev/null 1970-01-01 00:00:00.000000000 +0000
> +++ ngircd/src/ngircd/cl2chan.h 2008-05-25 13:05:55.000000000 -0700
separating out the cl2chan functions is a really good refactoring move
as far as code cleanup goes.
> Index: ngircd/src/ngircd/client.c
> ===================================================================
> --- ngircd.orig/src/ngircd/client.c 2008-05-25 10:52:59.000000000
> -0700
> +++ ngircd/src/ngircd/client.c 2008-05-25 13:05:55.000000000 -0700
again this looks more like it could have been it's own patch instead
of being included here. that said, it works properly; since it's
already here we may as well merge it in.
> @@ -141,8 +146,17 @@ Client_ThisServer( void )
> GLOBAL CLIENT *
> Client_NewLocal(CONN_ID Idx, char *Hostname, int Type, bool Idented)
> {
> - return Init_New_Client(Idx, This_Server, NULL, Type, NULL, NULL,
> - Hostname, NULL, 0, 0, NULL, Idented);
> + /* # is illegal in a nick name, so #+CONN_ID is unique.
> + * We need this as a placeholder until the clients sets
> + the real nick. */
> + char fake_nick[CLIENT_NICK_LEN];
> +
> + snprintf(fake_nick, sizeof fake_nick, "#%x#", Idx);
> +#ifdef DEBUG
> + assert(Client_Search(fake_nick) == NULL);
> +#endif
> + return Init_New_Client(Idx, This_Server, NULL, Type, fake_nick,
> NULL,
> + Hostname, NULL, 0, 0, NULL, Idented);
> } /* Client_NewLocal */
>
>
> @@ -241,93 +269,110 @@ Client_Destroy( CLIENT *Client, char *Lo
> strlcat(msg, Client->id, sizeof (msg));
> }
>
> - last = NULL;
> - c = My_Clients;
> - while( c )
> +#ifdef DEBUG
> + strlcpy(nickbuf, Client->id, sizeof nickbuf);
> + if (hashtable_get(&My_Clients, nickbuf) == NULL)
> + LogDebug("Client_Destroy() called for unknown nickname %s!",
> nickbuf);
> +#endif
> + hashtable_iterator_reset(&My_Clients);
> + while(( c = hashtable_iterator_getnext(&My_Clients)))
> {
> - if(( Client->type == CLIENT_SERVER ) && ( c->introducer ==
> Client ) && ( c != Client ))
> - {
> - /* der Client, der geloescht wird ist ein Server. Der Client,
> den wir gerade
> - * pruefen, ist ein Child von diesem und muss daher auch
> entfernt werden */
> - Client_Destroy( c, NULL, msg, false );
> - last = NULL;
> - c = My_Clients;
> + LogDebug("Testing client nick: %s, want %s", c->id, Client->id);
> +
I think there's a serious optimization to be made here:
If Client is not a server, then no iteration is necessary, so maybe
separating this function into two functional portions, one for normal
clients and one for servers, would be a good idea, since this function
is needlessly inefficient if the Client is a server. This is an
especially large bottleneck with a multiserver network, since
destroying a server would have taken O(n^2) as written in this patch.
This portion of the code was changed in my patch.
> + if (c != Client) {
> + if ((Client->type == CLIENT_SERVER) && (c->introducer ==
> Client)) {
> + /* The client that is about to be removed is a server,
> + * the client we are checking right now is a child of that
> + * server and thus has to be removed, too. */
> + Client_Destroy( c, NULL, msg, false );
> + hashtable_iterator_reset(&My_Clients);
> + }
> continue;
> }
> - if( c == Client )
> - {
> - /* Wir haben den Client gefunden: entfernen */
> - if( last ) last->next = c->next;
> - else My_Clients = (CLIENT *)c->next;
> -
> - if( c->type == CLIENT_USER )
> - {
> - if( c->conn_id != NONE )
> - {
> - /* Ein lokaler User */
> - Log( LOG_NOTICE, "User \"%s\" unregistered (connection %d):
> %s", Client_Mask( c ), c->conn_id, txt );
> -
> - if( SendQuit )
> - {
> - /* Alle andere Server informieren! */
> - if( FwdMsg ) IRC_WriteStrServersPrefix( NULL, c, "QUIT :%s",
> FwdMsg );
> - else IRC_WriteStrServersPrefix( NULL, c, "QUIT :" );
> + /* c == Client */
…<snip>…
> + LogDebug("Remove Nick \"%s\" from hashtable (Destroy)", c->id);
> + hashtable_delete(&My_Clients, c->id);
> + break;
> }
> + hashtable_iterator_reset(&My_Clients);
> +
> +#ifdef DEBUG
> + assert(hashtable_get(&My_Clients, nickbuf) == NULL);
> +#endif
> } /* Client_Destroy */
>
>
> @@ -575,13 +635,14 @@ Client_GetFromToken( CLIENT *Client, int
this looks a lot less efficient than it could be. luckily, it's
called only once during a connection's lifetime, but if a connection
has not been fully registered and logged in, than why not use the
token (or a variation of it) as the hashtable's key?
> assert( Client != NULL );
> assert( Token > 0 );
>
> - c = My_Clients;
> - while( c )
> - {
> - if(( c->type == CLIENT_SERVER ) && ( c->introducer == Client ) &&
> ( c->token == Token )) return c;
> - c = (CLIENT *)c->next;
> + hashtable_iterator_reset(&My_Clients);
> + while((c = hashtable_iterator_getnext(&My_Clients))) {
> + if(( c->type == CLIENT_SERVER ) &&
> + ( c->introducer == Client ) &&
> + ( c->token == Token )) break;
> }
> - return NULL;
> + hashtable_iterator_reset(&My_Clients);
> + return c;
> } /* Client_GetFromToken */
>
>
> @@ -884,12 +939,11 @@ Client_MyServerCount( void )
this comment applies to all the Client_*Count() functions: performance
in these functions is very unfortunate, changing from a hash table to
a linked list does not make this any faster, but if performance in
these sections of the code is important, then the counts should
probably be cached. for extra credit, the cached values could be
updated in some way that does not include a O(n) search of the hash
table.
> CLIENT *c;
> unsigned long cnt = 0;
>
> - c = My_Clients;
> - while( c )
> - {
> + hashtable_iterator_reset(&My_Clients);
> + while ((c = hashtable_iterator_getnext(&My_Clients)))
> if(( c->type == CLIENT_SERVER ) && ( c->hops == 1 )) cnt++;
> - c = (CLIENT *)c->next;
> - }
> +
> + hashtable_iterator_reset(&My_Clients);
> return cnt;
> } /* Client_MyServerCount */
>
> @@ -1062,18 +1118,14 @@ Generate_MyToken( CLIENT *Client )
this function could also stand to use some help from caching. it
seems raucously inefficient.
> CLIENT *c;
> int token;
>
> - c = My_Clients;
> token = 2;
> - while( c )
> - {
> - if( c->mytoken == token )
> - {
> - /* Das Token wurde bereits vergeben */
> + hashtable_iterator_reset(&My_Clients);
> + while ((c = hashtable_iterator_getnext(&My_Clients))) {
> + if( c->mytoken == token ) {
> + /* Token already in use */
> token++;
> - c = My_Clients;
> - continue;
> + hashtable_iterator_reset(&My_Clients);
> }
> - else c = (CLIENT *)c->next;
> }
> Client->mytoken = token;
> Log( LOG_DEBUG, "Assigned token %d to server \"%s\".", token,
> Client->id );
> Index: ngircd/src/ngircd/hash.c
> ===================================================================
> --- ngircd.orig/src/ngircd/hash.c 2008-05-25 10:52:59.000000000 -0700
> +++ ngircd/src/ngircd/hash.c 2008-05-25 13:05:55.000000000 -0700
> @@ -19,16 +20,33 @@ static char UNUSED id[] = "$Id: hash.c,v
> #include "imp.h"
> #include <assert.h>
> #include <string.h>
> +#include <errno.h>
> +#include <stdlib.h>
>
> #include "defines.h"
> +#include "log.h"
> #include "tool.h"
>
> #include "exp.h"
> #include "hash.h"
>
allowing some way for the builder to set this at configure-time
(instead of adding -DNITIAL_HASHSIZE in their CFLAGS) would probably
be much appreciated, since network size varies enormously. if nothing
else, the initial size should err on the side of being too large, as
hash table performance does not suffer considerably from being too
empty.
> +#define INITIAL_HASHSIZE 23
> +
> +static struct bucket *hashtable_bucket_get(hashtable *h, const void
> *key);
> +static struct bucket *hashtable_bucket_steal(hashtable *h, const
> void *key);
>
> static UINT32 jenkins_hash PARAMS(( register UINT8 *k, register
> UINT32 length, register UINT32 initval ));
>
> +/* Enable Extra Debug Output in Hash Insertion/Removal/Iteration
> code.
> + * This creates a _LOT_ of debug messages. */
> +/* #define DEBUG_HASH */
> +
> +#ifdef DEBUG
> +static void hashtable_stats(const hashtable * const h);
> +#else
> +static inline void hashtable_stats(const UNUSED hashtable * const
> h) { }
> +#endif
> +
>
> GLOBAL UINT32
> Hash( const char *String )
> @@ -122,4 +140,493 @@ jenkins_hash( register UINT8 *k, registe
> } /* jenkins_hash */
>
>
> -/* -eof- */
> +static size_t get_larger_prime(size_t old)
> +{
> + size_t i;
since I've never seen an IRC network with more than 6664223 users,
this is probably fine. having artificial limitations allows for a
much faster get_larger_primes() than would otherwise be possible. I'd
suggest using consecutive primes that are approximately twice the
value of the previous in the list (extra credit: as close to a power
of 2 without exceeding).
> + static const size_t primes[] = { 223, 523, 823, 1123,
> + 1523, 1823, 2423, 4523, 8123,
> + 20323, 28723, 47123, 76423,
> + 96223, 133723, 220123,
> + 386723, 647723, 999623,
> + 1923323, 4735823, 6664223 };
> +
> + for (i=0; i < sizeof primes/sizeof primes[0];i++) {
> + if (old < primes[i])
> + return primes[i];
> + }
> +#ifdef DEBUG
> + Log(LOG_DEBUG, "Could not find larger prime than %u", old);
> +#endif
> + return old*2 + 1;
> +}
…<snip>…
> +static struct bucket *hashtable_iterator_getnextlist(hashtable *h,
> size_t startpos)
> +{
> + struct bucket *b = NULL;
> + size_t i = startpos;
> +
> + assert(h->hash_size > 0);
> + while (i < h->hash_size) {
> + b = h->buckets[i];
> + if (b) break;
> + i++;
> + }
> +
> + h->iterator.nextres = b;
> + h->iterator.curhash = i;
> +#ifdef DEBUG_HASH
> + LogDebug("hashtable_iterator_getnextlist: found %sresult, start at
> pos %u, now at %u",
> + b?"":"no ", startpos,i);
> +#endif
> + return b;
> +}
> +
> +
while we're here, it's a good time to mention: iterating over a hash
table should be done as little as possible. cscope reports that there
are 16 places in the code that call hashtable_iterator_getnext(),
which is probably at least 15 too many.
> +void *hashtable_iterator_getnext(hashtable *h)
> +{
> + void *retval;
> + struct bucket *nextres;
> +
> + assert(h != NULL);
> + nextres = h->iterator.nextres;
> +#ifdef DEBUG_HASH
> + if (!nextres) LogDebug("hashtable_iterator_getnext(): initial h-
> >iterator.nextres == NULL");
> +#endif
> + if (h->iterator.curhash == 0 && !nextres)
> + nextres = hashtable_iterator_getnextlist(h, 0);
> +
> + if (!nextres || h->iterator.curhash >= h->hash_size)
> + return NULL;
> +
> + retval = nextres->elemptr;
> + h->iterator.nextres = nextres->next;
> +
> + if (!h->iterator.nextres) /* no more buckets in this list, find
> next list */
> + (void)hashtable_iterator_getnextlist(h, h->iterator.curhash+1);
> +#ifdef DEBUG_HASH
> + LogDebug("hashtable_iterator_getnext(): %s, next result: %s",
> + retval ? "ok": "(NULL)", h->iterator.nextres ? "yes" :"(NULL)");
> + if (h->iterator.nextres)
> + assert(hashtable_checkref(h, h->iterator.nextres->keyptr) == 1);
> +#endif
> + return retval;
> +}
> +
> +
as per my recent change using strtok_r() in IRC_JOIN(), it's probably
worth looking into allowing a caller to provide/use their own iterator
context variables. as much as iterating a hash table is awful,
sometimes it's necessary. this is more a note for the future, as
adding support for iterator contexts provided by the caller would be
an unagreeably large change to add to an already large patch.
> +void hashtable_iterator_reset(hashtable *h)
> +{
> + assert(h != NULL);
> + h->iterator.curhash = 0;
> + h->iterator.nextres = NULL;
> +}
> +
…<snip>…
it's good that this function is provided, but it should be called as
little as possible via allowing the size of the table to be set easily
at compile-time or (even more flexible) in the config file.
> +static bool rehash(hashtable *h, size_t newsize)
> +{
> + hashtable newhash;
> + size_t i;
> +#ifdef DEBUG
> + unsigned int collisions = 0;
> +#endif
> +
> + hashtable_stats(h);
> +
> +bool hashtable_add(hashtable *h, const void *key, void *item)
> +{
> + size_t pos;
> + struct bucket *b;
> +
> + assert(h->hash_size > 0);
> + assert(key != NULL);
> + assert(item != NULL);
> +
> + if ((h->hash_fillcount * 100 / h->hash_size) > 90)
> + rehash(h, get_larger_prime(h->hash_size));
> +
> + errno = 0;
> + if (hashtable_bucket_get(h, key)) {
> + errno = EEXIST; /* dupe */
> + return false;
> + }
> +
> + b = bucket_alloc();
> + if (!b) return false;
> +
There's nothing in the hash table's implementation that seems to
require a prime table size (usually only required by doublehashing,
which does not use a list of buckets on collisions). A minor
optimization could include enforcing a power-of-2 table size, since
compilers for some platforms compile modulus to a multiply and divide,
whereas modulus by a power of 2 is strength-reduced by all at-least-
semi intelligent compilers as a simple and instruction. Using powers
of 2 for the table size would also make the get_larger_prime function
faster and easier, if not misleadingly named.
> + pos = h->hashfunc(key) % h->hash_size;
> + b->keyptr = key;
> + b->elemptr = item;
> + b->next = h->buckets[pos];
> + h->buckets[pos] = b;
> +
> + h->hash_fillcount++;
> +#ifdef DEBUG_HASH
> + LogDebug("hashtable_insert(): pos %u", pos);
> + assert(hashtable_checkref(h, key) == 1);
> +#endif
> + return true;
> +}
> +
> +
…<snip>…
> Index: ngircd/src/ngircd/irc-info.c
> ===================================================================
> --- ngircd.orig/src/ngircd/irc-info.c 2008-05-25 10:52:59.000000000
> -0700
> +++ ngircd/src/ngircd/irc-info.c 2008-05-25 13:05:55.000000000 -0700
there should probably be a faster way to iterate through servers than
this. iterating through all open connections feels a lot like using a
shotgun in a crowd to kill a pickpocket. for a future optimization,
maybe servers and users could warrant separate tables?
> @@ -225,7 +226,7 @@ IRC_LINKS( CLIENT *Client, REQUEST *Req
> {
> if( ! IRC_WriteStrClient( target, RPL_LINKS_MSG,
> Client_ID( target ), Client_ID( c ),
> Client_ID( Client_TopServer( c ) ? Client_TopServer( c ) :
> Client_ThisServer( )), Client_Hops( c ), Client_Info( c ))) return
> DISCONNECTED;
> }
> - c = Client_Next( c );
> + c = Client_Next();
> }
>
> IRC_SetPenalty( target, 1 );
this is slow, but so is calling /names on any network. for future
scaling, it might be worth looking into supporting the ability to
return a "server too busy" reply if the server's input socket has
waiting data being blocked by running this command?
> @@ -382,7 +383,7 @@ IRC_NAMES( CLIENT *Client, REQUEST *Req
> if( ! IRC_Send_NAMES( from, chan )) return DISCONNECTED;
>
> /* naechster Channel */
> - chan = Channel_Next( chan );
> + chan = Channel_Next();
> }
>
> /* Nun noch alle Clients ausgeben, die in keinem Channel sind */
> Index: ngircd/src/ngircd/irc-login.c
> ===================================================================
> --- ngircd.orig/src/ngircd/irc-login.c 2008-05-25 10:54:27.000000000
> -0700
> +++ ngircd/src/ngircd/irc-login.c 2008-05-25 13:05:55.000000000 -0700
this change is related to the change in Client_SetID() (more
precisely, Client_NewLocal() and Init_New_Client() ) (where a unique
identifier is used for clients that have not logged in, but are
connected). It probably should have been part of a different change,
but it makes a lot of use of the new hash table backend, so extracting
it out is not worth the trouble.
> @@ -244,7 +244,9 @@ IRC_NICK( CLIENT *Client, REQUEST *Req )
> Client_Conn( Client ));
>
> /* Register new nickname of this client */
> - Client_SetID( target, Req->argv[0] );
> + if (!Client_SetID( target, Req->argv[0] ))
> + return IRC_WriteStrClient( Client, ERR_NICKNAMEINUSE_MSG,
> + Client_ID( Client ), Req->argv[0] );
>
> /* If we received a valid USER command already then
> * register the new client! */
> Index: ngircd/src/ngircd/irc-oper.c
> ===================================================================
> --- ngircd.orig/src/ngircd/irc-oper.c 2008-05-25 10:52:59.000000000
> -0700
> +++ ngircd/src/ngircd/irc-oper.c 2008-05-25 13:05:55.000000000 -0700
> @@ -292,7 +292,7 @@ IRC_WALLOPS( CLIENT *Client, REQUEST *Re
again with the iterating. this is really inefficient and something to
look at in the future.
> if (!from)
> return IRC_WriteStrClient(Client, ERR_NOSUCHNICK_MSG,
> Client_ID(Client), Req->prefix);
>
> - for (to=Client_First(); to != NULL; to=Client_Next(to)) {
> + for (to=Client_First(); to != NULL; to=Client_Next()) {
> if (Client_Conn(to) < 0) /* no local connection or WALLOPS origin */
> continue;
>
> Index: ngircd/src/ngircd/irc-write.c
> ===================================================================
> --- ngircd.orig/src/ngircd/irc-write.c 2008-05-25 10:52:59.000000000
> -0700
> +++ ngircd/src/ngircd/irc-write.c 2008-05-25 13:05:55.000000000 -0700
> @@ -310,7 +311,7 @@ va_dcl
again, sending messages to many other servers should not require
iterating through the entirety of the client list.
> /* Ziel-Server gefunden. Nun noch pruefen, ob Flags stimmen */
> if(( Flag == '\0' ) || ( strchr( Client_Flags( c ), Flag ) !=
> NULL )) IRC_WriteStrClientPrefix( c, Prefix, "%s", buffer );
> }
> - c = Client_Next( c );
> + c = Client_Next();
> }
> } /* IRC_WriteStrServersPrefixFlag */
>
> Index: ngircd/src/ngircd/irc.c
> ===================================================================
> --- ngircd.orig/src/ngircd/irc.c 2008-05-25 10:52:59.000000000 -0700
> +++ ngircd/src/ngircd/irc.c 2008-05-25 13:05:55.000000000 -0700
> @@ -279,7 +279,7 @@ IRC_TRACE( CLIENT *Client, REQUEST *Req
again, iterating for servers should not require iterating through the
entirety of the client list.
> if( ! IRC_WriteStrClient( from, RPL_TRACEOPERATOR_MSG,
> Client_ID( from ), Client_ID( c ))) return DISCONNECTED;
> }
> }
> - c = Client_Next( c );
> + c = Client_Next();
> }
>
> IRC_SetPenalty( Client, 3 );
Final thoughts:
Most of the lines in this diff are actually due to function renaming
or variable renaming. Some of this was necessary (the arguments to
Client_Next() and Channel_Next() changed, for instance), but for the
most part the function name changes can be seen as code cleanup. Some
portions of it are valid bugfixes that could have been split out, but
in a few cases the bugfixes interact closely with the new code, so
it's very justifiable to include them. As I've thoroughly reviewed
and tested this patch, I feel that despite it's size, it is not
unnecessarily large or risky.
Users and servers (and in the future, services) should probably be
stored in separate hash tables. Iterating over a hash table of the
servers will be notably faster than iterating over a list of
connections and filtering out all clients that are not servers.
Almost anywhere that Client_Next() or Channel_Next() is called is a
possible place to optimize. In a lot of cases Client_Next() was being
used, but most of the set was not being used (when trying to iterate
over all servers or opers, for instance). This would be a
considerable speedup, especially since iterating over the hash table
takes longer than iterating over a linked list by approximately the m/
n fill ratio.
As ngircd grows and gets used on larger networks, it might be worth
finding expensive operations that cannot be sped up more (like /names,
or /list) and return a "server too busy" message if processing the
command would result in too many waiting inputs going unprocessed for
too long. Multithreading would fix this, but most of the code
(especially the hash table in this patch and the linked list currently
in use) is not thread safe.
/* comments welcome */
./scott