What arrays are there?

Sydius · Post by **Sydius** » Thu Mar 10, 2005 11:44 pm

So what arrays are there sitting in RAM in UOX3 now days?

Like is there one for items, one for connections, one for characters, one for accounts, one for scripts?...

Just curious.

Is STL being used extensively (or at all)?

I know hash-tables were used somewhat in the older version I worked with, but are hash-tables more prevalent now or less or about the same as the version UOXClassic was derived from (the same I derived mine from)?

Maarc · Post by **Maarc** » Fri Mar 11, 2005 3:22 am

That's a very loaded question, and the easiest way to find out... is probably to look through the source

I'll try and describe them mostly off the top of my head. We don't use flat arrays for almost anything any more, instead we use STL to a very significant degree.

Objects (items/characters/multis) are stored in their own hash indexed multi-maps based on their serial in the ObjectFactory. Pointers to them also exist in the fixed collection of SubRegions. This allows us to look up all items/characters in a smaller area (32 x 128 block).

Accounts are stored in STL maps (think a tree, to some degree, it's O(log n) lookup compared to O(n) previously).

Guilds, races, jails, temp effects, weather and GM/CNS queues are still largely vector based, changing to another structure didn't really benefit these areas, I don't think. Temp Effects might want to be changed to a list, given that they're heavily added/removed to, and the traversal is very linear and constant, not random.

Scripts are somewhat different, they're stored in ScriptSections and Script. Each class of script (items, npcs, and so on, look at the directories) get stored in their own bucket. We do a map based lookup on the section name, as well as supporting priority loading of the files. The map based lookup (again, O( log n )) is the fastest mechanism. Essentially, if it works as it's supposed to, it should be a very fast lookup (maximum of 14 probes, for instance, for 10000 records).

Packets are handled in a class factory sort of mechanism, and makes it easier for us to support new packets, fairly easily (incoming, at least, outgoing is much easier to do, just a new subclass of a class and use as needed).

STL is used extensively, we no longer roll our own container classes, and use STL classes everywhere.

Hash tables are far less prevalent now than they used to be, because, to my mind, they were used inappopriately in the past. Or at least, inappropriately once people went to a class based system. Things like what items a person is holding and so forth should not have been stored in a hash table, and they aren't any longer (it is extremely trivial, code wise, to find the items a person is holding now, we store our own list on each char/item).

There are other things I am sure I have missed, and you have two options if you want more details or knowledge.

1) Look at the code yourself

2) Ask a few more specific questions, your questions are quite broad.

The advantages of these changes have been greatly reduced search times in many places, better memory allocation / reductions, less bugs due to rolling our own container implementations, more organised and readable code, and less bugs due to the finicky nature of hash tables (I don't think we use any hash tables any more, except for storing the serial->item/char lookups).

Sydius · Post by **Sydius** » Fri Mar 11, 2005 4:13 am

Exactly the response I was hoping for.

Thanks!

giwo · Post by **giwo** » Fri Mar 11, 2005 4:39 am

Maarc pretty much said it all, the only thing I have to add are some thoughts on the current situation.

The largest overlap (IE wasted memory) is in the smaller not-so-vital sections. Things such as goplaces, we load all that data into many scriptsections and then wind up just moving it from that vector-based-system into a vector specially built to allow ease of use of the goplaces.

It is in these areas that I feel we could possibly cache the files into memory, loading them into their specific containers, then dump them only looking at the file again on a reload, saving us some RAM.

All-in-all, we are very STLish *laughs at his pun*.... Though it's only recently we've really started handling the STL properly (iterators and the like). CDataList was a relatively recent implementation to help better control some of our lists (in subregions and containers, mostly, where we do alot of regular addition and deletions).

Maarc · Post by **Maarc** » Fri Mar 11, 2005 5:01 am

Hmmm, that reminds me, I still haven't looked at chucking in the Dispose() method I was thinking of. If I add it, it might be possible to selectively dispose as we need to. Not to mention it helps with other parts.

While overlap still exists, I'd be guessing that's it's all fairly good, by and large.

Sydius · Post by **Sydius** » Fri Mar 11, 2005 5:35 pm

Why are the sub-regions 32x128? Kind of an odd shape, I would have thought. What information do they generally keep track of, and how do you handle situations where something happens on the edge of one?

Oh, and did you know SGI has a hash map STL container extension? It does the same thing, but much faster and without the need for a multi-map.

Maarc · Post by **Maarc** » Sat Mar 12, 2005 2:16 am

Yes, I know about hash_map. Yes, I know that SGI implements. But yes, I know not every compiler does, I think the default hash_map gets thrown into stdext namespace, not std

No point using containers that aren't widely implemented, unless you're prepared to isolate some compilers and not support them. While a neat extension, it's not part of the C++ standard.

As for 32x128, I'm not really sure why, to be honest. I remember there was some logic at the time. I think it had to do with the tendency for items/characters to be stacked together much more closely on a width basis than on a vertical basis. The numbers could be somewhat changed, I think, without too much trouble, though the files that you would save/load in the shared directory would be somewhat reorganised (those files are dependent on the structured size of SubRegion).

Sydius · Post by **Sydius** » Sat Mar 12, 2005 2:57 am

I believe hash_map is an extension to STL, not C++, so the only reason why a compiler wouldn’t support it is because it does not have that part of the library, which, in turn, would only require downloading it from the SGI web site. You would not be isolating any compiler by using it… it does not change the language, just the standard library that comes with the language, which is independent of any compiler (or should be, in theory).

punt · Post by **punt** » Sat Mar 12, 2005 3:03 am

Sydius wrote:I believe hash_map is an extension to STL, not C++, so the only reason why a compiler wouldn’t support it is because it does not have that part of the library, which, in turn, would only require downloading it from the SGI web site. You would not be isolating any compiler by using it… it does not change the language, just the standard library that comes with the language, which is independent of any compiler (or should be, in theory).

At one time STL was seperate from the C++ standard library (for that matter, the string class was as well). anyway, most of STL got added to the C++ standard library and thus part of the C++ standard (so a complient C++ compilier would be expected to inlcude it). That is what Maarc is referring to.

SGI continued their STL library (Also from AT&T) and one can find one that mimics it in STLPORT. However, those "extensions" are not included in many of the C++ compiliers, as they are not part of the specification for the C++ standard library.

Clearly one could just say, use STLPORT period for all compiliers for the project, but I believe at the present, they are trying to keep the library to a minimum.

Sydius · Post by **Sydius** » Sat Mar 12, 2005 4:04 am

Wouldn't the trade-off of including another lib be worthwhile, though, for the less complex code and faster execution?

punt · Post by **punt** » Sat Mar 12, 2005 4:22 am

Sydius wrote:Wouldn't the trade-off of including another lib be worthwhile, though, for the less complex code and faster execution?

If one considers the larger picture, versus doing as one goes. And even then, there is always a trade off between complexity and usablity. Speed increase hasn't been shown to actually affect the speed of the server. It may be swamped via user action, or other server actions. So that doesn't guarantee an observable speed increase.

But assuming it does, then at what point does on decide there are "two" many libraries? There is all ready hooks into boost in the code, as well as zlib. If one is not careful, one can quickly get overlapping libraries and a complete mess.

Compound that to then version control, cross dependencies, and then complexity to the user. And what dependencies they have for the compilers and OS's one hopes to support.

Any point decision by itself normally makes perfect sense. But that same point decision MAY not make sense in the bigger picture. Just has to be considered.

giwo · Post by **giwo** » Sat Mar 12, 2005 4:35 am

While I certainly don't mind the addition of important libraries, I perferr to keep things as clean as I can, myself.

Boost was shortly tested for a while, but I'm pretty sure very little of the remnants of that tests are still around. As for zlib, I wasn't aware that it was ever even tested in the source, I was actually asked to do that years back and haven't gotten around to it yet.

(The upside of that is we likely won't implement zlib now, so saved me some work, I guess).

Maarc · Post by **Maarc** » Sat Mar 12, 2005 4:49 am

A quick check shows no reference to boost any more, whether it be includes or anything else, it was a short term test to see if it would solve some of the portabilty issues (for things like directories), as well as dealing with string tokens (but UString handles that now).

I don't see references to ZLib, though I could be wrong. So the only real dependency we have, as far as I can see, is SpiderMonkey. I've been looking into NSPR when I've had some spare moments, and wondering if we can't include that as well, as that solves some portability issues (such as common thread model, network infrastructure, and directory IO, the major categories where the platforms are different). NSPR isn't necessarily a bad thing to add, as it's from the same people as SpiderMonkey, and SpiderMonkey can be built with it for thread safety (something we should also consider). So it's not a big stretch, dependency wise.

Sydius · Post by **Sydius** » Sat Mar 12, 2005 4:57 am

But it would simplify code considerably, wouldn’t it? I don’t see how it wouldn’t make a significant speed increase, either, and it would decrease RAM usage considerably, wouldn’t it?

I don’t much care if you guys use it… but just don’t understand the logic behind choosing not to.

punt · Post by **punt** » Sat Mar 12, 2005 5:03 am

Maarc wrote:A quick check shows no reference to boost any more, whether it be includes or anything else, it was a short term test to see if it would solve some of the portabilty issues (for things like directories), as well as dealing with string tokens (but UString handles that now).

I should have been clearer. I did not to indicate the code still used boost (although the version I checked out still had a few checks for stlport and boost (but not used, nor did more then defines).
I meant to show it indicated where one could go, if one started taking it as a parallel effort to make point decisions on library inclusion. Obviously I wasn't clear.

Nor did I stated it would not increase speed. I mearly cautioned to assume that speeding up any section of code would result in a true speed up in terms of usable operations.

giwo · Post by **giwo** » Sat Mar 12, 2005 7:19 am

I am all for the inclusion of npsr, had I the knowhow and motivation, I would have already included it myself.

As for the Hash map, we actually don't use a hash map at all, currently, largely because VC6 doesn't include it in its default STL. Basically what it comes down to is making it as simple as possible to get started with UOX3.

There are some needful inclusions (spidermonkey, possibly npsr in the future), but anywhere we can avoid doing anything that requires extra work to setup or os/compiler specific code, we do.

punt · Post by **punt** » Sat Mar 12, 2005 2:47 pm

giwo wrote:I am all for the inclusion of npsr, had I the knowhow and motivation, I would have already included it myself.

As for the Hash map, we actually don't use a hash map at all, currently, largely because VC6 doesn't include it in its default STL. Basically what it comes down to is making it as simple as possible to get started with UOX3.

There are some needful inclusions (spidermonkey, possibly npsr in the future), but anywhere we can avoid doing anything that requires extra work to setup or os/compiler specific code, we do.

I offer that perhaps before one continues to grab libraries to include, one lists all the things one is looking to gain from libraries, where all it would be applicable, and then see what libraries make sense. Are there implications that including two libraries would allow people do things multiple ways ? If so, is both acceptable, or is the goal to standardize?

Clearly NSPR can simplify and address many portablity issues. So can many libraries. How far does one to go? NSPR was also written when c++ compiliers where very immature, and takes a very consertative aproach. Is that consistent with what UOX is doing? Does it want or desire that far reaching approach?

A suggesttion may be to consider the spectrum, identify the areas and then identify the options for libraries to address that. Consider where one wants to go in totality, and what it means to "commit" to a library or set of library.

Then , once one gets the selection, implement in totatlity (something UOX has not demonstrated an ability to do well in the past, so I bring to the attention to consider).

As always only a suggestion. And perhaps it has been done, and a result is NSPR.

xir · Post by **xir** » Sun Mar 13, 2005 3:15 pm

At the same time, one is trying to achieve improving UOX and also compiler and platform portability. For such a small developer team, time is wasted on fiddling with portability tidbits rather than the goal of improving UOX. In the long term moving over to NSPR is a very viable option and will improve maintainance considerably. The NSPR will need to be included anyway to compile spidermonkey with thread safety.

There are a few points to consider though if one wants to move over to NSPR. NSPR is a C library which just wraps the system calls on different OSes into a common interface. There is no OO associated with it at all which is unfortunate. There is also how far one wants to go with portability. NSPR relies on its own datatypes (which are typedef'ed I think) - this could be a problem with type conflicts throughout the code base if one wasn't to port fully to NSPR's datatypes. Wrapping the system calls in UOX with NSPR calls would be easy enough to do in the short term, but it doesn't improve the design of UOX. UOX really does need a redesign. A few nicely wrapped NSPR classes would create the foundation for a more Object Orientated design. However I've looked at the code before and it really is going to be difficult to do this. Its probably not going to be even feasible as the code in some places is just all mingled together.
Take threading for example - the common way to do this would be to derive from a Thread class and then run it. In UOX threading is done by just spawning threads in the main procedure if I remember correctly. Another example would be the Network I/O. Normally one would create a Socket type object from which to derive all other Network type activity and a static container would hold all instances of that class (or derivations) so asynchronous event dispatching could be done with select() from a static function in that class. Since select() or WSAWaitForMultipleObjects() or whatever demultiplexing call you are using to scan for socket activity blocks it could possibly be given its own thread. In UOX however network activity is not centered at a certain point - it is spread over multiple calls and even over multiple paths of executions (the xFTP server?) and as such I notice one is forced to sleep in order to keep CPU activity to a limit. In my opinion network i/o is one of the more essential things that needs to be looked at. UOX after all is a server!

Perhaps I'm being a bit too harsh and critical of what things should be like. I've given up on UO for the time being anyway but just was browsing and wanted to add my two cents. It is quite possible to just replace the calls at the moment with the NSPR ones (if thats the wish). I've played with it in a project or two and it really isn't that difficult - the documentation is there and fairly complete.

Maarc · Post by **Maarc** » Mon Mar 14, 2005 3:06 am

No, nothing wrong with criticism, it's how people react to it that is important

I'm curious as to when you last looked at the code. Obviously, you looked at something recent (xFTP? I didn't even know there was something like that in there). But I didn't think the main game network IO was too bad, mostly. We have a socket class which wraps the socket ID, which has send/receive methods, and does it's own logging as needed, as well as attaching to the character. We have a structure for the packets we send/receive, they inherit from a base input/output class, which get passed to a character's socket, the receive based on a class factory on packet ID. And we have the network global container, which periodically polls for new connections and data updates. How would you change that? Just the main game interaction loop, how would you do that? I willingly admit my knowledge of network programming isn't huge, and certainly could be improved. And I know less about practical threading as well

The network stuff should be asynchronous, except we enforce that data gets received and acked in 2 seconds (configurable) or the socket gets killed, so it can be slowed down by slow people. I'm guessing you'd suggeset a handful of threads to handle the network IO.

As far as some of the other stuff goes, a lot of it would be purely implementation specific. For instance, the need for PRDir, for instance, would be used entirely as an implementation feature, it wouldn't get passed around. There aren't a huge number of spots where platform-specific stuff occurs, it's just a pain when it does occur.

As far as I know, there's only two main threads (with optional compile support for a third). Console, game logic and network (optional). I can think of a few other threads that would come in handy (split network into two, login vs main game, data loader for things as needed, caching sorts of stuff)

xir · Post by **xir** » Tue Mar 15, 2005 11:15 pm

Just like to say before I begin I don't mean to demean anyone's work (as it may appear so.) I've had a interest in Network I/O models recently and I've studied several examples. I myself am not a great programmer but I do love design. I may be wrong in my conclusions or results, and if so you can correct me

I just would like to discuss and perhaps show which model would be best for UOX3.

The Network I/O isn't really <i>that</i> bad. I'm just saying it could be possibly improved upon and if one is doing it again (in NSPR) a possible reactor model could be adapted (since it runs in a single thread) so you could maybe wrap the sockets nicely with the javascript without too much hassle. Maarc you are Abaddon in code right?

Let me discuss my opinions on UOX3's model (as you were interested.)

cSocket :
This class is a lot more than a "socket." It handles some network i/o, buffering, accounts, compression, targetting, coords, banking, timers, client version control and other stuff which mostly is protocol based.

cNetworkStuff:
Its hard to tell from the name what this actually does. Looking at code it handles server sockets, buffering, polling for socket activity, packet slicing and parsing of all clients and firewalls.

cPUOXBuffer:
This is the base packet class I presume which all other packets inherit.

UOX3 is mainly just a single threaded server right? Looking at the main loop the only two "events" which should trigger a cpu burst would be in the event of network activity or a timer trigger (lets say worldsaves come into the timer event.) In the cNetworkStuff class two functions are being called - the CheckConnections() (this polls for server activity) and CheckMessages() (this polls for all client activity.) Each of these functions calls select() seperately with a timeout value. I would guess from the main loop that you are struggling to fight a CPU battle. You are balancing the CPU by trying to "guess" according to the number of players on how much of the CPU can be dedicated to the program in the next turn of the loop.

UOXSleep( (cwmWorldState->GetPlayersOnline() ? 10 : 90 ) );

if( uiNextCheckConn <= cwmWorldState->GetUICurrentTime() || cwmWorldState->GetOverflow() ) // Cut lag on CheckConn by not doing it EVERY loop.
{
Network->CheckConnections();
uiNextCheckConn = BuildTimeValue( 1.0f );
}
Network->CheckMessages();

Your __LOGIN_THREAD__ define also gives it away that you were planning to (or have already done ?) splitting the polling into another thread. (By the way I'd like to know how that is going

) It seems to me there is a big problem with CPU balancing.

There are possibly two disadvantages to this model (In my opinion).

UOXSleep() <-- 1 or more network events happens during this and we have to wait

CheckMessages() <-- nothing happened
CheckConnections() <-- nothing happened
(waste of CPU - traversal of all client list (to set), numerous bit manipulations + 2 expensive system calls)

First model could possibly have a negative impact as the client list grows. Second problem is a waste of CPU.

My proposal to solve this problem would be to only have one select() system call. You can poll everything at the same time. For example.

(Remove the UOXSleep)
CheckNetworkActivity();
--> polls everything with select (you can supply a generous timeout here)
--> checks each file descriptor in turn and see if any activity
--> If activity on server socket
--> do what CheckConnections() does without the polling
--> If activity on client connections
--> do whatever CheckMessages() does without the polling
CheckAllTheTimerJunk();

Since select() can be asked to block until some network activity happened on all sockets you don't need to waste CPU doing it twice. It potentially acts as a "sleeper" too. (On UNIX you can potentially add the console into the select() poll too to check for its activity

) Of course you would need a timeout for the select() because you need to look at the timers, but I think if you do as I say it will cut down on cpu activity dramatically and you can remove the ugly sleeps.

Lets look at how wolfpack did their network layer since you were interested in a threading model.

Ok I'm not very familiar with wolfpack and not sure if my comments on it are fairly complete or exact but I get a general idea looking at the code in what is being done. I'm actually disappointed in what Wolfpack have done. The QTNetworkOperation class provides a very sweet reactor type network i/o model (which I commented on in my last post) that encapsulates the whole asynchronous operations without getting to the nitty gritty of low level socket interfaces. Wolfpack instead (looking at it) have decided to handle low level socket operations themselves possibly because they wished to implement a threading based Network Layer. Its actually quite interesting how wolfpack tackled the cpu problem by splitting up into threads. You can see remants of the old network i/o in UOX restructured (or perhaps it just happened that way that they are similiar but I doubt it

)

Wolfpacks network layer is split up into at least 4 paths of execution. (I believe 3 threads and the main thread included in some Network code?)

cListener :
Wolfpack seems to have developed their server socket using QT's raw socket class (although I don't know why they didn't just use QServerSocket.) This class is threaded and polls for incoming connections. (I also don't know why they set the device to non-blocking in a single thread - people must love wasting cpu these days.)

cUoSocket :
This is the client connection socket. I love how wolfpack do their packets and packethandlers. This is quite similiar to the cSocket class in UOX3 but is cleaned up a lot using packet events and handlers. This class doesn't really do much network i/o if any at all. It just holds the socket device. Instead you got to look at the cAsyncNetIO class where all the I/O is done.

cAsyncNetIO :
This class handles most if not all of the client connections I/O. The object itself is run in a seperate thread. cUoSockets "register" with this class for it to perform operations such as buffer management, packet slicing and encryption/(de)compression. The cAsyncNetIO is run constantly reading and writing the sockets. Note, that there is no demultiplexing call like select() (I don't even think QT supplies one) so what wolfpack seem to be doing is just attempting a read and write of each socket device in an infinite loop and checking the return value (yuck.) Of course, they need a sleep in this thread also to stop too much cpu consumption.

cUoPacket and other packets:
I love how they do they do the packets and packethandlers in Wolfpack. One has to look at it to appreciate.

cNetwork:
This executes within the main Wolfpack thread I believe. It is the focal point for all network activity. The class contains a list of all cUoSockets, a cAsyncNetIO object and 2 cListeners (one for login and gameserver in seperate threads.) The class is responsible for taking incoming connections and adding them to cAsyncNetIO for network handling.

Things I like what Wolfpack have done:
The listening class is good (which UOX does not have) as it distinguishes the two server sockets nicely.
The packets and packethandlers are done really really well!!!
I love how they split up the load between the different threads. It is done quite elegantly.

I'd love to see a cpu activity model for wolfpack, because I believe it can be still quite expensive if run on a single processor machine. I believe they still haven't solved the CPU problem. They are still using sleeps within the threads. Without a demultiplexing call cAsyncNetIO is constantly scanning all the sockets each loop and attempting to read from them even if none are ready. Then it goes into a 10 ms sleep I believe. (same problems as UOX) I'd believe wolfpack might run well on a SMP though.

If I was doing a single threaded network model. This is what I would do.

I'd have a singleton called nsSocketEventDispatcher, a nspr wrapper class nsSocket, an interface (abstract base class) named nsSocketEventHandler.

Let me draw up a quick diagram of what I think would be good.

Just to explain it.

Example cLoginServer would inherit from nsSocket and nsSocketEventHandler. It would override the event_close() and event_accept() of the interface so it and then register with the nsSocketEventDispatcher so it can notify the class of activities.
Likewise the cUoClient would inherit from both classes and override the event_read/write/connect/close and whatever else it needs. It would also register with the dispatcher. In the main loop you would call the singleton's method dispatchEvents() (you can supply a timeout of course) and this would use a demultiplexing call like select() to notify events to of the registered classes. The nice thing about this model is, that it doesn't waste any cpu in threads nor does it poll several times. It is event triggered. I beleive it is called the Reactive model.

There also is another model called the proactive which deals which threading I believe. I'm sure google would find you some nice material if you want to read up on it.

The 3 classes could be easily incorporated into 1 easily. I just showed the 3 to clarify it.

Anyways, as you asked, thats how I would do it.

Ultima Offline eXperiment 3 forums