Certainly this must be in our handling of STL containers and such, however it is quite disheartening.
Problems I have noticed to-date with the most current release:
Random invalid-pointers in vectors before any object deletion has occurred. (Most commonly in the Subregion::Charlist and Itemlists, but I have now also experienced this in the cItemHandle::Items vector)
De-allocation issues causing crashes upon destruction of a class containing a populated vector. (This has been occurring recently in the CSpeechQueue destructor causing an assertion)
Failure of an object to be properly removed from a vector upon its deletion
These are only a few that stick at the front of my brain, but by-and-large I am stumped by them when they have occurred recently, as I have been unable to find a cause.
Is our handling of STL truly that bad, or am I missing something?
I haven't looked at it that closely, so I don't know how much may or may not have changed in the last 9 months. But if it is at all similar to how we used to.... then I'm not sure why you're seeing what you're seeing, because the use of STL has actually been quite reasonable, and very similar to the way everyone else uses it, by and large.
Did you change the 32 to MAX_NAME? Because if not, I could see how that could bork you up elsewhere (it was borking up STL stuff in VC 7/7.1 in the same sort of fashion). But I'm sure I'll read you did in another thread, so....
I found and fixed the issue with itemcount loops, it could also have possibly affected charcount loops, and it was originating from storing a CBO rather than the actual index in the deletionQueue, then doing a calcFromSer on it.
I modified this to simply storing the index right off and removing the object from the hash table and have not since been able to reproduce an itemcount crash.
As for issues with SubRegions, I checked thouroughly for places where items were possibly not removed from the subregion before their container or location changed, unable as of yet to find any flaws in that respect. However I did implement some self-healing code to our wrapper functions which should at least trap the issue before it becomes a crash or a nuisance.
giwo wrote:However I did implement some self-healing code to our wrapper functions which should at least trap the issue before it becomes a crash or a nuisance.
For what it is worth, as a general rule, this is bad practice. Other then for debug, it masks the problem, which lets it continue, and worse, over time, adds additional code to get around a problem that later developers may not appreciate the rational.
Why not instead, raise an exception (since the packet code uses exceptions), and dump all pertinate information to a log file that can be used for debuggin (and sent to the developer for examination).
I'd have to agree with Punt, hiding the problem might make it disappear for a while, but ultimately, you really do want to know what the problem is, where it is, and how you can go about fixing it. Annoying short term, perhaps, but longer will generate a cleaner code base.
As do I, however I can easily enough reproduce the crash on my own without it causing prolonged problems for users, I am aware of it, and how to reproduce, they can provide me with no further information, I just have to figure out exactly what is causing it, and for that I have had not much time, nor is there anyone else who has tackled it yet to this point.
giwo wrote:As do I, however I can easily enough reproduce the crash on my own without it causing prolonged problems for users, I am aware of it, and how to reproduce, they can provide me with no further information, I just have to figure out exactly what is causing it, and for that I have had not much time, nor is there anyone else who has tackled it yet to this point.
Giwo,
The concern is all a mater of time. Yes you may be able to recreate at will. Suppose you are unable to solve the problem within your time constraints for over a year. Within that year, say others have taken the code and added to it. Will they understand the rational? Will their changes also affect it? If a short time, everything is understandable. But it all goes to time, and code management. Something UOX hasn't been steller at, but a serious of single point decisions based on what is now (which makes it becom a self fulfilling prophecy, for no one else can pick it up due to these type of things).
I only offered it as a consideration. Clearly you have, which is great. Have you also considered raising an exception when the error occurs (before you "correct" it), to capture as much information as possible (to get different scenarios, there may be more then one). If you have considered it (not if you have implemented it), then please don't worry about this post. It was only food for thought.
I have no intention of leaving it as it is, however UOX is still too overgrown to be anything close to simple to maintain, especially for one or two people.
Once I have split things out a bit, and hopefully moved a bulk of our non-essential code into JavaScripts, it should vastly improve UOX from a maintenance perspective allowing for easier following and trapping of problems.
Were this the only bug that UOX faces (SubRegion data corruption) I would gladly focus my attention on trapping it until I can root it out, however we currently have several related issues (other vectors being corrupted mid-iteration) which leads me to believe that theese issues are being caused by:
1) bad technique
2) patchwork code which is simply far too large to inspect all of.
Thus my conclusion has been to focus on cleaning, re-organizing, and ultimately kicking out functions to JS until we can bring UOX to a state where one can follow problems without spending hours simply trying to trace its path through UOX.
Never fear, I have taken out my self-healing (and much hated ) code, as I have (quite hopefully) fixed the problem in SubRegions once and for all.
There are, of course, other areas that will need the same TLC, but for now this will be quite a step forward (given that it actually works properly).
As for STL issues with Hash Tables, this is being solved quite simply by getting rid of them alltogether. Upon having more of a look at them, I decided they were entirely unsuited to our wants or needs, and were really just bad adaptations from the former hash tables to maps and vectors disguised as hash tables. That being the case, I'd rather do the legwork elsewhere than worry about the additional hassles of hash.h.
My implementation of hash wasn't *that* bad. But yes, it involved reinventing a lot of wheels. Did you get a chance to look at the ObjectFactory stuff I sent you? It gets rid of chars[]/items[]/charsp[]/itemsp[] out of global scope, and makes a single class responsible for the creation and destruction of all game world objects.
*grins* somehow I knew as I was posting that, that you had re-written hash for pre-1.0
And I agree, it wasn't all that bad, but it obviously had some issues (as serials were vanishing into thin air).
Anyhow, not having hash at all is the way of the future.
I got a chance to glance at your object factory stuff yesterday, but only in passing, as I was at work. Downloaded all of the zips you sent me today, so will get to merging all that in tonight.
I sent you my latest (note before merging) today just so you can see what I've been working on (lest we work on the same issue moreso).