Prior Discussions: Relational to Document DB #182

DavidRieman · 2023-10-23T07:32:19Z

DavidRieman
Oct 23, 2023
Maintainer

This thread contains restored content from prior forums; see #134 for details.

This covers some interesting snippets around how our data storage preferences evolved from many relational tables, through to our document DB preferences of today.

[2007]

[We had diagrams about the Data Acces Layer (DAL), which SQL server tech among many configuration options were working best and being maintained at the time, the facades, stored procedures, update processes, where a couple devs tried and failed to make certain data updates, and so on. I'm not going to recreate these all anymore since they're all no longer relevant. It was several layers but highly module and our primary WM maintainer was primarily a Database Guy so most DAL maintenence was on him.]

[Feb 2010]

I've been curious lately about alternatives to RDBMS in software. This thinking came about mainly after I added save/load functionality to a tool at work; the whole process, including basic testing, took roughly half an hour. Granted, this was a small tool with a well-designed object-oriented approach. But XML serialization in .NET is so easy it's ridiculous; slap some [XmlIgnore] on properties you don't want to persist, etc, add some simple save/load methods using the serializer and you're basically done. I'm not really suggesting we save the World state as one giant XML file, but the experience got me thinking about how much overhead RDBMS adds to development of persistence solutions.

I am thinking about experimenting with object database software. Glancing through db4o tutorial (for example) gets me excited about how these technologies can simplify things. Further, I'm considering experimenting with WheelMUD persistence. If it were to go well, would there be opposition to trying it out, maybe in a new branch until/if it turns out to make sense as a new direction? Does anyone have experience with ODBMS? If we're totally set on relational databases though (and I know a lot of work has gone into this, so it seems a reasonable response), I can certainly stave my curiosity about object databases until some other work project prods it as a potential solution.

As for implementation... Query limitations aren't really a concern in my mind; maybe 'ranking' systems would need to query for traits among players, for instance, but Linq and the like provides a simple means to do this stuff against db4o (for example) anyway. My main concern is how these techs handle changing classes over time; say we add a property to the Player class... for something like a MUD there really ought not be a bunch of hoops to jump through just to get the existing Players to be loadable again. Probably the same should hold for removing properties from a class.

Creating a branch and experimenting there is the right thing to, do in this instance. So knock yourself out!
The main reason for having a regular relational database is two fold:

The vast majority of corporate jobs involve coding against a relational database. This is part of the charter for WheelMUD, which is to help people gain real world work experience.

A lot of scenarios work well with a defined ORM. The two that come to mind are Web/WCF Services and Remote Admin. Having each object serialized would make this a nightmare to manage.

I have worked with db4o in the past. It was about 4 years ago. Developing inside the db4o environment was pretty easy, but it was quite hacky when working with .NET. The project I was working on was integrating a client's internal db4o databases to Microsoft Great Plains (now Microsoft Dynamics GP). It was... interesting.

I am doing some work with what is essentially an object database. We are serializing objects and dumping them in the database. But it isn't fast. It is really more of an object cache on a desktop. And that works great since we never know what we will want to throw in it next. But on the server we have a true relational database.

Since the WheelMUD server database doesn't change fast I don't see much benefit to it. If we were to ever have a thick client app though, that would be a good place for an object store of some type. The downside though to that is that when you upgrade an object you risk not being able to deserialize it again.

For whomever runs into this, let me preface this by saying that I am not a lawyer, and the statements here are personal thoughts.
I won't be experimenting with object databases after all, neither for this project nor for work purposes. There are a number of issues in addition to the points you guys have made, but mainly around licensing related issues.

There seem to be no sufficient free object databases which are non-GPL.
The best "free" option I found would probably have been db4o, but it is GPL (or commercial).
Commercial options tend not to disclose any hint of their rates, and you have to contact them for quotes; this is significantly inhibitive for me to, say, reason with a manager that it's worth trying it out, since licensing costs need to be a known factor with such discussions, especially when cheaper/free (non-object-database) alternatives exist.
I have read several places now that Ms-PL is incompatible with GPL. Our code is Ms-PL. I believe we cannot build WheelMUD to be compatible with GPL software because we cannot link (many would argue dynamically nor statically) against such libraries, whether or not we intend to distribute them ourselves.
On a related note, we don't actually link any MySQL libraries directly, do we? I've never actually used MySQL yet so I don't know if any non-OS libraries are required to communicate with the server or whatnot. MySQL code itself is GPL, but merely communicating with a server that was written with GPL code does not constitute "derivitive works" from what I understand. I think we'd only have a problem if we're linking some code that is written under GPL.

[We discussed further and decided not to pursue existing Object DBs.]

[Apr 2010]

[It was proposed to go to flat XML for storage. There was pushback but the option to support it as a seperate means was on the table. We talked a bit about issues from XML storage of prior versions of the code.]

Previously I'd brought up alternatives to RDBMS and decided against trying an object database, but it brought up another point, which was in short that DB = useful experience for real world = part of our charter.

Sure XML persistence is easy but it has some pretty notable drawbacks like difficulty of non-loaded querying. Unless we always load everything into memory, say even players who have ever been created (probably a bad idea), etc, then we'd have to get pretty "creative" to get around certain issues. Another example using the object heirarchy tree: unless an object keeps track of its derivitives in addition to an object knowing what it derives from, or every object is in memory, then there's not an easy way to know if a given object has any children in the heirarchy since we can't query it without processing tons of XML. That said, I can see really useful features using XML serialization. It may be a convenient way for importing/exporting zones for sharing rather than making a separate SQLite file for each zone or whatnot.

I can't really speak much on the previous re-writes as I've only been here during some of the C# incarnation myself. I think I recall someone saying things would get to a point where there was a realization of major architectural mistakes that would just be easier to rewrite and pull in certain key parts that worked.

I like "sealed" if we move forward with the object heirarchy concept. I'd like to have a technical idea of how the object heirarchy might play out before i'm totally sold on it though (IE do we somehow define a value for every property that means 'inherit'? Or is a derived 'virtual' object really a different construct that's an ID of the base object plus a key-value list of propertiesNames-overrideValues?)

Yeah I get the "real world experience" thing. Its just a pain as I find myself wanting to overhaul the object/Item stuff and every time I think about it, I can't get around needing to dramatically change the DB to flatten the table hierarchy.

As for Item tracking parents and children, I actually said that above. If each object type has a unique name, an Item can keep a list of these names as string for what its based on, and what others use as a base. Think of it as "Every Item knows its Parent and Children names". At that point resolving dependencies is a two pass sweep, the first pass to populate a temporary name cache, the second to import.

To answer your last question, that depends on our perf goals. We do a lot with it, heck we could even throw out a hierarchy all together and just make child objects start as duplicates that have a memory of were they came from for tracking, or not. But that in of itself would require that Items be much more flexible than they are now, and the class hiarchy of the codebase itself flatttened a bit.

[May 2010]

I got to a point with my refactoring/simplifying where I wanted to rapidly prove that the automatically-filled properties idea I had was going to work. So I built a new quick throw-away project to test out the idea (I've recently heard this is called a "spike" so I'm using that term now). I'm happy to say that this stuff seems to work quite nicely and should serve to simplify the heck out of the codebase when applied to (at least) Behaviors. As an added bonus, Behaviors will not need to have any concept of the DAL, nor even a reference to the Data DLL. Which is good because I want the basic abstract Behavior to be something easily built on top of from other DLLs and exposed through an MEF catalog of Behaviors; when creating a Behavior instance as stored in the DB, the loader code would search the MEF catalog for a behavior of that name (like "MyNewBehavior" as a string would be stored in the DB for a MyNewBehavior, and so long as any DLL has been loaded which exposes such a type-named Behavior, then that Behavior can load). The loader code would then instantiate that MyNewBehavior with the constructor that takes a Dictionary of saved-property-names-to-saved-property-values as read from the DB. Most of the work to construct the behavior from that information is then performed in the base constructor, making derived behaviors much simpler to code. Anyway, here's the spike I wrote to demonstrate the construction logic:

Dictionary<string, string> map = new Dictionary<string, string>();
map.Add("damage", "2d6");
map.Add("speed", "1.2");
map.Add("damagetype", DamageType.Slashing.ToString());
var w = new WeaponBehavior(map);

Console.WriteLine(string.Format("{0} {1} at speed {2}", w.Damage, w.DamageType, w.Speed));
Console.ReadKey();

public enum DamageType
{
    Slashing,
    Piercing,
    Bludgeoning,
}

public abstract class Behavior
{
    protected abstract void SetDefaultProperties();

    public Behavior(Dictionary<string, string> inheritProperties)
    {
        SetDefaultProperties();
        if (inheritProperties != null)
        {
            InheritProperties(inheritProperties);
        }
    }
    private void InheritProperties(Dictionary<string, string> inheritProperties)
    {
        Type t = GetType();
        foreach (var property in t.GetProperties())
        {
            string key = property.Name.ToLower();
            if (inheritProperties.ContainsKey(key))
            {
                string value = inheritProperties[key];
                if (property.PropertyType == typeof(string))
                {
                    property.SetValue(this, value, null);
                }
                else if (property.PropertyType == typeof(int))
                {
                    int i = int.Parse(value);
                    property.SetValue(this, i, null);
                }
                else if (property.PropertyType == typeof(float))
                {
                    float f = float.Parse(value);
                    property.SetValue(this, f, null);
                }
                else if (property.PropertyType.IsEnum)
                {
                    object o = Enum.Parse(property.PropertyType, value, true);
                    property.SetValue(this, o, null);
                }
                else
                {
                    throw new TypeLoadException("Behavior constructor cannot populate this type: " + property.PropertyType.ToString());
                }
            }
        }
    }
}

public class WeaponBehavior : Behavior
{
    protected override void SetDefaultProperties()
    {
        Damage = "1";
        DamageType = DamageType.Bludgeoning;
        Speed = 1.0f;
    }

    public WeaponBehavior() : base(null) { }
    public WeaponBehavior(Dictionary<string, string> properties) : base(properties) { }
    
    public string Damage { get; set; }
    public DamageType DamageType { get; set; }
    public float Speed { get; set; }
}

What I'm pondering right now is how complicated properties would work; a behavior that houses other behaviors, for instance. I think that'd have to end up like if property.PropertyType == typeof(Behavior) then the value given to us will be an instanceID for said behavior, which we should cause to be loaded by ID now" but so much for keeping the DAL reference out of the Behavior at that point? Maybe an interface for loader? Thoughts?

I think ItemBehaviorPropertyRecord will be going away entirely under this model, but I'm not sure as I'm still not very experienced with the DAL end of things.

[We talked through how we were currently storing things, and alternate ways the spike should have used something other than WeaponBehavior to demo since people thought I was talking about Rules Engine adjacent stuff, and alternate ways to do the type recognition stuff w/switches or type converters. But the main points were a demo of Behavior inheritance and the value of finding something to bypass the hard limitations of specific relational columns being more difficult to maintain and expand over time compared to the ease we're looking for of building up Behaviors capabilities.]

[Oct 2010]

It looks nice, though I'm wondering, should channel roles be stored on behaviors or in a seperate table? HelpManager will need to be upgraded at some point to derive from ManagerSystem, though apart from that I can't see any other conflicts.
Thanks for getting it onto mainline!

In general I would prefer things to get stored into the relevant behaviors; the less a typical developer has to think about and work with the DB tables directly, the better IMO. Let's say a dev adds another property to the behavior to track some other type of role or whatnot. Ideally, the persistence just magically 'upgrades' to handle the new property behind the scenes, rather than having that dev (who is unlikely to be a DBA) have to go alter tables, table recreation scripts, regenerate the DAL, etc... That said, we haven't yet built the attributes/reflection-based automatic storage of behaviors derived from Behavior yet. So it's all mostly theory, although I did write and post about a spike some time ago where I showed that the concept should work.

[April-May 2011]

[Paraphrasing... We threw around ideas for alternative DB tech and RavenDB was coming up the most.]

What are the benefits of using RavenDB for parts? Is it perhaps more resilient at load time to the mapped object type having since added/removed properties?

RavenDb is a native .NET implementation of a document store.

As implemented, RavenDb is infinitely more flexible than a database table. The shape of the document can change on the fly, without having to change anything. The custom code that we write will ignore anything that it is not currently handling.

I haven't seen a degradation of load/write times with RavenDb.

RavenDb was the easiest one to get ramped up with, after testing several other document stores, i.e. CouchDb, MongoDb, Cassandra and a few others

[We tried to figure out what level of objects we want in relational vs documents...]

Anyway, back on segregation of Things... I'm not sure if you're suggesting to do this, but I don't think it would make sense to store different thing types apart. Things get really hard to reason about when you have crossovers / compicated Things. Imagine for instance a MUD about space combat, where there are no 'player' avatars but instead you are basically a vehicle; some players act as Rooms, having large hanger bays which can house other players, etc... Basically a typical Thing that a player controls has PlayerBehavior, UserControlledBehavior, MovableBehavior, RoomBehavior, etc... Does that entity get stored as a room, as a player, or as both? How an example using two thing types you suggested as being one tabular and one document: let's say an admin decides to have any container of reasonable size act as a Room with some sort of Size property; if you are small enough (IE a Pixie or shrunken by magic) then you can enter that container. If you are big, you can pick up said container (let's call it a backpack) and put it in your inventory (and then wear it as equipment). (Let's assume we have some reasonable game mechanic for what happens when a player logs in from within the hanger of, or container of, another player... such as booting them to the surrounding space.) What mechanisms are used to determine which table these things would go into? In general, being able to generically store any Thing and all it's Behaviors in fewer but flexible tables/documents seems both the simplest AND most flexible approach to me.

Anyway I just want to be able to tell an arbitrary Thing to save/load and have it magically work. Then add some code properties I realized I need to a behavior, or even the base Thing, and have loads still work (with null/zero as default values where the values were "missing"), without having to touch anything on the DB side. As you suggested, whether the data is in tabular or document format is hopefully an implementation detail that most won't have to worry about. :)

[Talked more and generally agreed RavenDB was our path forward, but had a brief freak-out that we thought we couldn't use RavenDB either. Then to our relief, we found and linked the evidence on RavenDB.net showing we are fine after all, that usage of RavenDB doesn't complicate nor encroach on our own software license of Ms-PL.]

I'm really close to start work on the document storage (RavenDb). While reading up on the documentation for the storage framework, I came upon an interesting tidbit that I didn't notice before. The storage engine will encode the whole object graph into JSON. With this in mind, I went looking for how to exclude certain parts of the object graph.

The way to do this is to decorate the part with the JsonIgnore attribute. The specific case I'm thinking of is the PlayerBehavior. This behavior, and probably some others, are getting data from the database. So it doesn't need to be duplicated onto the document store.

I went and removed everything related to Items and MudObjects in the database. Those need to be done from scratch. I always hated how those were working. I thought they were clunky.

Whoa... i just had a scan through the RavenDB hello world tutorial - and yes by the examples where the query is generating the Order instances, and seeing the fully-qualified class names in the @metadata sections, yeah this might "just work" and involve less hand-holding that I thought we'd have to do.

[Late May 2011]

Seems strange that PlacesManager has methods like LoadAreas, LoadRoomsForArea, etc. Shouldn't the act of loading a world Thing automatically load it's SubThings (typically "areas") and loading those area Things will load their SubThings (typically "rooms") and so on? What about Things an admin wants to store directly in the World that aren't technically "areas"? If there were any custom behaviors on any of the areas/rooms/exits/etc, code of this sort seems it would also fail to load them too.

Yes, that's why I put the comments to the effect of "Put this in specific behavior." I put them there just to have basic code to load the world.

Alright, I got these load methods into the proper behaviors. The exits are still not loading correctly. I'm totally missing what is needed. This is now in the RoomBehavior.Load() method.

Note 2023

RavenDB has been working out as a document DB that now loads everything. The relational DB is gone: Even account authentication occurs through the Document DB.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prior Discussions: Relational to Document DB #182

{{title}}

Replies: 0 comments

Select a reply

Prior Discussions: Relational to Document DB #182

DavidRieman Oct 23, 2023 Maintainer

[2007]

[Feb 2010]

[Apr 2010]

[May 2010]

[Oct 2010]

[April-May 2011]

[Late May 2011]

Note 2023

Replies: 0 comments

DavidRieman
Oct 23, 2023
Maintainer