On Data…

Previously, I talked about some very generic things regarding Interrogative 3 that I wanted to accomplish. Specifically, I wanted to make the dialog more procedurally generated, and to be able to access data that could be updated or “live”. With the previous versions of Interrogative, the problem was that most dialog that the NPC had to say was written by a writer. A sentence like “Old Farmer Johnson is a jerk” was written manually, and was basically static data. If you wanted more opinions on Old Farmer Johnson, you had to type them in.

You might say: Fine, we can do that, and account for opinions and then that data is there! But all of the data isn’t there. If you want to extend that to other characters, then you would either have to create all that dialog, or generate that dialog on the fly for that opinion. Procedurally generating the data isn’t that difficult, it turns out- but you have to set up the data correctly in order to do it. Interrogative 2 had a complex table-based system where each context (subject) had a database table with dialog, and each line of dialog was set up for the dialog action being asked, Knowledge Level the dialog represented, conditions we might want to evaluate, etc. Each line of dialog was accompanied by all of that baggage, so creating the amount of data needed to represent opinions was prohibitive. If we generated it procedurally, we could have a handful of dialog in the system that reflected the opinion, and the markup we used for other tasks could be extended to fill in names and opinions. But the table-based structure of the system made that a little clunky. And what if we decided to ask other questions? What was needed was a data structure that we could use to store facts!

Enter the Triple

But, weren’t we doing that already? Nope- we were storing lines of dialog with facts in them. Like the chocolate chips in the cookie, facts are sprinkled in the dialog to make communication useful and meaningful (who the hell wants a plain cookie?). Here’s a few examples of facts:

  • Water is wet.
  • Blaster Rifle BR-33 does 82 minimum damage.
  • Master Blimblam’s max health is 3248.
  • Leaves are green.

So, facts are basically attributes. Some of these change (the color of the sky changes throughout the day), and some stay the same (if Master Blimblam is an NPC, that attribute is probably static). What we’ve done is take the information and just chop it up into single entries with the purpose of using them in much the same way as data is used in “triplestores”. A triplestore is a database that stores data in a Entity-Attribute-Value format that is usually referred to as Subject-Predicate-Object (called a “triple”), much like the above facts. This kind of data is used mainly for Semantic Queries, which if you’re into Semantic Web or Big Data, you’ll know about this (and probably more than I do). Without going into a class on those subjects, triples are a great starting point for storing base data.

However, we have to move beyond just a triple in order to store our data, because there’s a few shortcomings to using that format in its vanilla form. We still need to represent Knowledge Levels, which represent depth of knowledge of a subject, and especially at lower levels, may give NPCs access to incomplete or incorrect information. Also, I wanted to chain some of the facts together, and having a field with ambiguous data in it seemed hard to deal with, so I added data to type the data being contained in some instances where we were just pointing to other facts. This is especially true since subject names are kept in a separate list for looking up all data that pertains to them.

That last part is pretty important for compound facts. Consider the subjects listed:

  • Rifle
  • BR-33
  • Weapon

And the following data stored as triples (the hyphens are separating the data fields):

  • Rifle -is a- Weapon.
  • BR-33 -is a- Rifle.

You could simply reference the BR-33 “Rifle” value to the “Rifle” entity above it, and then be able to say that the BR-33 is a weapon. And since the BR-33 is a weapon, you could query the subject “Weapon” and get facts about that. If the NPC has the Knowledge Levels available to them, then they could tell you a good deal about these subjects beyond what would be simply written as dialog. It may still seem a bit confusing, since the facts above are still all just words, even with the list above it, but let’s introduce a new list of Predicates. What is a Predicate? According to Wikipedia: “the purpose of the predicate is to complete an idea about the subject, such as what it does or what it is like”. That’s a broader way of saying “attribute”. And that’s what we’re using it for in this list:

  • Is a

That’s it. A list of entities (subjects), a list of attributes (predicates), and a list of triples that describe them with values, along with tagging and some markup, and we have flexible data for the foundation of Interrogative 3.0. There are, of course, other sets of data involved in the system that are far more conventional, but we’ll get to those later when we talk about tools and other features. This is just a primer on data formats that are different from the usual tables of data used for games.

So, one more thing: Predicate and attribute, entity and subject, and value and object are interchangeable here as naming conventions go. That minimum damage for the BR-33 is obviously an attribute. So is the color of the sky. But with dialog, it’s more fitting to say predicate, and the distinction between the two in usage is very gray. And where dialog is concerned, you can still store dialog as facts:

  • Bird Flu -what- Virus.
  • Bird Flu -how- “through being sneezed on by a bird”.
  • Bird Flu -when- <collection of dates of outbreak>.

Above is a list of facts where the predicates are our dear old interrogative actions (it’s not accurate to say that the interrogative actions are attributes). Ask a question, and the NPC can easily look up the data. And aside from the second entry, which is canned dialog in the value of the attribute, the values are very easy to modify on the fly. And now we have a flexible data format to build upon for Interrogative 3.0. Keep in mind that this is a general view of the data, and before I’m done, the format may change, losing or gaining columns in the database, but integrating data lessons from another industry and combining them with my needs resulted in a more compact set of data than Interrogative 2.0 had, while making the data much more accessible and opening up possibilities for more communication features.

However, this format has also brought up some interesting challenges. If you’re sitting there thinking “but what if I don’t want to store my game data like that?”, then you’ve already figured out the biggest of those challenges. Addressing that is a blog for another time though.

Additional Reading…

If you want to read some more about triples, triplestores, and how you might bastardize the concepts to your own ends, Wikipedia has an article or two, and the triplestore article is a good place to start (http://en.wikipedia.org/wiki/Triplestore).

Next blog: That other foundation of Interrogative 3.0- Personality Traits, and how to misuse them…