Like a drunk who wakes up in the alleyway after a bender, I’m sitting here typing about some recent months of progress in natural language processing, especially where it concerns using a dictionary to help parse the language. It’s all still related to the Interrogative AI I’ve been working on, of course, and for conversational AI in general.
Basically, the problem I was facing was how to interpret what a player says to a character. At first, I resorted to simple actions that the NPC would filter through their personality traits to respond to, and that was okay. Then, I went a bit further, since I was implementing a text interface for a demo, and included a list of words and simple sentiment markup (-1 for negative, 0 for neutral, 1 for positive) that I’d find in statements.
That was less okay, as I quickly found that unrestricted text input makes for some statements that are open to interpretation. Then, I decided to look back to how I’m using attributes in the AI for inspiration. Maybe expanding the attributes to be a lot more flexible was an idea that will pay off better than I originally thought. Read on, my friends…
What’s a word?
One of the guiding principles of Interrogative was that words were actions- very complex actions, but actions nonetheless. I call the AI system “Interrogative” because I started out using interrogative words as actions against information as a game mechanic, and as a method to get NPCs to answer questions and address the player and the world.
But when information is posed instead of queried, it’s a bit different. The NPC has no database of attributes and values to look up and respond with. The system now needs to parse the text input of the player, figure out what’s being said, and then feed that to the NPC in a way that it can then respond. Because the system is based on dynamic information and a designer-specified domain, I’m avoiding canned statements. They can be done- but we’re not here to talk about easy things. So now we need to parse: Enter Natural Language Processing and Understanding.
Without going into specifics, the majority of the NLP algorithms deal with statistics, and extracting high-level information such as the subject of a sentence or sentiment analysis. What Interrogative needs is something a bit more robust: Models. A model is a set of data that describes an object in the world. The object can be a physical object or a concept, but the data operates the same way.
The player types something like “NPC is an idiot” and that builds out a simple model of data stating that the NPC has very low intelligence. The NPC can then compare that to its model of self. And if the presented model is negative, positive, or some other measurement different from what it sees of itself, it can act accordingly. If it’s an Orc, it will probably whip out its axe and show you how smart it is at hacking you to death, you condescending jerk!
So that’s the process, but now the issue at hand is: How does NLP tell you what “idiot” means? Simple answer: It doesn’t. You can maintain a list of words for sentiment analysis, which will solve that problem, but then you’ll run into another problem: What if the player says “NPC is a fool”? That’s not the same as “idiot”, though it is also usually negative in context. Beyond sentiment, which changes in context, what the hell do these words mean?
One way to answer this, and the way that I’m using for my techniques, is to use the same method I used when I put together the personality traits for my NPCs: Spectrums of measurement along which the words are placed, describing a relative amount. “Smart”, “brilliant”, “stupid”, “unintelligent”- these all belong along a vector between two words that describe the extremes of relativeIntelligence. “Bold”, “adventurous”, “timid”, all can be placed along another vector with words describing the extremes of relativeConfidence. You can evenly space those words (implicitly, if they’re assigned no values), or you can assign values to them. Either way, now I had vectors to fill with words.
Creating the data
So, with the idea that I could parse the text and “know” what a word was, I set out to do something that was…fucking tedious. I created a dictionary of words mapped to arbitrary vectors. As of this writing, there’s 4581 entries in this dictionary and growing. I divided the words into several lists of the obvious categories, not all of which are mapped to vectors: Adjectives, adverbs, verbs, nouns, pronouns, prepositions, etc. Words that describe- adjectives- were the first to get mapped, and the easiest!
For many of those, especially where it dealt with concrete mappings from words to data such as colors and number words, I gave values. In doing that, a program now definitively knows that “red” is #FF0000 or RGB(255,0,0) (whatever format you choose). If a player types “<car> is red”, then the NPC can understand what the player is saying, and be able to know (if it actually knows the color of the <car>) if that statement is correct or not. Helper functions will also allow the NPC to change that data, for a variety of reasons.
For the adjectives that are more subjective, placing them on a vector with a value allows the NPC to “know” that “stupid” and “moron” are not quite the same. If they view themselves as average intelligence, then they’re being told they’re less than, and then the logic required to react to that is that much easier to implement.
I am now trapped in dictionary hell…
Of course, this all sounds great! But it took days and days of going to dictionary.com and looking up words, trying to figure out which vector was best for an adjective, and then typing all of that in. I eventually wrote a tool to help with it but it was still maddening. At this point, I’m not aware if there are any resources that provide this kind of information. I asked on Twitter, but no one responded- which means either no one knew, or thought I was stupid for even asking.
Semantic Web ontologies such as RDF and OWL do provide some information, especially where it relates to relationships or categories of certain concepts, but I wasn’t able to find quite the dictionary I was looking for. There’s also Gellish, which is a formalized English- but that costs money, and has licensing and stuff, and so as a result, I have no idea if that provides this kind of information (unless I’m mistaken, I think it provides something similar to, but more detailed than the Semantic Web ontologies).
Action words act on vectors
I ran into some hiccups in trying to figure out what to do with verbs: They’re not descriptive words, except in that they describe an action, which is not in itself something atomic. But verbs aren’t always a high-level concept. They are actions that operate on the present value of those vectors whose value is represented by adjectives and other words. “Amuse” adjusts the calmExcited vector in the positive direction. “Alert” adjusts the relativeVigilance vector. And so on.
This is a limited representation of a verb, of course, as verbs such as “walk” are physical actions that do not fit neatly into adjusting a value along a vector, but more or less adjust a more complex state machine or graph representation of the data involved.
It’s all a graph
And that last part about graphs is where things started to come back around to the model-building portion of the text parsing: As humans learn, they build out extremely complex graphs using neurons and synapses inside the brain, and then iterate over that graph via the spiking behaviors of neurons, and the limited calculations further provided by the branching synapses.
At a high level, neurons, or groups of neurons, map to symbols such as words or concepts, which are closely related to each other on the physical surface of the brain. We don’t need to simulate all of that, but we can build a graph of knowledge that can be queried. It can also be iterated over, if you’re using some of those words to describe states being used to represent values for the NPC. The brain even creates graph-based maps of its own body as well as external areas (something that is started to be experimented on in AI now). Graphs are useful!
I’ve begun representing the text as a graph, though in its current form, it does represent some serious challenges where knowledge representation is concerned. Some of that is really just in how I’m parsing or presenting the data. Some of it will entail some new ideas on how to relate the information (such as how to represent episodic or sequential information). But the early progress, especially since I’m doing this with an hour here and there in my free time, is promising.
This is just the beginning…
There’s a lot to be done with this dictionary/parsing technique that I intend to tackle- not just for games, but in general:
- Rewrite the dictionary to allow for multiple “definitions” (vectors, values, etc).
- Account for words that are verbs, nouns, or adjectives, depending on the context (partially addressed by the dictionary rewrite).
- Use parsing to figure out which vector is best for the context of the word.
- Persist the graphs from parsed text and then query and compare it through the Interrogative chat interface.
- How to parse text to define new words?
- Techniques for building out models and iterating over the states like an FSM or Behavior Tree.
- Predictive iterations based on available information in the model (that tiger has claws, and claws hurt- it can hurt me).
From there, it’s a pretty robust AI that seems way too heavy for use in games- especially multiplayer. However, the parsing of input text can be done client-side, as what matters is that the information that the player is typing is accurately conveyed to the NPC for processing.
It’s also a pretty robust AI for building a “chatbot” on- more on the level of Viv than a lot of the shitty app chatbots that are really just text fields and menus blurted out in chat format to give the appearance of a conversation for the sake of a trend.
Next Blog: More about the dictionary, rewrite progress (I’ve hardly started 🙁 ), and maybe some commentary on other AI stuff. I have opinions!