
EverWorld is a procedurally generated adventure life sim with deeply simulated artificial intelligence (so says the Steam page!).
EverWorld is actually version 3 of a game that’s taken several years (elapsed) to make. The first version was called ‘One Little World’, then ‘EverWorld’, then the name stuck for the current version which was a complete rebuild from scratch.
EverWorld is a game that is very ‘deep’ but not very ‘wide’. That is to say, there are a few systems with extreme depth in terms of world generation and simulation, but the actual world is pretty small at the moment – it only takes a couple of minutes to sprint from one end to the other.
Lots of EverWorld’s systems are not very visible. I want to dive into some of the systems at work in this post, in particular one of the first things players will do when they play the game for the first time – saying hello to an NPC.
For a simple example, here’s what the player sees:
1. Player says: ‘Hi’
2. NPC says: ‘Hi’
Pretty standard stuff. Here’s what happens in the background:
1. Player says ‘Hi’
1a. Player’s dialogue creates a GameEvent and is interpreted (see below)
1b. Check to see if NPC is busy or can talk. NPC could be doing anything at this point – they could be asleep, could be talking to someone else, could be in the middle of something etc.
1c. Assuming they can (and want) to talk to the player, the NPC clears their current action queue, finishing any final tasks they need to do
1d. NPC finishes their current animation in a way that looks smooth / natural
1e. Two new actions queued for NPC – turn to player, then socialise with player
1f. NPC does an immediate ‘socialise’ action with the player
1g. NPC chooses from a broad list of ‘barks’ based on several things – their needs, their mood, their personality, their opinion of the player, the current conversation topic, the last thing said etc. All possible barks are examine and scored. In this case it’s pretty simple – the player has said hello, so the most highly scoring bark is nearly always a greet response.
1h. Once the ‘bark’ is chosen, now the NPC has to decide what actual words to say. This again is a scoring across a variety of different ways of saying whatever the ‘bark’ is, using similar inputs (personality, opinion, emotions etc) but also the NPCs ‘voice’ (what words they tend to use more / less) plus any context specific ways of saying things. This provides the NPC with an ‘utterance’ – a simple example would be ‘[greet]’ or ‘[greet] there’.
1i. Now the NPC has an utterance, there are usually some placeholders to choose from. [greet] is an example – there are lots of possibilities, similarly scored using the personality, opinion, emotions and voice persona of the NPC. Examples for [greet] would be ‘hi’, ‘hello’, ‘hey’ etc. Placeholders are used for more complex things like [subject_food] or [subject_activity] but we’ll leave that for another post.
1j. Some final checks on the actual dialogue the NPC has chosen are done, and it’s formatted properly
1k. The NPC’s dialogue act creates a ‘GameEvent’. These occur whenever the player or NPCs do anything, are recorded in the game, and can be witnessed and remembered by other NPCs. Dialogue events are also stored in a ‘Conversation’ which helps set the context and topic for an social exchange.
1l. A lip sync based on the phonetics of what the NPC has said are queued for animating the NPC’s face
1m. Any effects from whatever the NPC has said occur (this also occurs whenever the player says something, including just the word ‘hi’ above). This could be changing the topic, imparting some knowledge, setting an internal reminder that this NPC has now said hello to the player at least once today etc. Any changes to opinion by saying a certain thing are also conducted – e.g. saying something hurtful to an NPC lowers their opinion of the player.
1n. The event and dialogue is displayed to the player, finally leading to…
2. NPC says: ‘Hi’
That’s a lot (and I’ve probably missed some steps)! Free text chat is a core feature of EverWorld so some complexity is expected here. The synergies and interactions with other systems (opinions, personalities, events etc) also add to the sequence of steps. This process is repeated for every single time an NPC says something to the player (plus some extras like waiting their turn to speak, not repeating the same things, moving with the topic etc).
Interpreting what the player says is also a pretty complex topic that I’ll save for another post.
EverWorld does not use LLMs or any similar generative AI for generating responses. Experimentation with LLMs and SLMs hasn’t proved particularly fruitful – there is just no way to fully control or reduce the risk of hallucinations. We might try generative AI for other things (e.g. interpreting / simplifying player meaning) in the future. A core tenet of EverWorld is that the player and NPCs can talk about actual things in the game world – if the NPC says they like a particular kind of food, then that should be true, and the player should be able to make that food and give it to the NPC.
Instead, EverWorld uses a complex set of subsystems to complete the process above, with a lot (A LOT) of high performance string matching and searching, scoring and weighting models, and a little bit of regex dark magic. Strings aren’t computationally cheap, but they’re really the only option when it comes to a game dialogue system like EverWorld’s.
Potential topics for next post:
– Player dialogue interpretation
– Game event propagation
– NPC knowledge
– EverWorld vision and design principles
– Previous iterations of EverWorld + lessons learned
– Early access feedback
Leave a Reply