Learning Systems

A Tabletop Network Exploration

Dec 10, 2023

The fourth Tabletop Network conference was held in November, and it was once again an energizing and thought-provoking experience for me.

The centerpiece of the conference is the Big Idea sessions. Prior to the conference people propose deep topics, and attendees choose which they want to participate in. Each group goes off and has five to six hours to dive into their topic, and then they present their ideas to the entire conference that evening.

The topic I participated in grew out of a talk I gave last summer about the ways that a ‘cardboard AI’ has been implemented in board games. Specifically, I give a survey of non-computer systems are there to either emulate an opponent or create an environment that is challenging for the player to navigate.

Here’s the talk if you would like to check it out:

A question has been nagging at me since then. Is it possible to design a system that learns from what the players do, and reacts to it? The main attempt to do that which I am aware of is MENACE, which learned to play tic-tac-toe by placing beads in matchboxes. That’s what is shown in the thumbnail for the video above.

As MENACE wins and loses you add or take away beads to the boxes that led to that conclusion. Over time it will stop making losing moves, and just make ones that lead to wins (or draws, since it’s tic tac toe).

Could a system similar to this be implemented in a commercial game? Or perhaps a different system completely that leads to similar results?

That was the question our group at Tabletop Network set out to tackle. Our eight-person team was Sofía Alatorre, Brian Beal, Mike Carambat, Todd Fast, Bernardo González, Matt Leacock, Manuel Hilario Lopez, and myself.

First, what is the definition of what we envision? Here is what we came up with:

A system that reacts to player actions and game state through a feedback loop in order to change future behavior.

For example, let’s say you’re playing a dungeon crawler, and your party is focused on using magic. Wouldn’t it be neat if the monsters and traps shifted to becoming more anti-magic? This would force the player to change their tactics and be more flexible, and may thwart a ‘perfect plan’.

The game would act in a classic decision loop, ‘observing’ what the players do, and changing behavior based on that.

Benefits and Challenges

Why would we want a system like this? We identified several benefits:

Adjust skill level to match player skill
- Flow State
- Balance
Create a variable challenge
Encourage players to try alternate strategies
Believable consequences / “living world”
Market differentiation

Of course, since this is a board game, there are key challenges that any system needs to overcome:

How does the system evaluate the game state?
How much mechanical and rules overhead is added?
What additional components are required, and how does that impact product cost?

In essence, how do you accomplish this goal without adding complexity? (As a caveat, we ruled out using any app assistance)

A Toy System

When approaching a problem like this, it can be helpful to construct a simple model that demonstrates what you would like to achieve. This can help clarify features and reveal problems. Here’s one that we came up with:

Borg Adaptation

You are battling against the Borg from Star Trek. One of their defining characteristics is their ability to adapt, particularly to become immune to phasers by changing their shield frequencies. Here’s a simple mechanism to emulate that in a board game:

The Borg shields are represented by a deck of 40 cards with four copies of digits 0-9. The Federation player has their phasers set to a two-digit frequency - 72 for example.

When you shoot at the Borg, you flip up two cards from their deck. If you turn up a ‘7’ and a ‘2’ your shot does no damage. If you turn up one digit, but not the other, the shot does 50% damage. Maybe it does 25% damage if you turn up two 2’s or two 7’s. If no 7 or 2 is turned up, your shot gets through at full power.

After the shot, any numbers that are drawn that failed to block (a six for example) are removed from the deck. If a card blocks you add another copy and shuffle it in.

Over time, the Borg deck will become better and better at blocking your shots.

To make this more interesting we can give players the ability to change the frequency they are using. But perhaps this is not easy - it takes a lot of time or resources. So players need to balance trying to squeeze a shot through a decently tuned ‘shield deck’ versus the effort required to switch. And maybe they can only change one number at a time.

This toy model shows us a few things:

We need to decide how responsive we want the system to be. With a forty card deck it may too long for the deck to self-tune. We may want to start with a smaller deck - perhaps even just 10 cards. Do you have enough turns for a meaningful feedback effect within a game?

The feedback mechanism can also be tuned. If a card is wrong, we could remove it (fast tuning), or we could replace it with a random 0-9 card. This would slow down tuning, but also make things more uncertain for the players.

We need a way for the system to respond to player changes and not get stuck. Let’s say our deck gets down to just having 7’s and 2’s, and the players switch the frequency to 46. If we just take out wrong cards we can never recover. We need a way to either bring 4’s or 6’s back into the deck - perhaps using the ‘random replacement’ rather than ‘removal’ idea from above. Or maybe if the deck fails to block any damage three or four shots in a row you completely reset.

Point being there always needs to be a way for the system to react to changes, either from the player or environment. You need to avoid your system being ‘over-tuned’.

Is the system responding gradually, or in a big jump? The system we described gradually improves. It is also possible to make the deck jump in ‘quality’ in bigger chunks. One way to do this is to put the cards into a discard pile rather than back into the deck. So if you match a ‘7’ and add an extra one to the deck, you put it into a discard pile first. Then when you exhaust the main draw pile you shuffle and start again with your discords. So any tuning won’t kick in until the reshuffle. This creates a bigger ‘leap’ event for the players to deal with rather than a gradual tightening.

Other Implementation Ideas

During our conversation we came up with some other possible implementation ideas - not very fleshed out of course, but perhaps it might give you some ideas.

Adaptive Automa Decks

The Automa system developed by Morten Monrad Pedersen is a popular way of implementing a solo game, and focuses on how an opponent’s actions impact your decisions, rather than the details of what the opponent is doing.

It is often implemented with a deck of cards that define the ‘actions’ and effects the Automa has. Wingspan is an example.

What if the cards in an Automa deck were modified based on what the player was doing? If the player was earning a lot of points by doing Action X, then certain cards would be added to the deck to help counter that strategy.

Restricting / Scaling Victory Conditions

This mainly applies between games, but a system could modify the victory conditions for future games based on what the player does. If they win in a certain manner, that option for victory can be eliminated for the next game, or it could be made more difficult to achieve.

Oath does that to a certain extent, but that is more to create new conditions for the players rather than adapt the difficulty.

Path-Dependent NPC Behavior

Say you are playing a detective game where you need to interview witnesses. It might be interesting if the game kept track of actions you took, and that impacts how those interviews go.

Systems similar to this have been implemented, but typically with an ‘event-based’ method. In these you track whether event X happened, and if so take a token or mark a log. Then future events may branch depending on whether you have a specific token or log item marked.

Examples of this are T.I.M.E. Stories and Canteloop.

However, this idea is a different take. Rather than hit specific triggers, you would accumulate certain characteristics based on your actions. Perhaps you earn ‘aggression’ or ‘friendship’ points, which would impact later actions. The gating items in Canteloop serve to act as gating items to restrict access to later parts of the story until you accomplish certain goals. The characteristic-based idea would still give options to complete the game, but you’d need to alter your approach.

Bag Building Orders

In this idea, a bag contains cubes of various colors. The ‘AI’ actions are decided on by pulling one or more cubes, and having the colors trigger certain actions. Perhaps red are attack, green are defensive, and blue are movement. Player actions and AI success/failure can change the composition of the bag, and organically change the behavior of the AI.

Questions and Parameters

Developing a system like this require the designer to think about where they want to be on several scales. Here are some we came up with:

Predictability: How much insight do the players have into the range of possible actions by the AI?

Oppositional: Is the AI strictly against the player? Or is the learning system sometimes making things easier for the player?

Iteration Speed: Are changes happening rapidly (e.g. within a game or even single turn)? Or do they happen only a few times during a game, or perhaps only in between games?

Path Dependency: Are the changes to what the player is facing only dependent on the current state of the game? Or will it vary based on the path the players took to get to that state?

Agent or Environment: Does the system impact an agent opposing the players? Or does it impact the overall game environment?

Wrapping Up

Hopefully this gave you some food for thought. We don’t think there’s any universal system that will achieve a reactive system. And perhaps in the end any system powerful enough to present meaningful change to the player will prove too complex or component-intensive to be justified.

But I think there’s something here. There are many examples of games poking around the edges of this idea, many of which I feature in my lecture that opened this post. And perhaps just a few more steps can make this idea, and our game worlds, really come alive.

If this type of deep thinking appeals to you, please consider attending Tabletop Network next year!

Do you think creating adaptive game systems is a worthwhile goal? What ideas do you have to achieve them?

The Universe Explained with a Cookie

A reminder that my new book The Universe Explained with a Cookie is releasing at the end of April, but is available for pre-order now! Ask for it in your bookstore or web site of choice!

Matt Montgomery

Especially thinking about path-dependent interactions,

I’m reminded of Ultima IV’s virtue system. The eight virtues and shrines seem like just the sort of thing that would fit right into that mechanical structure. As usual, a very thought-provoking piece!

Expand full comment

Frank Lantz

Love this! Reminds me a bit of the embodied cognition discussed in this conversation: https://x.com/Meaningness/status/1731699163979514254?s=20

He uses the example of a stack of bowls in the kitchen as a kind of sorting algorithm that is instantiated in the environment.

Joe Henrich talks about this idea as being the key thing about humans in his fantastic book The Secret of Our Success: https://press.princeton.edu/books/paperback/9780691178431/the-secret-of-our-success

Computation is physical, and this is a profound and counter-intuitive truth that board games perfectly express.

Tabletop Network seems cool, will check it out!

7 more comments...

GameTek

Discussion about this post