Glad that you liked it! :)
Posts by
Indeed
I have no idea what BOFA is.
This project was done by myself, with help from
Yuchen Li, Dr. Steven James and Prof.
@togelius.bsky.social
.
You can find the paper here: arxiv.org/abs/2601.00105, and the code here: github.com/umair-nasir1...
For example, we ran experiments where we seeded the initial population with the Sokoban push mechanic and replaced the initial level with a Sokoban level, and found interesting results. This demonstrates that mortar can provide ideas to expand your game.
The results suggest that Mortar can open-endedly generate playable and learnable games with mechanics that contribute positively towards the game.
LLM selects the game assets from a defined set. With some postprocessing steps to ensure playability, we send the game to our MCTS agents. Kendall’s tau rank correlation decides on the learnability of the game, and CITS decides the importance of the mechanic.
At each node expansion, we either add a mechanic from the QD archive, or generates a new one through an LLM. The LLM generates each individual method for the game class, including the game level, in a step-by-step manner, with the context focusing on the mechanics in the node.
The actual writing of the mechanics is done by an LLM, acting as an evolutionary operator in a Quality-Diversity algorithm. The selection operator selects a game mechanic, the LLM mutates it, and the mutated mechanic becomes the root node of the Evaluation Tree.
At the end of this "Evaluation MCTS" we get the actual ranks of the MCTS agents, find the rank correlation throughout the games in the tree, and derive Constraint Importance Through Search (CITS) score, which is inspired by Shapley values.
Interestingly, this makes the construction process a form of tree search itself, where the node expansions take the form of testing different mechanics. This is a form of systematic trial and error, or an evolutionary algorithm with a strange population structure.
However, that only tells us about the game, not the mechanic. We need to know if a new mechanic makes the game better. We do this by adding new mechanics sequentially, and measuring the importance of the mechanic for the game.
But how can you tell whether a game mechanic is good? By playing the game. In Mortar, we use different Monte Carlo tree search agents. The good ones should play it better than the bad ones. This is a measure of game depth, or learnability.
A game is more than the sum of its mechanics. But the game mechanics are really important! Can you make AI generate novel games by creating new mechanics, one at a time? Our new system, Mortar, generates complete games this way.