SimVi: A Behavior Tree Prototype for Natural Language Game AI Control
While exploring game AI development, I experimented with an interesting idea: Can we let players describe AI behaviors in natural language, and have the system automatically generate and execute the corresponding logic? SimVi is a preliminary implementation of this concept.
SimVi is a game AI prototype that combines natural language processing with behavior tree systems. Through a custom DSL (Domain Specific Language) and OpenAI API, players can describe villager behaviors in Chinese, and the system automatically converts these descriptions into behavior tree DSL, which is then executed in a Phaser game scene.
The project source code is open-sourced on GitHub, feel free to check it out and contribute.
Core Concept
The core goal of SimVi is to lower the barrier for defining game AI behaviors. Traditionally, defining game AI behaviors requires writing code or using complex visual editors. SimVi attempts to use natural language as an intermediary, allowing players to more intuitively describe desired behaviors.
From Natural Language to Behavior Tree
The entire process can be divided into several steps:
- Natural Language Input:Players input Chinese instructions, such as "讓這個村民持續採木頭,背包滿了就回倉庫存放。" (Make this villager continuously gather wood, and return to storage when the backpack is full.)
- DSL Generation:The system uses OpenAI API (or a mock version) to convert natural language into structured behavior tree DSL
- DSL Parsing and Validation:Parse DSL syntax, calculate complexity, and check if it meets restrictions
- Behavior Tree Execution:Execute behavior tree logic in the game scene to control villager actions
Technical Architecture
Frontend Technology Stack
SimVi uses a modern frontend technology stack:
- React + TypeScript + Vite:Provides fast development experience and type safety
- Phaser 3:Game engine responsible for rendering game scenes and entities
- React Flow:Visualizes behavior tree structure, allowing players to intuitively see the generated logic
Behavior Tree DSL Design
SimVi defines a concise behavior tree DSL that supports the following node types:
- BEHAVIOR:Defines the root node of a behavior
- LOOP:Repeatedly executes child nodes (currently supports
mode=forever) - SEQUENCE:Executes child nodes sequentially, fails if any child fails
- SELECTOR:Tries child nodes sequentially, succeeds if any child succeeds
- CONDITION:Conditional checks, such as checking if the backpack is full
- ACTION:Specific actions, such as moving, gathering, or depositing resources
Here's a simple DSL example:
BEHAVIOR Lumberjack:
LOOP mode=forever:
SEQUENCE:
ACTION move_to target=nearest(tree)
ACTION gather resource=wood
ACTION move_to target=storage
ACTION deposit resource=wood
This DSL defines a continuous wood-gathering behavior: the villager will move to the nearest tree, gather wood, then move to storage, deposit the wood, and repeat this cycle.
Complexity Calculation Mechanism
To prevent players from defining overly complex behaviors, SimVi implements a complexity calculation mechanism. Each node has a corresponding complexity cost, and the system calculates the total complexity of the entire behavior tree and compares it with a set limit. If it exceeds the limit, the behavior will not execute.
This design allows the game to limit the complexity of behaviors available to players based on mechanisms like "village level," adding strategic depth.
Implementation Details
DSL Parser
The DSL parser uses a recursive descent approach, using indentation to represent node hierarchy. The parser will:
- Check if indentation is correct (must be multiples of 2)
- Validate syntax against specifications
- Build an Abstract Syntax Tree (AST)
- Provide detailed error messages, including line numbers and specific error locations
Behavior Tree Executor
The behavior tree executor (BehaviorRuntime) is responsible for executing the parsed AST. It implements complete behavior tree logic:
- State Memory:Uses memory to store node execution states, supporting long-running actions (such as movement)
- Node Execution:Executes corresponding logic based on node type (SEQUENCE, SELECTOR, LOOP, etc.)
- Action Handling:Integrates with the game world's action system, such as movement, gathering, and depositing
Natural Language to DSL Conversion
The system uses OpenAI's GPT-4o-mini model to convert natural language to DSL. The prompt design includes:
- List of available actions and conditions
- DSL syntax specifications
- Multiple examples showing behavior descriptions of different complexities
- Clear rules to ensure generated DSL conforms to specifications
If no OpenAI API Key is provided, the system uses a mock version that generates simple DSL examples based on keyword matching.
Game Scene Design
The current game scene includes:
- Villager:Blue circle that can execute various behaviors
- Tree:Green circle that can be harvested for wood
- Storage:Yellow rectangle where resources can be deposited
- Guard:Red circle that patrols along a fixed path
The game world uses a simple entity-component system, where each entity has properties like position and inventory. The action system modifies these properties based on behavior tree instructions and updates visual representations.
Visualization Features
SimVi uses React Flow to visualize generated behavior trees. When DSL parsing succeeds, the system automatically generates corresponding nodes and connections, allowing players to intuitively see:
- The overall structure of the behavior tree
- Node types and parameters
- Hierarchical relationships between nodes
This visualization feature is very helpful for understanding AI behavior logic and makes debugging easier.
Application Scenarios and Ideas
Although SimVi is currently just a simple prototype, this technical direction has many interesting application possibilities in game development. Here are some application scenarios I thought of during development:
Squad Formation System
In action games or RPGs, players can control the main character while commanding other companions. Through natural language, players can define unique combat strategies for each companion:
- Tank Character:"When enemies approach, prioritize protecting the player. If the player's health drops below 50%, use taunt skills to attract enemy attention."
- Healer Character:"Continuously follow the player. When the player or any teammate's health drops below 30%, immediately use healing skills."
- DPS Character:"Prioritize attacking the target the player is attacking. If the target's health drops below 20%, use finishing skills."
This design allows players to adjust companion tactics in real-time based on different combat situations, without needing to enter complex settings menus. Each companion can have their own unique behavior pattern, making combat more strategic and personalized.
Resource Gathering Optimization
In simulation or strategy games, players need to manage multiple villagers for resource gathering. By defining each villager's behavior through natural language, more refined resource management can be achieved:
- Division of Labor:"Have villager A specialize in gathering wood, villager B specialize in gathering stone, and villager C transport resources between them to storage."
- Priority Management:"When wood inventory drops below 100, all villagers prioritize gathering wood. When wood is sufficient, switch to gathering other resources."
- Efficiency Optimization:"Have villagers closest to the forest handle gathering, and villagers closest to storage handle transportation, maximizing overall efficiency."
Players can describe ideal resource gathering strategies in natural language, and the system will automatically generate corresponding behavior trees, allowing multiple villagers to work together to achieve the fastest gathering efficiency. This approach is more flexible than traditional "click-to-assign" methods and allows players to experiment with different strategy combinations.
These application scenarios demonstrate the potential of natural language-controlled AI: it not only lowers the technical barrier but also allows players to express strategic intentions in a more intuitive way. Although the current implementation is still simple, this direction is worth further exploration.
Project Limitations and Future Directions
As a prototype project, SimVi currently has some limitations:
- Only supports a limited set of action and condition types
- LOOP nodes only support
forevermode - Game scene is relatively simple, with only basic entities
- Natural language conversion accuracy has room for improvement
Possible future improvements include:
- Expanding action and condition systems to support more game mechanics
- Improving natural language processing to increase conversion accuracy
- Adding more game elements, such as combat, building, etc.
- Implementing more complex behavior tree nodes, such as parallel execution, interrupt mechanisms, etc.
- Adding behavior tree editing functionality, allowing players to manually adjust generated logic
Technical Learning and Insights
During the development of SimVi, I learned about:
- Behavior Tree Implementation:How to design and implement a complete behavior tree system
- DSL Design:How to design an easy-to-read and easy-to-write domain-specific language
- Natural Language Processing:How to use LLMs to handle structured tasks
- Game Development:How to use Phaser 3 to create simple game scenes
- Type Safety:How to use TypeScript to ensure type correctness of DSL and behavior trees
Live Demo
The following video demonstrates SimVi in action, showing how to describe behaviors in natural language, how the system generates DSL, and how villagers execute these behaviors in the game scene:
Conclusion
SimVi is an exploratory prototype project that attempts to use natural language to lower the barrier for defining game AI behaviors. Although there's still much room for improvement, this project demonstrates the potential of combining natural language processing with game development.
This project has also given me a deeper understanding of behavior tree systems and the application potential of natural language processing in game development. If you're interested in this project, feel free to check out the source code on GitHub, or share any ideas and suggestions.
If I have the opportunity in the future, I'll continue to improve this project by adding more features and enhancements. If you have any ideas or suggestions, feel free to share them!