PettingZoo AI env for Azul multiplayer board game to enable AI agent training.
from azul_marl_env import azul_v1_2players, azul_v1_3players, azul_v1_4players
env_2players = azul_v1_2players()
env_3players = azul_v1_3players()
env_4players = azul_v1_4players()
env_2players_custom_max_moves = azul_v1_2players(max_moves=100)from azul_marl_env import AzulEnv
env = AzulEnv(player_count=2)
env = AzulEnv(player_count=3)
env = AzulEnv(player_count=4)
env = AzulEnv(player_count=2, max_moves=100)from azul_marl_env import azul_v1_2players
import random
# Create and reset the environment
env = azul_v1_2players()
observation, info = env.reset()
# Iterate through agents
for agent in env.agent_iter():
# Get current agent's observation and info
observation, reward, termination, truncation, info = env.last()
if termination or truncation:
break
# Get valid moves for current agent
valid_moves = info["valid_moves"]
# Select a random valid move
action = random.choice(valid_moves)
# Execute the move
env.step(action)
# Render the environment (optional)
env.render()
# Close the environment
env.close()from azul_marl_env import azul_v1_2players
import random
def play_random_game():
env = azul_v1_2players()
observation, info = env.reset()
for agent in env.agent_iter():
observation, reward, termination, truncation, info = env.last()
if termination or truncation:
print(f"Game finished! Final scores: {[player['score'] for player in observation['players']]}")
break
# Get valid moves and make a random move
valid_moves = info["valid_moves"]
if valid_moves:
action = random.choice(valid_moves)
env.step(action)
env.close()
play_random_game()Factory count (num_factories):
2 player game -> 5
3 player game -> 7
4 player game -> 9
-
Action Space: MultiDiscrete([num_factories + 1, 5, 20, 5])
- First value: Factory index. Index 0 is taken for the center so the factory indexes are: 0 based factory index + 1.
- Second value: Tile color (0-4 representing different colors)
- Third value: Number of tiles to place on floor (0-19)
- Fourth value: Pattern line index (0-4)
-
Observation Space: Dictionary containing:
factories: Box(0, 4, (num_factories, 5), int32) - Tile counts in each factorycenter: Box(0, 3 * num_factories, (5,), int32) - Tile counts in centerplayers: Tuple of player states, each containing:pattern_lines: Box(0, 5, (5, 5), int32) - Current pattern lineswall: Box(0, 5, (5, 5), int32) - Wall statefloor: Box(0, 5, (7,), int32) - Floor tilesis_starting: Discrete(2) - First player markerscore: Discrete(241) - Player's score
bag: Box(0, 100, (5,), int32) - Remaining tiles in baglid: Box(0, 100, (5,), int32) - Discarded tiles
-
Reward:
-1for each step until game end-2for invalid moves- Final Azul score is added to cumulative reward at game end
-
Done:
Truewhen:-
Game is completed (at least one player filled at least one horizontal wall)
-
Falseotherwise -
Truncated:
Truewhen: -
Maximum moves reached (player_count * 150 by default)
-
Falseotherwise
-
-
Info: Contains
valid_moveslist for the current player
