An Open Model Built a Mario-Style Platformer — Under Contract, on watsonx

2 minute read

Every other game in this arcade — Pong, Tetris, Match-3 — was written by Claude Opus 4.8. This one wasn’t. I wanted to prove the part of the pitch that’s easy to claim and hard to show: that the governance is provider-agnostic. So I swapped the brain entirely and had an open-weight model — openai/gpt-oss-120b, running on IBM watsonx.ai — build the most advanced game of the set: a Mario-style neon platformer, across 6 governed batches.

Same mb contracts. Same gitpilot driver. Same mb check gate. Different model. Here’s the whole thing — and a one-command script to reproduce it.

🎮 Play it first

▶ Play Neon Climber — one self-contained HTML file, mobile + desktop (tilt, touch, or arrows). Bounce up neon platforms, stomp enemies, grab coins + stars, ride springs, use power-ups, climb as high as you can.

Neon Climber — hero

The real game running (headless capture): moving + spring platforms, neon enemies, a star, coins, a high-score (BEST), and a mute toggle.

Neon Climber gameplay

Source, the contract bundle, and the reproducible build script: github.com/ruslanmv/doodle-jump-climber-under-contract.

The setup: GitPilot, pointed at watsonx

GitPilot is provider-agnostic. To use IBM watsonx with the open gpt-oss-120b model, you set a handful of env vars — nothing else in the loop changes:

pip install agent-generator gitcopilot crewai

export GITPILOT_PROVIDER=watsonx
export WATSONX_API_KEY=<your IBM Cloud API key>
export WATSONX_PROJECT_ID=<your watsonx project id>
export WATSONX_URL=https://us-south.ml.cloud.ibm.com
export GITPILOT_WATSONX_MODEL=openai/gpt-oss-120b
export GITPILOT_MAX_TOKENS=18000

The 6 batches

Each batch ran the same four commands — plan a scoped batch, render the contract-bound prompt, let the model extend the single allowed file, validate fail-closed:

mb next "<the batch goal>"
mb prompt --coder gitpilot
gitpilot generate -m "$(cat coder-prompts/gitpilot.md)" -o .
mb check frontend/index.html

#	Batch	What gpt-oss-120b added	Size	Matrix Commit
1	Foundation	neon hero, jump physics, vertical camera, procedural platforms	15 KB	`mc-f0101634a883`
2	Controls	left/right, screen-wrap, tilt + touch + keyboard	22 KB	`mc-2cde9611b371`
3	Platforms + items	moving/breaking/spring platforms, coins + stars	36 KB	`mc-511e55820933`
4	Enemies + power-ups	stompable neon enemies, shield / jetpack / magnet	35 KB	`mc-cf84ff722c6e`
5	Juice	particles, trail, parallax, WebAudio, screen-shake	41 KB	`mc-b133e1b1d466`
6	Meta	start / game-over, high score, difficulty ramp, boss milestone	47 KB	`mc-8c08092f78c3`

Every batch returned MATRIX_STATUS: approved score=100, the model wrote to only frontend/index.html, and a headless smoke test found zero runtime errors. Full transcript in EVIDENCE.md.

An honest note on open models. gpt-oss-120b is strong, but below Opus 4.8 for huge single-file rewrites: it wrote more compact code (batch 4 even shrank the file) and I verified after every batch that earlier features survived (they did — no regressions). The discipline that makes that safe is exactly the contract: an allow-list the model can’t exceed and a validator that runs before anything lands.

Reproduce it with one command

The repo ships a build.sh that rebuilds the entire game from scratch — all 6 batches — with watsonx:

git clone https://github.com/ruslanmv/doodle-jump-climber-under-contract
cd doodle-jump-climber-under-contract
pip install agent-generator gitcopilot crewai
export WATSONX_API_KEY=...  WATSONX_PROJECT_ID=...
./build.sh          # → frontend/index.html, validated batch by batch

Want Claude or local Ollama instead? Change GITPILOT_PROVIDER and the model env var at the top of build.sh. The contracts don’t change.

How it works — and why Matrix Builder

How it works: Matrix Builder × GitPilot × watsonx, and the advantages of mb

A contract, not a prompt — a locked blueprint + pinned standards.
Allow-list scope — the model edits only the files you permit.
Fail-closed validation — mb check returns approved / needs-repair / rejected.
Immutable Matrix Commits — every change pins the prompt, diff, and verdict.
Provider-agnostic — Claude, OpenAI, watsonx, or local Ollama; the governance never changes. This game is the proof.

Take it for a spin

Play / fork: github.com/ruslanmv/doodle-jump-climber-under-contract
Matrix Builder: agent-matrix/matrix-builder
GitPilot: gitpilot.ruslanmv.com
The rest of the arcade: Pong · Tetris · Match-3

An open model built this game, and it could prove it stayed within scope at every step. That ability to verify exactly what an AI was allowed to change is the part worth taking away.

Share on

Twitter Facebook LinkedIn

Ruslan Magana Vsevolodovna

An Open Model Built a Mario-Style Platformer — Under Contract, on watsonx

🎮 Play it first

The setup: GitPilot, pointed at watsonx

The 6 batches

Reproduce it with one command

How it works — and why Matrix Builder

Take it for a spin

Share on

Leave a comment

You may also enjoy

The $0 AI Software House: Running a 4-Agent Engineering Team on a $35 Raspberry Pi

05 Jul 2026

The EU AI Act vs. Autonomous AI: How to Audit an Agentic Workflow in Under 60 Seconds

05 Jul 2026

I Let an Open-Source AI Team Refactor a 10,000-Line Legacy Codebase Overnight — Under a Strict Matrix Contract

05 Jul 2026

Why I Created Matrix Designer: Giving AI a Brain Before It Writes Code

20 Jun 2026