I Built a Website on My Laptop — CPU Only, No Cloud, No GPU, No API Keys

A while back I wrote about turning one sentence into a governed game with GitPilot + Matrix Builder — and then about building the watsonx-only Contract Quest. Both of those leaned on a cloud model. This time I wanted to answer a harder, more honest question:

Can a multi-agent system design and build a real website with no cloud, no GPU, no API keys — on a plain laptop CPU?

The answer is yes. Below is exactly how I did it: a small local model, a bridge, a multi-agent designer, an AI coder, and a governance layer that refuses to let anything broken ship. One sentence goes in; a working, validated website comes out — and the second pass made it look genuinely professional.

The local architecture — CPU only, no GPU, no cloud

The cast (and who does what)

Five small, sharp tools, each with one job:

Tool Role Where it runs
Ollama runs a small coding LLM locally :11434, CPU
OllaBridge wraps Ollama in an OpenAI-compatible gateway :11435
Matrix Designer the multi-agent brain that plans the batches local
GitPilot the AI coder that writes the files local
Matrix Builder the governance — validates every batch, fail-closed local

Nothing leaves the machine. The only “API key” in the whole system is a throwaway string I made up for the local gateway.

My box for this run: 4 CPU cores, 15 GB RAM, no GPU. That’s it.


Step 1 — A local model with Ollama

Install Ollama and start the server (it needs zstd for extraction on a fresh box):

sudo apt-get install -y zstd
curl -fsSL https://ollama.com/install.sh | sh
ollama serve &           # API on 127.0.0.1:11434

Then pull a light coding model. The whole point is “small enough to run on a CPU,” so I used qwen2.5-coder:1.5b — under 1 GB:

ollama pull qwen2.5-coder:1.5b

A quick smoke test wrote a Hello-World page in about 11 seconds on CPU. That’s our engine.

Step 2 — Bridge it with OllaBridge

GitPilot (and most tools) speak the OpenAI chat API. Ollama has its own dialect, so I put OllaBridge in front of it: it exposes a clean, OpenAI-compatible /v1/chat/completions, adds an API key and rate limiting, and forwards to Ollama.

pip install ollabridge

# .env — point the bridge at our local Ollama
cat > .env <<'EOF'
HOST=127.0.0.1
PORT=11435
API_KEYS=sk-ollabridge-local-dev
OLLAMA_BASE_URL=http://127.0.0.1:11434
DEFAULT_MODEL=qwen2.5-coder:1.5b
EOF

ollabridge start --no-setup --host 127.0.0.1 --port 11435 --auth-mode local-trust

A health check confirms the bridge is live and pointed at our model:

{ "status": "ok", "mode": "gateway", "default_model": "qwen2.5-coder:1.5b", "auth_mode": "local-trust" }

Now any OpenAI client can talk to a model that’s running entirely on this laptop.

Step 3 — Point GitPilot at the local model

GitPilot supports an openai provider with a custom base URL — so I just aim it at the bridge:

export GITPILOT_PROVIDER=openai
export OPENAI_BASE_URL=http://127.0.0.1:11435/v1
export OPENAI_API_KEY=sk-ollabridge-local-dev
export GITPILOT_OPENAI_MODEL=qwen2.5-coder:1.5b

That single indirection is the whole trick: GitPilot → OllaBridge → Ollama → your CPU. The coder never knows (or cares) that the “OpenAI” endpoint is a 1 GB model humming on your own machine.

Step 4 — Let the multi-agent brain plan the batches

Here’s where it stops being “ask a model for an HTML file” and starts being a system.

Matrix Designer is a small LangGraph crew. Each agent owns one slice of the design, and they hand off in sequence — the Planner sets the goal, Requirements extracts the features, the Architect picks the components, UI/UX sketches the flows, the Batch Planner turns it all into an ordered roadmap, Quality checks it against the rules, and the Synthesizer emits the final plan.

Matrix Designer — the multi-agent crew

pip install matrix-designer
mdesign blueprints --idea "A simple Hello World static website with HTML and CSS"
# → minimal / standard / production blueprints, each with an ordered batch plan

The crucial idea: the design is decided before any code is written. The model isn’t improvising a whole site in one shot — it’s filling in small, pre-planned, pre-scoped batches. That’s the difference between “an AI made something” and “a system built something.”

Step 5 — Build it, one governed batch at a time

Now Matrix Builder turns the plan into a contract and runs the loop. For every batch it renders a contract-bound prompt, lets GitPilot generate the code with our local model, and then runs mb check — which is fail-closed: if the change steps outside the allow-list or breaks an acceptance rule, it returns needs-repair and nothing ships.

The governed loop — one validated batch at a time

mb init "Hello World — a clean single-file static website ..." --quality starter
# then, per batch:
mb next "<the next goal>"          # plan a scoped batch
mb prompt --coder gitpilot          # render the contract-bound prompt
gitpilot generate -m "<prompt>" -o frontend   # local model writes the file
mb check --changed frontend/index.html        # validate, fail-closed → Matrix Commit

Every batch that passed was sealed as an immutable Matrix Commit:

Batch 01  Hello World page foundation     ✓ mc-02bbd1f093de
Batch 02  Greeting button                 ✓ mc-0de0cc051dad
Batch 03  Polish (gradient + greeting)     ✓ mc-657331039ba9
MATRIX_STATUS: approved  score=100

The result

The first pass worked but looked rough — the 1.5B model put a gradient on the wrong CSS property and targeted a missing element. So I did what you’d do with any junior coder: I gave it a precise spec and ran one more governed batch. The second pass is genuinely presentable:

The generated Hello World website Generated 100% locally — qwen2.5-coder:1.5b → OllaBridge → GitPilot → validated by Matrix Builder. Shown after clicking Greet Me.

A purple gradient, a clean white card, a working Greet Me button whose JavaScript fires a friendly message — and every line was written by a model running on a CPU, under governance, with the build history to prove it.

Why this matters

  • It’s private and free. No tokens leave your laptop; no per-call billing; no GPU. The same wiring runs on a Raspberry-class box or an air-gapped network.
  • Small models are fine — with a system around them. A 1.5B model can’t one-shot a polished app. But planned into small, scoped, validated batches, it ships real, correct files. The leverage is the multi-agent design + governance, not the size of the model.
  • The quality is a dial, not a ceiling. Want it sharper? Swap one string — ollama pull qwen2.5-coder:7b and GITPILOT_OPENAI_MODEL=qwen2.5-coder:7b. The architecture doesn’t change at all.
  • Nothing broken ships. mb check is fail-closed, so the worst case is “this batch needs another try,” never “a broken thing got committed.”

This is the same governed pipeline from the game tutorial — only now the model is on your CPU, and a designer brain plans the batches before the coder touches a file.

Reproduce it

# 1) local model
sudo apt-get install -y zstd && curl -fsSL https://ollama.com/install.sh | sh
ollama serve & ; ollama pull qwen2.5-coder:1.5b

# 2) bridge
pip install ollabridge
ollabridge start --no-setup --port 11435 --auth-mode local-trust

# 3) wire the coder to the local model
export GITPILOT_PROVIDER=openai OPENAI_BASE_URL=http://127.0.0.1:11435/v1 \
       OPENAI_API_KEY=sk-ollabridge-local-dev GITPILOT_OPENAI_MODEL=qwen2.5-coder:1.5b

# 4) plan + build, governed
pip install matrix-designer agent-generator gitcopilot
mdesign blueprints --idea "<your idea>"
mb init "<your idea>" --quality starter
# loop: mb next → mb prompt --coder gitpilot → gitpilot generate -o <dir> → mb check

Take it further

This experiment showed that useful AI software does not always require a large cloud model or expensive infrastructure. With the right system around it — planning, scoped batches, validation, and a local model — even a CPU-only setup can produce something real, private, and reproducible.

Leave a comment