LLM integration

The library ships with multiple LLM client wrappers behind a common abstraction so V4 and V5 engines can swap providers without code changes.

Available clients

Client

Auth

Notes

aup.llm.CodexCliClient

OAuth (handled by codex CLI)

Recommended for V5 reproduction. No API key. Sweet spot at concurrency=15 (~6 calls/sec).

aup.llm.ClaudeCodeClient

OAuth (handled by claude CLI)

No API key. Sweet spot at concurrency=20. Default model: haiku.

aup.llm.AnthropicClient

ANTHROPIC_API_KEY env var

Direct Anthropic SDK.

aup.llm.OpenAIClient

OPENAI_API_KEY env var

Direct OpenAI SDK.

aup.llm.ZaiCodingClient

ZAI_API_KEY env var

Anthropic-compatible Z.ai proxy. Concurrency=5.

Selecting a provider

import agent_urban_planning as aup

# codex-cli (preferred for V5 paper reproduction)
client = aup.llm.CodexCliClient()

# claude-code
client = aup.llm.ClaudeCodeClient(model="haiku")

# Anthropic SDK
client = aup.llm.AnthropicClient(api_key=os.environ["ANTHROPIC_API_KEY"])

# Pass to any engine that uses an LLM:
engine = aup.LLMDecisionEngine(params, llm_client=client)

Async + caching

V5 issues 50 cluster × 50 iter × 11 calls/iter ≈ 27,500 LLM calls per baseline run. Concurrency + caching are essential:

from aup.llm import AsyncLLMClient, LLMCallCache

# Async wrapper around any sync client (bounded concurrency)
async_client = AsyncLLMClient(client, concurrency=15)

# Per-clearing, price-bucketed cache
cache = LLMCallCache(bucket_size=0.20)

The LLMDecisionEngine automatically wraps its client with AsyncLLMClient and uses LLMCallCache internally; users typically don’t need to construct these directly.

Custom client

To plug in a custom LLM provider, implement the agent_urban_planning.llm.LLMClient protocol:

class MyClient:
    total_concurrency = 10  # max concurrent calls

    def complete(self, user: str, system: str = "") -> str:
        """Return raw string response."""
        # ... your logic ...
        return raw_response

# Use it like any built-in:
engine = aup.LLMDecisionEngine(params, llm_client=MyClient())

Anything with a .complete(user, system="") method returning a string works.

Multi-provider failover

For long V5 runs, the MultiProviderClient rotates calls across configured providers and fails over if any provider rate-limits:

from aup.llm import MultiProviderClient

client = MultiProviderClient([
    aup.llm.CodexCliClient(),
    aup.llm.ClaudeCodeClient(),
    aup.llm.ZaiCodingClient(),
])

Cost guidance for V5 baseline + shock

Approximate cost on each provider (Berlin 96-zone scenario, seed 42, 50 iters, ~27,500 calls per run, 2 runs for baseline + shock):

Provider

Cost (full V5 baseline + shock)

codex-cli

$0 (OAuth, free tier sufficient)

claude-code

$0 (OAuth, free tier sufficient)

Anthropic SDK

~$30-50

OpenAI SDK

~$40-70

Z.ai

~$20-30

For paper reproduction, codex-cli is recommended — it’s free under OAuth and produces deterministic-enough outputs for the bundled cache to replicate.

See also