Photo by Gemini, Two trading scripts. One IB Gateway. Constant timeouts.
Picture two people trying to call the same phone number at the same time. One gets through. The other gets a busy signal — or worse, connects but hears nothing. Neither knows what the other is doing. Neither knows there is another person.
Running multiple trading strategies through a single IBKR TWS or IB Gateway is almost exactly this. The platform wasn’t designed for concurrent connections — it’s a trading terminal, not a message router. When two strategies fire at the same time, one gets data and one gets silence, and the logs rarely tell you which is which.
I’d been sitting on the fix for a while: a message queue in the middle, one process owning the IB connection, everything else talking to it. The architecture was clear in my head. I just didn’t have the time to build it solo. Now, since the AI coding capability improves drastically, I finally decided to open the Claude Code and said: “I have an idea. Help me build it.“
This is what building with AI actually feels like.
My Early Architecture Draft
Previous readings
1. The Architecture: Before and After
1.1 My Trading Setup
Before getting into the architecture problem, a bit of context on how I have things set up — because it’s actually the reason the problem exists in the first place.
I have multiple sub-accounts underneath my IBKR master account. Each sub-account is dedicated to exactly one trading strategy: trade_option_wheel.py runs against one sub-account, trade_cppi.py runs against another, and so on.
This arrangement has some real advantages that I’ve grown to appreciate:
- Monitoring is clean. Each strategy’s P&L, positions, and cash are fully isolated. I can see at a glance how each one is doing without untangling shared positions.
- Performance evaluation is honest. Since costs, commissions, and returns don’t bleed across sub-accounts, comparing strategies is straightforward — no guessing which fees belong to which trade.
- Risk is contained. A bad trade in one strategy doesn’t pollute the capital base of another. Each sub-account lives and dies by its own decisions.
The downside is that each script needs to connect to IB Gateway individually — and IB Gateway wasn’t designed to handle multiple simultaneous API connections gracefully. That’s where the phone-call collision problem starts.
1.2 The Problem With Two Cooks in One Kitchen
The sub-account setup is clean in theory. In practice, there’s an ugly catch.
Each trading script connects to IB Gateway using its own dedicated sub-account number — so far, fine. The problem is what happens when the second script connects. IB Gateway doesn’t handle two simultaneous sessions graciously. It picks one and kicks the other out. Silently. The first strategy’s connection just… gets dropped. No warning, no error thrown back to the script — it’s still sitting there waiting for a response that’s never coming, until eventually the socket gives up and times out.
My first instinct was the obvious one: just schedule the scripts to make sure these cron tasks will never run at the same time. It worked, but it was fragile. Any glitch in the request-response cycle would cause unpredictable failures across strategies. The more I thought about adding a third or fourth strategy down the road, the worse the whole thing felt. Not exactly a solution I felt good about.
Before: two strategies, two connections, unpredictable collisions
The whole point of the sub-account setup is to keep adding more — each new strategy gets its own isolated sub-account, its own script, its own P&L. Three strategies means three simultaneous connections. Five means five. At some point, the sleep hacks stop working entirely, and I’m back to constant timeouts — except now there are more of them, and they’re harder to debug.
I needed a real fix, not a workaround.
1.3. The Simple But Stable Solution
The fix is a classic pattern: introduce a message queue broker. One process owns the connection. Everyone else sends it requests. The broker processes them one at a time, in order, sends responses back. No more collisions, no more guessing whose request made it through.
The only question was which message queue to use. Kafka and RabbitMQ were the obvious names — but honestly, they felt like overkill. Both require running a dedicated broker process with real infrastructure overhead. That’s like hiring a full logistics team to manage two people passing notes across a desk.
ZeroMQ, on the other hand, was the much lighter option. It’s a lightweight socket library — not a full message broker, no separate process to run, no infrastructure to babysit. It speaks request-reply natively, works in-process or across machines. Fast, simple, and exactly as much as I needed.
With the new architecture, a single ib_broker_service.py process becomes the sole owner of the IB Gateway connection. The strategies that run against the sub-accounts become ZMQ clients — they send requests over a socket and wait for responses, completely unaware of the underlying IB connection.
After: one broker, one connection, all requests serialized through the queue
Inside the broker service, an asyncio.Queue serializes all incoming requests. Even if both strategies fire at the exact same moment, the queue ensures the gateway sees only one request at a time. The race condition is gone by design. Clean. Predictable. Easy to reason about.
The design was clean. Implementing it together with Claude Code would be a new challenge for me instead.
2. My First Real Session With Claude Code
2.1. The Firehose
I’d used AI assistants before for quick questions, but this was my first time sitting down with Claude Code for a real engineering task. I wasn’t sure what to expect.
What I got was a firehose.
I described the architecture — the sub-account setup, the IB Gateway collision problem, the ZeroMQ broker idea — and Claude just… started producing code. Function after function, file after file, in a single response. Then the next response. Then the next. It felt almost endless. I’d ask a follow-up, get another wall of coherent, structured Python back. I remember thinking: this is genuinely different from anything I’ve used before.
To get more out of it, I installed a few skills on top of the base Claude Code setup: the superpowers plugin for structured workflows like Plan mode and subagent dispatch, a GitHub integration, and a handful of others from the official skill library. Each one felt like adding a new tool to the workbench. The whole thing started feeling less like a chatbot and more like a junior engineer who never gets tired and never complains about scope creep.
Within a few sessions, I had a working POC — the ZMQ broker service, the client wrappers, a basic test suite covering the main request flows. Things that would’ve taken me a week of solo work were done in an afternoon. I was genuinely impressed. Maybe a little too impressed.
Docker network overview — how the broker service sits between the strategies and IB Gateway
2.2. Then Reality Showed Up
The POC looked great. The tests passed. I felt good. It might be too good…
Then I started fine-tuning the actual interaction with IB Gateway, and the beautiful lie started cracking.
A lot of what we’d built together was based on how ib_insync should behave — what the docs said, what the method names implied, what seemed reasonable. Claude would suggest an approach; I’d ask if it was right; Claude would confirm it. But neither of us had actually watched the API respond against a live gateway. The code was coherent. The assumptions underneath it were untested.
The hard truth is: unvalidated code is just speculation with good formatting. Claude can produce something that compiles, passes tests, and reads like it was written by someone who knew what they were doing — and still be completely wrong about how the actual system behaves. Without someone going in and verifying the real API responses against a live gateway, all that confidence is just noise. I found that out the hard way later on.
For the next few days, I was deep in it — printing logs, staring at outputs, debugging side by side with Claude, and rewriting acceptance criteria so Claude could actually understand what “correct behavior” meant for each edge case. The broker service got torn down and rebuilt more than once. Old assumptions got replaced with new ones, which then got replaced again.
By the end, things worked. But if I’m being honest about the time spent — the POC took an afternoon, and cleaning up after the POC took the rest of the week. Claude didn’t save me a week of work. It saved me maybe a day, and handed me a different kind of problem to solve instead.
3. What Claude Code Actually Did Well
After a debugging war story like that, it’s easy to walk away feeling like AI was mostly the problem. But that wouldn’t be honest. Looking back, there were four things Claude Code did that genuinely changed how fast I moved — and I think they’re worth naming specifically.
1. Plan mode for design-before-code discipline. Before touching the broker service’s account subscription logic, I asked Claude to write out the invariants first — what exactly does reqAccountUpdates(False) clear? Which cache is affected, and which isn’t? Writing those down in plain language, before changing a single line of code, prevented at least two wrong turns I can clearly trace back to that decision. The written invariants became the thing I’d check when a fix didn’t behave the way I expected.
2. Background tasks for parallel work. My test suite took about 300 seconds to run. Rather than sitting and waiting, I’d kick it off as a background task and spend that time reading Docker logs in the foreground. This sounds mundane — and it is — but those 5-minute chunks add up. What would’ve been dead time became useful time.
3. Log-driven iteration. Every time a bug was invisible from the outside, Claude’s suggestion was always the same: add a _log.info at the exact point of suspicion and rebuild. Not vague advice like “add some logging” — but “log raw_positions here, before the exception block, so we can see whether the data ever existed.” That kind of specificity made every debugging loop shorter.
None of these are magic. They’re good engineering habits. The true value isn’t that AI replaces judgment — it’s that AI shrinks the gaps between moments of judgment.
4. Final Words
Looking back at the whole experience — the architecture design, the build, the debugging war — I keep coming back to three things that I think are actually true about building with AI, at least for now.
1. You still have to be the domain expert.
I came into this project with the architecture already in my head. The message queue idea and the single-connection invariant — all of that came from me, before I opened Claude Code. That was domain knowledge doing its job. Claude helped me build the thing efficiently. It didn’t design it.
And when the debugging got hard, the breakthroughs came from knowing things that weren’t in the code — like IB API’s single-account subscription limit. Claude could have read every line of ib_broker_service.py and never surfaced that. It’s the kind of knowledge you earn from the documentation and from past painful bugs. AI accelerates execution. It doesn’t fill that gap.
2. The danger isn’t AI being wrong. It’s AI being wrong confidently.
The bugs that hurt the most in this project weren’t the ones that threw errors and screamed at my face. They were the ones that succeeded quietly — startup logs that looked correct, a method that seemed to exist, a status=ok response that hid a silent internal failure.
AI fails the same way — not by saying “I’m not sure,” but by saying something specific and plausible that just happens to be wrong. The answer looks right. The formatting looks right. And if you’re moving fast, you accept it and move on. The bug you didn’t know you introduced won’t show up until it’s the worst possible time. The only real fix is to know what you’re asking for well enough to recognize when the answer is off.
3. The right mental model: an extremely fast junior who needs supervision.
Think of AI as a developer who reads code fast, implements what you tell it to implement, and rarely pushes back — but needs you in the loop to catch when it’s heading somewhere wrong. Left unsupervised, it will confidently walk down the wrong path and hand you something that looks finished.
That’s what I found Claude Code to be. Fast, capable, occasionally wrong in ways that were my job to catch. I don’t mean that as a complaint — it’s just an accurate description. Once I stopped expecting it to be more than that, I got a lot more out of it.
51/51 tests passing. No more “account updates request timed out.” Two strategies, one gateway, zero conflicts.
I won’t pretend that number didn’t feel good after three days of logs lying to my face. But I also won’t pretend Claude did that alone — or that I could’ve done it without knowing what I was looking for. Both of those things are true at the same time. That tension, the it helped but it wasn’t enough on its own, is probably the most honest thing I can say about working with AI right now. And I suspect it’ll stay true for a while.
If you’re running multiple IBKR strategies and hitting the same connection headaches, I’d love to know — drop a comment or reach out.