Two Bugs, One Symptom
Source: Dev.to
Background
A debugging war story from implementing an SSE client transport in the Raku MCP SDK.
The task seemed straightforward: add legacy SSE transport to the SDK. The server side went smoothly—Cro makes it easy to push text/event-stream responses. The client side, however, destroyed an afternoon.
Symptom
is-connected stays False forever. No error, no exception, no timeout message—just nothing happens.
Initial Attempts
We tried several approaches, in roughly this order:
start { await $client.get(...) }– GET takes 5–10 seconds to resolve$client.get(...).then(-> $p { ... })–.thencallback also delayedreact { whenever $resp.body-byte-stream }–wheneverdoesn’t fireSupply.tap(...)– tap callback delayedRAKUDO_MAX_THREADS=128– no help
Each approach worked fine in isolation but failed when a Cro HTTP server was running in the same process.
Root Cause 1: Thread‑Pool Starvation
Raku’s start blocks, .then callbacks, and react/whenever all share a single ThreadPoolScheduler. Cro uses the same primitives. When a Cro server holds open long‑lived SSE streams (Supply pipelines inside whenever blocks) and a Cro client in the same process needs scheduler slots to resolve its HTTP response pipeline, they compete for the same pool. Neither side is doing anything wrong; the starvation is emergent.
Debug output
SSE-CLIENT: before get
connected=False
connected=False
connected=False
SSE-CLIENT: after get, status=200
The GET resolves, but 10 seconds too late—after the test’s polling loop has already given up.
Fix: Escape the Shared Pool
Thread.start creates a real OS thread outside Raku’s scheduler. However, await doesn’t work inside Thread.start; it silently returns Nil. The solution is to use .result, which synchronously waits on a Promise outside the scheduler.
method !connect-sse() {
my $self := self;
my $url := $!url;
Thread.start: {
my $client = (require ::('Cro::HTTP::Client')).new;
my $resp = $client.get($url,
headers => [Accept => 'text/event-stream']).result;
react whenever $resp.body-byte-stream -> $chunk {
$self.handle-sse-chunk($chunk);
}
CATCH { default {} }
}
}
After this change, the connection is established, data flows, and chunks arrive.
Root Cause 2: Regex Space Handling in the SSE Parser
The SSE parser receives a line like:
event: endpoint
data: http://...
It splits each line on :, getting field "event" and value " endpoint". According to the SSE spec, a single leading space after the colon should be stripped. The code attempted this:
$value = $value.subst(/^ /, '') if $value.defined;
It looks correct, but in Raku regexes whitespace is insignificant by default. The pattern /^ / actually means “anchor to start of string” (^) followed by insignificant whitespace, not a literal space. Thus subst matches a zero‑width position at index 0, replaces nothing, and returns the original string unchanged. The event type remains " endpoint" (with a leading space), causing the check $!sse-event-type eq 'endpoint' to fail and the POST endpoint never to be set.
Debug output after fixing placement
HANDLE-CHUNK: empty line, event-type=[ endpoint] data=[ http://127.0.0.1:39652/message]
The leading space in [ endpoint] is the entire bug.
Fix: Strip the Space Explicitly
Avoid the regex entirely:
$value = $value.substr(1) if $value.defined && $value.starts-with(' ');
Now the event type is correctly recognized as "endpoint".
Interaction Between the Two Bugs
Bug 1 prevented data from arriving in time, so Bug 2 was invisible. Once Bug 1 was fixed, the symptom (is-connected staying False) persisted due to Bug 2. The system never entered a partially working state; it moved directly from “broken for reason A” to “broken for reason B” with no observable change in behavior.
Takeaways
- Whitespace in Raku regexes is insignificant by default.
/^ /does not match a literal space; it matches the start of the string. This can silently introduce bugs when stripping leading spaces. - Thread‑pool starvation can arise when a server and client share the same
ThreadPoolSchedulerin a single process. UsingThread.startwith.resultsidesteps the issue. - Running both server and client in the same process is fine in production (separate processes) but can expose emergent scheduling problems in tests.
- Defensive coding (e.g., explicit string manipulation instead of regexes for simple tasks) can avoid subtle pitfalls.
Thanks to @lizmat for motivating this post.