GLOBAL SCYTHE gRPC Stream & Latent Swarm Decomposition

SCYTHE Intelligence platform evolution, sans potential data pipeline bottlenecks, and the language the system uses to think about data becomes the constraint. We hit that!

Previous versions of SCYTHE moved data over HTTP REST and WebSocket frames. It worked. But it was always a translation problem: produce JSON, ship JSON, parse JSON, render JSON. Every hop was another opportunity to lose structure, lose timing, lose trust. And in a system designed to reason about adversarial network behavior, structural loss is a threat surface.

This release changes the data plane from the ground up.

The Problem We Were Actually Solving

By Stage 8, SCYTHE could detect clusters, narrate behavior, simulate activation cascades, and render live RF volumetric fields on a Cesium globe. That’s not a small surface area.

But three things were grinding against each other:

JSON overhead on high-frequency voxel streams. The RF field processor was packing 32³ float32 arrays into JSON lists and shipping them over WebSocket. At 64³ resolution, that’s 524,288 bytes of numeric text that had to be parsed on the other end before a single shader could touch it. We were burning cycles on syntax, not signal.
Token authentication was brittle at WebSocket reconnect. URL-embedded tokens (?token=abc123) broke on orchestrator restart, failed on Private Network Access preflight checks, and provided no isolation between instances sharing the same relay. The auth model was bolted on at the protocol boundary — the worst place for security.
“Quiet” clusters were invisible. A cluster of 694 nodes sitting dormant showed up as a static counter. The system had no vocabulary for what coordinated inactivity means — no density decomposition, no temporal ghosting, no probabilistic intent layer. A staging infrastructure in standby looked identical to an abandoned one.

The Latent Swarm Decomposition Engine

Before we changed the wire protocol, we changed how the system thinks about clusters.

The insight: a node count is a compressed event, not a measurement. 694 nodes sitting quiet isn’t the absence of signal. It’s a coiled spring with a signature.

We extended the cluster intelligence layer with what we’re calling the Latent Swarm Decomposition Engine — a multi-layer autopsy that fires when a cluster is examined rather than just enumerated.

Dimensional Density (not a counter)

Instead of reporting:

Nodes: 694

The system now computes:

Spatial Density:     0.82   (highly co-located)
Temporal Activity:   0.03   (dormant)
ASN Entropy:         1.74   (multi-operator blend)
Signal Coherence:    0.61   (shared infra patterns)

Low activity + high density = staging posture. Mixed ASN + coherence = coordination layer. These aren’t labels applied after the fact — they’re derived from the same cluster graph the system already maintains.

ASN Stratification

Each cluster decomposes into its ASN membership:

AS6849  → 312 nodes  (core infrastructure)
AS16509 → 148 nodes  (AWS bleed-in)
AS13335 →  97 nodes  (Cloudflare masking layer)
+ 3 minor ASNs

The interpretation engine generates narrative automatically: “Core infrastructure anchored in AS6849 with cloud spillover. Pattern suggests hybrid hosting + traffic obfuscation.” That text isn’t written by a human — it’s derived from the ASN stratification logic in cluster_swarm_engine.py, which runs a 3-tier IP→ASN resolution pipeline (pyasn radix, MaxMind ASN, MaxMind City) with a 10K-entry LRU cache.

Temporal Ghosting

Even a “quiet” cluster has temporal structure. The system now surfaces:

02:14 UTC → microburst (11 nodes)
07:52 UTC → sync jitter across 200 nodes
13:08 UTC → ASN route shift

This reframes silence as below-threshold coordination rather than absence. The cluster isn’t gone — it’s between moves.

Silence Pressure Metric

Silence Pressure = Node Count × Inactivity Duration × Coherence

A cluster with high node count, sustained inactivity, and high phase coherence has elevated silence pressure — meaning coordinated activation is more probable, not less. The counter became an intelligence surface.

Probabilistic Intent Engine

Latent Intent Probabilities:
  Staging Infrastructure:  61%
  Traffic Relay Mesh:      22%
  Abandoned / Decaying:     9%
  Unknown:                  8%

These scores are computed from six independent basis vectors in narrate_cluster(): node concentration, ASN diversity, phase coherence, behavioral type match, temporal activity, and RF fraction. They aren’t hard-coded — they’re a weighted inference from the live cluster state.

The Architecture Shift: gRPC + Protobuf Data Plane

With the cluster intelligence layer upgraded, we turned to the wire protocol problem.

What changed

The SCYTHE stack now has two clearly separated planes:

Control Plane   → port 8765  (WebSocket, orchestrator commands, operator auth)
Data Plane      → port 50051 (gRPC + Protobuf, high-throughput binary streams)
Legacy Fallback → port 8766  (rf_voxel_processor HTTP/WS, unchanged)

We extended scythe.proto with a new ScytheStreamService alongside the existing OrchestratorService, HypergraphService, and ClusterIntelService:

service ScytheStreamService {
  rpc StreamClusters(StreamRequest) returns (stream StreamCluster);
  rpc StreamRFField(LodHint)        returns (stream RFField);
  rpc StreamDeltas(StreamRequest)   returns (stream StreamDelta);
}

Each message is Protobuf-encoded. RFField.voxels is raw bytes — the float32 grid lands directly in the GPU texture upload path with no parsing step.

GPU-Native RF Field Synthesis

The voxel field computation moved from a CPU numpy loop to a CUDA-accelerated inverse-distance weighting kernel:

# Vectorised on a CUDA meshgrid — no Python loop over voxels
grid = torch.stack(torch.meshgrid(
    torch.linspace(-1, 1, S, device=device),
    torch.linspace(-1, 1, S, device=device),
    torch.linspace(-1, 1, S, device=device),
    indexing='ij'
), dim=-1)  # [S, S, S, 3]

for pos, intensity in zip(pos_t, int_t):
    diff = grid - pos.view(1, 1, 1, 3)
    dist2 = (diff * diff).sum(-1)
    field += intensity / (dist2 + 0.01)

The result is normalised, packed as [sx:u16 LE][sy:u16 LE][sz:u16 LE][float32 LE…], and cached with an ISO-8601 timestamp header (X-Field-Timestamp). The gRPC StreamRFField servicer deduplicates by that header — frames are only forwarded when the field actually changed.

LOD Demand Signaling

The client sends camera altitude in the LodHint message; the server selects:

>50 km  → LOD 0  (16³,  200ms poll)
>10 km  → LOD 1  (32³,  500ms poll)
 ≤10 km → LOD 2  (64³, 1000ms poll)

LOD downsampling uses scipy.ndimage.zoom with order=1. The client gets resolution appropriate to its viewing distance — no wasted bandwidth computing 64³ fields for a globe-altitude view.

Instance-Bound Auth on Streaming RPCs

A subtle but important hardening: every streaming RPC that touches instance-specific data now carries instance_id in its request message and enforces it via _check_instance_auth(). The old design used Empty request types for StreamClusters and StreamDeltas — meaning a session bound to one instance could see data from all instances. That’s now closed.

The deck.gl Arbitration Layer

The rf-arbitration-layer.js brings volumetric RF rendering to the browser without a WebGPU dependency. It uses a WebGL2 TEXTURE_3D path with a custom 64-step raymarching shader:

// Heat-map colour ramp: blue → cyan → green → yellow → red
vec3 heatmap(float t) {
    if (t < 0.25) return mix(vec3(0.0,0.0,1.0), vec3(0.0,1.0,1.0), t*4.0);
    if (t < 0.50) return mix(vec3(0.0,1.0,1.0), vec3(0.0,1.0,0.0), (t-0.25)*4.0);
    if (t < 0.75) return mix(vec3(0.0,1.0,0.0), vec3(1.0,1.0,0.0), (t-0.50)*4.0);
    return            mix(vec3(1.0,1.0,0.0), vec3(1.0,0.0,0.0), (t-0.75)*4.0);
}

The computeArbitration() kernel runs CPU-side before the first render:

signal × 0.6 + coherence × 0.3 + (1 − diversity) × 0.1

This gives each voxel an arbitration weight — not just “how strong is the signal here” but “how contested is this space.” The computeDominance() function computes per-voxel difference between two competing RF fields, enabling multi-cluster interference visualization.

Cesium camera sync keeps the deck.gl overlay locked to the globe view via camera.changed and camera.moveEnd events.

What We Caught Before It Shipped

Four design reviews ran in parallel during implementation. Three blocking issues were caught before the code landed:

Intent scores iteration bug. cluster_swarm_engine.py returns intent_scores as a list of {label, basis, score} dicts. The gRPC DecomposeCluster handler was calling .items() on it — a silent AttributeError at runtime. Fixed to proper list iteration.

Empty request types bypass instance auth. StreamClusters(Empty) and StreamDeltas(Empty) had no instance_id field, so _check_instance_auth() couldn’t be called. Both changed to StreamClusters(StreamRequest) / StreamDeltas(StreamRequest). Sessions are now properly scoped.

/api/gpu-field unauthenticated compute endpoint. The docstring said “protected by X-Internal-Token” but the implementation had no check. Fixed to validate against SCYTHE_INTERNAL_TOKEN env var when set.

Wrong data source for cluster streams. StreamClusters was fetching raw nodes from /api/gravity/nodes and grouping them manually — bypassing the intelligence layer entirely. Changed to /api/clusters/intel, which returns pre-narrated cluster summaries with threat scores, ASN diversity, and phase coherence already computed.

VoxelStreamHub Cluster Publishers

voxel_stream_engine.py already defined CH_CLUSTER_NODES (0x02) and CH_CLUSTER_DELTA (0x03) constants and a pack_nodes() serialiser — but the VoxelStreamHub had no publish methods for them. Only publish_rf_field() existed.

Two new methods complete the channel matrix:

async def publish_cluster_nodes(self, nodes: list) -> None:
    payload = pack_nodes(nodes)
    await self.publish(CH_CLUSTER_NODES, payload)

async def publish_cluster_delta(self, node: dict) -> None:
    payload = pack_nodes([node])
    await self.publish(CH_CLUSTER_DELTA, payload)

These are the hooks that will let cluster_swarm_engine.py push live cluster updates into the streaming pipeline — replacing the current REST-poll architecture with a true publish/subscribe model.

Starting the System

The orchestrator startup command is unchanged:

python3 scythe_orchestrator.py \
  --host 0.0.0.0 \
  --port 5001 \
  --ollama-url http://192.168.1.185:11434 \
  --stream-relay-url ws://192.168.1.185:8765/ws \
  --mcp-ws-url ws://192.168.1.185:8766/ws

The gRPC server now launches automatically as a managed subprocess on port 50051. No additional flag required. To customise the gRPC port:

  --grpc-port 50051   # optional — 50051 is the default

To disable the gRPC server entirely (e.g., for a lightweight dev session):

  --no-grpc

The --stream-relay-url and --mcp-ws-url flags continue to configure the WebSocket control plane and the legacy voxel API — both still present and fully operational. The gRPC data plane adds capability; it doesn’t replace the existing WS layer.

What This Platform Is Becoming

Most network intelligence systems are built around the same primitive: poll an API, parse JSON, render a number. SCYTHE’s architecture is increasingly built around a different primitive: a cluster is a live field, and fields have physics.

A cluster doesn’t sit in a database cell. It has density, entropy, coherence, and silence pressure. It occupies space in a 3D RF field that can be raymarched and rendered. It has a temporal ghost — a record of micro-events that reframes what “quiet” means. It has an activation cascade that can be simulated before it happens.

The gRPC data plane, the GPU-native field synthesis, and the deck.gl arbitration layer are all in service of that one idea: intelligence isn’t a number you report. It’s a surface you explore.

SCYTHE is developed as an open-source RF intelligence and network visualization platform. The stack runs on Python (FastAPI, Flask, gRPC), PyTorch CUDA, CesiumJS, Three.js, and deck.gl. All intelligence narration is generated by local LLM inference — no external API calls.