'target sensor throughput (how many sensors, typical sample rates, and whether uplink is LAN vs WAN)' > Local Development is where we are now with Intended Production Environment being a Hyper Scaler / Datacenter and WAN Edge Devices Seeking Optimized Battery utilization (unknown available bandwidth throughput, depending on network conditions, I'm expecting / hoping many MBps) | I suppose there could be times when an ESP32 on a remote Pacific Island may only muster Kbps of ingress (eg Tsunami Early Warning).

Perfect — that’s *exactly* the environment spectrum where your “events + artifacts” architecture shines: you can run the same logical LPI pipeline everywhere, but **change the emission profile** based on bandwidth, power, and mission urgency.

Below is a practical, “operator-grade” design that handles:

* local dev (fat pipes, iterate fast)
* hyperscaler/datacenter (heavy compute + long retention)
* WAN edge devices with variable link quality (MB/s sometimes, kb/s sometimes)
* ultra-constrained edge (think remote microcontroller / Espressif Systems class) where you may only manage kbps.

---

## The core principle: split everything into **Events** and **Artifacts**

To avoid spaghetti and scale cleanly:

### Events (always small, always replayable)

These are your `/api/sensors/<id>/activity` posts:

* stage notifications (`iq_window_received`, `lpi_candidate_detected`, `waveform_classified`, `association_updated`)
* **compact summaries** (metrics, confidences, estimated params)
* **pointers + hashes** to artifacts

### Artifacts (heavy, optional, store-and-forward)

* raw IQ windows (or compressed IQ)
* TF tiles (STFT/PWVD heatmaps)
* SCF/CAF, HOS matrices
* optional model embeddings

You store artifacts locally (edge) and/or centrally (datacenter). Events just point to them.

This lets you be “collect them all” without trying to push gigabytes through a straw.

---

## Why hybrid is the default (battery math)

On battery devices, transmitting bytes usually costs more energy than computing a few FFTs—especially when the uplink is flaky. So you **compute just enough locally to reduce transmissions**, then upload raw data only when it’s worth it.

---

## Define 4 runtime emission profiles

Don’t hardcode “server-side” vs “sensor-side.” Pick one of these profiles dynamically per sensor based on:

* measured uplink throughput / loss
* battery %
* temperature / CPU budget
* mission posture (routine vs urgent)

### Profile 0 — Local dev “everything on”

**Use when:** localhost, LAN, iteration speed

Emit:

* all events for all stages
  Upload artifacts:
* raw IQ frequently
* TF tiles at high cadence

Good for validating correctness, UI, and replay.

---

### Profile 1 — Datacenter / hyperscaler “central brain”

**Use when:** sensors have decent uplink or you’re ingesting from a nearby gateway

Edge:

* ships IQ windows and/or coarse channelized IQ
* light gating (avoid total flood)

Server:

* computes heavy transforms (SCF/HOS/WVD), classification, association/tracking
* stores artifacts in durable object storage
* emits final “high value” events

This is where you run the expensive “Pace-style receiver stack” at scale, but only after gating.

---

### Profile 2 — WAN edge “smart sensor”

**Use when:** bandwidth is usually MB/s but can degrade

Edge:

* always computes **STFT tiles + noise floor + candidate gating**
* emits candidates/classifications quickly
* uploads raw IQ **only on triggers** or periodic calibration windows

Server:

* does deeper classification/association when raw IQ arrives
* otherwise trusts edge summaries

This profile gives you good UX and good battery characteristics.

---

### Profile 3 — Ultra-constrained “kbps mode”

**Use when:** remote island, disaster monitoring, intermittent link, maybe only kbps

Edge:

* does **very cheap detection**

  * coarse FFT / Goertzel tones / band energy
  * simple chirp/hop heuristics (if feasible)
* stores raw IQ locally (ring buffer) but **does not upload it**
* emits only:

  * heartbeats
  * compact detections (few hundred bytes)
  * *optional* 1–2 small “thumbnail” artifacts (tiny TF slice) if link allows

Server:

* updates mission state + alerts + tasks
* requests “artifact pull” only when a gateway link improves

This preserves “awareness” without trying to brute force uplink.

---

## Concrete bandwidth sizing (so your defaults don’t lie)

Raw IQ size is brutal:

**Bytes/sec ≈ sample_rate × bytes_per_complex_sample**

* cs16 IQ (I int16 + Q int16) → 4 bytes/sample
* 2.4 Msps → ~9.6 MB/s
* even 0.2 seconds → ~1.92 MB

That’s why you only upload raw IQ:

* on triggers
* or in short bursts
* or via gateway aggregation

TF tiles compress way better:

* e.g., 256 time frames × 1024 bins float16 ≈ 256×1024×2 ≈ ~0.5 MB (often compressible further)
* but you don’t need them continuously

---

## Dynamic adaptation: choose profile per sensor automatically

Have each sensor (or gateway) compute a rolling link estimate:

* `uplink_kbps` (EWMA)
* `loss_rate` / retry rate
* `battery_pct`
* `cpu_load`

Then select profile:

* if `uplink_kbps < 50` or `battery_pct < 15` → Profile 3
* else if `uplink_kbps < 1000` → Profile 2 (send summaries + triggered IQ)
* else → Profile 1/0 depending on environment

Emit a periodic event:
`kind: "sensor_profile_updated"` with the chosen profile and budgets.

That makes mission parameters “defined in part by sensor activity” in a very literal way.

---

## Event budgets: how many events/artifacts per second?

A clean way to prevent “graph spam” is to enforce budgets per sensor:

### Default budgets (good starting point)

**Profile 2 (smart sensor):**

* `heartbeat`: every 10–30s
* `tf_computed`: 1–2 Hz (summary only) + artifact pointer occasionally
* `lpi_candidate_detected`: as needed (gated)
* `waveform_classified`: only for candidates
* `iq upload`: only for candidates + 1 calibration window per minute (small)

**Profile 3 (kbps):**

* `heartbeat`: every 60–180s
* `candidate_detected`: max 1 per 10s (burst-cap)
* `classification`: max 1 per candidate, but keep payload tiny
* `artifact`: never, or only tiny thumbnails (like 8×128 bins quantized)

---

## What “Pace-compliant” payload looks like in each profile

Same schema, different population.

### Profile 1/2 payload (rich)

* `algo`, `feature_set_id`, `window`, `evidence` always
* `classes[]`, `estimated_params`, `belief`, `association` filled when available
* `artifact_ptrs` includes TF/SCF/HOS matrices when stored

### Profile 3 payload (tiny)

Still include:

* `algo.name`, `algo.version`, and a short `params_hash`
* `feature_set_id` (string)
* `window` (duration + center freq + sr)
* `evidence.iq_hash` (even if IQ not uploaded)

But reduce the rest to:

* top-1 class + confidence
* 2–4 scalar metrics (occupied_bw, peak_db, sweep_hint, hop_hint)
* a “requestable” artifact pointer (local path, not uploaded)

---

## Where transforms should run: a practical “split”

Instead of “server vs sensor,” think “cheap vs expensive”:

### Cheap (edge-friendly)

* coarse FFT / STFT tiles (downsampled)
* noise floor tracking
* energy detection + occupancy
* candidate gating

### Expensive (datacenter-friendly)

* cyclostationary SCF/CAF (especially high resolution)
* higher-order cumulants across many lags/bands
* WVD/PWVD (computationally heavy)
* multi-sensor fusion / association / tracking

That split gives you most of the benefit of Pace’s receiver stack without melting batteries or links.

---

## You asked: “Shall we create an IQ simulator (Raw FFT Bins?)”

Yes — but make it dual-output so you can evaluate edge vs server compute:

1. **IQ simulator** → produces raw IQ windows (cs16)
2. **Derived simulator** → produces TF tiles + summary metrics
3. **Event emitter** → posts:

   * `iq_window_received`
   * `tf_computed`
   * `lpi_candidate_detected` (gated)
   * `waveform_classified` (for candidates)

Then you can benchmark:

* bandwidth costs
* CPU costs
* detection latency
* classification quality

…and decide which parts to move edge-side.

---

## Recommended “next build” in your system

If you want a crisp implementation path that stays non-spaghetti:

1. Add **profile selection** (`sensor_profile_updated`) based on uplink + battery
2. Add **artifact store** (even just a folder tree) keyed by `iq_hash`
3. Implement **gated upload**:

   * in Profile 2/3, upload IQ only when `lpi_candidate_detected` fires
4. Keep your LPI worker endpoint, but allow:

   * `mode="simulate"` (current behavior)
   * `mode="dsp"` (real compute)

---

## Ultra-constrained island scenario (kbps, tsunami warning vibe)

For that extreme case, don’t pretend it’s doing full LPI classification. Make it a **reliable beacon + anomaly trigger**:

* scan a few critical bands / channels
* compute coarse energy + change detection
* send a tiny alert payload:

  * `band_id`, `delta_db`, `duration_ms`, `confidence`
* store raw data locally for later retrieval if a gateway shows up

This is still “collect them all”—just locally until you have a path to ingest.

---

If you tell me your “first production-ish” sensor target (e.g., SDR on a Raspberry Pi class device, Android phone, small Linux box, or actual microcontroller), I’ll give you a **default profile** with:

* window durations
* cadence
* gating thresholds
* what artifacts to store vs upload
* and a minimal compression/serialization recommendation (CBOR vs JSON, zstd vs gzip).
