Skip to content

Interpretation and Joins

Before joining anything, keep this mental model in mind:

  1. one orchestrator cycle means one rover pose
  2. that pose gets one Qualisys position sample
  3. that same pose then gets one acoustic capture
  4. that same pose then gets one RF measurement cycle

In practice:

  • one row in exp-<experiment_id>-positions.csv is one rover stop
  • one RF JSON-line record with the same experiment_id and cycle_id belongs to that stop
  • in a fully populated reciprocity cycle, that means up to 42 ceiling receiver rows with the same (experiment_id, cycle_id)

In the processed NetCDF:

  • rover_x, rover_y, rover_z contain the joined Qualisys coordinates
  • position_available == 1 means the position row was considered usable
  • csi_real + 1j * csi_imag reconstructs the cable-corrected complex RF quantity
  • csi_available == 1 means a given host contributed a usable RF record for that experiment/cycle

In the processed acoustic NetCDF:

  • values stores the microphone waveforms on (experiment_id, cycle_id, microphone_label, sample_index)
  • the same (experiment_id, cycle_id) pair identifies the acoustic capture recorded at that rover pose
  • microphone_label selects the microphone channel inside that one capture
AxisMeaningTypical question
experiment_idOne logical run such as EXP003 or EXP005.Which measurement campaign am I looking at?
cycle_idOne orchestrator loop iteration. This is the physical rover stop axis.Which stop in the run am I selecting?
hostnameOne RF receiver host or tile.Which antenna/tile produced this CSI value?

This means the processed xarray is easiest to picture as:

  • one outer stack of experiments
  • inside each experiment, one matrix over cycle_id x hostname
  • rover coordinates attached to each cycle
  • CSI values attached to each (cycle_id, hostname) pair

In other words:

  • rover variables live on (experiment_id, cycle_id)
  • CSI variables live on (experiment_id, cycle_id, hostname)

The named coordinates are also the xarray indexes:

CoordinateWhat it storesWhy it matters
experiment_idlabels like EXP003use .sel(experiment_id=\"EXP003\") instead of integer positions
cycle_idorchestrator cycle labelslets you map CSI and rover pose back to the same physical stop
hostnamereceiver host labels such as A05lets you select or mask CSI by tile name

Inspect them directly with:

ds.sizes
ds.coords
ds.indexes

cycle_id is a shared axis across the full dataset. That is convenient, but it is also the easiest place to make mistakes.

The important part is:

  • the dataset may list many cycle_id values
  • not every experiment uses every listed cycle
  • not every hostname is populated in every cycle

So a coordinate existing in the dataset does not guarantee that the data at that coordinate is valid. Always treat the masks as authoritative:

  • csi_available tells you whether CSI exists for one (experiment_id, cycle_id, hostname) location
  • position_available plus finite rover_x/y/z tells you whether the rover pose is usable for one (experiment_id, cycle_id) location

One physical RF measurement point is the pair (experiment_id, cycle_id).

That pair gives you:

  • one rover position: rover_x, rover_y, rover_z
  • one availability flag: position_available
  • one CSI vector across the active hostnames: csi_real, csi_imag, csi_available

Selecting one experiment, then one cycle, then one host shrinks the dataset like this:

SelectionRemaining structure
dsfull (experiment_id, cycle_id, hostname) dataset
ds.sel(experiment_id=\"EXP003\")one cycle_id x hostname slice
ds.sel(experiment_id=\"EXP003\", cycle_id=123)one rover pose plus one CSI vector over hostname
...sel(hostname=\"A05\")one scalar CSI value for one tile

Important structural details:

  • cycle_id is a shared axis across the dataset, not a guarantee that every experiment has data for every listed cycle
  • always use csi_available to discover which cycles and hostnames are populated
  • always check position_available and finite rover coordinates before trusting a rover pose
  • duplicate rover positions may already have been filtered out before the .nc file was written

For a worked walkthrough, open:

Safe starting pattern:

exp = ds.sel(experiment_id="EXP003")
cycle_mask = exp["csi_available"].any(dim="hostname")
cycle_ids = exp["cycle_id"].values[cycle_mask.values]
csi = (exp["csi_real"] + 1j * exp["csi_imag"]).sel(cycle_id=cycle_ids)
position_ok = (
exp["position_available"].sel(cycle_id=cycle_ids) > 0
) & (
np.isfinite(exp["rover_x"].sel(cycle_id=cycle_ids))
) & (
np.isfinite(exp["rover_y"].sel(cycle_id=cycle_ids))
) & (
np.isfinite(exp["rover_z"].sel(cycle_id=cycle_ids))
)