Interpretation and Joins
Before joining anything, keep this mental model in mind:
- one orchestrator cycle means one rover pose
- that pose gets one Qualisys position sample
- that same pose then gets one acoustic capture
- that same pose then gets one RF measurement cycle
In practice:
- one row in
exp-<experiment_id>-positions.csvis one rover stop - one RF JSON-line record with the same
experiment_idandcycle_idbelongs to that stop - in a fully populated reciprocity cycle, that means up to
42ceiling receiver rows with the same(experiment_id, cycle_id)
In the processed NetCDF:
rover_x,rover_y,rover_zcontain the joined Qualisys coordinatesposition_available == 1means the position row was considered usablecsi_real + 1j * csi_imagreconstructs the cable-corrected complex RF quantitycsi_available == 1means a given host contributed a usable RF record for that experiment/cycle
In the processed acoustic NetCDF:
valuesstores the microphone waveforms on(experiment_id, cycle_id, microphone_label, sample_index)- the same
(experiment_id, cycle_id)pair identifies the acoustic capture recorded at that rover pose microphone_labelselects the microphone channel inside that one capture
The Three Named Axes
Section titled “The Three Named Axes”| Axis | Meaning | Typical question |
|---|---|---|
experiment_id | One logical run such as EXP003 or EXP005. | Which measurement campaign am I looking at? |
cycle_id | One orchestrator loop iteration. This is the physical rover stop axis. | Which stop in the run am I selecting? |
hostname | One RF receiver host or tile. | Which antenna/tile produced this CSI value? |
This means the processed xarray is easiest to picture as:
- one outer stack of experiments
- inside each experiment, one matrix over
cycle_id x hostname - rover coordinates attached to each cycle
- CSI values attached to each
(cycle_id, hostname)pair
In other words:
- rover variables live on
(experiment_id, cycle_id) - CSI variables live on
(experiment_id, cycle_id, hostname)
Coordinates, Indexes, and Why They Matter
Section titled “Coordinates, Indexes, and Why They Matter”The named coordinates are also the xarray indexes:
| Coordinate | What it stores | Why it matters |
|---|---|---|
experiment_id | labels like EXP003 | use .sel(experiment_id=\"EXP003\") instead of integer positions |
cycle_id | orchestrator cycle labels | lets you map CSI and rover pose back to the same physical stop |
hostname | receiver host labels such as A05 | lets you select or mask CSI by tile name |
Inspect them directly with:
ds.sizesds.coordsds.indexesThe Most Important Structural Detail
Section titled “The Most Important Structural Detail”cycle_id is a shared axis across the full dataset. That is convenient, but it is also the easiest place to make mistakes.
The important part is:
- the dataset may list many
cycle_idvalues - not every experiment uses every listed cycle
- not every hostname is populated in every cycle
So a coordinate existing in the dataset does not guarantee that the data at that coordinate is valid. Always treat the masks as authoritative:
csi_availabletells you whether CSI exists for one(experiment_id, cycle_id, hostname)locationposition_availableplus finiterover_x/y/ztells you whether the rover pose is usable for one(experiment_id, cycle_id)location
Make One Measurement Point Tangible
Section titled “Make One Measurement Point Tangible”One physical RF measurement point is the pair (experiment_id, cycle_id).
That pair gives you:
- one rover position:
rover_x,rover_y,rover_z - one availability flag:
position_available - one CSI vector across the active hostnames:
csi_real,csi_imag,csi_available
Selecting one experiment, then one cycle, then one host shrinks the dataset like this:
| Selection | Remaining structure |
|---|---|
ds | full (experiment_id, cycle_id, hostname) dataset |
ds.sel(experiment_id=\"EXP003\") | one cycle_id x hostname slice |
ds.sel(experiment_id=\"EXP003\", cycle_id=123) | one rover pose plus one CSI vector over hostname |
...sel(hostname=\"A05\") | one scalar CSI value for one tile |
Important structural details:
cycle_idis a shared axis across the dataset, not a guarantee that every experiment has data for every listed cycle- always use
csi_availableto discover which cycles and hostnames are populated - always check
position_availableand finite rover coordinates before trusting a rover pose - duplicate rover positions may already have been filtered out before the
.ncfile was written
For a worked walkthrough, open:
- Notebook: RF Xarray Structure
- Notebook: Acoustic Xarray Structure
- Notebook: RF And Acoustic At One Position
Safe starting pattern:
exp = ds.sel(experiment_id="EXP003")cycle_mask = exp["csi_available"].any(dim="hostname")cycle_ids = exp["cycle_id"].values[cycle_mask.values]
csi = (exp["csi_real"] + 1j * exp["csi_imag"]).sel(cycle_id=cycle_ids)position_ok = ( exp["position_available"].sel(cycle_id=cycle_ids) > 0) & ( np.isfinite(exp["rover_x"].sel(cycle_id=cycle_ids))) & ( np.isfinite(exp["rover_y"].sel(cycle_id=cycle_ids))) & ( np.isfinite(exp["rover_z"].sel(cycle_id=cycle_ids)))