How to inspect data from past runs#
Bluesky runs are stored in a Tiled catalog. This page shows how to read them back at the IPython prompt.
The catalog object#
The session-level catalog client is bound to cat:
cat # <Container ...> -- top-level Tiled client
len(cat) # number of runs available
cat[-1] # most recent run
cat[-5:] # last five runs
cat["<uid>"] # a specific run by UID
You can also instantiate a fresh client without restarting your session:
from tiled.client import from_uri
fresh = from_uri("http://sn.xray.aps.anl.gov:8000")
fresh # /raw under that server
Structure of a run#
A Bluesky run is a small tree of streams. The two you will see most
often are primary (the main event stream the scan emits) and
baseline (the once-at-start, once-at-stop snapshot of every device
tagged baseline):
run = cat[-1]
list(run)
# ['primary', 'baseline']
primary = run.primary
list(primary)
# ['data', 'config', 'time', 'seq_num', ...]
The primary.data group holds the actual readings; everything else
is bookkeeping.
Reading the data#
run.primary.read() returns an
xarray.Dataset – a labeled
multi-dimensional array container that plays well with pandas, numpy,
and matplotlib:
ds = run.primary.read()
ds
# <xarray.Dataset>
# Dimensions: (time: 11)
# Coordinates:
# * time (time) datetime64[ns] ...
# Data variables:
# sample_stage_xprime (time) float64 0.0 1.0 2.0 ...
# scaler_chan01 (time) float64 1234.0 1180.0 ...
ds.to_pandas()
# pandas DataFrame
ds["scaler_chan01"].plot()
# matplotlib plot
For single columns: ds["sample_stage_xprime"] returns an xarray.DataArray.
Metadata#
Every run carries metadata accessible without reading the bulk data:
run.metadata["start"]
# {'uid': '...', 'time': 1719000000.0, 'scan_id': 1, 'plan_name': 'scan',
# 'plan_args': {...}, 'plan_type': 'generator',
# 'detectors': ['scaler'], 'motors': ['sample_stage_xprime'], 'num_points': 11,
# ...}
run.metadata["stop"]
# {'uid': '...', 'time': ..., 'exit_status': 'success', 'num_events': {...}}
run.metadata["start"]["scan_id"]
# 1
Baseline stream#
If the device is baseline-labeled in devices.yml, it appears in
the baseline stream automatically:
run.baseline.read()
# <xarray.Dataset>
# Dimensions: (time: 2)
# Data variables (one entry per baseline device, both at start and at end):
# sample_stage_xprime (time) float64 ...
# sample_stage_base_y (time) float64 ...
# ...
Useful for “what was the rest of the instrument doing while I scanned this one motor?”.
Filtering and searching#
Tiled supports basic filtering on metadata:
from tiled.queries import Key
# Runs by plan name:
cat.search(Key("plan_name") == "scan")
# Runs from the last hour (server-side time field is unix epoch):
import time
cat.search(Key("time") > time.time() - 3600)
Image (area-detector) data#
If a scan included an area detector like the Eiger2, the image data
is referenced from the run’s master HDF5 file via HDF5 external
links. Reading it from cat[-1] should return a dask-backed
array, but as of the date of this writing the end-to-end path is not
yet validated at 3-ID-C. See How to visualize
HDF5 for the current state.
Common pitfalls#
Wrapping reads in
RE(...)is wrong;cat[-1].primary.read()is not a plan. See The RunEngine.Reading a very large image array eagerly.
read()will pull the entire dataset into memory. For Eiger-class data, slice first:run.primary["eiger2_image"][0]to get the first frame.Stale
catafter a long session. Tiled clients cache schemas; if a new run is not showing up, refresh withcat.refresh().
See also#
How to visualize HDF5 image files – area detector workflow.
How to run a scan – to produce the data.