API Reference

Data loading is handled by the mojito package. This package provides utilities for loading, processing, and writing LISA Mojito L1 data.

I/O

MojitoProcessor.io.read.load_file(paths, *, load_days=None)[source]

Load a raw Mojito L1 HDF5 file.

Uses the mojito package to open the file and extracts TDI observables, light travel times, spacecraft orbits, noise estimates, and metadata into a flat dictionary suitable for process_pipeline().

Parameters:
  • paths (str, Path, or list thereof) – Path(s) to the Mojito L1 .h5 file(s).

  • load_days (float, optional) – Number of days to load from the start of the file (lazy slicing). None loads the full dataset.

Return type:

dict

Returns:

data (dict) – Dictionary containing:

  • tdis — dict of TDI channel arrays (X, Y, Z, A, E, T)

  • fs, dt, t_tdi — TDI sampling parameters and timestamps

  • ltts, ltt_derivatives, ltt_times — light travel times

  • orbits, velocities, orbit_times — spacecraft kinematics

  • noise_estimates — frequency-domain noise covariance cubes

  • metadata — laser frequency and pipeline names

MojitoProcessor.io.read.load_processed(path, *, segment_ids=None)[source]

Load processed segments from an HDF5 file written by write().

Reconstructs each SignalProcessor from the stored channel arrays and metadata attributes. Per-segment orbit and LTT data (written under /raw/<segment_name>/) are returned alongside the segments in a raw data dict.

Parameters:
  • path (str or Path) – Path to a .h5 file previously written by write().

  • segment_ids (list of int, optional) – Indices of segments to load, e.g. [100, 101, 102] loads only segment100, segment101, and segment102. None (default) loads all segments.

Return type:

tuple[Dict[str, SignalProcessor], dict]

Returns:

  • segments (dict of SignalProcessor) – Dictionary mapping segment names to reconstructed SignalProcessor objects.

  • raw_data (dict) – Auxiliary data dict. Top-level keys:

    • 'orbits' — per-spacecraft position/velocity arrays (full span)

    • 'noise_estimates' — dict of noise covariance arrays (if present)

    • 'metadata' — dict with laser_frequency / pipeline_names (if present)

    • '<seg_name>_ltts' — one entry per segment containing LTT arrays: ltts, ltt_derivatives (if present), ltt_times

Raises:

ValueError – If the file does not contain a /processed group, indicating it was not written by write().

Examples

>>> from MojitoProcessor import write, load_processed
>>> write("processed.h5", segments, raw_data=data)
>>> segments, raw = load_processed("processed.h5")
>>> sp = segments["segment0"]
>>> orbit_positions = raw["segment0"]["orbits"]
MojitoProcessor.io.write.write(output_path, segments, raw_data=None, *, segment_ids=None, filter_kwargs=None, downsample_kwargs=None, trim_kwargs=None, truncate_kwargs=None, window_kwargs=None)[source]

Write processed segments and raw auxiliary data to an HDF5 file.

Return type:

None

Parameters:
  • output_path (str | Path)

  • segments (Dict[str, SignalProcessor])

  • raw_data (dict | None)

  • segment_ids (list | None)

  • filter_kwargs (dict | None)

  • downsample_kwargs (dict | None)

  • trim_kwargs (dict | None)

  • truncate_kwargs (dict | None)

  • window_kwargs (dict | None)

File layout

/processed/
<segment_name>/ one group per segment

<channel> processed time-domain array (e.g. X, Y, Z) t time array in seconds (TCB) attrs: fs, dt, N, T, t0, channels

/pipeline_params/

filter/ attrs: highpass_cutoff, lowpass_cutoff, order, … downsample/ attrs: target_fs, kaiser_window trim/ attrs: fraction truncate/ attrs: days window/ attrs: window, alpha

/raw/ only written when raw_data is provided
orbits/ full-span orbit arrays

sc_position_1 spacecraft 1 positions (n_orbit, 3) [m] sc_position_2 sc_position_3 sc_velocity_1 spacecraft 1 velocities (n_orbit, 3) [m/s] sc_velocity_2 sc_velocity_3 times orbit sample timestamps (TCB) [s]

noise_estimates/

xyz noise covariance cube (freq, ch, ch) aet noise covariance cube (freq, ch, ch)

metadata/

attrs: laser_frequency, pipeline_names

<segment_name>/ one group per segment
ltts/

<link> light travel times sliced to segment window derivatives/

<link> LTT time-derivatives

times LTT sample timestamps for this segment

LTT data are sliced to each segment’s time window using the segment’s t0 and time array. Segments without a valid t0 (stored as NaN) are skipped for per-segment raw data.

type output_path:

str | Path

param output_path:

Destination file path. Created (or overwritten) by this function.

type output_path:

str or Path

type segments:

Dict[str, SignalProcessor]

param segments:

Output of process_pipeline().

type segments:

dict of SignalProcessor

type raw_data:

Optional[dict]

param raw_data:

Raw data dict returned by load_file(). When provided, noise estimates and metadata are written under /raw/, and orbit/LTT data are written per-segment under /raw/<segment_name>/.

type raw_data:

dict, optional

type segment_ids:

Optional[list]

param segment_ids:

Indices of segments to write, e.g. [0, 17] writes segment0 and segment17. None (default) writes all segments.

type segment_ids:

list of int, optional

type filter_kwargs:

Optional[dict]

param filter_kwargs:

Pipeline parameter dicts, stored verbatim under /pipeline_params/.

type filter_kwargs:

dict, optional

type downsample_kwargs:

Optional[dict]

param downsample_kwargs:

Pipeline parameter dicts, stored verbatim under /pipeline_params/.

type downsample_kwargs:

dict, optional

type trim_kwargs:

Optional[dict]

param trim_kwargs:

Pipeline parameter dicts, stored verbatim under /pipeline_params/.

type trim_kwargs:

dict, optional

type truncate_kwargs:

Optional[dict]

param truncate_kwargs:

Pipeline parameter dicts, stored verbatim under /pipeline_params/.

type truncate_kwargs:

dict, optional

type window_kwargs:

Optional[dict]

param window_kwargs:

Pipeline parameter dicts, stored verbatim under /pipeline_params/.

type window_kwargs:

dict, optional

Signal Processing

class MojitoProcessor.process.sigprocess.SignalProcessor(data, fs, t0=None)[source]

Bases: object

Signal processor for multi-channel time series data.

Handles filtering, downsampling, trimming, and windowing while automatically tracking sampling parameters (fs, N, T, dt).

Parameters:
  • data (dict) – Dictionary of channel data, e.g., {‘X’: array, ‘Y’: array, ‘Z’: array}

  • fs (float) – Sampling frequency in Hz

  • t0 (float | None)

data

Current processed data (updated after each operation). Includes a 't' key giving the time array [t0, t0+dt, ..., t0+(N-1)*dt] in seconds (or [0, dt, ..., (N-1)*dt] when t0 is None).

Type:

dict

fs

Current sampling frequency in Hz

Type:

float

N

Current number of samples per channel

Type:

int

T

Current duration in seconds

Type:

float

dt

Current sampling period in seconds

Type:

float

t

Time array [t0, t0+dt, ..., t0+(N-1)*dt] in seconds.

Type:

ndarray

channels

List of channel names

Type:

list

Example

>>> sp = SignalProcessor({'X': x_data, 'Y': y_data}, fs=4.0)
>>> filtered = sp.filter(low=1e-4, high=1.0, order=6)
>>> trimmed = sp.trim(fraction=0.02)  # Trim 2% total (1% each end)
>>> windowed = sp.apply_window(window='tukey', alpha=0.05)
>>> t = sp.data['t']   # time array [t0, t0+dt, ..., t0+(N-1)*dt]
property data: dict

Channel data as a dict, including a 't' key for the time array.

The time array is t0 + np.arange(N) * dt (seconds). When t0 is None, the array starts from 0.

Returns:

dict – All channel arrays plus 't'.

property t: ndarray

Time array [t0, t0+dt, ..., t0+(N-1)*dt] in seconds.

filter(*, low=None, high=None, order=2, filter_type='butterworth', zero_phase=True)[source]

Apply filter to all channels (auto-detects highpass/lowpass/bandpass).

Automatically determines filter type based on provided cutoff frequencies: - Only low set: highpass filter - Only high set: lowpass filter - Both low and high set: bandpass filter

Parameters:
  • low (float, optional) – Lower cutoff frequency in Hz (highpass)

  • high (float, optional) – Upper cutoff frequency in Hz (lowpass)

  • order (int, optional) – Filter order (default: 6)

  • filter_type (str, optional) – Filter type: ‘butterworth’, ‘chebyshev1’, ‘chebyshev2’, ‘bessel’ (default: ‘butterworth’)

  • zero_phase (bool, optional) – Use zero-phase filtering (filtfilt) if True, else single-pass (default: True)

Return type:

Dict[str, ndarray]

Returns:

filtered_data (dict) – Dictionary of filtered channel data

Raises:

ValueError – If neither low nor high is provided

Examples

>>> # Highpass only
>>> sp.filter(low=5e-6, order=2)
>>> # Lowpass only
>>> sp.filter(high=0.1, order=2)
>>> # Bandpass
>>> sp.filter(low=1e-4, high=0.1, order=6)
downsample(target_fs, window=('kaiser', 31.0), padtype='line')[source]

Resample all channels to a target sampling rate using polyphase filtering.

Uses scipy.signal.resample_poly which applies a zero-phase FIR anti-aliasing filter via polyphase decomposition. Accepts arbitrary rational target rates (e.g., 4 Hz -> 0.4 Hz), unlike decimate which requires an integer factor.

Parameters:
  • target_fs (float) – Desired output sampling frequency in Hz. Must be positive and less than or equal to the current sampling frequency (this method is a downsampler).

  • window (tuple or array_like, optional) – Window specification passed to scipy.signal.resample_poly for FIR anti-aliasing filter design. Default ('kaiser', 5.0) is scipy’s own default and gives good stopband attenuation.

  • padtype (str, optional) – Edge-padding strategy. Options: 'line' (default), 'constant', 'mean', 'median', 'maximum', 'minimum'. 'line' extends the signal linearly from each end, reducing edge transients for slowly-varying data such as LISA TDI channels.

Return type:

Tuple[Dict[str, ndarray], float]

Returns:

  • resampled_data (dict) – Dictionary mapping channel names to resampled 1D arrays.

  • new_fs (float) – Actual output sampling frequency in Hz (exact rational result self.fs * up / down).

Raises:
  • ValueError – If target_fs is not positive.

  • ValueError – If target_fs exceeds the current sampling frequency.

  • ValueError – If the rational approximation of target_fs / self.fs produces up == 0.

Notes

The up/down integers are computed via:

ratio = Fraction(target_fs / self.fs).limit_denominator(10000)
up, down = ratio.numerator, ratio.denominator

Common use cases from 4 Hz source data:

  • 4 Hz -> 1 Hz: up=1, down=4

  • 4 Hz -> 0.4 Hz: up=1, down=10

  • 4 Hz -> 2 Hz: up=1, down=2

  • 4 Hz -> 3 Hz: up=3, down=4

Examples

>>> sp = SignalProcessor({'X': x_data, 'Y': y_data}, fs=4.0)
>>> sp.filter(low=5e-6, order=2)
>>> sp.trim(fraction=0.022)  # Trim 2.2% total
>>> resampled, new_fs = sp.downsample(target_fs=1.0)
>>> print(new_fs)   # 1.0
trim(fraction)[source]

Trim data by removing a fraction of the dataset.

Parameters:

fraction (float) – Total fraction of data to remove (e.g., 0.01 = 1%).

Return type:

Dict[str, ndarray]

Returns:

trimmed_data (dict) – Dictionary of trimmed channel data

Raises:

ValueError – If fraction is not in range [0, 1] or would remove all data

Examples

>>> # Trim 2% total (1% from each end)
>>> sp.trim(fraction=0.02)
>>> # Trim 5% from start only
>>> sp.trim(fraction=0.05)
apply_window(window='tukey', **window_params)[source]

Apply window function to all channels.

Parameters:
  • window (str, optional) – Window type: ‘tukey’, ‘blackmanharris’, ‘hann’, ‘hamming’, ‘blackman’, ‘planck’ (default: ‘tukey’)

  • **window_params – Additional parameters for the window function. alpha (float, default 0.05) is accepted by ‘tukey’ and ‘planck’. Other windows ignore extra keyword arguments (a warning is emitted).

Return type:

Dict[str, ndarray]

Returns:

windowed_data (dict) – Dictionary of windowed channel data

Examples

>>> sp.apply_window('tukey', alpha=0.05)
>>> sp.apply_window('blackmanharris')
>>> sp.apply_window('hann')
periodogram()[source]

Compute the one-sided power spectral density for each channel.

The data is assumed to have already been windowed (e.g. by apply_window()), so no additional window is applied here.

Normalisation follows Parseval’s theorem: the integral of the one-sided PSD over positive frequencies equals the mean square of the signal.

Return type:

Tuple[ndarray, Dict[str, ndarray]]

Returns:

  • freqs (ndarray) – Frequency array in Hz, shape (N//2 + 1,).

  • psds (dict) – Dictionary mapping channel names to one-sided PSD arrays (units²/Hz), each with the same shape as freqs.

Examples

>>> freqs, psds = sp.periodogram()
>>> plt.loglog(freqs[1:], psds['X'][1:])
fft()[source]

Compute the one-sided complex FFT spectrum for each channel.

The data is assumed to have already been windowed (e.g. by apply_window()), so no additional window is applied here. Returns raw complex amplitudes from numpy.fft.rfft.

Return type:

Tuple[ndarray, Dict[str, ndarray]]

Returns:

  • freqs (ndarray) – Frequency array in Hz, shape (N//2 + 1,).

  • ffts (dict) – Dictionary mapping channel names to complex FFT arrays, each with shape (N//2 + 1,).

Examples

>>> freqs, ffts = sp.fft()
>>> plt.loglog(freqs[1:], np.abs(ffts['X'][1:]))
to_aet()[source]

Transform XYZ Michelson channels to noise-orthogonal AET channels.

Uses the standard equal-arm combination:

A = (Z - X) / sqrt(2)
E = (X - 2Y + Z) / sqrt(6)
T = (X + Y + Z) / sqrt(3)

Returns a new SignalProcessor with channels ['A', 'E', 'T'], inheriting fs, t0, and all derived parameters from the original.

Return type:

SignalProcessor

Returns:

SignalProcessor – New processor with AET channel data.

Raises:

ValueError – If any of the channels 'X', 'Y', 'Z' are missing.

Examples

>>> sp_xyz = processed_segments['segment0']
>>> sp_aet = sp_xyz.to_aet()
>>> freqs, psds = sp_aet.periodogram()
get_params()[source]

Get current signal parameters.

Return type:

dict

Returns:

params (dict) – Dictionary containing fs, N, T, dt, and channels

MojitoProcessor.process.sigprocess.process_pipeline(data, channels=None, *, filter_kwargs=None, downsample_kwargs=None, trim_kwargs=None, truncate_kwargs=None, window_kwargs=None)[source]

Run the full TDI data processing pipeline on a MojitoData object.

Applies the following steps in order:

  1. Filter — band-pass (if lowpass_cutoff given) or high-pass only

  2. Downsample — polyphase resampling to target_fs (optional)

  3. Trim — removes edge artefacts introduced by the filter from both ends

  4. Truncate — selects the first truncate_days of the processed data

  5. Window — tapers edges to reduce spectral leakage

Pipeline progress is emitted at logging.INFO level via the MojitoUtils.SigProcessing logger.

Parameters:
  • data (dict) – Loaded LISA L1 data dict (from load_file()). Must contain data['tdis'] (channel arrays) and data['fs'] (sampling rate in Hz).

  • channels (list of str, optional) – TDI channels to process. Default ['X', 'Y', 'Z'].

  • filter_kwargs (dict, optional) – Filter parameters. Keys: - highpass_cutoff (float): High-pass cutoff in Hz (default: 5e-6) - lowpass_cutoff (float, optional): Low-pass cutoff for band-pass - order (int): Filter order (default: 2) - filter_type (str): Filter type (default: ‘butterworth’)

  • downsample_kwargs (dict, optional) – Downsampling parameters. Keys: - target_fs (float): Target sampling rate in Hz - kaiser_window (float): Kaiser window beta parameter (default: 31.0)

  • trim_kwargs (dict, optional) – Trimming parameters. Omit (or pass None) to skip trimming. Keys: - fraction (float): Fraction to trim from each end (default: 0.0)

  • truncate_kwargs (dict, optional) – Segmentation parameters. Keys: - days (float): Segment length in days (default: 4.0) Dataset is split into non-overlapping segments of this length. Each segment is independently windowed. Set to None to disable segmentation (returns single segment with full dataset). Note: Remainder samples shorter than a full segment are discarded.

  • window_kwargs (dict, optional) – Windowing parameters. Omit (or pass None) to skip windowing. Keys: - window (str): Window type - ‘tukey’, ‘hann’, etc. (default: ‘tukey’) - alpha (float): Taper fraction for Tukey window (default: 0.025)

Return type:

dict

Returns:

segments (dict of SignalProcessor) – Dictionary mapping segment names ('segment0', 'segment1', …) to SignalProcessor objects. Each segment contains windowed data ready for FFT analysis. Access via segments['segment0'].data, segments['segment0'].fs, etc.

Pipelines

MojitoProcessor.pipelines.read_and_process.read_and_process(path, channels=None, *, load_days=None, filter_kwargs=None, downsample_kwargs=None, trim_kwargs=None, truncate_kwargs=None, window_kwargs=None, output_path=None, segment_ids=None)[source]

Load a MojitoL1 file and run the full processing pipeline in one call.

This is a thin wrapper around load_file() and process_pipeline(). When output_path is provided the result is also written to an HDF5 file via write().

Parameters:
  • path (str or Path) – Path to the MojitoL1 .h5 input file.

  • channels (list of str, optional) – TDI channels to process. Default ['X', 'Y', 'Z'].

  • load_days (float, optional) – Number of days to load from the file (lazy slicing). None loads the full dataset.

  • filter_kwargs (dict, optional) – Passed to process_pipeline().

  • downsample_kwargs (dict, optional) – Passed to process_pipeline().

  • trim_kwargs (dict, optional) – Passed to process_pipeline().

  • truncate_kwargs (dict, optional) – Passed to process_pipeline().

  • window_kwargs (dict, optional) – Passed to process_pipeline().

  • output_path (str or Path, optional) – If given, write processed segments and raw auxiliary data to this HDF5 file.

  • segment_ids (list of int, optional) – Indices of segments to write to output_path, e.g. [0, 3]. None (default) writes all segments. Ignored when output_path is None.

Return type:

Dict[str, SignalProcessor]

Returns:

segments (dict of SignalProcessor) – Processed segments keyed by 'segment0', 'segment1', etc.