API Reference¶

Data loading is handled by the mojito package. This package provides utilities for loading, processing, and writing LISA Mojito L1 data.

I/O¶

MojitoProcessor.io.read.load_file(paths, *, load_days=None)[source]¶

Load a raw Mojito L1 HDF5 file.

Uses the mojito package to open the file and extracts TDI observables, light travel times, spacecraft orbits, noise estimates, and metadata into a flat dictionary suitable for process_pipeline().

Parameters:

paths (str, Path, or list thereof) – Path(s) to the Mojito L1 .h5 file(s).
load_days (float, optional) – Number of days to load from the start of the file (lazy slicing). None loads the full dataset.

Return type:

dict

Returns:

data (dict) – Dictionary containing:

tdis — dict of TDI channel arrays (X, Y, Z, A, E, T)
fs, dt, t_tdi — TDI sampling parameters and timestamps
ltts, ltt_derivatives, ltt_times — light travel times
orbits, velocities, orbit_times — spacecraft kinematics
noise_estimates — frequency-domain noise covariance cubes
metadata — laser frequency and pipeline names

MojitoProcessor.io.read.load_processed(path, *, segment_ids=None)[source]¶

Load processed segments from an HDF5 file written by write().

Reconstructs each SignalProcessor from the stored channel arrays and metadata attributes. Per-segment orbit and LTT data (written under /raw/<segment_name>/) are returned alongside the segments in a raw data dict.

Parameters:

path (str or Path) – Path to a .h5 file previously written by write().
segment_ids (list of int, optional) – Indices of segments to load, e.g. [100, 101, 102] loads only segment100, segment101, and segment102. None (default) loads all segments.

Return type:

tuple[Dict[str, SignalProcessor], dict]

Returns:

segments (dict of SignalProcessor) – Dictionary mapping segment names to reconstructed SignalProcessor objects.
raw_data (dict) – Auxiliary data dict. Top-level keys:
- 'orbits' — per-spacecraft position/velocity arrays (full span)
- 'noise_estimates' — dict of noise covariance arrays (if present)
- 'metadata' — dict with laser_frequency / pipeline_names (if present)
- '<seg_name>_ltts' — one entry per segment containing LTT arrays: ltts, ltt_derivatives (if present), ltt_times

Raises:

ValueError – If the file does not contain a /processed group, indicating it was not written by write().

Examples

>>> from MojitoProcessor import write, load_processed
>>> write("processed.h5", segments, raw_data=data)
>>> segments, raw = load_processed("processed.h5")
>>> sp = segments["segment0"]
>>> orbit_positions = raw["segment0"]["orbits"]

MojitoProcessor.io.write.write(output_path, segments, raw_data=None, *, segment_ids=None, filter_kwargs=None, downsample_kwargs=None, trim_kwargs=None, truncate_kwargs=None, window_kwargs=None)[source]¶

Write processed segments and raw auxiliary data to an HDF5 file.

Return type:

None

Parameters:

output_path (str | Path)
segments (Dict[str, SignalProcessor])
raw_data (dict | None)
segment_ids (list | None)
filter_kwargs (dict | None)
downsample_kwargs (dict | None)
trim_kwargs (dict | None)
truncate_kwargs (dict | None)
window_kwargs (dict | None)

File layout¶

/processed/

<segment_name>/ one group per segment: <channel> processed time-domain array (e.g. X, Y, Z) t time array in seconds (TCB) attrs: fs, dt, N, T, t0, channels

/pipeline_params/

filter/ attrs: highpass_cutoff, lowpass_cutoff, order, … downsample/ attrs: target_fs, kaiser_window trim/ attrs: fraction truncate/ attrs: days window/ attrs: window, alpha

/raw/ only written when raw_data is provided

orbits/ full-span orbit arrays

sc_position_1 spacecraft 1 positions (n_orbit, 3) [m] sc_position_2 sc_position_3 sc_velocity_1 spacecraft 1 velocities (n_orbit, 3) [m/s] sc_velocity_2 sc_velocity_3 times orbit sample timestamps (TCB) [s]

noise_estimates/

xyz noise covariance cube (freq, ch, ch) aet noise covariance cube (freq, ch, ch)

metadata/

attrs: laser_frequency, pipeline_names

<segment_name>/ one group per segment

ltts/

<link> LTT time-derivatives

times LTT sample timestamps for this segment

LTT data are sliced to each segment’s time window using the segment’s t0 and time array. Segments without a valid t0 (stored as NaN) are skipped for per-segment raw data.

type output_path:: str | Path
param output_path:: Destination file path. Created (or overwritten) by this function.
type output_path:: str or Path
type segments:: Dict[str, SignalProcessor]
param segments:: Output of process_pipeline().
type segments:: dict of SignalProcessor
type raw_data:: Optional[dict]
param raw_data:: Raw data dict returned by load_file(). When provided, noise estimates and metadata are written under /raw/, and orbit/LTT data are written per-segment under /raw/<segment_name>/.
type raw_data:: dict, optional
type segment_ids:: Optional[list]
param segment_ids:: Indices of segments to write, e.g. [0, 17] writes segment0 and segment17. None (default) writes all segments.
type segment_ids:: list of int, optional
type filter_kwargs:: Optional[dict]
param filter_kwargs:: Pipeline parameter dicts, stored verbatim under /pipeline_params/.
type filter_kwargs:: dict, optional
type downsample_kwargs:: Optional[dict]
param downsample_kwargs:: Pipeline parameter dicts, stored verbatim under /pipeline_params/.
type downsample_kwargs:: dict, optional
type trim_kwargs:: Optional[dict]
param trim_kwargs:: Pipeline parameter dicts, stored verbatim under /pipeline_params/.
type trim_kwargs:: dict, optional
type truncate_kwargs:: Optional[dict]
param truncate_kwargs:: Pipeline parameter dicts, stored verbatim under /pipeline_params/.
type truncate_kwargs:: dict, optional
type window_kwargs:: Optional[dict]
param window_kwargs:: Pipeline parameter dicts, stored verbatim under /pipeline_params/.
type window_kwargs:: dict, optional

Signal Processing¶

class MojitoProcessor.process.sigprocess.SignalProcessor(data, fs, t0=None)[source]¶

Bases: object

Signal processor for multi-channel time series data.

Handles filtering, downsampling, trimming, and windowing while automatically tracking sampling parameters (fs, N, T, dt).

Parameters:

data (dict) – Dictionary of channel data, e.g., {‘X’: array, ‘Y’: array, ‘Z’: array}
fs (float) – Sampling frequency in Hz
t0 (float | None)

data¶

Current processed data (updated after each operation). Includes a 't' key giving the time array [t0, t0+dt, ..., t0+(N-1)*dt] in seconds (or [0, dt, ..., (N-1)*dt] when t0 is None).

Type:: dict

fs¶

Current sampling frequency in Hz

Type:: float

N¶

Current number of samples per channel

Type:: int

T¶

Current duration in seconds

Type:: float

dt¶

Current sampling period in seconds

Type:: float

t¶

Time array [t0, t0+dt, ..., t0+(N-1)*dt] in seconds.

Type:: ndarray

channels¶

List of channel names

Type:: list

Example

>>> sp = SignalProcessor({'X': x_data, 'Y': y_data}, fs=4.0)
>>> filtered = sp.filter(low=1e-4, high=1.0, order=6)
>>> trimmed = sp.trim(fraction=0.02)  # Trim 2% total (1% each end)
>>> windowed = sp.apply_window(window='tukey', alpha=0.05)
>>> t = sp.data['t']   # time array [t0, t0+dt, ..., t0+(N-1)*dt]

property data: dict¶

Channel data as a dict, including a 't' key for the time array.

The time array is t0 + np.arange(N) * dt (seconds). When t0 is None, the array starts from 0.

Returns:: dict – All channel arrays plus 't'.

property t: ndarray¶: Time array [t0, t0+dt, ..., t0+(N-1)*dt] in seconds.

filter(*, low=None, high=None, order=2, filter_type='butterworth', zero_phase=True)[source]¶

Apply filter to all channels (auto-detects highpass/lowpass/bandpass).

Automatically determines filter type based on provided cutoff frequencies: - Only low set: highpass filter - Only high set: lowpass filter - Both low and high set: bandpass filter

Parameters:

low (float, optional) – Lower cutoff frequency in Hz (highpass)
high (float, optional) – Upper cutoff frequency in Hz (lowpass)
order (int, optional) – Filter order (default: 6)
filter_type (str, optional) – Filter type: ‘butterworth’, ‘chebyshev1’, ‘chebyshev2’, ‘bessel’ (default: ‘butterworth’)
zero_phase (bool, optional) – Use zero-phase filtering (filtfilt) if True, else single-pass (default: True)

Return type:

Dict[str, ndarray]

Returns:

filtered_data (dict) – Dictionary of filtered channel data

Raises:

ValueError – If neither low nor high is provided

Examples

>>> # Highpass only
>>> sp.filter(low=5e-6, order=2)
>>> # Lowpass only
>>> sp.filter(high=0.1, order=2)
>>> # Bandpass
>>> sp.filter(low=1e-4, high=0.1, order=6)

downsample(target_fs, window=('kaiser', 31.0), padtype='line')[source]¶

Resample all channels to a target sampling rate using polyphase filtering.

Uses scipy.signal.resample_poly which applies a zero-phase FIR anti-aliasing filter via polyphase decomposition. Accepts arbitrary rational target rates (e.g., 4 Hz -> 0.4 Hz), unlike decimate which requires an integer factor.

Parameters:

target_fs (float) – Desired output sampling frequency in Hz. Must be positive and less than or equal to the current sampling frequency (this method is a downsampler).
window (tuple or array_like, optional) – Window specification passed to scipy.signal.resample_poly for FIR anti-aliasing filter design. Default ('kaiser', 5.0) is scipy’s own default and gives good stopband attenuation.
padtype (str, optional) – Edge-padding strategy. Options: 'line' (default), 'constant', 'mean', 'median', 'maximum', 'minimum'. 'line' extends the signal linearly from each end, reducing edge transients for slowly-varying data such as LISA TDI channels.

Return type:

Tuple[Dict[str, ndarray], float]

Returns:

resampled_data (dict) – Dictionary mapping channel names to resampled 1D arrays.
new_fs (float) – Actual output sampling frequency in Hz (exact rational result self.fs * up / down).

Raises:

ValueError – If target_fs is not positive.
ValueError – If target_fs exceeds the current sampling frequency.
ValueError – If the rational approximation of target_fs / self.fs produces up == 0.

Notes

The up/down integers are computed via:

ratio = Fraction(target_fs / self.fs).limit_denominator(10000)
up, down = ratio.numerator, ratio.denominator

Common use cases from 4 Hz source data:

4 Hz -> 1 Hz: up=1, down=4
4 Hz -> 0.4 Hz: up=1, down=10
4 Hz -> 2 Hz: up=1, down=2
4 Hz -> 3 Hz: up=3, down=4

Examples

>>> sp = SignalProcessor({'X': x_data, 'Y': y_data}, fs=4.0)
>>> sp.filter(low=5e-6, order=2)
>>> sp.trim(fraction=0.022)  # Trim 2.2% total
>>> resampled, new_fs = sp.downsample(target_fs=1.0)
>>> print(new_fs)   # 1.0

trim(fraction)[source]¶

Trim data by removing a fraction of the dataset.

Parameters:: fraction (float) – Total fraction of data to remove (e.g., 0.01 = 1%).
Return type:: Dict[str, ndarray]
Returns:: trimmed_data (dict) – Dictionary of trimmed channel data
Raises:: ValueError – If fraction is not in range [0, 1] or would remove all data

Examples

>>> # Trim 2% total (1% from each end)
>>> sp.trim(fraction=0.02)
>>> # Trim 5% from start only
>>> sp.trim(fraction=0.05)

apply_window(window='tukey', **window_params)[source]¶

Apply window function to all channels.

Parameters:

window (str, optional) – Window type: ‘tukey’, ‘blackmanharris’, ‘hann’, ‘hamming’, ‘blackman’, ‘planck’ (default: ‘tukey’)
**window_params – Additional parameters for the window function. alpha (float, default 0.05) is accepted by ‘tukey’ and ‘planck’. Other windows ignore extra keyword arguments (a warning is emitted).

Return type:

Dict[str, ndarray]

Returns:

windowed_data (dict) – Dictionary of windowed channel data

Examples

>>> sp.apply_window('tukey', alpha=0.05)
>>> sp.apply_window('blackmanharris')
>>> sp.apply_window('hann')

periodogram()[source]¶

Compute the one-sided power spectral density for each channel.

The data is assumed to have already been windowed (e.g. by apply_window()), so no additional window is applied here.

Normalisation follows Parseval’s theorem: the integral of the one-sided PSD over positive frequencies equals the mean square of the signal.

Return type:

Tuple[ndarray, Dict[str, ndarray]]

Returns:

freqs (ndarray) – Frequency array in Hz, shape (N//2 + 1,).
psds (dict) – Dictionary mapping channel names to one-sided PSD arrays (units²/Hz), each with the same shape as freqs.

Examples

>>> freqs, psds = sp.periodogram()
>>> plt.loglog(freqs[1:], psds['X'][1:])

fft()[source]¶

Compute the one-sided complex FFT spectrum for each channel.

The data is assumed to have already been windowed (e.g. by apply_window()), so no additional window is applied here. Returns raw complex amplitudes from numpy.fft.rfft.

Return type:

Tuple[ndarray, Dict[str, ndarray]]

Returns:

freqs (ndarray) – Frequency array in Hz, shape (N//2 + 1,).
ffts (dict) – Dictionary mapping channel names to complex FFT arrays, each with shape (N//2 + 1,).

Examples

>>> freqs, ffts = sp.fft()
>>> plt.loglog(freqs[1:], np.abs(ffts['X'][1:]))

to_aet()[source]¶

Transform XYZ Michelson channels to noise-orthogonal AET channels.

Uses the standard equal-arm combination:

A = (Z - X) / sqrt(2)
E = (X - 2Y + Z) / sqrt(6)
T = (X + Y + Z) / sqrt(3)

Returns a new SignalProcessor with channels ['A', 'E', 'T'], inheriting fs, t0, and all derived parameters from the original.

Return type:: SignalProcessor
Returns:: SignalProcessor – New processor with AET channel data.
Raises:: ValueError – If any of the channels 'X', 'Y', 'Z' are missing.

Examples

>>> sp_xyz = processed_segments['segment0']
>>> sp_aet = sp_xyz.to_aet()
>>> freqs, psds = sp_aet.periodogram()

get_params()[source]¶

Get current signal parameters.

Return type:: dict
Returns:: params (dict) – Dictionary containing fs, N, T, dt, and channels

MojitoProcessor.process.sigprocess.process_pipeline(data, channels=None, *, filter_kwargs=None, downsample_kwargs=None, trim_kwargs=None, truncate_kwargs=None, window_kwargs=None)[source]¶

Run the full TDI data processing pipeline on a MojitoData object.

Applies the following steps in order:

Filter — band-pass (if lowpass_cutoff given) or high-pass only
Downsample — polyphase resampling to target_fs (optional)
Trim — removes edge artefacts introduced by the filter from both ends
Truncate — selects the first truncate_days of the processed data
Window — tapers edges to reduce spectral leakage

Pipeline progress is emitted at logging.INFO level via the MojitoUtils.SigProcessing logger.

Parameters:

data (dict) – Loaded LISA L1 data dict (from load_file()). Must contain data['tdis'] (channel arrays) and data['fs'] (sampling rate in Hz).
channels (list of str, optional) – TDI channels to process. Default ['X', 'Y', 'Z'].
filter_kwargs (dict, optional) – Filter parameters. Keys: - highpass_cutoff (float): High-pass cutoff in Hz (default: 5e-6) - lowpass_cutoff (float, optional): Low-pass cutoff for band-pass - order (int): Filter order (default: 2) - filter_type (str): Filter type (default: ‘butterworth’)
downsample_kwargs (dict, optional) – Downsampling parameters. Keys: - target_fs (float): Target sampling rate in Hz - kaiser_window (float): Kaiser window beta parameter (default: 31.0)
trim_kwargs (dict, optional) – Trimming parameters. Omit (or pass None) to skip trimming. Keys: - fraction (float): Fraction to trim from each end (default: 0.0)
truncate_kwargs (dict, optional) – Segmentation parameters. Keys: - days (float): Segment length in days (default: 4.0) Dataset is split into non-overlapping segments of this length. Each segment is independently windowed. Set to None to disable segmentation (returns single segment with full dataset). Note: Remainder samples shorter than a full segment are discarded.
window_kwargs (dict, optional) – Windowing parameters. Omit (or pass None) to skip windowing. Keys: - window (str): Window type - ‘tukey’, ‘hann’, etc. (default: ‘tukey’) - alpha (float): Taper fraction for Tukey window (default: 0.025)

Return type:

dict

Returns:

segments (dict of SignalProcessor) – Dictionary mapping segment names ('segment0', 'segment1', …) to SignalProcessor objects. Each segment contains windowed data ready for FFT analysis. Access via segments['segment0'].data, segments['segment0'].fs, etc.

Pipelines¶

MojitoProcessor.pipelines.read_and_process.read_and_process(path, channels=None, *, load_days=None, filter_kwargs=None, downsample_kwargs=None, trim_kwargs=None, truncate_kwargs=None, window_kwargs=None, output_path=None, segment_ids=None)[source]¶

Load a MojitoL1 file and run the full processing pipeline in one call.

This is a thin wrapper around load_file() and process_pipeline(). When output_path is provided the result is also written to an HDF5 file via write().

Parameters:

path (str or Path) – Path to the MojitoL1 .h5 input file.
channels (list of str, optional) – TDI channels to process. Default ['X', 'Y', 'Z'].
load_days (float, optional) – Number of days to load from the file (lazy slicing). None loads the full dataset.
filter_kwargs (dict, optional) – Passed to process_pipeline().
downsample_kwargs (dict, optional) – Passed to process_pipeline().
trim_kwargs (dict, optional) – Passed to process_pipeline().
truncate_kwargs (dict, optional) – Passed to process_pipeline().
window_kwargs (dict, optional) – Passed to process_pipeline().
output_path (str or Path, optional) – If given, write processed segments and raw auxiliary data to this HDF5 file.
segment_ids (list of int, optional) – Indices of segments to write to output_path, e.g. [0, 3]. None (default) writes all segments. Ignored when output_path is None.

Return type:

Dict[str, SignalProcessor]

Returns:

segments (dict of SignalProcessor) – Processed segments keyed by 'segment0', 'segment1', etc.