API Reference¶
Data loading is handled by the mojito package. This package provides utilities for loading, processing, and writing LISA Mojito L1 data.
I/O¶
- MojitoProcessor.io.read.load_file(paths, *, load_days=None)[source]¶
Load a raw Mojito L1 HDF5 file.
Uses the
mojitopackage to open the file and extracts TDI observables, light travel times, spacecraft orbits, noise estimates, and metadata into a flat dictionary suitable forprocess_pipeline().- Parameters:
paths (str, Path, or list thereof) – Path(s) to the Mojito L1
.h5file(s).load_days (float, optional) – Number of days to load from the start of the file (lazy slicing).
Noneloads the full dataset.
- Return type:
dict- Returns:
data (dict) – Dictionary containing:
tdis— dict of TDI channel arrays (X, Y, Z, A, E, T)fs,dt,t_tdi— TDI sampling parameters and timestampsltts,ltt_derivatives,ltt_times— light travel timesorbits,velocities,orbit_times— spacecraft kinematicsnoise_estimates— frequency-domain noise covariance cubesmetadata— laser frequency and pipeline names
- MojitoProcessor.io.read.load_processed(path, *, segment_ids=None)[source]¶
Load processed segments from an HDF5 file written by
write().Reconstructs each
SignalProcessorfrom the stored channel arrays and metadata attributes. Per-segment orbit and LTT data (written under/raw/<segment_name>/) are returned alongside the segments in a raw data dict.- Parameters:
path (str or Path) – Path to a
.h5file previously written bywrite().segment_ids (list of int, optional) – Indices of segments to load, e.g.
[100, 101, 102]loads onlysegment100,segment101, andsegment102.None(default) loads all segments.
- Return type:
tuple[Dict[str,SignalProcessor],dict]- Returns:
segments (dict of SignalProcessor) – Dictionary mapping segment names to reconstructed
SignalProcessorobjects.raw_data (dict) – Auxiliary data dict. Top-level keys:
'orbits'— per-spacecraft position/velocity arrays (full span)'noise_estimates'— dict of noise covariance arrays (if present)'metadata'— dict withlaser_frequency/pipeline_names(if present)'<seg_name>_ltts'— one entry per segment containing LTT arrays:ltts,ltt_derivatives(if present),ltt_times
- Raises:
ValueError – If the file does not contain a
/processedgroup, indicating it was not written bywrite().
Examples
>>> from MojitoProcessor import write, load_processed >>> write("processed.h5", segments, raw_data=data) >>> segments, raw = load_processed("processed.h5") >>> sp = segments["segment0"] >>> orbit_positions = raw["segment0"]["orbits"]
- MojitoProcessor.io.write.write(output_path, segments, raw_data=None, *, segment_ids=None, filter_kwargs=None, downsample_kwargs=None, trim_kwargs=None, truncate_kwargs=None, window_kwargs=None)[source]¶
Write processed segments and raw auxiliary data to an HDF5 file.
- Return type:
None- Parameters:
output_path (str | Path)
segments (Dict[str, SignalProcessor])
raw_data (dict | None)
segment_ids (list | None)
filter_kwargs (dict | None)
downsample_kwargs (dict | None)
trim_kwargs (dict | None)
truncate_kwargs (dict | None)
window_kwargs (dict | None)
File layout¶
- /processed/
- <segment_name>/ one group per segment
<channel> processed time-domain array (e.g. X, Y, Z) t time array in seconds (TCB) attrs: fs, dt, N, T, t0, channels
- /pipeline_params/
filter/ attrs: highpass_cutoff, lowpass_cutoff, order, … downsample/ attrs: target_fs, kaiser_window trim/ attrs: fraction truncate/ attrs: days window/ attrs: window, alpha
- /raw/ only written when raw_data is provided
- orbits/ full-span orbit arrays
sc_position_1 spacecraft 1 positions (n_orbit, 3) [m] sc_position_2 sc_position_3 sc_velocity_1 spacecraft 1 velocities (n_orbit, 3) [m/s] sc_velocity_2 sc_velocity_3 times orbit sample timestamps (TCB) [s]
- noise_estimates/
xyz noise covariance cube (freq, ch, ch) aet noise covariance cube (freq, ch, ch)
- metadata/
attrs: laser_frequency, pipeline_names
- <segment_name>/ one group per segment
- ltts/
<link> light travel times sliced to segment window derivatives/
<link> LTT time-derivatives
times LTT sample timestamps for this segment
LTT data are sliced to each segment’s time window using the segment’s
t0and time array. Segments without a validt0(stored as NaN) are skipped for per-segment raw data.- type output_path:
str|Path- param output_path:
Destination file path. Created (or overwritten) by this function.
- type output_path:
str or Path
- type segments:
Dict[str,SignalProcessor]- param segments:
Output of
process_pipeline().- type segments:
dict of SignalProcessor
- type raw_data:
Optional[dict]- param raw_data:
Raw data dict returned by
load_file(). When provided, noise estimates and metadata are written under/raw/, and orbit/LTT data are written per-segment under/raw/<segment_name>/.- type raw_data:
dict, optional
- type segment_ids:
Optional[list]- param segment_ids:
Indices of segments to write, e.g.
[0, 17]writessegment0andsegment17.None(default) writes all segments.- type segment_ids:
list of int, optional
- type filter_kwargs:
Optional[dict]- param filter_kwargs:
Pipeline parameter dicts, stored verbatim under
/pipeline_params/.- type filter_kwargs:
dict, optional
- type downsample_kwargs:
Optional[dict]- param downsample_kwargs:
Pipeline parameter dicts, stored verbatim under
/pipeline_params/.- type downsample_kwargs:
dict, optional
- type trim_kwargs:
Optional[dict]- param trim_kwargs:
Pipeline parameter dicts, stored verbatim under
/pipeline_params/.- type trim_kwargs:
dict, optional
- type truncate_kwargs:
Optional[dict]- param truncate_kwargs:
Pipeline parameter dicts, stored verbatim under
/pipeline_params/.- type truncate_kwargs:
dict, optional
- type window_kwargs:
Optional[dict]- param window_kwargs:
Pipeline parameter dicts, stored verbatim under
/pipeline_params/.- type window_kwargs:
dict, optional
Signal Processing¶
- class MojitoProcessor.process.sigprocess.SignalProcessor(data, fs, t0=None)[source]¶
Bases:
objectSignal processor for multi-channel time series data.
Handles filtering, downsampling, trimming, and windowing while automatically tracking sampling parameters (fs, N, T, dt).
- Parameters:
data (dict) – Dictionary of channel data, e.g., {‘X’: array, ‘Y’: array, ‘Z’: array}
fs (float) – Sampling frequency in Hz
t0 (float | None)
- data¶
Current processed data (updated after each operation). Includes a
't'key giving the time array[t0, t0+dt, ..., t0+(N-1)*dt]in seconds (or[0, dt, ..., (N-1)*dt]whent0isNone).- Type:
dict
- fs¶
Current sampling frequency in Hz
- Type:
float
- N¶
Current number of samples per channel
- Type:
int
- T¶
Current duration in seconds
- Type:
float
- dt¶
Current sampling period in seconds
- Type:
float
- t¶
Time array
[t0, t0+dt, ..., t0+(N-1)*dt]in seconds.- Type:
ndarray
- channels¶
List of channel names
- Type:
list
Example
>>> sp = SignalProcessor({'X': x_data, 'Y': y_data}, fs=4.0) >>> filtered = sp.filter(low=1e-4, high=1.0, order=6) >>> trimmed = sp.trim(fraction=0.02) # Trim 2% total (1% each end) >>> windowed = sp.apply_window(window='tukey', alpha=0.05) >>> t = sp.data['t'] # time array [t0, t0+dt, ..., t0+(N-1)*dt]
- property data: dict¶
Channel data as a dict, including a
't'key for the time array.The time array is
t0 + np.arange(N) * dt(seconds). Whent0isNone, the array starts from 0.- Returns:
dict – All channel arrays plus
't'.
- property t: ndarray¶
Time array
[t0, t0+dt, ..., t0+(N-1)*dt]in seconds.
- filter(*, low=None, high=None, order=2, filter_type='butterworth', zero_phase=True)[source]¶
Apply filter to all channels (auto-detects highpass/lowpass/bandpass).
Automatically determines filter type based on provided cutoff frequencies: - Only low set: highpass filter - Only high set: lowpass filter - Both low and high set: bandpass filter
- Parameters:
low (float, optional) – Lower cutoff frequency in Hz (highpass)
high (float, optional) – Upper cutoff frequency in Hz (lowpass)
order (int, optional) – Filter order (default: 6)
filter_type (str, optional) – Filter type: ‘butterworth’, ‘chebyshev1’, ‘chebyshev2’, ‘bessel’ (default: ‘butterworth’)
zero_phase (bool, optional) – Use zero-phase filtering (filtfilt) if True, else single-pass (default: True)
- Return type:
Dict[str,ndarray]- Returns:
filtered_data (dict) – Dictionary of filtered channel data
- Raises:
ValueError – If neither low nor high is provided
Examples
>>> # Highpass only >>> sp.filter(low=5e-6, order=2) >>> # Lowpass only >>> sp.filter(high=0.1, order=2) >>> # Bandpass >>> sp.filter(low=1e-4, high=0.1, order=6)
- downsample(target_fs, window=('kaiser', 31.0), padtype='line')[source]¶
Resample all channels to a target sampling rate using polyphase filtering.
Uses
scipy.signal.resample_polywhich applies a zero-phase FIR anti-aliasing filter via polyphase decomposition. Accepts arbitrary rational target rates (e.g., 4 Hz -> 0.4 Hz), unlikedecimatewhich requires an integer factor.- Parameters:
target_fs (float) – Desired output sampling frequency in Hz. Must be positive and less than or equal to the current sampling frequency (this method is a downsampler).
window (tuple or array_like, optional) – Window specification passed to
scipy.signal.resample_polyfor FIR anti-aliasing filter design. Default('kaiser', 5.0)is scipy’s own default and gives good stopband attenuation.padtype (str, optional) – Edge-padding strategy. Options:
'line'(default),'constant','mean','median','maximum','minimum'.'line'extends the signal linearly from each end, reducing edge transients for slowly-varying data such as LISA TDI channels.
- Return type:
Tuple[Dict[str,ndarray],float]- Returns:
resampled_data (dict) – Dictionary mapping channel names to resampled 1D arrays.
new_fs (float) – Actual output sampling frequency in Hz (exact rational result
self.fs * up / down).
- Raises:
ValueError – If
target_fsis not positive.ValueError – If
target_fsexceeds the current sampling frequency.ValueError – If the rational approximation of
target_fs / self.fsproducesup == 0.
Notes
The up/down integers are computed via:
ratio = Fraction(target_fs / self.fs).limit_denominator(10000) up, down = ratio.numerator, ratio.denominator
Common use cases from 4 Hz source data:
4 Hz -> 1 Hz: up=1, down=4
4 Hz -> 0.4 Hz: up=1, down=10
4 Hz -> 2 Hz: up=1, down=2
4 Hz -> 3 Hz: up=3, down=4
Examples
>>> sp = SignalProcessor({'X': x_data, 'Y': y_data}, fs=4.0) >>> sp.filter(low=5e-6, order=2) >>> sp.trim(fraction=0.022) # Trim 2.2% total >>> resampled, new_fs = sp.downsample(target_fs=1.0) >>> print(new_fs) # 1.0
- trim(fraction)[source]¶
Trim data by removing a fraction of the dataset.
- Parameters:
fraction (float) – Total fraction of data to remove (e.g., 0.01 = 1%).
- Return type:
Dict[str,ndarray]- Returns:
trimmed_data (dict) – Dictionary of trimmed channel data
- Raises:
ValueError – If fraction is not in range [0, 1] or would remove all data
Examples
>>> # Trim 2% total (1% from each end) >>> sp.trim(fraction=0.02) >>> # Trim 5% from start only >>> sp.trim(fraction=0.05)
- apply_window(window='tukey', **window_params)[source]¶
Apply window function to all channels.
- Parameters:
window (str, optional) – Window type: ‘tukey’, ‘blackmanharris’, ‘hann’, ‘hamming’, ‘blackman’, ‘planck’ (default: ‘tukey’)
**window_params – Additional parameters for the window function.
alpha(float, default 0.05) is accepted by ‘tukey’ and ‘planck’. Other windows ignore extra keyword arguments (a warning is emitted).
- Return type:
Dict[str,ndarray]- Returns:
windowed_data (dict) – Dictionary of windowed channel data
Examples
>>> sp.apply_window('tukey', alpha=0.05) >>> sp.apply_window('blackmanharris') >>> sp.apply_window('hann')
- periodogram()[source]¶
Compute the one-sided power spectral density for each channel.
The data is assumed to have already been windowed (e.g. by
apply_window()), so no additional window is applied here.Normalisation follows Parseval’s theorem: the integral of the one-sided PSD over positive frequencies equals the mean square of the signal.
- Return type:
Tuple[ndarray,Dict[str,ndarray]]- Returns:
freqs (ndarray) – Frequency array in Hz, shape
(N//2 + 1,).psds (dict) – Dictionary mapping channel names to one-sided PSD arrays (units²/Hz), each with the same shape as
freqs.
Examples
>>> freqs, psds = sp.periodogram() >>> plt.loglog(freqs[1:], psds['X'][1:])
- fft()[source]¶
Compute the one-sided complex FFT spectrum for each channel.
The data is assumed to have already been windowed (e.g. by
apply_window()), so no additional window is applied here. Returns raw complex amplitudes fromnumpy.fft.rfft.- Return type:
Tuple[ndarray,Dict[str,ndarray]]- Returns:
freqs (ndarray) – Frequency array in Hz, shape
(N//2 + 1,).ffts (dict) – Dictionary mapping channel names to complex FFT arrays, each with shape
(N//2 + 1,).
Examples
>>> freqs, ffts = sp.fft() >>> plt.loglog(freqs[1:], np.abs(ffts['X'][1:]))
- to_aet()[source]¶
Transform XYZ Michelson channels to noise-orthogonal AET channels.
Uses the standard equal-arm combination:
A = (Z - X) / sqrt(2) E = (X - 2Y + Z) / sqrt(6) T = (X + Y + Z) / sqrt(3)
Returns a new
SignalProcessorwith channels['A', 'E', 'T'], inheritingfs,t0, and all derived parameters from the original.- Return type:
- Returns:
SignalProcessor – New processor with AET channel data.
- Raises:
ValueError – If any of the channels
'X','Y','Z'are missing.
Examples
>>> sp_xyz = processed_segments['segment0'] >>> sp_aet = sp_xyz.to_aet() >>> freqs, psds = sp_aet.periodogram()
- MojitoProcessor.process.sigprocess.process_pipeline(data, channels=None, *, filter_kwargs=None, downsample_kwargs=None, trim_kwargs=None, truncate_kwargs=None, window_kwargs=None)[source]¶
Run the full TDI data processing pipeline on a MojitoData object.
Applies the following steps in order:
Filter — band-pass (if
lowpass_cutoffgiven) or high-pass onlyDownsample — polyphase resampling to
target_fs(optional)Trim — removes edge artefacts introduced by the filter from both ends
Truncate — selects the first
truncate_daysof the processed dataWindow — tapers edges to reduce spectral leakage
Pipeline progress is emitted at
logging.INFOlevel via theMojitoUtils.SigProcessinglogger.- Parameters:
data (dict) – Loaded LISA L1 data dict (from
load_file()). Must containdata['tdis'](channel arrays) anddata['fs'](sampling rate in Hz).channels (list of str, optional) – TDI channels to process. Default
['X', 'Y', 'Z'].filter_kwargs (dict, optional) – Filter parameters. Keys: -
highpass_cutoff(float): High-pass cutoff in Hz (default: 5e-6) -lowpass_cutoff(float, optional): Low-pass cutoff for band-pass -order(int): Filter order (default: 2) -filter_type(str): Filter type (default: ‘butterworth’)downsample_kwargs (dict, optional) – Downsampling parameters. Keys: -
target_fs(float): Target sampling rate in Hz -kaiser_window(float): Kaiser window beta parameter (default: 31.0)trim_kwargs (dict, optional) – Trimming parameters. Omit (or pass
None) to skip trimming. Keys: -fraction(float): Fraction to trim from each end (default: 0.0)truncate_kwargs (dict, optional) – Segmentation parameters. Keys: -
days(float): Segment length in days (default: 4.0) Dataset is split into non-overlapping segments of this length. Each segment is independently windowed. Set toNoneto disable segmentation (returns single segment with full dataset). Note: Remainder samples shorter than a full segment are discarded.window_kwargs (dict, optional) – Windowing parameters. Omit (or pass
None) to skip windowing. Keys: -window(str): Window type - ‘tukey’, ‘hann’, etc. (default: ‘tukey’) -alpha(float): Taper fraction for Tukey window (default: 0.025)
- Return type:
dict- Returns:
segments (dict of SignalProcessor) – Dictionary mapping segment names (
'segment0','segment1', …) toSignalProcessorobjects. Each segment contains windowed data ready for FFT analysis. Access viasegments['segment0'].data,segments['segment0'].fs, etc.
Pipelines¶
- MojitoProcessor.pipelines.read_and_process.read_and_process(path, channels=None, *, load_days=None, filter_kwargs=None, downsample_kwargs=None, trim_kwargs=None, truncate_kwargs=None, window_kwargs=None, output_path=None, segment_ids=None)[source]¶
Load a MojitoL1 file and run the full processing pipeline in one call.
This is a thin wrapper around
load_file()andprocess_pipeline(). When output_path is provided the result is also written to an HDF5 file viawrite().- Parameters:
path (str or Path) – Path to the MojitoL1
.h5input file.channels (list of str, optional) – TDI channels to process. Default
['X', 'Y', 'Z'].load_days (float, optional) – Number of days to load from the file (lazy slicing).
Noneloads the full dataset.filter_kwargs (dict, optional) – Passed to
process_pipeline().downsample_kwargs (dict, optional) – Passed to
process_pipeline().trim_kwargs (dict, optional) – Passed to
process_pipeline().truncate_kwargs (dict, optional) – Passed to
process_pipeline().window_kwargs (dict, optional) – Passed to
process_pipeline().output_path (str or Path, optional) – If given, write processed segments and raw auxiliary data to this HDF5 file.
segment_ids (list of int, optional) – Indices of segments to write to output_path, e.g.
[0, 3].None(default) writes all segments. Ignored when output_path isNone.
- Return type:
Dict[str,SignalProcessor]- Returns:
segments (dict of SignalProcessor) – Processed segments keyed by
'segment0','segment1', etc.