API Reference

This page contains the API reference for all public classes and functions in the lisa-gap package.

Core Classes

GapMaskGenerator

class lisagap.GapMaskGenerator(sim_t: ndarray[tuple[Any, ...], dtype[floating[Any]]], gap_definitions: dict[str, dict[str, dict[str, Any]]], treat_as_nan: bool = True, planseed: int = 11071993, unplanseed: int = 16121997)[source]

Bases: object

A class to generate and manage gap masks for time series data. Original code developed by Eleonora Castelli (NASA Goddard) and adapted by Ollie Burke (Glasgow)

__init__(sim_t: ndarray[tuple[Any, ...], dtype[floating[Any]]], gap_definitions: dict[str, dict[str, dict[str, Any]]], treat_as_nan: bool = True, planseed: int = 11071993, unplanseed: int = 16121997)[source]

Parameters:

sim_t (np.ndarray) – Array of simulation time values.
gap_definitions (dict) – Dictionary defining planned and unplanned gaps. Each gap entry must include: - ‘rate_per_year’ (float) - ‘duration_hr’ (float)
treat_as_nan (bool, optional) – If True, gaps are inserted as NaNs. If False, they are inserted as zeros.
planseed (int) – Seed for planned gap randomization.
unplanseed (int) – Seed for unplanned gap randomization.

build_quality_flags(data_array: ndarray[tuple[Any, ...], dtype[float64]], save_to_file: str | None = None) → ndarray[tuple[Any, ...], dtype[int64]][source]

Build a masking function based on the gap definitions and the provided data array.

Parameters:

data_array (np.ndarray) – The data array to be masked.
save_to_file (str, optional) – If provided, saves the quality flags to an HDF5 file at the specified path.

Returns:

A masking function that can be applied to the data array.

Return type:

np.ndarray

construct_planned_gap_mask(rate: float, gap_duration: float, seed: int | None = None) → ndarray[tuple[Any, ...], dtype[float64 | int64]][source]

Construct a planned gap mask with regular spacing and jitter.

Parameters:

rate (float) – Gap rate (in events/s).
gap_duration (float) – Gap duration (in samples).
seed (int or None) – Random seed.

Returns:

Array with gaps represented as NaNs or zeros, but valid data always as integer 1.

Return type:

np.ndarray

construct_unplanned_gap_mask(rate: float, gap_duration: float, seed: int | None = None) → ndarray[tuple[Any, ...], dtype[float64 | int64]][source]

Construct an unplanned gap mask using an exponential distribution.

Parameters:

rate (float) – Gap rate (in events/s).
gap_duration (float) – Gap duration (in seconds).
seed (int or None) – Random seed.

Returns:

Array with gaps represented as NaNs or zeros, but valid data always as integer 1.

Return type:

np.ndarray

classmethod from_hdf5(filename: str) → GapMaskGenerator[source]

Reconstruct a GapMaskGenerator object from an HDF5 file. classmethod, so no need to instantiate the class first. This method reads the gap mask, simulation time, and metadata from the file, and returns a new instance of GapMaskGenerator.

Parameters:: filename (str) – Path to the HDF5 file.
Returns:: A new instance reconstructed from the file.
Return type:: GapMaskGenerator

generate_mask(include_planned: bool = True, include_unplanned: bool = True) → ndarray[tuple[Any, ...], dtype[float64 | int64]][source]

Combine planned and unplanned masks into a final mask.

Parameters:

include_planned (bool) – Include planned gaps.
include_unplanned (bool) – Include unplanned gaps.

Returns:

Final gap mask with valid data as integer 1, gaps as 0 or NaN.

Return type:

np.ndarray

static load_mask_from_hdf5(filename: str, convert_to_mask: bool = True) → tuple[ndarray, dict[str, Any]][source]

Load just the mask data from an HDF5 file, with optional conversion.

Parameters:

filename (str) – Path to the HDF5 file.
convert_to_mask (bool, optional) – If True and the file contains quality flags, convert them back to a mask format. If False, return the data in its stored format.

Returns:

The mask/quality flag data and metadata dictionary containing: - ‘treat_as_nan’: boolean indicating original mask type - ‘saved_as_quality_flags’: boolean indicating storage format - ‘data_type’: string describing what was returned

Return type:

tuple[np.ndarray, dict]

Examples

>>> # Load quality flags as-is
>>> flags, meta = GapMaskGenerator.load_mask_from_hdf5("data.h5", convert_to_mask=False)
>>> # Load and convert quality flags to mask format
>>> mask, meta = GapMaskGenerator.load_mask_from_hdf5("data.h5", convert_to_mask=True)

static quality_flags_to_mask(quality_flags: ndarray[tuple[Any, ...], dtype[int64]]) → ndarray[tuple[Any, ...], dtype[float64]][source]

Convert integer quality flags to a float mask suitable for data multiplication.

This utility function converts from the compact storage format (integers) to the working format (floats with NaNs) used for masking data arrays.

Parameters:: quality_flags (NDArray[np.int_]) – Integer array where: - 0 = valid data - 1 = corrupt/gap data
Returns:: Float mask array where: - 1.0 = valid data (multiply by 1.0 = unchanged) - NaN = gap data (multiply by NaN = NaN)
Return type:: NDArray[np.float64]

Examples

>>> flags = np.array([0, 1, 0, 1, 0], dtype=int)
>>> mask = GapMaskGenerator.quality_flags_to_mask(flags)
>>> print(mask)
[1.0, nan, 1.0, nan, 1.0]
>>> data = np.array([10.0, 20.0, 30.0, 40.0, 50.0])
>>> masked_data = data * mask
>>> print(masked_data)
[10.0, nan, 30.0, nan, 50.0]

save_to_hdf5(mask: ndarray, filename: str = 'gap_mask_data.h5', save_as_quality_flags: bool = False) → None[source]

Save the gap mask and associated simulation metadata to an HDF5 file.

Parameters:

mask (np.ndarray) – The gap mask array, typically generated using generate_mask(). Should be of the same length as sim_t, and contain either 1s and 0s, or 1s and NaNs depending on the treat_as_nan setting.
filename (str, optional) – Path to the HDF5 file to create. Defaults to “gap_mask_data.h5”.
save_as_quality_flags (bool, optional) – If True, converts the mask to quality data flags before saving: - 1 = gap/corrupt data (where original mask has NaN or 0) - 0 = valid data (where original mask has 1.0 or 1) If False (default), saves the mask in its original format.

Notes

This function stores:

The mask data under “gap_mask” (original format) or “quality_flags” (if save_as_quality_flags=True)
Metadata attributes:
- “dt” (time step)
- “treat_as_nan” (boolean mask type flag)
- “saved_as_quality_flags” (boolean indicating storage format)
Gap configuration details in two groups:
- “planned_gaps”: each with rate_events_per_year and duration_hours
- “unplanned_gaps”: same structure as planned gaps

The resulting file can be reloaded using the from_hdf5() class method given below.

summary(mask: ndarray[tuple[Any, ...], dtype[float64]] | ndarray[tuple[Any, ...], dtype[int64]] | None = None, export_json_path: str | Path | None = None) → dict[str, Any][source]

Return a structured dictionary summarising the gap configuration and optionally the content of a specific mask.

Parameters:

mask (np.ndarray, optional) – If provided, calculates duty cycle and number of gaps based on this mask.
export_json_path (str or Path, optional) – If provided, writes the summary dictionary to a JSON file at the given path.

Returns:

Summary of configuration and optionally mask content.

Return type:

dict

GapWindowGenerator

class lisagap.GapWindowGenerator(gap_mask_generator: GapMaskGenerator)[source]

Bases: object

Advanced gap mask generator with smooth tapering capabilities.

This class wraps an existing lisaglitch.GapMaskGenerator instance and adds sophisticated tapering and windowing functionality for smooth transitions around gap edges.

Parameters:: gap_mask_generator (GapMaskGenerator) – An instantiated GapMaskGenerator object from lisaglitch.

Examples

>>> from lisaglitch import GapMaskGenerator
>>> from lisagap import GapWindowGenerator
>>>
>>> # Create the core gap generator
>>> gap_gen = GapMaskGenerator(sim_t=sim_t, gap_definitions=gap_defs)
>>>
>>> # Wrap it with windowing capabilities
>>> window = GapWindowGenerator(gap_gen)
>>>
>>> # Generate masks with optional tapering
>>> mask = window.generate_mask(apply_tapering=True, taper_definitions=taper_defs)

__init__(gap_mask_generator: GapMaskGenerator)[source]

Initialize a GapWindowGenerator from an existing GapMaskGenerator.

Parameters:: gap_mask_generator (GapMaskGenerator) – An instantiated GapMaskGenerator object from lisaglitch.

static apply_proportional_tapering(mask_data: ndarray[tuple[Any, ...], dtype[_ScalarT]], dt: float = 1.0, short_taper_fraction: float = 0.25, medium_taper_fraction: float = 0.05, long_taper_fraction: float = 0.05, min_gap_points: int = 5, short_gap_threshold_minutes: float = 10.0, long_gap_threshold_hours: float = 10.0) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Apply proportional tapering to gaps in a mask loaded from .npy array.

This method automatically detects gaps in the input mask and applies Tukey window tapering proportional to the gap duration. Different taper fractions are applied based on gap length categories.

Parameters:

mask_data (np.ndarray) – Input mask data from .npy file. Can contain NaN or 0 for gaps.
dt (float) – Time step in seconds between samples.
short_taper_fraction (float, optional) – Fraction of gap duration to taper on each side for short gaps. Default is 0.25 (25% each side = 50% total taper).
medium_taper_fraction (float, optional) – Fraction of gap duration to taper on each side for medium gaps. Default is 0.05 (5% each side = 10% total taper).
long_taper_fraction (float, optional) – Fraction of gap duration to taper on each side for long gaps. Default is 0.05 (5% each side = 10% total taper).
min_gap_points (int, optional) – Minimum number of consecutive gap points to apply tapering. Gaps shorter than this are left unchanged. Default is 5.
short_gap_threshold_minutes (float, optional) – Threshold in minutes to distinguish short from medium gaps. Default is 10.0 minutes.
long_gap_threshold_hours (float, optional) – Threshold in hours to distinguish medium from long gaps. Default is 10.0 hours.

Returns:

Tapered mask with smooth transitions around gap edges.

Return type:

np.ndarray

Examples

>>> # Load mask from .npy file
>>> mask = np.load('gap_mask.npy')
>>>
>>> # Apply proportional tapering
>>> tapered_mask = GapWindowGenerator.apply_proportional_tapering(
...     mask, dt=1.0
... )
>>>
>>> # Custom tapering for different gap categories
>>> tapered_mask = GapWindowGenerator.apply_proportional_tapering(
...     mask,
...     dt=0.25,
...     short_taper_fraction=0.3,   # 30% each side for short gaps
...     medium_taper_fraction=0.1,  # 10% each side for medium gaps
...     long_taper_fraction=0.02    # 2% each side for long gaps
... )

apply_smooth_taper_to_mask(mask: ndarray[tuple[Any, ...], dtype[_ScalarT]], taper_gap_definitions: Dict[str, Dict[str, Dict[str, Any]]] | None)[source]

Apply Tukey taper smoothing to an existing gap mask.

This function takes as input a mask and applies smooth tapers to the end of the gaps using Tukey windows, which helps reduce spectral artifacts in frequency domain analysis.

Parameters:

mask (np.ndarray) – Original binary or NaN mask (1 = good, 0/NaN = gap).
taper_gap_definitions (dict) –
Dictionary containing taper parameters per gap type. Expected structure: {

”planned”: {
“gap_name”: {“lobe_lengths_hr”: float}

}, “unplanned”: {

”gap_name”: {“lobe_lengths_hr”: float}

}

}

Returns:

A smoothed mask with tapering applied around each gap.

Return type:

np.ndarray

construct_planned_gap_mask(**kwargs)[source]: Generate only planned gaps mask.

construct_unplanned_gap_mask(**kwargs)[source]: Generate only unplanned gaps mask.

generate_window(include_planned: bool = True, include_unplanned: bool = True, apply_tapering: bool = False, taper_definitions: Dict[str, Dict[str, Dict[str, Any]]] | None = None, merge_close_gaps: bool = False, min_freq_resolution_hz: float = 9.259259259259259e-05) → tuple[ndarray[tuple[Any, ...], dtype[_ScalarT]], Dict[str, Any] | None][source]

Generate gap mask with optional tapering and gap merging.

This method combines gap generation from the underlying GapMaskGenerator with optional smooth tapering and gap merging to create production-ready masks.

Parameters:

include_planned (bool, optional) – Include planned gaps in the mask. Default is True.
include_unplanned (bool, optional) – Include unplanned gaps in the mask. Default is True.
apply_tapering (bool, optional) – Whether to apply smooth tapering around gaps. Default is False.
taper_definitions (dict, optional) –
Tapering parameters for each gap type. Required if apply_tapering=True. Expected structure: {

”planned”: {
“gap_name”: {“lobe_lengths_hr”: float}

}, “unplanned”: {

”gap_name”: {“lobe_lengths_hr”: float}

}

}
merge_close_gaps (bool, optional) – If True, merge gaps that are close together to avoid short segments with poor frequency resolution. Default is False.
min_freq_resolution_hz (float, optional) – Minimum frequency resolution for data segments. Segments shorter than 1/min_freq_resolution_hz will be merged with adjacent gaps. Default is 1/(3*3600) Hz ≈ 9.26e-5 Hz (3 hour minimum segments).

Returns:

mask (np.ndarray) – Gap mask with 1.0 for good data and 0.0/NaN for gaps. If tapering is applied, values between 0 and 1 indicate the tapering transition regions.
merge_stats (dict or None) – Dictionary containing merge statistics if merge_close_gaps=True, else None. Contains: ‘segments_merged’, ‘original_duty_cycle’, ‘merged_duty_cycle’, ‘additional_data_lost_hr’.

Examples

>>> # Basic mask generation
>>> mask, stats = window.generate_window()
>>>
>>> # With gap merging
>>> mask, stats = window.generate_window(
...     merge_close_gaps=True,
...     min_freq_resolution_hz=1/(2*3600)  # 2 hour minimum segments
... )
>>> print(f"Merged {stats['segments_merged']} segments")
>>>
>>> # With tapering and merging
>>> taper_defs = {
...     "planned": {"maintenance": {"lobe_lengths_hr": 2.0}}
... }
>>> mask, stats = window.generate_window(
...     apply_tapering=True,
...     taper_definitions=taper_defs,
...     merge_close_gaps=True
... )

property planned_durations: Get planned gap durations.

property planned_rates: Get planned gap rates.

save_to_hdf5(mask: ndarray[tuple[Any, ...], dtype[_ScalarT]], filename: str = 'gap_mask_data.h5', **kwargs)[source]: Save a gap mask to an HDF5 file.

summary(mask: ndarray[tuple[Any, ...], dtype[float64]] | None = None) → Dict[str, Any][source]

Return a summary of the gap configuration and mask statistics.

Parameters:: mask (np.ndarray, optional) – If provided, includes statistics about this specific mask.
Returns:: Summary dictionary with configuration and statistics.
Return type:: dict

property unplanned_durations: Get unplanned gap durations.

property unplanned_rates: Get unplanned gap rates.

DataSegmentGenerator

class lisagap.DataSegmentGenerator(mask: ndarray[tuple[Any, ...], dtype[_ScalarT]], data: ndarray[tuple[Any, ...], dtype[_ScalarT]], dt: float, t0: float = 0.0)[source]

Bases: object

Generator for segmenting time series data into continuous chunks based on gap masks.

This class takes a binary mask (1s for valid data, NaN/0s for gaps) and segments the corresponding data into continuous chunks. Each segment contains the data, time stamps, mask information, and indices for separate analysis.

Parameters:

mask (NDArray) – Binary mask where 1 indicates valid data and NaN/0 indicates gaps.
data (NDArray) – Time series data corresponding to the mask.
dt (float) – Sampling interval (time step between samples).
t0 (float, optional) – Start time for the time series. Default is 0.0.

Examples

>>> import numpy as np
>>> from lisagap import DataSegmentGenerator
>>>
>>> # Create sample data with gaps
>>> data = np.random.randn(1000)
>>> mask = np.ones_like(data)
>>> mask[200:300] = np.nan  # Create a gap
>>> mask[500:520] = np.nan  # Another gap
>>>
>>> # Create segmenter
>>> segmenter = DataSegmentGenerator(mask=mask, data=data, dt=0.1, t0=0.0)
>>>
>>> # Get time domain segments
>>> segments = segmenter.get_time_segments()
>>>
>>> # Get frequency domain information
>>> freq_info = segmenter.get_freq_info_from_segments()

__init__(mask: ndarray[tuple[Any, ...], dtype[_ScalarT]], data: ndarray[tuple[Any, ...], dtype[_ScalarT]], dt: float, t0: float = 0.0)[source]

Initialize the DataSegmentGenerator.

Parameters:

mask (NDArray) – Binary mask where 1 indicates valid data and NaN/0 indicates gaps.
data (NDArray) – Time series data corresponding to the mask.
dt (float) – Sampling interval (time step between samples).
t0 (float, optional) – Start time for the time series. Default is 0.0.

classmethod from_gap_generator(gap_window_generator: GapWindowGenerator, data: ndarray[tuple[Any, ...], dtype[_ScalarT]], dt: float, t0: float = 0.0, apply_tapering: bool = False, taper_definitions: Dict[str, Dict[str, Dict[str, Any]]] | None = None, **kwargs) → Tuple[DataSegmentGenerator, ndarray[tuple[Any, ...], dtype[_ScalarT]]][source]

Create DataSegmentGenerator from a GapWindowGenerator.

This class method generates a mask using the provided GapWindowGenerator and returns both the DataSegmentGenerator instance and the mask for downstream reuse.

Parameters:

gap_window_generator (GapWindowGenerator) – Configured GapWindowGenerator instance.
data (NDArray) – Time series data to segment.
dt (float) – Sampling interval.
t0 (float, optional) – Start time. Default is 0.0.
apply_tapering (bool, optional) – Whether to apply tapering to the mask. Default is False.
taper_definitions (dict, optional) – Tapering definitions for the mask.
**kwargs – Additional arguments passed to generate_window().

Returns:

Tuple containing: - DataSegmentGenerator instance - The generated mask (for downstream reuse)

Return type:

Tuple[DataSegmentGenerator, NDArray]

get_freq_info_from_segments() → Dict[str, Dict[str, Any]][source]

Get frequency domain information for each segment.

Returns:: Dictionary containing frequency information with keys: - ‘frequencies’: Frequency bins for the segment - ‘fft’: FFT of the segment data - ‘start_idx’: Start index in original array - ‘end_idx’: End index in original array
Return type:: Dict[str, Dict[str, Any]]

get_time_segments(apply_window: bool = False, left_edge_taper: int | None = None, right_edge_taper: int | None = None) → Dict[str, Dict[str, Any]][source]

Get time domain segments of the data.

Parameters:

apply_window (bool, optional) – If True, apply the windowing/tapering to the segmented data. Default is False.
left_edge_taper (int, optional) – Number of samples to taper on the left edge of the first segment. Only applied when apply_window=True. Default is None (no edge tapering).
right_edge_taper (int, optional) – Number of samples to taper on the right edge of the last segment. Only applied when apply_window=True. Default is None (no edge tapering).

Returns:

Dictionary containing segmented data with keys: - ‘data’: Data array for the segment (windowed if apply_window=True) - ‘time’: Time array for the segment - ‘mask’: Mask array for the segment (showing any tapering applied) - ‘start_idx’: Start index in original array - ‘end_idx’: End index in original array

Return type:

Dict[str, Dict[str, Any]]

summary() → Dict[str, Any][source]

Get summary information about the segmentation.

Returns:: Summary containing: - Number of segments - Total data length - Total valid data length - Segment lengths - Gap information
Return type:: Dict[str, Any]

Key Methods and Features

Edge Tapering

The DataSegmentGenerator.get_time_segments() method now supports advanced edge tapering for spectral analysis:

apply_window (bool): Apply windowing to data segments
left_edge_taper (int): Number of samples for left edge tapering on first segment
right_edge_taper (int): Number of samples for right edge tapering on last segment

Edge tapering uses one-sided Tukey windows to smoothly ramp data from 0 to full amplitude (left edge) or from full amplitude to 0 (right edge), preventing spectral leakage in frequency domain analysis.

Proportional Tapering

The GapWindowGenerator.apply_proportional_tapering() static method provides intelligent gap tapering:

Automatically detects gaps in mask data
Applies different taper fractions based on gap duration (short/medium/long)
Uses optimal Tukey window parameters for smooth transitions
Preserves data integrity while reducing spectral artifacts

Utility Functions

The following utility functions are available for internal use but may be useful for advanced users: