Skip to content

SampleEntropy

Overview

The SampleEntropy class computes the Sample Entropy (SampEn) of multiple signals and finds the optimal SampEn parameters using a regularized MSE objective function and the Optuna Bayesian optimization framework with the Tree-based Parzen Estimator (TPE) surrogate function.

For more details on this optimization procedure, see Z. Blanks et al., Optimal Sample Entropy Parameter Selection for Short Time Series Signals via Bayesian Optimization, 2023.

The main methods are find_optimal_sampen_params() for performing the optimization and compute_all_sampen() for computing the SampEn of all signals with the optimized or user-provided parameters.

Attributes

  • df: (pd.DataFrame) The DataFrame containing the signals. Must contain columns for the signal_id, timestamp, and signal value at each timestamp.
  • signal_id: (str, optional) Column name in df containing the signal IDs. Default is 'signal_id'.
  • timestamp: (str, optional) Column name in df containing the timestamps. Default is 'timestamp'.
  • value_col: (str, optional) Column name in df containing the values. Default is 'value'.
  • objective: (str, optional) Objective function to minimize. Default is 'mse'. Choices are 'mse' and 'sampen_eff'.
  • n_boot: (int, optional) Number of bootstrap samples to use in the estimation. Default is 100.
  • n_trials: (int, optional) Number of trials for the optimization. Default is 100.
  • random_seed: (int, optional) Seed for the random number generator. Default is None.
  • r_range: (tuple[float, float], optional) Tuple specifying the range of \(r\) values for the optimization. Default is (0.10, 0.50).
  • m_range: (tuple[int, int], optional) Tuple specifying the range of \(m\) values for the optimization. Default is (1, 3).
  • p_range: (tuple[float, float], optional) Tuple specifying the range of \(p\) values for the stationary bootstrap. Default is (0.01, 0.99).
  • lam: (float, optional) The trade-off parameter between the \(r\)-based penalization. Default is 0.33.
  • r: (float, optional) User-provided value for \(r\). Default is None.
  • m: (int, optional) User-provided value for \(m\). Default is None.
  • p: (float, optional) User-provided value for \(p\). Default is None.

Methods

find_optimal_sampen_params

Finds the optimal \((m, r)\) SampEn parameters for the input signal set.

Notes

This method uses the Optuna library for optimizing the parameters \((m, r, p)\) using a TPE surrogate function.

Example

>>> sampen = SampleEntropy(df, n_trials=50)
>>> sampen.find_optimal_sampen_params()

compute_all_sampen

Computes the SampEn of the input signal set given either the provided or optimized values of \((m, r)\).

Parameters

  • optimize: (bool, optional) If True, optimize the SampEn parameters before computing the SampEn for all signals. Defaults to False.
  • estimate_uncertainty: (bool, optional) If True, estimates the SE(SampEn) for the given or optimized \((m, r)\) values. Defaults to False.

Returns

  • pd.DataFrame: SampEn estimates given \((m, r)\) for all signals in the data.

Example

>>> sampen = SampleEntropy(df)
>>> results = sampen.compute_all_sampen(optimize=True)

get_optimization_results

Return a DataFrame of the optimization results.

Parameters

  • attrs: (tuple, optional) Attributes of the optuna trials to include in the dataframe. By default it includes the trial number, value of the objective function, and parameters used. Refer to optuna documentation for other options.

Returns

  • pd.DataFrame: DataFrame of the optimization trials.

Example

>>> sampen = SampleEntropy(df)
>>> sampen.find_optimal_sampen_params()
>>> optimization_results = sampen.get_optimization_results()