environmentaltools.temporal.pot_method

environmentaltools.temporal.pot_method(df: DataFrame, var_: str, window_size: int, alpha: float = 0.05, sim_no: int = 10000, method: str = 'nearest')[source]

Perform Peaks Over Threshold (POT) analysis with multiple thresholds.

Analyzes extreme values using Generalized Pareto Distribution fitted with L-moments. Tests multiple threshold values and provides bootstrap confidence intervals for return period estimates.

Parameters:
  • df (pd.DataFrame) – Time series data with datetime index

  • var (str) – Name of the variable column to analyze

  • window_size (int) – Window size for extracting independent maxima events

  • alpha (float, default=0.05) – Significance level for confidence intervals (0.05 gives 95% CI)

  • sim_no (int, default=10000) – Number of bootstrap simulations

  • method (str, default='nearest') – Interpolation method for p-value table (currently commented out)

Returns:

Results dictionary with keys:

  • ’mean_value_lmom’np.ndarray

    Shape (n_thresholds, n_params+6+len(tr_eval)) Mean parameter estimates and return levels for each threshold

  • ’upper_lim’np.ndarray

    Upper confidence interval bounds

  • ’lower_lim’np.ndarray

    Lower confidence interval bounds

  • ’au2_lmom’np.ndarray

    Anderson-Darling test statistics (currently zeros)

  • ’au2pv_lmom’np.ndarray

    Anderson-Darling p-values (currently zeros)

  • ’nyears’int

    Number of years in the dataset

  • ’thresholds’np.ndarray

    Threshold values tested (90th to 99.9th percentiles)

  • ’tr_eval’np.ndarray

    Return periods evaluated [1, 50, 100, 1000] years

Return type:

dict

Notes

The POT method:

  1. Extracts independent events using moving window maxima

  2. Tests thresholds from 90th to 99.9th percentiles

  3. For each threshold: - Fits GPD using L-moments - Computes bootstrap confidence intervals - (Optionally) performs Anderson-Darling goodness-of-fit test

Higher thresholds provide better asymptotic approximation but fewer data points. The optimal threshold balances bias-variance tradeoff.

References

Coles, S. (2001). “An Introduction to Statistical Modeling of Extreme Values”. Springer.

Examples

>>> import pandas as pd
>>> df = pd.DataFrame({
...     'Hs': np.random.weibull(1.5, 1000)
... }, index=pd.date_range('2000', periods=1000, freq='D'))
>>> results = pot_method(df, 'Hs', window_size=48, alpha=0.05)
>>> print(f"Number of thresholds tested: {len(results['thresholds'])}")