environmentaltools.temporal.pot_method

environmentaltools.temporal.pot_method(df: DataFrame, var_: str, window_size: int, alpha: float = 0.05, sim_no: int = 10000, method: str = 'nearest')[source]

Perform Peaks Over Threshold (POT) analysis with multiple thresholds.

Analyzes extreme values using Generalized Pareto Distribution fitted with L-moments. Tests multiple threshold values and provides bootstrap confidence intervals for return period estimates.

Parameters:

df (pd.DataFrame) – Time series data with datetime index
var (str) – Name of the variable column to analyze
window_size (int) – Window size for extracting independent maxima events
alpha (float, default=0.05) – Significance level for confidence intervals (0.05 gives 95% CI)
sim_no (int, default=10000) – Number of bootstrap simulations
method (str, default='nearest') – Interpolation method for p-value table (currently commented out)

Returns:

Results dictionary with keys:

’mean_value_lmom’np.ndarray
Shape (n_thresholds, n_params+6+len(tr_eval)) Mean parameter estimates and return levels for each threshold
’upper_lim’np.ndarray
Upper confidence interval bounds
’lower_lim’np.ndarray
Lower confidence interval bounds
’au2_lmom’np.ndarray
Anderson-Darling test statistics (currently zeros)
’au2pv_lmom’np.ndarray
Anderson-Darling p-values (currently zeros)
’nyears’int
Number of years in the dataset
’thresholds’np.ndarray
Threshold values tested (90th to 99.9th percentiles)
’tr_eval’np.ndarray
Return periods evaluated [1, 50, 100, 1000] years

Return type:

dict

Notes

The POT method:

Extracts independent events using moving window maxima
Tests thresholds from 90th to 99.9th percentiles
For each threshold: - Fits GPD using L-moments - Computes bootstrap confidence intervals - (Optionally) performs Anderson-Darling goodness-of-fit test

Higher thresholds provide better asymptotic approximation but fewer data points. The optimal threshold balances bias-variance tradeoff.

References

Coles, S. (2001). “An Introduction to Statistical Modeling of Extreme Values”. Springer.

Examples

>>> import pandas as pd
>>> df = pd.DataFrame({
...     'Hs': np.random.weibull(1.5, 1000)
... }, index=pd.date_range('2000', periods=1000, freq='D'))
>>> results = pot_method(df, 'Hs', window_size=48, alpha=0.05)
>>> print(f"Number of thresholds tested: {len(results['thresholds'])}")