environmentaltools.temporal.pot_method
- environmentaltools.temporal.pot_method(df: DataFrame, var_: str, window_size: int, alpha: float = 0.05, sim_no: int = 10000, method: str = 'nearest')[source]
Perform Peaks Over Threshold (POT) analysis with multiple thresholds.
Analyzes extreme values using Generalized Pareto Distribution fitted with L-moments. Tests multiple threshold values and provides bootstrap confidence intervals for return period estimates.
- Parameters:
df (pd.DataFrame) – Time series data with datetime index
var (str) – Name of the variable column to analyze
window_size (int) – Window size for extracting independent maxima events
alpha (float, default=0.05) – Significance level for confidence intervals (0.05 gives 95% CI)
sim_no (int, default=10000) – Number of bootstrap simulations
method (str, default='nearest') – Interpolation method for p-value table (currently commented out)
- Returns:
Results dictionary with keys:
- ’mean_value_lmom’np.ndarray
Shape (n_thresholds, n_params+6+len(tr_eval)) Mean parameter estimates and return levels for each threshold
- ’upper_lim’np.ndarray
Upper confidence interval bounds
- ’lower_lim’np.ndarray
Lower confidence interval bounds
- ’au2_lmom’np.ndarray
Anderson-Darling test statistics (currently zeros)
- ’au2pv_lmom’np.ndarray
Anderson-Darling p-values (currently zeros)
- ’nyears’int
Number of years in the dataset
- ’thresholds’np.ndarray
Threshold values tested (90th to 99.9th percentiles)
- ’tr_eval’np.ndarray
Return periods evaluated [1, 50, 100, 1000] years
- Return type:
dict
Notes
The POT method:
Extracts independent events using moving window maxima
Tests thresholds from 90th to 99.9th percentiles
For each threshold: - Fits GPD using L-moments - Computes bootstrap confidence intervals - (Optionally) performs Anderson-Darling goodness-of-fit test
Higher thresholds provide better asymptotic approximation but fewer data points. The optimal threshold balances bias-variance tradeoff.
References
Coles, S. (2001). “An Introduction to Statistical Modeling of Extreme Values”. Springer.
Examples
>>> import pandas as pd >>> df = pd.DataFrame({ ... 'Hs': np.random.weibull(1.5, 1000) ... }, index=pd.date_range('2000', periods=1000, freq='D')) >>> results = pot_method(df, 'Hs', window_size=48, alpha=0.05) >>> print(f"Number of thresholds tested: {len(results['thresholds'])}")