Common module

The common module provides essential utilities for data input/output operations, data manipulation, and general-purpose functions used across the environmentaltools package.

Common module for environmental data utilities.

This module provides essential utilities for data input/output operations, data manipulation, model configuration, and general-purpose functions used across the environmentaltools package.

Submodules:

read: Functions for reading various file formats (CSV, Excel, NetCDF, etc.) save: Functions for saving data to various formats load: Functions for loading model outputs and configurations write: Functions for writing model input files (SWAN, CSHORE, COPLA) utils: General utility functions for data analysis and transformations

environmentaltools.common.acorr(data: ndarray | Series, max_lags: int = 24)[source]

Compute autocorrelation function of a time series.

Calculates the normalized autocorrelation for a range of lags using matplotlib’s autocorrelation function.

Parameters:
  • data (np.ndarray or pd.Series) – Input time series data.

  • max_lags (int) – Maximum number of lags to compute. Defaults to 24.

Returns:

(lags, autocorrelation) - Arrays of lag values and corresponding

autocorrelation coefficients.

Return type:

tuple

environmentaltools.common.as_float_bool(obj: dict)[source]

Convert string values in dictionary to appropriate types.

Performs type conversion on dictionary values: converts numeric strings to floats/integers and boolean strings (‘True’, ‘False’) to bool type.

Parameters:

obj (dict) – Dictionary with string values to convert.

Returns:

Dictionary with values converted to appropriate types (float, int, or bool).

Return type:

dict

environmentaltools.common.ascii_tiff(file_name: str, output_format: str = 'row')[source]

Read ASCII or GeoTIFF raster files and extract coordinate data.

Reads georeferenced raster files using rasterio and extracts coordinates and values in either tabular (row) or grid format.

Parameters:
  • file_name (str) – Path to the ASCII or TIFF raster file.

  • output_format (str) – Output format type. Options are: - “row”: Returns flattened DataFrame with x, y, z columns (default) - “grid”: Returns dictionary with 2D arrays for x, y, z

Returns:

A tuple containing:
  • data (pd.DataFrame or dict): Coordinate and value data in specified format.

  • profile (dict): Rasterio profile containing metadata (CRS, transform, etc.).

Return type:

tuple

environmentaltools.common.best_params(data: DataFrame, bins: int, distrib: str, tail: bool = False)[source]

Computes the best parameters of a simple probability model based on the RMSE of the PDF.

Parameters:
  • data (pd.DataFrame) – Raw time series.

  • bins (int) – Number of bins for the histogram.

  • distrib (str) – Name of the probability model.

  • tail (bool, optional) – If True, fit only the tail. Defaults to False.

Returns:

The estimated parameters.

Return type:

list

environmentaltools.common.bias_adjustment(obs, hist, rcp, variable, funcs=['gumbel_l', 'gumbel_r'], quantiles=[0.1, 0.9], params=None)[source]

Bias adjustment for climate data using parametric quantile mapping.

Parameters:
  • obs (pd.DataFrame) – Observed data.

  • hist (pd.DataFrame) – Historical simulation data.

  • rcp (pd.DataFrame) – Scenario/projection data.

  • variable (str) – Variable name to adjust.

  • funcs (list, optional) – List of distribution names. Defaults to [“gumbel_l”, “gumbel_r”].

  • quantiles (list, optional) – Quantiles for tail adjustment. Defaults to [0.1, 0.9].

  • params (dict, optional) – Precomputed distribution parameters. Defaults to None.

Returns:

(hist, rcp) with bias-adjusted values in column ‘unbiased’.

Return type:

tuple

environmentaltools.common.bidimensional_ecdf(data1: ndarray, data2: ndarray, num_bins: int)[source]

Compute empirical 2D cumulative distribution function (ECDF).

Calculates the joint empirical CDF for two variables using a 2D histogram approach with cumulative summation.

Parameters:
  • data1 (np.ndarray) – Values of the first variable.

  • data2 (np.ndarray) – Values of the second variable.

  • num_bins (int) – Number of bins for the 2D histogram in each dimension.

Returns:

(x_mesh, y_mesh, ecdf_values) - 2D meshgrids of bin centers and

corresponding cumulative probability values.

Return type:

tuple

environmentaltools.common.coords_name(ds)[source]

Detect standard coordinate variable names in dataset.

Identifies latitude and longitude coordinate names from common conventions.

Parameters:

ds (xarray.Dataset) – Input dataset to inspect.

Returns:

(lat_name, lon_name) - Names of latitude and longitude coordinates.

Return type:

tuple

Raises:

Exception – If standard coordinate names are not found in dataset.

environmentaltools.common.create_lat_lon_matrix(lat, lon)[source]

Create 2D coordinate meshgrid from 1D coordinate vectors.

Converts separate latitude and longitude vectors into 2D coordinate matrices suitable for spatial operations and interpolation.

Parameters:
  • lat (np.ndarray) – 1D array of latitude values.

  • lon (np.ndarray) – 1D array of longitude values.

Returns:

(lat_matrix, lon_matrix) - 2D arrays of shape (len(lat), len(lon)).

Return type:

tuple

environmentaltools.common.create_mesh_dictionary(file_name: str, sheet_name: str = None)[source]

Read Excel file and create mesh parameter dictionary.

Loads mesh configuration from an Excel file and optionally extracts a specific sheet as a dictionary.

Parameters:
  • file_name (str) – Path to the Excel file with mesh parameters.

  • sheet_name (str, optional) – Name of specific sheet to extract as dictionary. If None, returns the entire DataFrame. Defaults to None.

Returns:

Dictionary of parameters if sheet_name specified,

otherwise returns full DataFrame.

Return type:

dict or pd.DataFrame

environmentaltools.common.create_project_directory(params: dict, data, global_db, local_db)[source]

Create project folder structure with initialized files for SWAN and COPLA models.

Generates a directory structure with subdirectories for each time step, containing bathymetry files and SWAN input files.

Parameters:
  • params (dict) – Dictionary with model parameters including: - directory: Root directory path for the project

  • data (pd.DataFrame) – Time series of boundary condition data with datetime index.

  • global_db (xr.Dataset) – Global mesh bathymetry dataset with ‘depth’ variable.

  • local_db (xr.Dataset) – Local mesh bathymetry dataset with ‘depth’ variable.

Returns:

Creates directory structure and files as side effect.

Return type:

None

environmentaltools.common.cshore_config()[source]

Create default CSHORE model configuration parameters.

Generates a dictionary with default configuration parameters for the CSHORE (Cross-shore) numerical model, including morphology, wave, and sediment transport settings.

Returns:

Dictionary containing CSHORE configuration parameters including:
  • Model flags (iline, iprofl, isedav, etc.)

  • Physical parameters (gamma, sporo, sg, etc.)

  • Boundary conditions (timebc_wave, swlbc, etc.)

  • Sediment properties (tanphi, blp, slp, etc.)

Return type:

dict

environmentaltools.common.csv(file_name: str, ts: bool = False, date_format=None, sep: str = ',', encoding: str = 'utf-8', index_col: list = [0], non_natural_date: bool = False, no_data_values: int = -999)[source]

Read CSV file with flexible datetime and encoding options.

Flexible CSV reader with support for time series data, custom separators, various encodings, and handling of non-natural date formats (e.g., 30-day months).

Parameters:
  • file_name (str) – Path to CSV file (supports .csv, .txt, .dat, .zip).

  • ts (bool) – If True, treats first column as datetime index. Defaults to False.

  • date_format (str, optional) – Date format string for parsing. Defaults to None.

  • sep (str) – Column separator character. Defaults to “,”.

  • encoding (str) – Character encoding. Defaults to “utf-8”.

  • index_col (list) – Columns to use as index. Defaults to [0].

  • non_natural_date (bool) – If True, handles model dates with 30-day months. Defaults to False.

  • no_data_values (int) – Value to treat as NaN. Defaults to -999.

Returns:

Loaded data with appropriate index type.

Return type:

pd.DataFrame

environmentaltools.common.cwriter(file_out: str)[source]

Create Excel workbook and worksheet for writing.

Initializes an Excel file with xlsxwriter engine for formatted output.

Parameters:

file_out (str) – Output file path.

Returns:

(workbook, worksheet) - Excel writer objects for formatting.

Return type:

tuple

environmentaltools.common.data_over_threshold(data: DataFrame, variable: str, threshold: float, duration: float)[source]

Extract extreme events exceeding a threshold for minimum duration.

Identifies continuous periods where variable values exceed a threshold and persist for at least the specified duration. Counts events per year.

Parameters:
  • data (pd.DataFrame) – Time series data with datetime index.

  • variable (str) – Column name of the variable to analyze.

  • threshold (float) – Threshold value to identify extreme events.

  • duration (float) – Minimum duration (in time units matching the index) required to classify as an extreme event.

Returns:

(events, events_per_year)
  • events (pd.DataFrame): Time series of all values during threshold exceedance events.

  • events_per_year (pd.DataFrame): Count of events per year with ‘eventno’ column.

Return type:

tuple

environmentaltools.common.date_to_julian(dates: list, calendar: str = 'julian') ndarray[source]

Convert datetime objects to Julian dates.

Transforms a list of datetime objects into Julian date numbers based on the specified calendar system.

Parameters:
  • dates (list) – List of datetime objects to convert.

  • calendar (str, optional) – Calendar system to use for conversion. Currently only ‘julian’ is supported. Defaults to ‘julian’.

Returns:

Array of Julian date numbers.

Return type:

np.ndarray

environmentaltools.common.delft_raw_files(folder: str, variables: dict, case_id: str)[source]

Load Delft3D raw output files for a specific case.

Reads multiple variable output files from Delft3D model for a given case, handling both communication (vars_com_guad) and wave (vars_wavm) variables.

Parameters:
  • folder (str) – Path to directory containing case subdirectories.

  • variables (dict) – Dictionary with keys ‘vars_com_guad’ and/or ‘vars_wavm’, each containing list of variable names to load.

  • case_id (str) – Case identifier (e.g., ‘case0001’).

Returns:

Dictionary where keys are variable names and values are numpy

arrays containing the spatial data for each variable.

Return type:

dict

environmentaltools.common.delft_raw_files_point(point: list, mesh_filename: str, folder: str, variables: list, num_cases: int, filename: str = 'seastates')[source]

Extract time series data at a specific point from Delft3D model outputs.

Reads Delft3D raw output files and extracts time series at the nearest grid point to specified coordinates. Saves results to compressed CSV.

Parameters:
  • point (list) – [x, y] coordinates of extraction point.

  • mesh_filename (str) – Path to Delft3D mesh file containing grid coordinates.

  • folder (str) – Directory containing case subdirectories with model outputs.

  • variables (list) – Variable names to extract (e.g., [‘hs’, ‘tp’, ‘eta’]).

  • num_cases (int) – Number of model cases to process.

  • filename (str) – Output filename prefix. Defaults to “seastates”.

Returns:

Saves extracted data to ZIP file with format:

{filename}{point[0]}_{point[1]}.zip

Return type:

None

environmentaltools.common.ecdf(df: DataFrame, variable: str, num_percentiles: int | bool = False) DataFrame[source]

Compute the empirical cumulative distribution function (ECDF).

Calculates non-exceedance probabilities for the variable values. Can optionally interpolate to a specified number of percentiles.

Parameters:
  • df (pd.DataFrame) – Raw time series data.

  • variable (str) – Name of the variable column to analyze.

  • num_percentiles (int | bool, optional) – Number of empirical percentiles to interpolate. If False, returns all data points. Defaults to False.

Returns:

DataFrame with variable values and their non-exceedance

probabilities. Index represents probability values.

Return type:

pd.DataFrame

environmentaltools.common.empirical_cdf_mapping(obs, hist, rcp, variable)[source]

Apply empirical CDF mapping for bias correction.

Parameters:
  • obs (pd.DataFrame) – Observed data.

  • hist (pd.DataFrame) – Historical simulation data.

  • rcp (pd.DataFrame) – Scenario/projection data.

  • variable (str) – Variable name to adjust.

Returns:

(hist, rcp) with bias-adjusted values in column ‘unbiased’.

Return type:

tuple

environmentaltools.common.epdf(df: DataFrame, variable: str, num_bins: int = 14) DataFrame[source]

Compute the empirical probability distribution function (PDF).

Creates a histogram-based empirical PDF by binning the variable values and calculating probability densities for each bin.

Parameters:
  • df (pd.DataFrame) – Raw time series data.

  • variable (str) – Name of the variable column to analyze.

  • num_bins (int, optional) – Number of bins for the histogram. Defaults to 14.

Returns:

DataFrame with bin centers as index and ‘prob’ column

containing probability densities.

Return type:

pd.DataFrame

environmentaltools.common.extract_isolines(data: dict, iso_values: list = None) dict[source]

Extract contour lines at specified values from gridded data.

Creates contour plot and extracts coordinates of contour lines at specified iso-values. Returns longest path for each iso-value.

Parameters:
  • data (dict) – Dictionary containing gridded data with keys: - ‘x’: x-coordinates (2D array) - ‘y’: y-coordinates (2D array) - ‘z’: values at each (x, y) point (2D array)

  • iso_values (list, optional) – Values at which to extract contours. Defaults to [0].

Returns:

Dictionary mapping each iso-value to a DataFrame with ‘x’ and ‘y’

columns containing coordinates along the contour line.

Return type:

dict

environmentaltools.common.find_indexes(latvar, lonvar, lat0, lon0)[source]

Find array indices of coordinates nearest to target location.

Uses great circle distance (tunnel through Earth) as the distance metric for finding the closest grid point.

Parameters:
  • latvar (np.ndarray) – Array of latitude values (degrees).

  • lonvar (np.ndarray) – Array of longitude values (degrees).

  • lat0 (float) – Target latitude (degrees).

  • lon0 (float) – Target longitude (degrees).

Returns:

(lat_index, lon_index) - Array indices of nearest point, or (None, None)

if arrays are empty.

Return type:

tuple

References

https://www.unidata.ucar.edu/blogs/developer/en/entry/accessing_netcdf_data_by_coordinates

environmentaltools.common.find_nearest_point(data, point: tuple)[source]

Find the nearest point in a spatial dataset to a given coordinate.

Uses Euclidean distance to identify the closest point in the dataset to the specified target coordinates.

Parameters:
  • data (xr.Dataset or pd.DataFrame) – Dataset with ‘x’ and ‘y’ coordinates.

  • point (tuple) – Target point coordinates as (x, y).

Returns:

Index of the nearest point in the dataset.

Return type:

int

environmentaltools.common.formats(wbook, style)[source]

Apply predefined formatting styles to Excel workbook.

Provides styling presets for headers and alternating rows.

Parameters:
  • wbook (xlsxwriter.Workbook) – Excel workbook object.

  • style (str) – Style name (‘header’, ‘even’, or ‘odd’).

Returns:

Format object with specified styling.

Return type:

xlsxwriter.Format

environmentaltools.common.gaps(data: DataFrame, variables: str | list, file_name: str = 'gaps', buoy: bool = False)[source]

Create summary table of data gaps for time series variables.

Analyzes time series to identify gaps, sampling frequency, and data quality metrics. Saves results to Excel file.

Parameters:
  • data (pd.DataFrame) – Time series data with datetime index.

  • variables (str or list) – Variable name(s) to analyze for gaps.

  • file_name (str) – Output filename (without extension) for the gap summary table. Defaults to “gaps”.

  • buoy (bool) – If True, includes quality control metrics assuming ‘Qc_e’ column exists. Defaults to False.

Returns:

Summary table with columns including cadency, accuracy,

period, number of years, gap percentage, median gap, and maximum gap.

Return type:

pd.DataFrame

environmentaltools.common.keys_as_int(obj: dict)[source]

Convert the keys at reading json file into a dictionary of integers.

Parameters:

obj (dict) – Input dictionary.

Returns:

Dictionary with integer keys where possible, original keys otherwise.

Return type:

dict

environmentaltools.common.keys_as_nparray(obj: dict)[source]

Convert the values at reading json file into numpy arrays recursively.

Recursively processes nested dictionaries up to 3 levels deep, converting values to numpy arrays where possible.

Parameters:

obj (dict) – Input dictionary with nested structure.

Returns:

Dictionary with values converted to numpy arrays where applicable.

Return type:

dict

environmentaltools.common.kmz(file_name: str, joint: bool = False)[source]

Read KMZ or KML files and extract elevation contour data.

Parses KMZ (zipped) or KML files to extract coordinate and elevation data from placemarks, with support for multiple elevation detection methods.

Parameters:
  • file_name (str) – Path to KMZ or KML file.

  • joint (bool) – If True, combines all contours into single DataFrame. If False, returns separate lists for each contour. Defaults to False.

Returns:

If joint=True, returns DataFrame with columns [x, y, z].

If joint=False, returns tuple of lists (x, y, z) where each element corresponds to a separate contour line.

Return type:

pd.DataFrame or tuple

Raises:

SystemExit – If KML parsing fails or file structure is invalid.

environmentaltools.common.latslons_values(ds, lat_name, lon_name)[source]

Extract latitude and longitude values from dataset.

Retrieves coordinate values, creating 2D meshgrid if coordinates are 1D arrays.

Parameters:
  • ds (xarray.Dataset) – Input dataset containing coordinate variables.

  • lat_name (str) – Name of latitude coordinate variable.

  • lon_name (str) – Name of longitude coordinate variable.

Returns:

(lats, lons) - Arrays of latitude and longitude values.

Return type:

tuple

environmentaltools.common.mat(file_name: str, variable: str = 'x', julian: bool = False)[source]

Read MATLAB .mat files and extract time series data.

Loads MATLAB files using scipy.io.loadmat and converts time values to pandas datetime format. Assumes data structure with timestamps in first column and values in second column.

Parameters:
  • file_name (str) – Path to .mat file.

  • variable (str) – Variable name to extract from .mat file structure. Defaults to “x”.

  • julian (bool) – If True, keeps julian date format. If False, converts to datetime using matplotlib date conversion. Defaults to False.

Returns:

Time series with datetime index and ‘Q’ column containing values.

Return type:

pd.DataFrame

environmentaltools.common.max_moving_window(data: DataFrame, duration: int) DataFrame[source]

Select peaks from time series using a moving window.

Identifies peak values that occur at the center of a moving window, effectively filtering out values that are not local maxima within the specified duration.

Parameters:
  • data (pd.DataFrame) – Time series data.

  • duration (int) – Duration of the moving window (in index units).

Returns:

DataFrame containing only the detected peaks with

their original timestamps.

Return type:

pd.DataFrame

environmentaltools.common.maximum_absolute_error(a, b)[source]

Calculate Maximum Absolute Error between two arrays.

The MAE measures the largest absolute difference between predicted and observed values. Useful for identifying worst-case errors.

Parameters:
  • a (array-like) – First array (e.g., observed values).

  • b (array-like) – Second array (e.g., predicted values).

Returns:

The maximum absolute error value.

Return type:

float

environmentaltools.common.mean_absolute_error(a, b)[source]

Calculate Mean Absolute Error between two arrays.

The Mean Absolute Error (MAE) measures the average magnitude of errors between predicted and observed values without considering their direction.

Parameters:
  • a (array-like) – First array (e.g., observed values).

  • b (array-like) – Second array (e.g., predicted values).

Returns:

The mean absolute error value.

Return type:

float

environmentaltools.common.mean_dt_param(B, Q)[source]
environmentaltools.common.netcdf(file_name: str, variables: str = None, latlon: list = None, depth: float = None, time_series: bool = True, glob: bool = False)[source]

Read NetCDF4 files with spatial/temporal subsetting options.

Reads NetCDF files using xarray with support for multi-file datasets, spatial point extraction, and time series conversion.

Parameters:
  • file_name (str) – Path to NetCDF file or directory pattern for glob.

  • variables (str or list, optional) – Variable name(s) to extract. Defaults to None (all).

  • latlon (list, optional) – [latitude, longitude] for point extraction. Defaults to None.

  • depth (float, optional) – Depth level for extraction. Defaults to None.

  • time_series (bool) – If True, converts to time series DataFrame. Defaults to True.

  • glob (bool) – If True, opens multiple files using pattern matching. Defaults to False.

Returns:

Extracted data. If latlon specified, returns

tuple of (DataFrame, (nearest_lat, nearest_lon)).

Return type:

pd.DataFrame or xarray.Dataset

Raises:

ValueError – If glob files are inconsistent.

environmentaltools.common.nonstationary_ecdf(data: DataFrame, variable: str, wlen: float = 0.038329911019849415, equal_windows: bool = False, pemp: list = None)[source]

Computes empirical percentiles using a moving window.

Parameters:
  • data (pd.DataFrame) – Time series.

  • variable (str) – Name of the variable.

  • wlen (float) – Length of window in years (default 14 days).

  • equal_windows (bool) – If True, use equal window size.

  • pemp (list, optional) – Empirical percentiles to use.

Returns:

Values of the given non-stationary percentiles. list: Chosen empirical percentiles.

Return type:

pd.DataFrame

environmentaltools.common.nonstationary_epdf(data: DataFrame, variable: str, wlen: float = 0.038329911019849415, no_values: int = 14)[source]

Computes the empirical PDF using a moving window.

Parameters:
  • data (pd.DataFrame) – Time series.

  • variable (str) – Name of the variable.

  • wlen (float) – Length of window in years (default 14 days).

  • no_values (int) – Number of values for the PDF.

Returns:

Values of the non-stationary PDF.

Return type:

pd.DataFrame

environmentaltools.common.npy(file_name: str)[source]

Read data from NumPy binary file (.npy).

Loads numpy array or pickled data from .npy file format, with fallback to pickle loading for complex objects.

Parameters:

file_name (str) – Path to .npy file (extension added automatically if missing).

Returns:

Loaded numpy array or dictionary if pickled object.

Return type:

np.ndarray or dict

environmentaltools.common.npy2json(params: dict)[source]

Convert dictionary with numpy arrays to JSON format and save to file.

Serializes numpy arrays to lists and performs custom transformations for specific parameter structures before saving to JSON file.

Parameters:

params (dict) – Dictionary containing parameters to transform. Must include ‘fname’ key for output filename. Arrays are converted to lists, ‘mode’ values to integers, and handles nested structures in ‘all’ and ‘fun’ keys.

Returns:

None

environmentaltools.common.optimize_rbf_epsilon(coords, data, n_train, method='gaussian', smooth=0.5, eps0=1, optimizer='local', metric='rmse')[source]

Optimize epsilon and smooth parameters for RBF by minimizing validation error (RMSE or MAE). Allows local (SLSQP) or global (differential_evolution) optimization.

Parameters:
  • coords (np.ndarray) – Input coordinates (n_samples, n_features).

  • data (np.ndarray) – Target values (n_samples,).

  • n_train (int) – Number of samples for training (rest for validation).

  • method (str, optional) – RBF function type. Default ‘gaussian’.

  • smooth (float, optional) – Initial smooth value. Default 0.5.

  • eps0 (float, optional) – Initial epsilon value. Default 1.

  • optimizer (str, optional) – ‘local’ (SLSQP) or ‘global’ (differential_evolution).

  • metric (str, optional) – ‘rmse’ or ‘mae’.

Returns:

(epsilon_opt, smooth_opt)

Return type:

tuple

environmentaltools.common.outliers_detection(data, outliers_fraction, method='Local Outlier Factor', scaler_method='MinMaxScaler')[source]

Detect outliers in data using various sklearn algorithms.

Parameters:
  • data (array-like) – Input data (2D array expected).

  • outliers_fraction (float) – Fraction of outliers to detect (contamination parameter).

  • method (str, optional) – Outlier detection method. Options are: - “Robust covariance”: Uses EllipticEnvelope - “One-Class SVM”: Uses OneClassSVM - “Isolation Forest”: Uses IsolationForest - “Local Outlier Factor”: Uses LocalOutlierFactor (default)

  • scaler_method (str, optional) – Scaling method to apply before detection. If None, no scaling is applied. Defaults to “MinMaxScaler”.

Returns:

Boolean mask indicating outliers (True) and inliers (False).

Return type:

np.ndarray

environmentaltools.common.pdf(file_name: str, encoding: str = 'latin-1', table: bool = False, guess: bool = False, area: list = None)[source]

Read PDF files and extract text or tabular data.

Extracts content from PDF files using either text extraction (PyPDF2) or table extraction (tabula-py) methods.

Parameters:
  • file_name (str) – Path to PDF file.

  • encoding (str) – Character encoding for table extraction. Defaults to “latin-1”.

  • table (bool) – If True, extracts tables using tabula. If False, extracts plain text from first page. Defaults to False.

  • guess (bool) – If True, tabula will guess table locations. Defaults to False.

  • area (list, optional) – Coordinates [top, left, bottom, right] defining table area for extraction. Defaults to None (auto-detect).

Returns:

Extracted text string (if table=False) or DataFrame

with table data (if table=True).

Return type:

str or pd.DataFrame

environmentaltools.common.pre_ensemble_plot(models: list, param: dict, variable: str, file_name: str = None)[source]

Compute ensemble statistics for multiple probability models across percentiles.

Evaluates multiple probability models at various percentile levels, computes their mean and standard deviation across a normalized time grid, and prepares data for ensemble visualization.

Parameters:
  • models (list) – List of model names to evaluate.

  • param (dict) – Dictionary containing parameters for each model, including: - ‘fun’: Probability distribution functions for each variable - Model-specific parameters for statistical fitting

  • variable (str) – Variable name to analyze (e.g., ‘Hs’, ‘Tp’, or direction variables).

  • file_name (str, optional) – Filename for saving results. Defaults to None.

Returns:

Dictionary where keys are percentile strings (e.g., ‘0.05’, ‘0.5’, ‘0.95’)

and values are DataFrames containing: - ‘n’: Normalized time coordinate [0, 1] - ‘prob’: Probability value for this percentile - One column per model with computed values - ‘mean’: Mean across all models - ‘std’: Standard deviation across all models

Return type:

dict

environmentaltools.common.probability_mapping(obs, hist, rcp, variable, func)[source]

Apply parametric probability mapping for bias correction.

Parameters:
  • obs (pd.DataFrame) – Observed data.

  • hist (pd.DataFrame) – Historical simulation data.

  • rcp (pd.DataFrame) – Scenario/projection data.

  • variable (str) – Variable name to adjust.

  • func (str) – Name of the distribution to use.

Returns:

(hist, rcp) with bias-adjusted values in column ‘unbiased’.

Return type:

tuple

environmentaltools.common.rbf_error_metric(params, coords, data, train_idx, valid_idx, method, metric='rmse')[source]

Compute the error of an RBF for given epsilon and smooth values.

Parameters:
  • params (list) – [epsilon, smooth].

  • coords (np.ndarray) – Input coordinates.

  • data (np.ndarray) – Target values.

  • train_idx (array) – Indices for training samples.

  • valid_idx (array) – Indices for validation samples.

  • method (str) – RBF function type.

  • metric (str) – ‘rmse’ or ‘mae’.

Returns:

Error value (RMSE or MAE).

Return type:

float

environmentaltools.common.read_copla(file_name: str, grid: dict = None)[source]

Read COPLA model velocity output files.

Loads velocity field data from COPLA model output and computes magnitude and direction on a 2D grid with ghost cells.

Parameters:
  • file_name (str) – Path to COPLA velocity output file.

  • grid (dict, optional) – Existing grid dictionary to update. If None, creates new dictionary. Defaults to None.

Returns:

Dictionary containing velocity components and derived fields:
  • ’u’: East-west velocity component (m/s) with ghost cells

  • ’v’: North-south velocity component (m/s) with ghost cells

  • ’U’: Velocity magnitude (m/s)

  • ’DirU’: Velocity direction (degrees, nautical convention)

Return type:

dict

environmentaltools.common.read_cshore(file_type: str, path: str, skiprows: int = 1)[source]

Read CSHORE model output files.

Loads and parses output files from the CSHORE numerical model, supporting various output types (profiles, transport rates, energy, velocities, etc.).

Parameters:
  • file_type (str) – Type of CSHORE output file to read. Options include: ‘bprof’, ‘bsusl’, ‘cross’, ‘energ’, ‘longs’, ‘param’, ‘rolle’, ‘setup’, ‘swase’, ‘timse’, ‘xmome’, ‘xvelo’, ‘ymome’, ‘yvelo’.

  • path (str) – Directory path containing CSHORE output files.

  • skiprows (int) – Number of rows to skip when reading file. Defaults to 1.

Returns:

Parsed CSHORE output data with appropriate column names

and spatial index (x-distance in meters).

Return type:

pd.DataFrame

environmentaltools.common.read_json(file_name: str, conversion_type: str = None)[source]

Read data from JSON files with optional type conversion.

Loads JSON files and converts keys to integers or numpy arrays based on the specified conversion type.

Parameters:
  • file_name (str) – Path to the JSON file.

  • conversion_type (str, optional) – Type of data conversion: - “td” (temporal dependency): Converts values to numpy arrays - None or other: Converts keys to integers Defaults to None.

Returns:

Loaded and converted dictionary data.

Return type:

dict

environmentaltools.common.read_pde(file_name: str, new_format: bool = False)[source]

Read data from Spanish Puertos del Estado (PdE) wave buoy files.

Parses wave data files from the Spanish port authority, handling both new and legacy file formats.

Parameters:
  • file_name (str) – Path to the PdE data file.

  • new_format (bool) – If True, uses new PdE file format. If False, uses legacy format with auto-detection of data start row. Defaults to False.

Returns:

Wave parameters with datetime index. Columns include significant

wave height (Hs), mean period (Tm), peak period (Tp), mean direction (DirM), and swell components. Invalid values (-100, -99.9, -9999) are replaced with NaN.

Return type:

pd.DataFrame

environmentaltools.common.read_swan(file_name: str, grid: dict = None, variables: list = None)[source]

Read SWAN model output from MATLAB file format.

Loads wave field data from SWAN (Simulating WAves Nearshore) model output stored in MATLAB format and computes wave number from wavelength.

Parameters:
  • file_name (str) – Path to MATLAB (.mat) file containing SWAN output.

  • grid (dict, optional) – Existing grid dictionary to update. If None, creates new dictionary. Defaults to None.

  • variables (list, optional) – List of output variable names in order: [x, y, depth, Qb, L, Setup, Hs, DirM]. If None, uses default names. Defaults to None.

Returns:

Dictionary containing wave parameters on grid:
  • ’x’, ‘y’: Coordinates

  • ’depth’: Water depth (m)

  • ’Qb’: Wave breaking dissipation

  • ’L’: Wavelength (m)

  • ’Setup’: Wave setup (m)

  • ’Hs’: Significant wave height (m)

  • ’DirM’: Mean wave direction (degrees)

  • ’kp’: Wave number (2π/L) computed from wavelength

Return type:

dict

environmentaltools.common.rmse(a, b)[source]

Calculate Root Mean Square Error between two arrays.

The RMSE measures the square root of the average of squared differences between predicted and observed values. Lower values indicate better fit.

Parameters:
  • a (array-like) – First array (e.g., observed values).

  • b (array-like) – Second array (e.g., predicted values).

Returns:

The root mean square error value.

Return type:

float

environmentaltools.common.rotate_geo2nav(ang)[source]

Convert angles from geographic (0°=East, 90°=North) to navigational (0°=North, 90°=East).

Parameters:

ang (array-like) – Array or Series of angles in degrees.

Returns:

Rotated angles in degrees.

Return type:

np.ndarray

environmentaltools.common.scaler(data, method='MinMaxScaler', transform=True, scale=False)[source]

Scale or inverse-scale data using sklearn scalers.

Parameters:
  • data (array-like or pd.DataFrame) – Data to scale.

  • method (str, optional) – Scaling method. Defaults to “MinMaxScaler”.

  • transform (bool, optional) – If True, transform; if False, inverse transform. Defaults to True.

  • scale (sklearn scaler, optional) – Pre-fitted scaler to use. Defaults to False.

Returns:

(transformed_data, scaler)

Return type:

tuple

environmentaltools.common.shp(file_name: str, joint: bool = False, variable: str = None)[source]

Read shapefile and extract geometry coordinates.

Reads shapefiles using geopandas and extracts coordinates from various geometry types (Point, Polygon, LineString, MultiPoint, MultiLineString, MultiPolygon).

Parameters:
  • file_name (str) – Path to shapefile.

  • joint (bool) – If True, concatenates all geometries into single DataFrame. Defaults to False (returns list of DataFrames).

  • variable (str, optional) – Additional attribute column to extract alongside coordinates. Defaults to None.

Returns:

If joint=True or single geometry, returns DataFrame

with columns [x, y] or [x, y, variable]. Otherwise returns list of DataFrames, one per geometry feature.

Return type:

pd.DataFrame or list

Raises:

ValueError – If geometry type cannot be processed with available methods.

environmentaltools.common.smooth_1d(data: ndarray, window_length: int = None, poly_order: int = 3) ndarray[source]

Apply Savitzky-Golay filter for 1D data smoothing.

Uses the Savitzky-Golay filter to smooth 1D data by fitting successive sub-sets of adjacent data points with a low-degree polynomial.

Parameters:
  • data (np.ndarray) – 1D array of data values to smooth.

  • window_length (int, optional) – Length of the filter window (number of coefficients). Must be a positive odd integer. If None, defaults to len(data)/51. Defaults to None.

  • poly_order (int, optional) – Order of the polynomial used to fit the samples. Must be less than window_length. Defaults to 3.

Returns:

Smoothed data array with the same length as input data.

Return type:

np.ndarray

environmentaltools.common.string_to_function(param: dict, variable: str = None)[source]

Convert string function names to scipy.stats function objects.

Replaces string representations of statistical distribution names with actual scipy.stats function objects in parameter dictionary.

Parameters:
  • param (dict) – Parameter dictionary containing ‘fun’ key with function name strings to convert.

  • variable (str, optional) – Specific variable name to process. If None, processes all entries in param[‘fun’]. Defaults to None.

Returns:

Updated parameter dictionary with function objects instead of strings.

Return type:

dict

environmentaltools.common.to_csv(data: DataFrame, file_name: str, compression: str = 'infer')[source]

Save DataFrame to CSV file with optional compression.

Exports data to CSV format with automatic compression detection or explicit zip compression.

Parameters:
  • data (pd.DataFrame) – Data to save.

  • file_name (str) – Output file path.

  • compression (str) – Compression type (‘infer’, ‘zip’, ‘gzip’, etc.). Defaults to ‘infer’ (auto-detect from extension).

Returns:

None

environmentaltools.common.to_esriascii(data: ndarray, ncols: int, nrows: int, cellsize: float, file_name: str, x0: float = 0, y0: float = 0, nodata_value: float = -9999)[source]

Save gridded data to ESRI ASCII raster format.

Exports 2D array data to ESRI ASCII Grid format (.asc) with header information including grid dimensions, origin, cell size, and no-data value.

Parameters:
  • data (np.ndarray) – 2D array of grid values to save.

  • ncols (int) – Number of columns in the grid.

  • nrows (int) – Number of rows in the grid.

  • cellsize (float) – Cell size (resolution) in spatial units.

  • file_name (str) – Output file path.

  • x0 (float) – X-coordinate of lower-left corner. Defaults to 0.

  • y0 (float) – Y-coordinate of lower-left corner. Defaults to 0.

  • nodata_value (float) – Value representing missing/no data. Defaults to -9999.

Returns:

None

environmentaltools.common.to_geotiff(data: ndarray, file_name: str, profile: dict = None, transform: Affine = None, auxiliary: dict = None)[source]

Save georeferenced raster data to GeoTIFF format.

Exports 2D array to GeoTIFF with spatial reference information. Profile can be provided directly or constructed from auxiliary parameters.

Parameters:
  • data (np.ndarray) – 2D array of raster values.

  • file_name (str) – Output GeoTIFF file path.

  • profile (dict, optional) – Rasterio profile dictionary with metadata (driver, dtype, nodata, dimensions, CRS, transform). If None, built from auxiliary.

  • transform (Affine, optional) – Affine transformation matrix. Ignored if profile provided. Defaults to None.

  • auxiliary (dict, optional) – Dictionary with keys: ‘corners’ (origin [x, y]), ‘dx’, ‘dy’ (cell sizes), ‘angle’ (rotation), ‘driver’, ‘dtype’, ‘nodata’, ‘nodesx’, ‘nodesy’ (dimensions), ‘count’ (bands), ‘crsno’ (EPSG code). Required if profile is None.

Returns:

None

environmentaltools.common.to_json(params: dict, file_name: str, numpy_array_serialization: bool = False)[source]

Save dictionary to JSON file with optional numpy array serialization.

Exports data to JSON format with optional automatic conversion of numpy arrays to lists for JSON compatibility.

Parameters:
  • params (dict) – Data dictionary to save.

  • file_name (str) – Output file path.

  • numpy_array_serialization (bool) – If True, recursively converts numpy arrays to lists in nested dictionaries. Defaults to False.

Returns:

None

environmentaltools.common.to_netcdf(data: DataFrame, file_path: str)[source]

Save DataFrame to NetCDF4 file format.

Exports time series data to NetCDF format for efficient storage and compatibility with climate/oceanographic data standards.

Parameters:
  • data (pd.DataFrame) – Time series or gridded data to save.

  • file_path (str) – Output file path (without .nc extension).

Returns:

None

environmentaltools.common.to_npy(data: ndarray, file_name: str)[source]

Save numpy array to binary .npy file.

Serializes numpy array to binary format for efficient storage and loading.

Parameters:
  • data (np.ndarray) – Array data to save.

  • file_name (str) – Output file path (without extension).

Returns:

None

environmentaltools.common.to_shp(file_name: str, lon: Series, lat: Series, geometry_type: str = 'point', values: Series = None)[source]

Save spatial data to ESRI shapefile format.

Creates shapefiles with point, multi-point, line, or multi-line geometries from coordinate data.

Parameters:
  • file_name (str) – Output shapefile path (without .shp extension).

  • lon (pd.Series or list) – Longitude or X coordinates.

  • lat (pd.Series or list) – Latitude or Y coordinates.

  • geometry_type (str) – Geometry type to create. Options: - ‘point’: Single point - ‘multi-point’: Multiple separate points - ‘line’: Single polyline - ‘multi-line’: Multiple polylines (requires values parameter) Defaults to ‘point’.

  • values (pd.Series, optional) – Values to group coordinates for multi-line geometries. Each unique value creates a separate line. Defaults to None.

Returns:

None

Raises:
  • ImportError – If pyshp package is not installed.

  • ValueError – If geometry_type is not recognized.

environmentaltools.common.to_txt(data: DataFrame, file_name: str, fmt: str = '%9.3f')[source]

Save DataFrame to text file with custom formatting.

Exports data to plain text file using numpy savetxt with specified format.

Parameters:
  • data (pd.DataFrame) – Data to save.

  • file_name (str) – Output file path.

  • fmt (str) – Format string for numeric values (e.g., ‘%9.3f’ for 9-character width with 3 decimal places). Defaults to “%9.3f”.

Returns:

None

environmentaltools.common.to_xlsx(data: DataFrame, file_name: str)[source]

Save DataFrame to formatted Excel file with styled headers and rows.

Exports data to Excel with alternating row colors and formatted headers for improved readability.

Parameters:
  • data (pd.DataFrame) – Data to save.

  • file_name (str) – Output Excel file path.

Returns:

None

environmentaltools.common.uv_to_magnitude_angle(u: Series | ndarray, v: Series | ndarray, labels: list = ['magnitude', 'angle'])[source]

Convert u, v vector components to magnitude and direction.

Transforms Cartesian velocity/wind components (u, v) to polar form (magnitude, direction) using standard meteorological convention.

Parameters:
  • u (pd.Series or np.ndarray) – Zonal (east-west) component.

  • v (pd.Series or np.ndarray) – Meridional (north-south) component.

  • labels (list) – Output column names for [magnitude, direction]. Defaults to [“magnitude”, “angle”].

Returns:

DataFrame with two columns containing magnitude (sqrt(u²+v²))

and angle in degrees [0, 360).

Return type:

pd.DataFrame

environmentaltools.common.write_copla_input(case_index: int, time_index, case_id: str, data, params: dict, mesh: str = 'local')[source]

Write COPLA wave propagation model input files.

Creates input files for the COPLA (Coastal Propagation of LArge waves) model, using SWAN output as boundary conditions.

Parameters:
  • case_index (int) – Sequential case number (0-based index).

  • time_index – Timestamp from the data index for this case.

  • case_id (str) – String identifier for this case (e.g., ‘0001’, ‘0002’).

  • data (pd.DataFrame) – Time series of boundary condition data.

  • params (dict) – Dictionary with model parameters including: - directory: Root directory for input/output files - Mesh configuration parameters

  • mesh (str, optional) – Mesh type identifier. Defaults to ‘local’.

environmentaltools.common.write_cshore_input(properties: dict, output_folder: str)[source]

Write CSHORE model input file.

Creates the infile for the CSHORE coastal hydrodynamics model with the specified configuration parameters.

Parameters:
  • properties (dict) –

    Dictionary containing CSHORE model parameters including:

    • header: Model header text

    • iline, iprofl, isedav, iperm, iover, iwtran, ipond, infilt, iwcint, iroll, iwind, itide, iveg: Integer flags for model options

    • dx: Spatial grid spacing

    • gamma: Wave breaking parameter

    • d50, wf, sg: Sediment parameters (grain size, fall velocity, specific gravity)

    • effb, efff, slp: Efficiency and slope parameters

    • and other model-specific parameters

  • output_folder (str) – Path to folder where infile will be created.

environmentaltools.common.write_swan_input(case_index: int, time_index, case_id: str, data, params: dict, mesh: str = 'global', local: bool = False, nested: bool = False)[source]

Write SWAN wave model input files.

Creates initialization and input files for the SWAN (Simulating WAves Nearshore) model for a specific case.

Parameters:
  • case_index (int) – Sequential case number (0-based index).

  • time_index – Timestamp from the data index for this case.

  • case_id (str) – String identifier for this case (e.g., ‘0001’, ‘0002’).

  • data (pd.DataFrame) – Time series of boundary condition data.

  • params (dict) – Dictionary with model parameters including: - directory: Root directory for output files - Mesh configuration parameters - Wave and wind parameters

  • mesh (str, optional) – Mesh type identifier (‘global’ or ‘local’). Defaults to ‘global’.

  • local (bool, optional) – Whether to use local mesh configuration. Defaults to False.

  • nested (bool, optional) – Whether to enable nested grid mode. Defaults to False.

environmentaltools.common.xlsx(file_name: str, sheet_name: str = 0, names: str = None)[source]

Read Excel file (.xls or .xlsx).

Reads Excel workbook with support for specific sheets and column naming.

Parameters:
  • file_name (str) – Path to Excel file.

  • sheet_name (str or int) – Sheet name or index to read. Defaults to 0 (first sheet).

  • names (list, optional) – Custom column names. Defaults to None (use file headers).

Returns:

Data from specified Excel sheet with first column as index.

Return type:

pd.DataFrame

environmentaltools.common.xrnearest(ds, lat, lon, lat_name=None, lon_name=None, variable_mask=None, time_mask=0)[source]

Find the nearest grid point to specified coordinates in xarray dataset.

Locates the dataset grid point closest to the given latitude and longitude coordinates, optionally applying a mask to exclude invalid data points.

Parameters:
  • ds (xarray.Dataset) – Input dataset with coordinate information.

  • lat (float) – Target latitude coordinate.

  • lon (float) – Target longitude coordinate.

  • lat_name (str, optional) – Name of latitude variable. Auto-detected if None.

  • lon_name (str, optional) – Name of longitude variable. Auto-detected if None.

  • variable_mask (str, optional) – Variable name to use for masking NaN values.

  • time_mask (int) – Time index to use for masking. Defaults to 0.

Returns:

Subset of dataset at the nearest grid point.

Return type:

xarray.Dataset

Raises:

Exception – If coordinate dimensions are not found in dataset.

Data Reading

read module

Functions for reading various file formats including Excel, CSV, NetCDF, and other data sources.

keys_as_int(obj)

Convert the keys at reading json file into a dictionary of integers.

keys_as_nparray(obj)

Convert the values at reading json file into numpy arrays recursively.

read_json(file_name[, conversion_type])

Read data from JSON files with optional type conversion.

read_pde(file_name[, new_format])

Read data from Spanish Puertos del Estado (PdE) wave buoy files.

csv(file_name[, ts, date_format, sep, ...])

Read CSV file with flexible datetime and encoding options.

npy(file_name)

Read data from NumPy binary file (.npy).

xlsx(file_name[, sheet_name, names])

Read Excel file (.xls or .xlsx).

netcdf(file_name[, variables, latlon, ...])

Read NetCDF4 files with spatial/temporal subsetting options.

ascii_tiff(file_name[, output_format])

Read ASCII or GeoTIFF raster files and extract coordinate data.

kmz(file_name[, joint])

Read KMZ or KML files and extract elevation contour data.

shp(file_name[, joint, variable])

Read shapefile and extract geometry coordinates.

mat(file_name[, variable, julian])

Read MATLAB .mat files and extract time series data.

pdf(file_name[, encoding, table, guess, area])

Read PDF files and extract text or tabular data.

Data Saving

save module

Functions for saving processed data to various formats.

npy2json(params)

Convert dictionary with numpy arrays to JSON format and save to file.

to_json(params, file_name[, ...])

Save dictionary to JSON file with optional numpy array serialization.

to_csv(data, file_name[, compression])

Save DataFrame to CSV file with optional compression.

to_npy(data, file_name)

Save numpy array to binary .npy file.

to_xlsx(data, file_name)

Save DataFrame to formatted Excel file with styled headers and rows.

cwriter(file_out)

Create Excel workbook and worksheet for writing.

formats(wbook, style)

Apply predefined formatting styles to Excel workbook.

to_esriascii(data, ncols, nrows, cellsize, ...)

Save gridded data to ESRI ASCII raster format.

as_float_bool(obj)

Convert string values in dictionary to appropriate types.

to_geotiff(data, file_name[, profile, ...])

Save georeferenced raster data to GeoTIFF format.

to_txt(data, file_name[, fmt])

Save DataFrame to text file with custom formatting.

to_shp(file_name, lon, lat[, geometry_type, ...])

Save spatial data to ESRI shapefile format.

to_netcdf(data, file_path)

Save DataFrame to NetCDF4 file format.

Model I/O

load module

Functions for loading model outputs and configurations.

create_mesh_dictionary(file_name[, sheet_name])

Read Excel file and create mesh parameter dictionary.

cshore_config()

Create default CSHORE model configuration parameters.

read_cshore(file_type, path[, skiprows])

Read CSHORE model output files.

read_copla(file_name[, grid])

Read COPLA model velocity output files.

read_swan(file_name[, grid, variables])

Read SWAN model output from MATLAB file format.

delft_raw_files_point(point, mesh_filename, ...)

Extract time series data at a specific point from Delft3D model outputs.

delft_raw_files(folder, variables, case_id)

Load Delft3D raw output files for a specific case.

write module

Functions for writing input files for numerical models (SWAN, CSHORE, COPLA).

write_cshore_input(properties, output_folder)

Write CSHORE model input file.

write_swan_input(case_index, time_index, ...)

Write SWAN wave model input files.

write_copla_input(case_index, time_index, ...)

Write COPLA wave propagation model input files.

create_project_directory(params, data, ...)

Create project folder structure with initialized files for SWAN and COPLA models.

Statistical Utilities

utils module

General utility functions for statistical analysis, data transformations, and bias correction.

max_moving_window(data, duration)

Select peaks from time series using a moving window.

gaps(data, variables[, file_name, buoy])

Create summary table of data gaps for time series variables.

ecdf(df, variable[, num_percentiles])

Compute the empirical cumulative distribution function (ECDF).

nonstationary_ecdf(data, variable[, wlen, ...])

Computes empirical percentiles using a moving window.

epdf(df, variable[, num_bins])

Compute the empirical probability distribution function (PDF).

nonstationary_epdf(data, variable[, wlen, ...])

Computes the empirical PDF using a moving window.

best_params(data, bins, distrib[, tail])

Computes the best parameters of a simple probability model based on the RMSE of the PDF.

acorr(data[, max_lags])

Compute autocorrelation function of a time series.

bidimensional_ecdf(data1, data2, num_bins)

Compute empirical 2D cumulative distribution function (ECDF).

bias_adjustment(obs, hist, rcp, variable[, ...])

Bias adjustment for climate data using parametric quantile mapping.

probability_mapping(obs, hist, rcp, ...)

Apply parametric probability mapping for bias correction.

empirical_cdf_mapping(obs, hist, rcp, variable)

Apply empirical CDF mapping for bias correction.

rotate_geo2nav(ang)

Convert angles from geographic (0°=East, 90°=North) to navigational (0°=North, 90°=East).

uv_to_magnitude_angle(u, v[, labels])

Convert u, v vector components to magnitude and direction.

optimize_rbf_epsilon(coords, data, n_train)

Optimize epsilon and smooth parameters for RBF by minimizing validation error (RMSE or MAE).

rbf_error_metric(params, coords, data, ...)

Compute the error of an RBF for given epsilon and smooth values.

outliers_detection(data, outliers_fraction)

Detect outliers in data using various sklearn algorithms.

scaler(data[, method, transform, scale])

Scale or inverse-scale data using sklearn scalers.

string_to_function(param[, variable])

Convert string function names to scipy.stats function objects.

data_over_threshold(data, variable, ...)

Extract extreme events exceeding a threshold for minimum duration.

extract_isolines(data[, iso_values])

Extract contour lines at specified values from gridded data.

pre_ensemble_plot(models, param, variable[, ...])

Compute ensemble statistics for multiple probability models across percentiles.

smooth_1d(data[, window_length, poly_order])

Apply Savitzky-Golay filter for 1D data smoothing.

find_nearest_point(data, point)

Find the nearest point in a spatial dataset to a given coordinate.

date_to_julian(dates[, calendar])

Convert datetime objects to Julian dates.

mean_dt_param(B, Q)

rmse(a, b)

Calculate Root Mean Square Error between two arrays.

maximum_absolute_error(a, b)

Calculate Maximum Absolute Error between two arrays.

mean_absolute_error(a, b)

Calculate Mean Absolute Error between two arrays.

xrnearest(ds, lat, lon[, lat_name, ...])

Find the nearest grid point to specified coordinates in xarray dataset.

latslons_values(ds, lat_name, lon_name)

Extract latitude and longitude values from dataset.

find_indexes(latvar, lonvar, lat0, lon0)

Find array indices of coordinates nearest to target location.

create_lat_lon_matrix(lat, lon)

Create 2D coordinate meshgrid from 1D coordinate vectors.

coords_name(ds)

Detect standard coordinate variable names in dataset.