nsEVDx Users Reference
A Python Library for modeling Non-stationary Extreme Value Distributions
Overview
The nsEVDx library provides comprehensive tools for modeling extreme value distributions under both stationary and non-stationary conditions. Designed for hydrologists, climate scientists, and engineers working with extreme events, it implements both frequentist and Bayesian inference frameworks with flexible covariate modeling capabilities.
GEV & GPD Models
Complete support for Generalized Extreme Value and Generalized Pareto distributions
Non-stationary Modeling
Time-varying parameters with arbitrary covariates
Bayesian & Frequentist
Dual inference frameworks with MCMC sampling
Advanced MCMC
MALA, HMC, and Metropolis-Hastings samplers
Model Diagnostics
Comprehensive convergence and goodness-of-fit metrics
Minimal Dependencies
Only numpy, scipy, matplotlib, and seaborn required
Key Applications
Climate Extremes
Analysis of extreme precipitation, temperature, and drought events under changing climate conditions.
Flood Frequency Analysis
Non-stationary flood frequency modeling with time-varying design quantiles for infrastructure planning.
Risk Assessment
Financial and environmental risk modeling with time-dependent extreme value parameters.
Engineering Design
Design value estimation for structures under non-stationary extreme loads.
Installation
pip install nsEVDx
# Or clone from GitHub:
git clone https://github.com/Nischalcs50/nsEVDx
cd nsEVDx
pip install .
For developers/contributors
git clone https://github.com/Nischalcs50/nsEVDx
cd nsEVDx
pip install -e .[dev]
Requirements
- Python 3.9+
- numpy
- scipy
- matplotlib
- seaborn
Theoretical Background
Generalized Extreme Value Distribution
The GEV distribution unifies the three types of extreme value distributions. The cumulative distribution function (CDF) of a GEV random variable X with location μ, scale σ > 0, and shape ξ is:
F(X; μ, σ, ξ) = exp{−[1 + ξ(x − μ)/σ]−1/ξ} for ξ ≠ 0
F(X; μ, σ, 0) = exp{−exp[−(x − μ)/σ]} for ξ = 0
Generalized Pareto Distribution
The CDF of exceedances Y = X-μ > 0 over threshold μ following a GPD, with scale σ > 0 and shape ξ
F(Y; μ, σ, ξ) = 1 − (1 + ξ(y − μ)/σ)−1/ξ for ξ ≠ 0
F(Y; μ, σ, 0) = 1 − exp(−(y − μ)/σ) for ξ = 0
Non-stationary Framework
Parameters are modeled as functions of covariates:
Location (linear):
μ(t) = β0 + β1Z1(t) + β2Z2(t) + ...
Scale (exponential):
σ(t) = exp(α0 + α1Z1(t) + α2Z2(t) + ...)
Shape (linear):
ξ(t) = κ0 + κ1Z1(t) + κ2Z2(t) + ...
where Z(t) is a dynamic covariate that changes with time and affects the extreme value distributions.Non-Stationarity Configuration
In nsEVDx
, non-stationarity is controlled via a configuration vector:
config = [a, b, c]
Each element in config specifies the number of covariates for the location (μ), scale (σ), and shape (ξ) parameters. A value of 0 indicates stationarity; values >0 indicate non-stationary modeling using the corresponding number of covariates for modeling. This framework allows flexible, parsimonious modeling of non-stationary extreme value distributions, including covariates only where supported by data.
Quick Start
import nsEVDx as ns
import numpy as np
from scipy.stats import genextreme
# Generate sample data
data = np.random.gev(0.1, loc=10, scale=2, size=50)
covariate = np.linspace(0, 1, 50).reshape(1, -1)
# Initialize model
sampler = ns.NonStationaryEVD(
config=[1, 0, 0], # Location varies with 1 covariate
data=data,
cov=covariate,
dist=genextreme
)
# Run MCMC
samples, acc_rate = sampler.MH_RandWalk(
num_samples=5000,
initial_params=[10, 0.1, 2, 0.1],
proposal_widths=[0.5, 0.05, 0.2, 0.05],
T=1.0
)
print(f"Acceptance rate: {acc_rate:.3f}")
print(f"Parameter means: {samples.mean(axis=0)}")
Module Structure
nsEVDx.NonStationaryEVD
class NonStationaryEVD(config, data, cov, dist, prior_specs=None, bounds=None)
Parameters
- config : list of int
- Configuration vector [location, scale, shape] specifying number of covariates for each parameter. 0 = stationary.
- data : array_like
- Observed extreme values in chronological order.
- cov : array_like
- Covariate matrix, shape (n_covariates, n_samples).
- dist : scipy.stats distribution
- Either genextreme or genpareto distribution object.
- prior_specs : list of tuples, optional
- Prior specifications for Bayesian inference. Format: [('dist_name', {'param': value}), ...]
- bounds : list of tuples, optional
- Parameter bounds for MLE optimization. Format: [(lower, upper), ...]
Configuration Examples:
config = [0, 0, 0] → Full stationary model
config = [1, 0, 0] → Non-stationary location with 1 covariate
config = [2, 1, 0] → 2 covariates for location, 1 for scale
Initialization
Initializing nsEVDx.NonStationaryEVD object for sampling.Examples
from scipy.stats import genextreme
# Stationary GEV model
sampler = ns.NonStationaryEVD([0,0,0], data, cov, genextreme)
# Non-stationary location only
sampler = ns.NonStationaryEVD([1,0,0], data, cov, genextreme)
# Multiple covariates for location and scale
sampler = ns.NonStationaryEVD([3,2,0], data, cov, genextreme)
get_param_description
@staticmethod
get_param_description(config, n_cov)
Returns descriptions of each parameter in the parameter vector based on configuration.
Parameters
- config : list of int
- Non-stationarity configuration [location, scale, shape].
- n_cov : int
- Total number of covariates available.
Returns
- list of str
- Descriptions of each parameter in order.
Example
# For config=[1,0,0] with n_cov=1
descriptions = ns.NonStationaryEVD.get_param_description([1,0,0], 1)
# Returns: ['B0 (location intercept)', 'B1 (location slope for covariate 1)',
# 'scale', 'shape']
Parameter Setup Methods
suggest_priors
suggest_priors()
Suggest default prior distributions for model parameters based on configuration and data statistics.
Returns
- prior_specs : list of tuples
- List of prior specifications for each parameter. Format: [('distribution_name', {'param': value}), ...]
Example
sampler = ns.NonStationaryEVD([1,0,0], data, cov, genextreme)
priors = sampler.suggest_priors()
# Returns: [('normal', {'loc': 0, 'scale': 10}), ...]
suggest_bounds
suggest_bounds(buffer=0.5)
Suggests bounds for MLE optimization based on config and distribution type.
Parameters
- buffer : float, default 0.5
- Fractional buffer around stationary parameter estimates.
Returns
- bounds : list of tuples
- List of (lower, upper) tuples for each parameter in order.
Bayesian Methods
MH_RandWalk
MH_RandWalk(num_samples, initial_params, proposal_widths, T=1.0)
Metropolis-Hastings sampler with random walk proposals.
Parameters
- num_samples : int
- Number of samples to generate.
- initial_params : array_like
- Starting parameter values.
- proposal_widths : array_like
- Standard deviations for proposal distribution.
- T : float, default 1.0
- Temperature parameter for tempering.
Returns
- samples : ndarray
- Array of shape (num_samples, n_parameters) containing parameter samples.
- acceptance_rate : float
- Fraction of proposals accepted.
Example
sampler = ns.NonStationaryEVD([1,0,0], data, cov, genextreme, prior_specs=priors)
samples, acc_rate = sampler.MH_RandWalk(
num_samples=10000,
initial_params=[0, 0, 1, 0.1],
proposal_widths=[0.1, 0.05, 0.05, 0.02],
T=1.0
)
print(f"Acceptance rate: {acc_rate:.3f}")
MH_Mala
MH_Mala(num_samples, initial_params, step_sizes, T=1.0)
Metropolis-Adjusted Langevin Algorithm using gradient information for more efficient sampling.
Parameters
- num_samples : int
- Number of samples to generate.
- initial_params : array_like
- Starting parameter values.
- step_sizes : list of float
- Step sizes epsilon for MALA proposals.
- T : float, default 1.0
- Temperature factor for acceptance ratio.
Returns
- samples : ndarray
- Array of shape (num_samples, n_parameters).
- acceptance_rate : float
- Fraction of proposals accepted.
Example
samples, acc_rate = sampler.MH_Mala(
num_samples=10000,
initial_params=[0, 0, 1, 0.1],
step_sizes=[0.01, 0.01, 0.01, 0.005]
)
numerical_grad_log_posterior
numerical_grad_log_posterior(params, h=1e-2)
Compute the numerical gradient of the log-posterior using central difference method for MH_Mala( ).
Parameters
- params : array_like
- Parameter values at which to evaluate the gradient.
- h : float or array_like, default 1e-2
- Step size for finite difference approximation.
Returns
- grad : ndarray
- Approximate gradient of the log-posterior with respect to each parameter.
MH_Hmc
MH_Hmc(num_samples, initial_params, step_size=0.1, num_leapfrog_steps=10, T=1.0)
Hamiltonian Monte Carlo sampler for efficient exploration of posterior distribution.
Parameters
- num_samples : int
- Number of samples to generate.
- initial_params : array_like
- Starting parameter values.
- step_size : float, default 0.1
- Leapfrog step size epsilon.
- num_leapfrog_steps : int, default 10
- Number of leapfrog steps per iteration.
- T : float, default 1.0
- Temperature scaling factor.
Returns
- samples : ndarray
- Array of shape (num_samples, n_parameters).
- acceptance_rate : float
- Fraction of proposals accepted.
Example
samples, acc_rate = sampler.MH_Hmc(
num_samples=10000,
initial_params=[0, 0, 1, 0.1],
step_size=0.05,
num_leapfrog_steps=20
)
hamiltonian
hamiltonian(params, momentum, T)
Compute the Hamiltonian (total energy) for HMC sampling.
Parameters
- params : array_like
- Current position in parameter space.
- momentum : array_like
- Auxiliary momentum variables.
- T : float
- Temperature scaling factor.
Returns
- float
- Total Hamiltonian energy (potential + kinetic).
Frequentist Methods
frequentist_nsEVD
frequentist_nsEVD(initial_params, max_retries=10)
Maximum likelihood estimation of parameters.
Parameters
- initial_params : array_like
- Initial parameter guess.
- max_retries : int, default 10
- Number of optimization retries with perturbed starting points.
Returns
- params : ndarray
- Estimated parameters.
- neg_loglik : float
- Negative log-likelihood at optimum.
Simulation
ns_EVDrvs
@staticmethod
ns_EVDrvs(dist, params, cov, config, size)
Generate non-stationary GEV or GPD random samples.
Parameters
- dist : rv_continuous
- SciPy continuous distribution object (genextreme or genpareto).
- params : array_like
- Flattened parameter list according to config.
- cov : ndarray
- Covariate matrix, shape (n_covariates, n_samples).
- config : list of int
- Non-stationarity config [loc, scale, shape].
- size : int
- Number of random samples to generate.
Returns
- ndarray
- Generated non-stationary random variates.
Example
from scipy.stats import genextreme
# Generate synthetic data
synthetic_data = ns.NonStationaryEVD.ns_EVDrvs(
dist=genextreme,
params=[10, 0.5, 2, 0.1], # location intercept, slope, scale, shape
cov=time_covariate,
config=[1, 0, 0],
size=100
)
Utility Functions
nsEVDx.neg_log_likelihood
neg_log_likelihood(params, data, dist)
Compute negative log-likelihood for stationary EVD.
Parameters
- params : array_like
- Parameters [loc, scale, shape] for the distribution.
- data : array_like
- Observed data points.
- dist : scipy.stats distribution
- Distribution object (genpareto or genextreme).
Returns
- float
- Negative log-likelihood. Returns np.inf if parameters are invalid.
nsEVDx.neg_log_likelihood_ns
neg_log_likelihood_ns(params, data, cov, config, dist)
Calculate negative log-likelihood for non-stationary EVD.
Parameters
- params : array_like
- Parameter vector ordered according to config.
- data : array_like
- Observed extreme values.
- cov : array_like
- Covariate matrix, shape (n_covariates, n_samples).
- config : list of int
- Non-stationarity configuration [location, scale, shape].
- dist : rv_continuous
- SciPy distribution object.
Returns
- float
- Negative log-likelihood value. Returns np.inf if invalid.
nsEVDx.EVD_parsViaMLE
EVD_parsViaMLE(data, dist, verbose=False)
Estimate stationary EVD parameters via maximum likelihood.
Parameters
- data : array_like
- Observed data.
- dist : scipy.stats distribution
- genextreme or genpareto distribution.
- verbose : bool, default False
- Whether to print optimization details.
Returns
- ndarray
- Estimated parameters [shape, location, scale].
nsEVDx.GEV_parsViaLM
GEV_parsViaLM(arr)
Estimate GEV parameters using L-moments (Hosking & Wallis, 1987).
Parameters
- arr : array_like
- Observed data sample.
Returns
- ndarray
- Array of size 3 containing [shape, location, scale].
nsEVDx.GPD_parsViaLM
GPD_parsViaLM(arr)
Estimate GPD parameters using L-moments (Hosking & Wallis, 1987).
Parameters
- arr : array_like
- Observed data sample.
Returns
- ndarray
- Array of size 3 containing [shape, location, scale].
nsEVDx.l_moments
l_moments(data)
Compute L-moments from data sample.
Parameters
- data : array_like
- Sample data array.
Returns
- ndarray
- Array containing [n, mean, L1, L2, T3, T4].
nsEVDx.comb
comb(n, k)
Compute binomial coefficient "n choose k".
Parameters
- n : int
- Total number of items.
- k : int
- Number of items to choose.
Returns
- float
- The binomial coefficient C(n, k).
Diagnostics & Visualization
nsEVDx.plot_trace
plot_trace(samples, config, fig_size=None, param_names_override=None)
Generate MCMC trace plots for convergence diagnostics.
Parameters
- samples : ndarray
- MCMC samples of shape (n_iterations, n_parameters).
- config : list of int
- Non-stationarity config [loc, scale, shape].
- fig_size : tuple, optional
- Figure size (width, height).
- param_names_override : list of str, optional
- Custom parameter names to override default naming.
Example
import nsEVDx as ns
# After MCMC sampling
samples, acc_rate = sampler.MH_RandWalk(...)
ns.plot_trace(samples, config=[1,0,0], fig_size=(12, 8))
nsEVDx.plot_posterior
plot_posterior(samples, config, fig_size=None, param_names_override=None)
Generate posterior distribution plots with histograms and density curves.
Parameters
- samples : ndarray
- MCMC samples of shape (n_iterations, n_parameters).
- config : list of int
- Non-stationarity config [loc, scale, shape].
- fig_size : tuple, optional
- Figure size (width, height). Default based on number of parameters.
- param_names_override : list of str, optional
- Custom parameter names.
Example
ns.plot_posterior(samples, config=[1,0,0])
# With custom names
ns.plot_posterior(
samples,
config=[1,0,0],
param_names_override=['μ₀', 'μ₁', 'σ', 'ξ']
)
nsEVDx.bayesian_metrics
bayesian_metrics(samples, data, cov, config, dist)
Compute Bayesian model selection criteria (DIC, AIC, BIC) from posterior samples.
Parameters
- samples : ndarray
- Posterior samples, shape (n_samples, n_params).
- data : array_like
- Observed data used to compute likelihood.
- cov : array_like
- Covariates used in the non-stationary model.
- config : list of int
- Configuration settings for the model.
- dist : scipy.stats distribution
- Distribution type (genextreme or genpareto).
Returns
- dict
- Dictionary containing computed values of DIC, AIC, and BIC.
Metrics:
DIC = −2 × mean(log L) + 2pD
AIC = −2 × max(log L) + 2k
BIC = −2 × max(log L) + k × log(n)
Example
metrics = ns.bayesian_metrics(samples, data, cov, [1,0,0], genextreme)
print(f"DIC: {metrics['DIC']:.2f}")
print(f"AIC: {metrics['AIC']:.2f}")
print(f"BIC: {metrics['BIC']:.2f}")
nsEVDx.gelman_rubin
gelman_rubin(chains)
Compute Gelman-Rubin R-hat statistic for MCMC convergence diagnostics.
Parameters
- chains : list of ndarray
- List of chains, each with shape [n_samples, n_params].
Returns
- ndarray
- R-hat values for each parameter. Values close to 1.0 indicate convergence.
Convergence criterion:
R̂ < 1.1 indicates good convergence
R̂ > 1.1 suggests more iterations needed
Example
# Run multiple chains
chains = []
for i in range(4):
samples, _ = sampler.MH_RandWalk(
num_samples=10000,
initial_params=init_params + np.random.randn(4)*0.1,
proposal_widths=[0.1, 0.05, 0.05, 0.02]
)
chains.append(samples)
# Check convergence
rhat = ns.gelman_rubin(chains)
print("R-hat values:", rhat)
print("Converged:", np.all(rhat < 1.1))
Examples
GEV Model Fitting
View the full example on GitHub: GEV Model Fitting Notebook
GPD Threshold Exceedances
View the full example on GitHub: GPD Frequentist Example Notebook
Citation
If you use nsEVDx in your research, please cite:
- Kafle, N., & Meier, C. I. (2025). nsEVDx: A Python library for modeling Non-Stationary Extreme Value Distributions. arXiv preprint arXiv:2509.07261.
- Kafle, N., & Meier, C. (2025). nsEVDx: A Python Library for Modeling Non-Stationary Extreme Value Distributions (v0.1.0). Zenodo. https://doi.org/10.5281/zenodo.15850043
References
- Betancourt, M. (2017). A Conceptual Introduction to Hamiltonian Monte Carlo. arXiv: Methodology.
- Coles, S. (2007). An Introduction to Statistical Modeling of Extreme Values (4th printing). Springer. https://doi.org/10.1007/978-1-4471-3675-0
- Gilleland, E. (2025). extRemes: Extreme Value Analysis. https://doi.org/10.32614/CRAN.package.extRemes
- Heffernan, J. E., Stephenson, A. G., & Gilleland, E. (2003). Ismev: An Introduction to Statistical Modeling of Extreme Values. https://CRAN.R-project.org/package=ismev
- Hosking, J. R. M., & Wallis, J. R. (1997). Regional Frequency Analysis: An Approach Based on L-Moments (Vol. 93). Cambridge University Press. https://doi.org/10.1017/cbo9780511529443
- IRSN. (2024). NSGEV: Non-Stationary GEV Time Series. https://github.com/IRSN/NSGEV/
- Kafle, N., & Meier, C. (n.d.). Detecting trends in short duration extreme precipitation over SEUS using neighborhood based method. Manuscript in Preparation.
- Kafle, N., & Meier, C. (2025). Evaluating Methodologies for Detecting Trends in Short-Duration Extreme Rainfall in the Southeastern United States. Extreme Hydrological or Critical Event Analysis-III, EWRI Congress 2025, Anchorage, AK, U.S. https://alaska2025.eventscribe.net
- Oriol Abril-Pla, V. Andreani, C. Carroll, L. Y. Dong, C. Fonnesbeck, M. Kochurov, R. Kumar, J. Lao, C. C. Luhmann, O. A. Martin, M. Osthege, R. Vieira, T. V. Wiecki, & R. Zinkov. (2023). PyMC: A modern, and comprehensive probabilistic programming framework in Python. PeerJ Computer Science, 9, e1516. https://doi.org/10.7717/peerj-cs.1516
- Paciorek, C. (2016). climextRemes: Tools for Analyzing Climate Extremes. https://CRAN.R-project.org/package=climextRemes
- Robert, C. P., & Casella, G. (2009). Introducing Monte Carlo Methods with R. https://doi.org/10.1007/978-1-4419-1576-4
- Roberts, G. O., & Tweedie, R. L. (1996). Exponential Convergence of Langevin Distributions and Their Discrete Approximations. Bernoulli, 2(4), 341. https://doi.org/10.2307/3318418
- Stan Development Team. (2023a). CmdStan: The command-line interface to Stan. https://mc-stan.org/users/interfaces/cmdstan
- Stan Development Team. (2023b). PyStan: The Python interface to Stan. https://pystan.readthedocs.io/