Overview

The nsEVDx library provides comprehensive tools for modeling extreme value distributions under both stationary and non-stationary conditions. Designed for hydrologists, climate scientists, and engineers working with extreme events, it implements both frequentist and Bayesian inference frameworks with flexible covariate modeling capabilities.

GEV & GPD Models

Complete support for Generalized Extreme Value and Generalized Pareto distributions

Non-stationary Modeling

Time-varying parameters with arbitrary covariates

Bayesian & Frequentist

Dual inference frameworks with MCMC sampling

Advanced MCMC

MALA, HMC, and Metropolis-Hastings samplers

Model Diagnostics

Comprehensive convergence and goodness-of-fit metrics

Minimal Dependencies

Only numpy, scipy, matplotlib, and seaborn required

Key Applications

Climate Extremes

Analysis of extreme precipitation, temperature, and drought events under changing climate conditions.

Flood Frequency Analysis

Non-stationary flood frequency modeling with time-varying design quantiles for infrastructure planning.

Risk Assessment

Financial and environmental risk modeling with time-dependent extreme value parameters.

Engineering Design

Design value estimation for structures under non-stationary extreme loads.

Installation

pip install nsEVDx  
# Or clone from GitHub:
git clone https://github.com/Nischalcs50/nsEVDx
cd nsEVDx
pip install .
For developers/contributors

git clone https://github.com/Nischalcs50/nsEVDx
cd nsEVDx
pip install -e .[dev]

Requirements

  • Python 3.9+
  • numpy
  • scipy
  • matplotlib
  • seaborn

Theoretical Background

Generalized Extreme Value Distribution

The GEV distribution unifies the three types of extreme value distributions. The cumulative distribution function (CDF) of a GEV random variable X with location μ, scale σ > 0, and shape ξ is:

CDF:

F(X; μ, σ, ξ) = exp{−[1 + ξ(xμ)/σ]−1/ξ}   for ξ ≠ 0

F(X; μ, σ, 0) = exp{−exp[−(xμ)/σ]}   for ξ = 0

Generalized Pareto Distribution

The CDF of exceedances Y = X-μ > 0 over threshold μ following a GPD, with scale σ > 0 and shape ξ

CDF:

F(Y; μ, σ, ξ) = 1 − (1 + ξ(yμ)/σ)−1/ξ   for ξ ≠ 0

F(Y; μ, σ, 0) = 1 − exp(−(yμ)/σ)   for ξ = 0

Non-stationary Framework

Parameters are modeled as functions of covariates:

Location (linear):

μ(t) = β0 + β1Z1(t) + β2Z2(t) + ...

Scale (exponential):

σ(t) = exp(α0 + α1Z1(t) + α2Z2(t) + ...)

Shape (linear):

ξ(t) = κ0 + κ1Z1(t) + κ2Z2(t) + ...

where Z(t) is a dynamic covariate that changes with time and affects the extreme value distributions.

Non-Stationarity Configuration

In nsEVDx, non-stationarity is controlled via a configuration vector: config = [a, b, c]

Each element in config specifies the number of covariates for the location (μ), scale (σ), and shape (ξ) parameters. A value of 0 indicates stationarity; values >0 indicate non-stationary modeling using the corresponding number of covariates for modeling. This framework allows flexible, parsimonious modeling of non-stationary extreme value distributions, including covariates only where supported by data.

Quick Start

import nsEVDx as ns
import numpy as np
from scipy.stats import genextreme

# Generate sample data
data = np.random.gev(0.1, loc=10, scale=2, size=50)
covariate = np.linspace(0, 1, 50).reshape(1, -1)

# Initialize model
sampler = ns.NonStationaryEVD(
    config=[1, 0, 0],  # Location varies with 1 covariate
    data=data,
    cov=covariate,
    dist=genextreme
)

# Run MCMC
samples, acc_rate = sampler.MH_RandWalk(
    num_samples=5000,
    initial_params=[10, 0.1, 2, 0.1],
    proposal_widths=[0.5, 0.05, 0.2, 0.05],
    T=1.0
)

print(f"Acceptance rate: {acc_rate:.3f}")
print(f"Parameter means: {samples.mean(axis=0)}")

Module Structure

nsEVDx.NonStationaryEVD

class NonStationaryEVD(config, data, cov, dist, prior_specs=None, bounds=None)

Parameters

config : list of int
Configuration vector [location, scale, shape] specifying number of covariates for each parameter. 0 = stationary.
data : array_like
Observed extreme values in chronological order.
cov : array_like
Covariate matrix, shape (n_covariates, n_samples).
dist : scipy.stats distribution
Either genextreme or genpareto distribution object.
prior_specs : list of tuples, optional
Prior specifications for Bayesian inference. Format: [('dist_name', {'param': value}), ...]
bounds : list of tuples, optional
Parameter bounds for MLE optimization. Format: [(lower, upper), ...]

Configuration Examples:

config = [0, 0, 0] → Full stationary model

config = [1, 0, 0] → Non-stationary location with 1 covariate

config = [2, 1, 0] → 2 covariates for location, 1 for scale

Initialization

Initializing nsEVDx.NonStationaryEVD object for sampling.

Examples

from scipy.stats import genextreme
# Stationary GEV model
sampler = ns.NonStationaryEVD([0,0,0], data, cov, genextreme)

# Non-stationary location only
sampler = ns.NonStationaryEVD([1,0,0], data, cov, genextreme)

# Multiple covariates for location and scale
sampler = ns.NonStationaryEVD([3,2,0], data, cov, genextreme)
                            

get_param_description

@staticmethod
get_param_description(config, n_cov)

Returns descriptions of each parameter in the parameter vector based on configuration.

Parameters
config : list of int
Non-stationarity configuration [location, scale, shape].
n_cov : int
Total number of covariates available.
Returns
list of str
Descriptions of each parameter in order.
Example
# For config=[1,0,0] with n_cov=1
descriptions = ns.NonStationaryEVD.get_param_description([1,0,0], 1)
# Returns: ['B0 (location intercept)', 'B1 (location slope for covariate 1)', 
#           'scale', 'shape']

Parameter Setup Methods

suggest_priors

suggest_priors()

Suggest default prior distributions for model parameters based on configuration and data statistics.

Returns
prior_specs : list of tuples
List of prior specifications for each parameter. Format: [('distribution_name', {'param': value}), ...]
Example
sampler = ns.NonStationaryEVD([1,0,0], data, cov, genextreme)
priors = sampler.suggest_priors()
# Returns: [('normal', {'loc': 0, 'scale': 10}), ...]

suggest_bounds

suggest_bounds(buffer=0.5)

Suggests bounds for MLE optimization based on config and distribution type.

Parameters
buffer : float, default 0.5
Fractional buffer around stationary parameter estimates.
Returns
bounds : list of tuples
List of (lower, upper) tuples for each parameter in order.

Bayesian Methods

MH_RandWalk

MH_RandWalk(num_samples, initial_params, proposal_widths, T=1.0)

Metropolis-Hastings sampler with random walk proposals.

Parameters
num_samples : int
Number of samples to generate.
initial_params : array_like
Starting parameter values.
proposal_widths : array_like
Standard deviations for proposal distribution.
T : float, default 1.0
Temperature parameter for tempering.
Returns
samples : ndarray
Array of shape (num_samples, n_parameters) containing parameter samples.
acceptance_rate : float
Fraction of proposals accepted.
Example
sampler = ns.NonStationaryEVD([1,0,0], data, cov, genextreme, prior_specs=priors)
samples, acc_rate = sampler.MH_RandWalk(
    num_samples=10000,
    initial_params=[0, 0, 1, 0.1],
    proposal_widths=[0.1, 0.05, 0.05, 0.02],
    T=1.0
)
print(f"Acceptance rate: {acc_rate:.3f}")

MH_Mala

MH_Mala(num_samples, initial_params, step_sizes, T=1.0)

Metropolis-Adjusted Langevin Algorithm using gradient information for more efficient sampling.

Parameters
num_samples : int
Number of samples to generate.
initial_params : array_like
Starting parameter values.
step_sizes : list of float
Step sizes epsilon for MALA proposals.
T : float, default 1.0
Temperature factor for acceptance ratio.
Returns
samples : ndarray
Array of shape (num_samples, n_parameters).
acceptance_rate : float
Fraction of proposals accepted.
Example
samples, acc_rate = sampler.MH_Mala(
    num_samples=10000,
    initial_params=[0, 0, 1, 0.1],
    step_sizes=[0.01, 0.01, 0.01, 0.005]
)

numerical_grad_log_posterior

numerical_grad_log_posterior(params, h=1e-2)

Compute the numerical gradient of the log-posterior using central difference method for MH_Mala( ).

Parameters
params : array_like
Parameter values at which to evaluate the gradient.
h : float or array_like, default 1e-2
Step size for finite difference approximation.
Returns
grad : ndarray
Approximate gradient of the log-posterior with respect to each parameter.

MH_Hmc

MH_Hmc(num_samples, initial_params, step_size=0.1, num_leapfrog_steps=10, T=1.0)

Hamiltonian Monte Carlo sampler for efficient exploration of posterior distribution.

Parameters
num_samples : int
Number of samples to generate.
initial_params : array_like
Starting parameter values.
step_size : float, default 0.1
Leapfrog step size epsilon.
num_leapfrog_steps : int, default 10
Number of leapfrog steps per iteration.
T : float, default 1.0
Temperature scaling factor.
Returns
samples : ndarray
Array of shape (num_samples, n_parameters).
acceptance_rate : float
Fraction of proposals accepted.
Example
samples, acc_rate = sampler.MH_Hmc(
    num_samples=10000,
    initial_params=[0, 0, 1, 0.1],
    step_size=0.05,
    num_leapfrog_steps=20
)

hamiltonian

hamiltonian(params, momentum, T)

Compute the Hamiltonian (total energy) for HMC sampling.

Parameters
params : array_like
Current position in parameter space.
momentum : array_like
Auxiliary momentum variables.
T : float
Temperature scaling factor.
Returns
float
Total Hamiltonian energy (potential + kinetic).

Frequentist Methods

frequentist_nsEVD

frequentist_nsEVD(initial_params, max_retries=10)

Maximum likelihood estimation of parameters.

Parameters
initial_params : array_like
Initial parameter guess.
max_retries : int, default 10
Number of optimization retries with perturbed starting points.
Returns
params : ndarray
Estimated parameters.
neg_loglik : float
Negative log-likelihood at optimum.

Simulation

ns_EVDrvs

@staticmethod
ns_EVDrvs(dist, params, cov, config, size)

Generate non-stationary GEV or GPD random samples.

Parameters
dist : rv_continuous
SciPy continuous distribution object (genextreme or genpareto).
params : array_like
Flattened parameter list according to config.
cov : ndarray
Covariate matrix, shape (n_covariates, n_samples).
config : list of int
Non-stationarity config [loc, scale, shape].
size : int
Number of random samples to generate.
Returns
ndarray
Generated non-stationary random variates.
Example
from scipy.stats import genextreme

# Generate synthetic data
synthetic_data = ns.NonStationaryEVD.ns_EVDrvs(
    dist=genextreme,
    params=[10, 0.5, 2, 0.1],  # location intercept, slope, scale, shape
    cov=time_covariate,
    config=[1, 0, 0],
    size=100
)

Utility Functions

nsEVDx.neg_log_likelihood

neg_log_likelihood(params, data, dist)

Compute negative log-likelihood for stationary EVD.

Parameters
params : array_like
Parameters [loc, scale, shape] for the distribution.
data : array_like
Observed data points.
dist : scipy.stats distribution
Distribution object (genpareto or genextreme).
Returns
float
Negative log-likelihood. Returns np.inf if parameters are invalid.

nsEVDx.neg_log_likelihood_ns

neg_log_likelihood_ns(params, data, cov, config, dist)

Calculate negative log-likelihood for non-stationary EVD.

Parameters
params : array_like
Parameter vector ordered according to config.
data : array_like
Observed extreme values.
cov : array_like
Covariate matrix, shape (n_covariates, n_samples).
config : list of int
Non-stationarity configuration [location, scale, shape].
dist : rv_continuous
SciPy distribution object.
Returns
float
Negative log-likelihood value. Returns np.inf if invalid.

nsEVDx.EVD_parsViaMLE

EVD_parsViaMLE(data, dist, verbose=False)

Estimate stationary EVD parameters via maximum likelihood.

Parameters
data : array_like
Observed data.
dist : scipy.stats distribution
genextreme or genpareto distribution.
verbose : bool, default False
Whether to print optimization details.
Returns
ndarray
Estimated parameters [shape, location, scale].

nsEVDx.GEV_parsViaLM

GEV_parsViaLM(arr)

Estimate GEV parameters using L-moments (Hosking & Wallis, 1987).

Parameters
arr : array_like
Observed data sample.
Returns
ndarray
Array of size 3 containing [shape, location, scale].

nsEVDx.GPD_parsViaLM

GPD_parsViaLM(arr)

Estimate GPD parameters using L-moments (Hosking & Wallis, 1987).

Parameters
arr : array_like
Observed data sample.
Returns
ndarray
Array of size 3 containing [shape, location, scale].

nsEVDx.l_moments

l_moments(data)

Compute L-moments from data sample.

Parameters
data : array_like
Sample data array.
Returns
ndarray
Array containing [n, mean, L1, L2, T3, T4].

nsEVDx.comb

comb(n, k)

Compute binomial coefficient "n choose k".

Parameters
n : int
Total number of items.
k : int
Number of items to choose.
Returns
float
The binomial coefficient C(n, k).

Diagnostics & Visualization

nsEVDx.plot_trace

plot_trace(samples, config, fig_size=None, param_names_override=None)

Generate MCMC trace plots for convergence diagnostics.

Parameters
samples : ndarray
MCMC samples of shape (n_iterations, n_parameters).
config : list of int
Non-stationarity config [loc, scale, shape].
fig_size : tuple, optional
Figure size (width, height).
param_names_override : list of str, optional
Custom parameter names to override default naming.
Example
import nsEVDx as ns

# After MCMC sampling
samples, acc_rate = sampler.MH_RandWalk(...)
ns.plot_trace(samples, config=[1,0,0], fig_size=(12, 8))

nsEVDx.plot_posterior

plot_posterior(samples, config, fig_size=None, param_names_override=None)

Generate posterior distribution plots with histograms and density curves.

Parameters
samples : ndarray
MCMC samples of shape (n_iterations, n_parameters).
config : list of int
Non-stationarity config [loc, scale, shape].
fig_size : tuple, optional
Figure size (width, height). Default based on number of parameters.
param_names_override : list of str, optional
Custom parameter names.
Example
ns.plot_posterior(samples, config=[1,0,0])

# With custom names
ns.plot_posterior(
    samples, 
    config=[1,0,0],
    param_names_override=['μ₀', 'μ₁', 'σ', 'ξ']
)

nsEVDx.bayesian_metrics

bayesian_metrics(samples, data, cov, config, dist)

Compute Bayesian model selection criteria (DIC, AIC, BIC) from posterior samples.

Parameters
samples : ndarray
Posterior samples, shape (n_samples, n_params).
data : array_like
Observed data used to compute likelihood.
cov : array_like
Covariates used in the non-stationary model.
config : list of int
Configuration settings for the model.
dist : scipy.stats distribution
Distribution type (genextreme or genpareto).
Returns
dict
Dictionary containing computed values of DIC, AIC, and BIC.

Metrics:

DIC = −2 × mean(log L) + 2pD

AIC = −2 × max(log L) + 2k

BIC = −2 × max(log L) + k × log(n)

Example
metrics = ns.bayesian_metrics(samples, data, cov, [1,0,0], genextreme)
print(f"DIC: {metrics['DIC']:.2f}")
print(f"AIC: {metrics['AIC']:.2f}")
print(f"BIC: {metrics['BIC']:.2f}")

nsEVDx.gelman_rubin

gelman_rubin(chains)

Compute Gelman-Rubin R-hat statistic for MCMC convergence diagnostics.

Parameters
chains : list of ndarray
List of chains, each with shape [n_samples, n_params].
Returns
ndarray
R-hat values for each parameter. Values close to 1.0 indicate convergence.

Convergence criterion:

R̂ < 1.1 indicates good convergence

R̂ > 1.1 suggests more iterations needed

Example
# Run multiple chains
chains = []
for i in range(4):
    samples, _ = sampler.MH_RandWalk(
        num_samples=10000,
        initial_params=init_params + np.random.randn(4)*0.1,
        proposal_widths=[0.1, 0.05, 0.05, 0.02]
    )
    chains.append(samples)

# Check convergence
rhat = ns.gelman_rubin(chains)
print("R-hat values:", rhat)
print("Converged:", np.all(rhat < 1.1))
for more detail visit API

Examples

GEV Model Fitting

View the full example on GitHub: GEV Model Fitting Notebook

GPD Threshold Exceedances

View the full example on GitHub: GPD Frequentist Example Notebook

For more examples on Jupyternotebook visit Examples

Citation

If you use nsEVDx in your research, please cite:

  1. Kafle, N., & Meier, C. I. (2025). nsEVDx: A Python library for modeling Non-Stationary Extreme Value Distributions. arXiv preprint arXiv:2509.07261.
  2. Kafle, N., & Meier, C. (2025). nsEVDx: A Python Library for Modeling Non-Stationary Extreme Value Distributions (v0.1.0). Zenodo. https://doi.org/10.5281/zenodo.15850043

References

  1. Betancourt, M. (2017). A Conceptual Introduction to Hamiltonian Monte Carlo. arXiv: Methodology.
  2. Coles, S. (2007). An Introduction to Statistical Modeling of Extreme Values (4th printing). Springer. https://doi.org/10.1007/978-1-4471-3675-0
  3. Gilleland, E. (2025). extRemes: Extreme Value Analysis. https://doi.org/10.32614/CRAN.package.extRemes
  4. Heffernan, J. E., Stephenson, A. G., & Gilleland, E. (2003). Ismev: An Introduction to Statistical Modeling of Extreme Values. https://CRAN.R-project.org/package=ismev
  5. Hosking, J. R. M., & Wallis, J. R. (1997). Regional Frequency Analysis: An Approach Based on L-Moments (Vol. 93). Cambridge University Press. https://doi.org/10.1017/cbo9780511529443
  6. IRSN. (2024). NSGEV: Non-Stationary GEV Time Series. https://github.com/IRSN/NSGEV/
  7. Kafle, N., & Meier, C. (n.d.). Detecting trends in short duration extreme precipitation over SEUS using neighborhood based method. Manuscript in Preparation.
  8. Kafle, N., & Meier, C. (2025). Evaluating Methodologies for Detecting Trends in Short-Duration Extreme Rainfall in the Southeastern United States. Extreme Hydrological or Critical Event Analysis-III, EWRI Congress 2025, Anchorage, AK, U.S. https://alaska2025.eventscribe.net
  9. Oriol Abril-Pla, V. Andreani, C. Carroll, L. Y. Dong, C. Fonnesbeck, M. Kochurov, R. Kumar, J. Lao, C. C. Luhmann, O. A. Martin, M. Osthege, R. Vieira, T. V. Wiecki, & R. Zinkov. (2023). PyMC: A modern, and comprehensive probabilistic programming framework in Python. PeerJ Computer Science, 9, e1516. https://doi.org/10.7717/peerj-cs.1516
  10. Paciorek, C. (2016). climextRemes: Tools for Analyzing Climate Extremes. https://CRAN.R-project.org/package=climextRemes
  11. Robert, C. P., & Casella, G. (2009). Introducing Monte Carlo Methods with R. https://doi.org/10.1007/978-1-4419-1576-4
  12. Roberts, G. O., & Tweedie, R. L. (1996). Exponential Convergence of Langevin Distributions and Their Discrete Approximations. Bernoulli, 2(4), 341. https://doi.org/10.2307/3318418
  13. Stan Development Team. (2023a). CmdStan: The command-line interface to Stan. https://mc-stan.org/users/interfaces/cmdstan
  14. Stan Development Team. (2023b). PyStan: The Python interface to Stan. https://pystan.readthedocs.io/