nsEVDx package

Submodules

nsEVDx.evd_model module

class NonStationaryEVD(config, data, cov, dist, prior_specs=None, bounds=None)[source]

Bases: object

MH_Hmc(num_samples, initial_params, step_size=0.01, num_leapfrog_steps=10, burn_in=1000, num_chains=1, show_progress=True, n_jobs=1, T=1.0)[source]

Hamiltonian Monte Carlo (HMC) sampler. Wrapper around the HMC egine class to run multi-chain HMC sampling.

Parameters:
  • num_samples (int) – Total iterations per chain (excluding burnin).

  • initial_params (array-like) – Starting parameter vector.

  • step_size (float) – Leapfrog step size epsilon.

  • num_leapfrog_steps (int) – Number of leapfrog steps per proposal.

  • burn_in (int) – Number of initial samples to discard per chain.

  • num_chains (int) – Number of independent chains.

  • show_progress (bool) – Display tqdm progress bars.

  • n_jobs (int) – Parallel jobs via joblib.

  • T (float) – Temperature scaling factor.

  • Returns

  • -------

  • dict – A dictionary containing: - ‘chains’: List of sample arrays [chain, iteration, parameter]. - ‘r_hats’: Gelman-Rubin convergence values for each parameter. - ‘acceptance_rates’: List of acceptance rates per chain. - ‘step_sizes’: List of final step sizes per chain. - ‘divergences’: List of divergence counts per chain.

Return type:

Union[Tuple[ndarray, float], Tuple[List[ndarray], List[float], ndarray]]

MH_Mala(num_samples, initial_params, step_sizes, T=1.0, burn_in=0, num_chains=1, show_progress=True, n_jobs=1)[source]

Metropolis-Adjusted Langevin Algorithm (MALA) sampler.

Parameters:
  • num_samples (int) – Total iterations per chain (including burnin).

  • initial_params (array-like) – Starting parameter vector.

  • step_sizes (array-like) – Per-parameter step sizes (epsilon).

  • T (float) – Temperature scaling factor.

  • burnin (int) – Number of initial samples to discard per chain.

  • num_chains (int) – Number of independent chains.

  • show_progress (bool) – Display tqdm progress bars.

  • n_jobs (int) – Parallel jobs via joblib.

  • burn_in (int)

Return type:

Same convention as MH_RandWalk.

MH_RandWalk(num_samples, initial_params, proposal_widths, T=1.0, burn_in=0, num_chains=1, show_progress=True, n_jobs=1)[source]

Metropolis-Hastings Random-Walk sampler.

Parameters:
  • num_samples (int) – Total iterations per chain (excluding burnin).

  • initial_params (array-like) – Starting parameter vector (same start used for all chains + small jitter for chains > 1).

  • proposal_widths (array-like) – Per-parameter proposal standard deviations.

  • T (float) – Temperature (default 1.0 = no tempering).

  • burnin (int) – Number of initial samples to discard per chain.

  • num_chains (int) – Number of independent chains. R-hat is automatically computed when num_chains >= 2.

  • show_progress (bool) – Display tqdm progress bars (default True).

  • n_jobs (int) – Parallel jobs via joblib (-1 = all cores). Requires joblib.

  • burn_in (int)

Return type:

Union[Tuple[ndarray, float], Tuple[List[ndarray], List[float], ndarray]]

Returns:

  • If num_chains == 1 – (samples [n_post, n_params], acceptance_rate)

  • If num_chains > 1 – (list_of_chains, list_of_acceptance_rates) Use run_chains() for automatic R-hat reporting.

frequentist_nsEVD(initial_params, max_retries=10)[source]

Estimate non-stationary EVD parameters via MLE with retries. :type initial_params: Union[List[float], ndarray] :param initial_params: Initial guess for parameters. :type initial_params: array-like :type max_retries: int :param max_retries: Number of retry attempts with modified initial guess. :type max_retries: int

Returns:

params – Estimated parameters.

Return type:

array-like

Parameters:
  • initial_params (List[float] | ndarray)

  • max_retries (int)

static get_param_description(config, n_cov)[source]

Returns a list of strings describing each parameter’s role in the parameter vector, based on the provided configuration (config. vector).

Parameters:
  • config (list of int) – Non-stationarity configuration [location, scale, shape].

  • n_cov (int) – Total number of covariates available.

Returns:

Descriptions of each parameter in order.

Return type:

list of str

static ns_EVDrvs(dist, params, cov, config, size)[source]

Generate non-stationary GEV or GPD random samples. :type dist: rv_continuous :param dist: SciPy continuous distribution object (e.g., genextreme or genpareto). :type dist: rv_continuous :type params: Union[List[float], ndarray] :param params: Flattened parameter list according to config. :type params: list :type cov: ndarray :param cov: Covariate matrix, shape (n_covariates, n_samples). :type cov: np.ndarray :type config: List[int] :param config: Non-stationarity config [loc, scale, shape]. :type config: list of int :type size: int :param size: Number of random samples to generate. :type size: int

Returns:

Generated non-stationary random variates.

Return type:

np.ndarray

Parameters:
  • dist (rv_continuous)

  • params (List[float] | ndarray)

  • cov (ndarray)

  • config (List[int])

  • size (int)

suggest_bounds(buffer=0.5)[source]

Suggests bounds for MLE optimization based on config. vector and distribution.

Parameters:

buffer (float) – Fractional buffer around stationary parameter estimates.

Returns:

bounds – List of (lower, upper) tuples for each parameter in order.

Return type:

List[Tuple[float, float]]

suggest_priors()[source]

Suggest default prior distributions for model parameters based on the current configuration and data statistics.

Returns:

prior_specs – List of prior specifications for each parameter in the order expected by the sampler. Each element is a tuple like (distribution_name, distribution_parameters_dict).

Return type:

list of tuples

nsEVDx.hmc_engine module

class HMCEngine(model, grad_method='analytical')[source]

Bases: object

Analytical-gradient HMC engine for NonStationaryEVD.

Parameters:
  • model (NonStationaryEVD) – A fitted (or partially set-up) NonStationaryEVD model instance. Must have data, cov, config, dist, and prior_specs set. Call model.prior_specs = model.suggest_priors() first if you have not already done so. grad_method : str 'analytical' (default) or fall back to 'numerical'. Analytical is ~20-50x faster per leapfrog step. Numerical is used automatically as fallback for unsupported dists.

  • grad_method (str)

nsEVDx.utils module

EVD_parsViaMLE(data, dist, verbose=False)[source]

Estimate EVD (GEV or GPD) parameters via MLE.

Parameters:
  • data (array-like) – Observed data.

  • dist (scipy.stats distribution object) – genextreme or genpareto distribution.

Returns:

Estimated parameters [xi (shape), mu (location), sigma (scale)].

Return type:

np.ndarray

Raises:

ValueError – If optimization fails.

GEV_parsViaLM(arr)[source]

Estimate Generalized Extreme Value (GEV) parameters using L-moments based on the formulations given in Hosking and Wallis (1987)

Parameters:

arr (array-like) – Observed data sample.

Returns:

A NumPy array of size 3 containing the estimated GEV parameters: [shape, location, scale].

Return type:

np.ndarray

GPD_parsViaLM(arr)[source]

Estimate Generalized Pareto Distribution (GPD) parameters using L-moments based on the formulations given in Hosking and Wallis (1987)

Parameters:

arr (array-like) – Observed data sample.

Returns:

A NumPy array of size 3 containing the estimated GPD parameters: [shape, location, scale].

Return type:

np.ndarray

bayesian_metrics(samples, data, cov, config, dist)[source]

Compute Bayesian model selection criteria (DIC, AIC, BIC) from posterior samples.

This function evaluates the model’s performance using Deviance Information Criterion (DIC), Akaike Information Criterion (AIC), and Bayesian Information Criterion (BIC) based on the log-likelihoods computed from the posterior samples.

Parameters:
  • samples (ndarray of shape (n_samples, n_params)) – Posterior samples of model parameters obtained from MCMC or another Bayesian method.

  • data (array-like) – Observed data used to compute the likelihood.

  • cov (array-like or None) – Covariates used in the non-stationary model, if applicable.

  • config (dict) – Configuration settings for the likelihood computation, e.g., fixed parameters, link functions.

  • dist (str or callable) – Distribution type used for modeling the data (e.g., “gev”, “gumbel”), passed to the likelihood function.

Returns:

A dictionary containing the computed values of DIC, AIC, and BIC.

Return type:

dict

Notes

  • DIC is computed using the effective number of parameters

    (pD = 2 * (max_ll - mean_ll)).

  • AIC and BIC are computed using the maximum log-likelihood and number of

    parameters.

  • The log-likelihood is computed using the negative log-likelihood function

    for each sample.

gelman_rubin(chains)[source]

Compute the Gelman-Rubin R-hat statistic for each parameter.

Parameters:

chains (list of np.ndarray) – List of chains (arrays of shape [n_samples, n_params])

Returns:

R-hat values for each parameter

Return type:

np.ndarray

l_moments(data)[source]

Compute L-moments from the given data sample.

Parameters:

data (array-like) – Sample data array.

Returns:

Array containing [n, mean, L1, L2, T3, T4], where - n: sample size - mean: sample mean - L1, L2: first and second L-moments - T3, T4: L-skewness and L-kurtosis

Return type:

np.ndarray

neg_log_likelihood(params, data, dist)[source]

Compute the negative log-likelihood of data for given parameters of a stationary distribution.

Parameters:
  • params (list or np.ndarray) – Parameters [loc, scale, shape] for the distribution.

  • data (array-like) – Observed data points.

  • dist (scipy.stats distribution object) – Distribution object (e.g., genpareto or genextreme).

Returns:

Negative log-likelihood. Returns np.inf if parameters are invalid or evaluation fails.

Return type:

float

neg_log_likelihood_ns(params, data, cov, config, dist)[source]

Calculate the negative log-likelihood of the non-stationary extreme value distribution.

Parameters:
  • params (np.ndarray) – Parameter vector ordered according to the config.

  • data (list or np.ndarray) – Observed extreme values (e.g., annual maxima).

  • cov (list of lists or np.ndarray) – Covariate matrix with shape (n_covariates, n_samples).

  • config (list of int) – Non-stationarity configuration [location, scale, shape], where 0 = stationary, >=1 = number of covariates for non-stationary.

  • dist (rv_continuous) –

    SciPy continuous distribution object (e.g., genextreme or

    genpareto).

Returns:

Negative log-likelihood value. Returns np.inf if invalid parameters.

Return type:

float

plot_posterior(chains, config, fig_size=None, param_names_override=None)[source]

Plot histograms with density curves for each parameter based on config. vector

Parameters:
  • samples (np.ndarray) – MCMC samples of shape (n_iterations, n_parameters)

  • config (list of int) – Non-stationarity config [loc, scale, shape]

  • fig_size (tuple, optional) – Optional figure size (width, height). Default is based on number of parameters.

  • param_names_override (list of str, optional) – Custom parameter names to override default naming from config.

plot_trace(chains, config, fig_size=None, param_names_override=None)[source]

Plot MCMC trace plots for each parameter based on config. vector

Parameters:
  • chains (np.ndarray) – MCMC samples of shape (n_iterations, n_parameters)

  • config (list of int) – Non-stationarity config [loc, scale, shape]

  • fig_size (tuple) – Optional figure size.

  • param_names_override (list of str) – Optional custom names for parameters.

Module contents

EVD_parsViaMLE(data, dist, verbose=False)[source]

Estimate EVD (GEV or GPD) parameters via MLE.

Parameters:
  • data (array-like) – Observed data.

  • dist (scipy.stats distribution object) – genextreme or genpareto distribution.

Returns:

Estimated parameters [xi (shape), mu (location), sigma (scale)].

Return type:

np.ndarray

Raises:

ValueError – If optimization fails.

GEV_parsViaLM(arr)[source]

Estimate Generalized Extreme Value (GEV) parameters using L-moments based on the formulations given in Hosking and Wallis (1987)

Parameters:

arr (array-like) – Observed data sample.

Returns:

A NumPy array of size 3 containing the estimated GEV parameters: [shape, location, scale].

Return type:

np.ndarray

GPD_parsViaLM(arr)[source]

Estimate Generalized Pareto Distribution (GPD) parameters using L-moments based on the formulations given in Hosking and Wallis (1987)

Parameters:

arr (array-like) – Observed data sample.

Returns:

A NumPy array of size 3 containing the estimated GPD parameters: [shape, location, scale].

Return type:

np.ndarray

class HMCEngine(model, grad_method='analytical')[source]

Bases: object

Analytical-gradient HMC engine for NonStationaryEVD.

Parameters:
  • model (NonStationaryEVD) – A fitted (or partially set-up) NonStationaryEVD model instance. Must have data, cov, config, dist, and prior_specs set. Call model.prior_specs = model.suggest_priors() first if you have not already done so. grad_method : str 'analytical' (default) or fall back to 'numerical'. Analytical is ~20-50x faster per leapfrog step. Numerical is used automatically as fallback for unsupported dists.

  • grad_method (str)

class NonStationaryEVD(config, data, cov, dist, prior_specs=None, bounds=None)[source]

Bases: object

MH_Hmc(num_samples, initial_params, step_size=0.01, num_leapfrog_steps=10, burn_in=1000, num_chains=1, show_progress=True, n_jobs=1, T=1.0)[source]

Hamiltonian Monte Carlo (HMC) sampler. Wrapper around the HMC egine class to run multi-chain HMC sampling.

Parameters:
  • num_samples (int) – Total iterations per chain (excluding burnin).

  • initial_params (array-like) – Starting parameter vector.

  • step_size (float) – Leapfrog step size epsilon.

  • num_leapfrog_steps (int) – Number of leapfrog steps per proposal.

  • burn_in (int) – Number of initial samples to discard per chain.

  • num_chains (int) – Number of independent chains.

  • show_progress (bool) – Display tqdm progress bars.

  • n_jobs (int) – Parallel jobs via joblib.

  • T (float) – Temperature scaling factor.

  • Returns

  • -------

  • dict – A dictionary containing: - ‘chains’: List of sample arrays [chain, iteration, parameter]. - ‘r_hats’: Gelman-Rubin convergence values for each parameter. - ‘acceptance_rates’: List of acceptance rates per chain. - ‘step_sizes’: List of final step sizes per chain. - ‘divergences’: List of divergence counts per chain.

Return type:

Union[Tuple[ndarray, float], Tuple[List[ndarray], List[float], ndarray]]

MH_Mala(num_samples, initial_params, step_sizes, T=1.0, burn_in=0, num_chains=1, show_progress=True, n_jobs=1)[source]

Metropolis-Adjusted Langevin Algorithm (MALA) sampler.

Parameters:
  • num_samples (int) – Total iterations per chain (including burnin).

  • initial_params (array-like) – Starting parameter vector.

  • step_sizes (array-like) – Per-parameter step sizes (epsilon).

  • T (float) – Temperature scaling factor.

  • burnin (int) – Number of initial samples to discard per chain.

  • num_chains (int) – Number of independent chains.

  • show_progress (bool) – Display tqdm progress bars.

  • n_jobs (int) – Parallel jobs via joblib.

  • burn_in (int)

Return type:

Same convention as MH_RandWalk.

MH_RandWalk(num_samples, initial_params, proposal_widths, T=1.0, burn_in=0, num_chains=1, show_progress=True, n_jobs=1)[source]

Metropolis-Hastings Random-Walk sampler.

Parameters:
  • num_samples (int) – Total iterations per chain (excluding burnin).

  • initial_params (array-like) – Starting parameter vector (same start used for all chains + small jitter for chains > 1).

  • proposal_widths (array-like) – Per-parameter proposal standard deviations.

  • T (float) – Temperature (default 1.0 = no tempering).

  • burnin (int) – Number of initial samples to discard per chain.

  • num_chains (int) – Number of independent chains. R-hat is automatically computed when num_chains >= 2.

  • show_progress (bool) – Display tqdm progress bars (default True).

  • n_jobs (int) – Parallel jobs via joblib (-1 = all cores). Requires joblib.

  • burn_in (int)

Return type:

Union[Tuple[ndarray, float], Tuple[List[ndarray], List[float], ndarray]]

Returns:

  • If num_chains == 1 – (samples [n_post, n_params], acceptance_rate)

  • If num_chains > 1 – (list_of_chains, list_of_acceptance_rates) Use run_chains() for automatic R-hat reporting.

frequentist_nsEVD(initial_params, max_retries=10)[source]

Estimate non-stationary EVD parameters via MLE with retries. :type initial_params: Union[List[float], ndarray] :param initial_params: Initial guess for parameters. :type initial_params: array-like :type max_retries: int :param max_retries: Number of retry attempts with modified initial guess. :type max_retries: int

Returns:

params – Estimated parameters.

Return type:

array-like

Parameters:
  • initial_params (List[float] | ndarray)

  • max_retries (int)

static get_param_description(config, n_cov)[source]

Returns a list of strings describing each parameter’s role in the parameter vector, based on the provided configuration (config. vector).

Parameters:
  • config (list of int) – Non-stationarity configuration [location, scale, shape].

  • n_cov (int) – Total number of covariates available.

Returns:

Descriptions of each parameter in order.

Return type:

list of str

static ns_EVDrvs(dist, params, cov, config, size)[source]

Generate non-stationary GEV or GPD random samples. :type dist: rv_continuous :param dist: SciPy continuous distribution object (e.g., genextreme or genpareto). :type dist: rv_continuous :type params: Union[List[float], ndarray] :param params: Flattened parameter list according to config. :type params: list :type cov: ndarray :param cov: Covariate matrix, shape (n_covariates, n_samples). :type cov: np.ndarray :type config: List[int] :param config: Non-stationarity config [loc, scale, shape]. :type config: list of int :type size: int :param size: Number of random samples to generate. :type size: int

Returns:

Generated non-stationary random variates.

Return type:

np.ndarray

Parameters:
  • dist (rv_continuous)

  • params (List[float] | ndarray)

  • cov (ndarray)

  • config (List[int])

  • size (int)

suggest_bounds(buffer=0.5)[source]

Suggests bounds for MLE optimization based on config. vector and distribution.

Parameters:

buffer (float) – Fractional buffer around stationary parameter estimates.

Returns:

bounds – List of (lower, upper) tuples for each parameter in order.

Return type:

List[Tuple[float, float]]

suggest_priors()[source]

Suggest default prior distributions for model parameters based on the current configuration and data statistics.

Returns:

prior_specs – List of prior specifications for each parameter in the order expected by the sampler. Each element is a tuple like (distribution_name, distribution_parameters_dict).

Return type:

list of tuples

bayesian_metrics(samples, data, cov, config, dist)[source]

Compute Bayesian model selection criteria (DIC, AIC, BIC) from posterior samples.

This function evaluates the model’s performance using Deviance Information Criterion (DIC), Akaike Information Criterion (AIC), and Bayesian Information Criterion (BIC) based on the log-likelihoods computed from the posterior samples.

Parameters:
  • samples (ndarray of shape (n_samples, n_params)) – Posterior samples of model parameters obtained from MCMC or another Bayesian method.

  • data (array-like) – Observed data used to compute the likelihood.

  • cov (array-like or None) – Covariates used in the non-stationary model, if applicable.

  • config (dict) – Configuration settings for the likelihood computation, e.g., fixed parameters, link functions.

  • dist (str or callable) – Distribution type used for modeling the data (e.g., “gev”, “gumbel”), passed to the likelihood function.

Returns:

A dictionary containing the computed values of DIC, AIC, and BIC.

Return type:

dict

Notes

  • DIC is computed using the effective number of parameters

    (pD = 2 * (max_ll - mean_ll)).

  • AIC and BIC are computed using the maximum log-likelihood and number of

    parameters.

  • The log-likelihood is computed using the negative log-likelihood function

    for each sample.

gelman_rubin(chains)[source]

Compute the Gelman-Rubin R-hat statistic for each parameter.

Parameters:

chains (list of np.ndarray) – List of chains (arrays of shape [n_samples, n_params])

Returns:

R-hat values for each parameter

Return type:

np.ndarray

l_moments(data)[source]

Compute L-moments from the given data sample.

Parameters:

data (array-like) – Sample data array.

Returns:

Array containing [n, mean, L1, L2, T3, T4], where - n: sample size - mean: sample mean - L1, L2: first and second L-moments - T3, T4: L-skewness and L-kurtosis

Return type:

np.ndarray

neg_log_likelihood(params, data, dist)[source]

Compute the negative log-likelihood of data for given parameters of a stationary distribution.

Parameters:
  • params (list or np.ndarray) – Parameters [loc, scale, shape] for the distribution.

  • data (array-like) – Observed data points.

  • dist (scipy.stats distribution object) – Distribution object (e.g., genpareto or genextreme).

Returns:

Negative log-likelihood. Returns np.inf if parameters are invalid or evaluation fails.

Return type:

float

neg_log_likelihood_ns(params, data, cov, config, dist)[source]

Calculate the negative log-likelihood of the non-stationary extreme value distribution.

Parameters:
  • params (np.ndarray) – Parameter vector ordered according to the config.

  • data (list or np.ndarray) – Observed extreme values (e.g., annual maxima).

  • cov (list of lists or np.ndarray) – Covariate matrix with shape (n_covariates, n_samples).

  • config (list of int) – Non-stationarity configuration [location, scale, shape], where 0 = stationary, >=1 = number of covariates for non-stationary.

  • dist (rv_continuous) –

    SciPy continuous distribution object (e.g., genextreme or

    genpareto).

Returns:

Negative log-likelihood value. Returns np.inf if invalid parameters.

Return type:

float

plot_posterior(chains, config, fig_size=None, param_names_override=None)[source]

Plot histograms with density curves for each parameter based on config. vector

Parameters:
  • samples (np.ndarray) – MCMC samples of shape (n_iterations, n_parameters)

  • config (list of int) – Non-stationarity config [loc, scale, shape]

  • fig_size (tuple, optional) – Optional figure size (width, height). Default is based on number of parameters.

  • param_names_override (list of str, optional) – Custom parameter names to override default naming from config.

plot_trace(chains, config, fig_size=None, param_names_override=None)[source]

Plot MCMC trace plots for each parameter based on config. vector

Parameters:
  • chains (np.ndarray) – MCMC samples of shape (n_iterations, n_parameters)

  • config (list of int) – Non-stationarity config [loc, scale, shape]

  • fig_size (tuple) – Optional figure size.

  • param_names_override (list of str) – Optional custom names for parameters.