nsEVDx package
Submodules
nsEVDx.evd_model module
- class NonStationaryEVD(config, data, cov, dist, prior_specs=None, bounds=None)[source]
Bases:
object- MH_Hmc(num_samples, initial_params, step_size=0.01, num_leapfrog_steps=10, burn_in=1000, num_chains=1, show_progress=True, n_jobs=1, T=1.0)[source]
Hamiltonian Monte Carlo (HMC) sampler. Wrapper around the HMC egine class to run multi-chain HMC sampling.
- Parameters:
num_samples (int) – Total iterations per chain (excluding burnin).
initial_params (array-like) – Starting parameter vector.
step_size (float) – Leapfrog step size epsilon.
num_leapfrog_steps (int) – Number of leapfrog steps per proposal.
burn_in (int) – Number of initial samples to discard per chain.
num_chains (int) – Number of independent chains.
show_progress (bool) – Display tqdm progress bars.
n_jobs (int) – Parallel jobs via joblib.
T (float) – Temperature scaling factor.
Returns
-------
dict – A dictionary containing: - ‘chains’: List of sample arrays [chain, iteration, parameter]. - ‘r_hats’: Gelman-Rubin convergence values for each parameter. - ‘acceptance_rates’: List of acceptance rates per chain. - ‘step_sizes’: List of final step sizes per chain. - ‘divergences’: List of divergence counts per chain.
- Return type:
Union[Tuple[ndarray,float],Tuple[List[ndarray],List[float],ndarray]]
- MH_Mala(num_samples, initial_params, step_sizes, T=1.0, burn_in=0, num_chains=1, show_progress=True, n_jobs=1)[source]
Metropolis-Adjusted Langevin Algorithm (MALA) sampler.
- Parameters:
num_samples (int) – Total iterations per chain (including burnin).
initial_params (array-like) – Starting parameter vector.
step_sizes (array-like) – Per-parameter step sizes (epsilon).
T (float) – Temperature scaling factor.
burnin (int) – Number of initial samples to discard per chain.
num_chains (int) – Number of independent chains.
show_progress (bool) – Display tqdm progress bars.
n_jobs (int) – Parallel jobs via joblib.
burn_in (int)
- Return type:
Same convention as MH_RandWalk.
- MH_RandWalk(num_samples, initial_params, proposal_widths, T=1.0, burn_in=0, num_chains=1, show_progress=True, n_jobs=1)[source]
Metropolis-Hastings Random-Walk sampler.
- Parameters:
num_samples (int) – Total iterations per chain (excluding burnin).
initial_params (array-like) – Starting parameter vector (same start used for all chains + small jitter for chains > 1).
proposal_widths (array-like) – Per-parameter proposal standard deviations.
T (float) – Temperature (default 1.0 = no tempering).
burnin (int) – Number of initial samples to discard per chain.
num_chains (int) – Number of independent chains. R-hat is automatically computed when num_chains >= 2.
show_progress (bool) – Display tqdm progress bars (default True).
n_jobs (int) – Parallel jobs via joblib (-1 = all cores). Requires joblib.
burn_in (int)
- Return type:
Union[Tuple[ndarray,float],Tuple[List[ndarray],List[float],ndarray]]- Returns:
If num_chains == 1 – (samples [n_post, n_params], acceptance_rate)
If num_chains > 1 – (list_of_chains, list_of_acceptance_rates) Use run_chains() for automatic R-hat reporting.
- frequentist_nsEVD(initial_params, max_retries=10)[source]
Estimate non-stationary EVD parameters via MLE with retries. :type initial_params:
Union[List[float],ndarray] :param initial_params: Initial guess for parameters. :type initial_params: array-like :type max_retries:int:param max_retries: Number of retry attempts with modified initial guess. :type max_retries: int- Returns:
params – Estimated parameters.
- Return type:
array-like
- Parameters:
initial_params (List[float] | ndarray)
max_retries (int)
- static get_param_description(config, n_cov)[source]
Returns a list of strings describing each parameter’s role in the parameter vector, based on the provided configuration (config. vector).
- Parameters:
config (list of int) – Non-stationarity configuration [location, scale, shape].
n_cov (int) – Total number of covariates available.
- Returns:
Descriptions of each parameter in order.
- Return type:
list of str
- static ns_EVDrvs(dist, params, cov, config, size)[source]
Generate non-stationary GEV or GPD random samples. :type dist:
rv_continuous:param dist: SciPy continuous distribution object (e.g., genextreme or genpareto). :type dist: rv_continuous :type params:Union[List[float],ndarray] :param params: Flattened parameter list according to config. :type params: list :type cov:ndarray:param cov: Covariate matrix, shape (n_covariates, n_samples). :type cov: np.ndarray :type config:List[int] :param config: Non-stationarity config [loc, scale, shape]. :type config: list of int :type size:int:param size: Number of random samples to generate. :type size: int- Returns:
Generated non-stationary random variates.
- Return type:
np.ndarray
- Parameters:
dist (rv_continuous)
params (List[float] | ndarray)
cov (ndarray)
config (List[int])
size (int)
- suggest_bounds(buffer=0.5)[source]
Suggests bounds for MLE optimization based on config. vector and distribution.
- Parameters:
buffer (float) – Fractional buffer around stationary parameter estimates.
- Returns:
bounds – List of (lower, upper) tuples for each parameter in order.
- Return type:
List[Tuple[float, float]]
- suggest_priors()[source]
Suggest default prior distributions for model parameters based on the current configuration and data statistics.
- Returns:
prior_specs – List of prior specifications for each parameter in the order expected by the sampler. Each element is a tuple like (distribution_name, distribution_parameters_dict).
- Return type:
list of tuples
nsEVDx.hmc_engine module
- class HMCEngine(model, grad_method='analytical')[source]
Bases:
objectAnalytical-gradient HMC engine for NonStationaryEVD.
- Parameters:
model (NonStationaryEVD) – A fitted (or partially set-up) NonStationaryEVD model instance. Must have
data,cov,config,dist, andprior_specsset. Callmodel.prior_specs = model.suggest_priors()first if you have not already done so. grad_method : str'analytical'(default) or fall back to'numerical'. Analytical is ~20-50x faster per leapfrog step. Numerical is used automatically as fallback for unsupported dists.grad_method (str)
nsEVDx.utils module
- EVD_parsViaMLE(data, dist, verbose=False)[source]
Estimate EVD (GEV or GPD) parameters via MLE.
- Parameters:
data (array-like) – Observed data.
dist (scipy.stats distribution object) – genextreme or genpareto distribution.
- Returns:
Estimated parameters [xi (shape), mu (location), sigma (scale)].
- Return type:
np.ndarray
- Raises:
ValueError – If optimization fails.
- GEV_parsViaLM(arr)[source]
Estimate Generalized Extreme Value (GEV) parameters using L-moments based on the formulations given in Hosking and Wallis (1987)
- Parameters:
arr (array-like) – Observed data sample.
- Returns:
A NumPy array of size 3 containing the estimated GEV parameters: [shape, location, scale].
- Return type:
np.ndarray
- GPD_parsViaLM(arr)[source]
Estimate Generalized Pareto Distribution (GPD) parameters using L-moments based on the formulations given in Hosking and Wallis (1987)
- Parameters:
arr (array-like) – Observed data sample.
- Returns:
A NumPy array of size 3 containing the estimated GPD parameters: [shape, location, scale].
- Return type:
np.ndarray
- bayesian_metrics(samples, data, cov, config, dist)[source]
Compute Bayesian model selection criteria (DIC, AIC, BIC) from posterior samples.
This function evaluates the model’s performance using Deviance Information Criterion (DIC), Akaike Information Criterion (AIC), and Bayesian Information Criterion (BIC) based on the log-likelihoods computed from the posterior samples.
- Parameters:
samples (ndarray of shape (n_samples, n_params)) – Posterior samples of model parameters obtained from MCMC or another Bayesian method.
data (array-like) – Observed data used to compute the likelihood.
cov (array-like or None) – Covariates used in the non-stationary model, if applicable.
config (dict) – Configuration settings for the likelihood computation, e.g., fixed parameters, link functions.
dist (str or callable) – Distribution type used for modeling the data (e.g., “gev”, “gumbel”), passed to the likelihood function.
- Returns:
A dictionary containing the computed values of DIC, AIC, and BIC.
- Return type:
dict
Notes
- DIC is computed using the effective number of parameters
(pD = 2 * (max_ll - mean_ll)).
- AIC and BIC are computed using the maximum log-likelihood and number of
parameters.
- The log-likelihood is computed using the negative log-likelihood function
for each sample.
- gelman_rubin(chains)[source]
Compute the Gelman-Rubin R-hat statistic for each parameter.
- Parameters:
chains (list of np.ndarray) – List of chains (arrays of shape [n_samples, n_params])
- Returns:
R-hat values for each parameter
- Return type:
np.ndarray
- l_moments(data)[source]
Compute L-moments from the given data sample.
- Parameters:
data (array-like) – Sample data array.
- Returns:
Array containing [n, mean, L1, L2, T3, T4], where - n: sample size - mean: sample mean - L1, L2: first and second L-moments - T3, T4: L-skewness and L-kurtosis
- Return type:
np.ndarray
- neg_log_likelihood(params, data, dist)[source]
Compute the negative log-likelihood of data for given parameters of a stationary distribution.
- Parameters:
params (list or np.ndarray) – Parameters [loc, scale, shape] for the distribution.
data (array-like) – Observed data points.
dist (scipy.stats distribution object) – Distribution object (e.g., genpareto or genextreme).
- Returns:
Negative log-likelihood. Returns np.inf if parameters are invalid or evaluation fails.
- Return type:
float
- neg_log_likelihood_ns(params, data, cov, config, dist)[source]
Calculate the negative log-likelihood of the non-stationary extreme value distribution.
- Parameters:
params (np.ndarray) – Parameter vector ordered according to the config.
data (list or np.ndarray) – Observed extreme values (e.g., annual maxima).
cov (list of lists or np.ndarray) – Covariate matrix with shape (n_covariates, n_samples).
config (list of int) – Non-stationarity configuration [location, scale, shape], where 0 = stationary, >=1 = number of covariates for non-stationary.
dist (rv_continuous) –
- SciPy continuous distribution object (e.g., genextreme or
genpareto).
- Returns:
Negative log-likelihood value. Returns np.inf if invalid parameters.
- Return type:
float
- plot_posterior(chains, config, fig_size=None, param_names_override=None)[source]
Plot histograms with density curves for each parameter based on config. vector
- Parameters:
samples (np.ndarray) – MCMC samples of shape (n_iterations, n_parameters)
config (list of int) – Non-stationarity config [loc, scale, shape]
fig_size (tuple, optional) – Optional figure size (width, height). Default is based on number of parameters.
param_names_override (list of str, optional) – Custom parameter names to override default naming from config.
- plot_trace(chains, config, fig_size=None, param_names_override=None)[source]
Plot MCMC trace plots for each parameter based on config. vector
- Parameters:
chains (np.ndarray) – MCMC samples of shape (n_iterations, n_parameters)
config (list of int) – Non-stationarity config [loc, scale, shape]
fig_size (tuple) – Optional figure size.
param_names_override (list of str) – Optional custom names for parameters.
Module contents
- EVD_parsViaMLE(data, dist, verbose=False)[source]
Estimate EVD (GEV or GPD) parameters via MLE.
- Parameters:
data (array-like) – Observed data.
dist (scipy.stats distribution object) – genextreme or genpareto distribution.
- Returns:
Estimated parameters [xi (shape), mu (location), sigma (scale)].
- Return type:
np.ndarray
- Raises:
ValueError – If optimization fails.
- GEV_parsViaLM(arr)[source]
Estimate Generalized Extreme Value (GEV) parameters using L-moments based on the formulations given in Hosking and Wallis (1987)
- Parameters:
arr (array-like) – Observed data sample.
- Returns:
A NumPy array of size 3 containing the estimated GEV parameters: [shape, location, scale].
- Return type:
np.ndarray
- GPD_parsViaLM(arr)[source]
Estimate Generalized Pareto Distribution (GPD) parameters using L-moments based on the formulations given in Hosking and Wallis (1987)
- Parameters:
arr (array-like) – Observed data sample.
- Returns:
A NumPy array of size 3 containing the estimated GPD parameters: [shape, location, scale].
- Return type:
np.ndarray
- class HMCEngine(model, grad_method='analytical')[source]
Bases:
objectAnalytical-gradient HMC engine for NonStationaryEVD.
- Parameters:
model (NonStationaryEVD) – A fitted (or partially set-up) NonStationaryEVD model instance. Must have
data,cov,config,dist, andprior_specsset. Callmodel.prior_specs = model.suggest_priors()first if you have not already done so. grad_method : str'analytical'(default) or fall back to'numerical'. Analytical is ~20-50x faster per leapfrog step. Numerical is used automatically as fallback for unsupported dists.grad_method (str)
- class NonStationaryEVD(config, data, cov, dist, prior_specs=None, bounds=None)[source]
Bases:
object- MH_Hmc(num_samples, initial_params, step_size=0.01, num_leapfrog_steps=10, burn_in=1000, num_chains=1, show_progress=True, n_jobs=1, T=1.0)[source]
Hamiltonian Monte Carlo (HMC) sampler. Wrapper around the HMC egine class to run multi-chain HMC sampling.
- Parameters:
num_samples (int) – Total iterations per chain (excluding burnin).
initial_params (array-like) – Starting parameter vector.
step_size (float) – Leapfrog step size epsilon.
num_leapfrog_steps (int) – Number of leapfrog steps per proposal.
burn_in (int) – Number of initial samples to discard per chain.
num_chains (int) – Number of independent chains.
show_progress (bool) – Display tqdm progress bars.
n_jobs (int) – Parallel jobs via joblib.
T (float) – Temperature scaling factor.
Returns
-------
dict – A dictionary containing: - ‘chains’: List of sample arrays [chain, iteration, parameter]. - ‘r_hats’: Gelman-Rubin convergence values for each parameter. - ‘acceptance_rates’: List of acceptance rates per chain. - ‘step_sizes’: List of final step sizes per chain. - ‘divergences’: List of divergence counts per chain.
- Return type:
Union[Tuple[ndarray,float],Tuple[List[ndarray],List[float],ndarray]]
- MH_Mala(num_samples, initial_params, step_sizes, T=1.0, burn_in=0, num_chains=1, show_progress=True, n_jobs=1)[source]
Metropolis-Adjusted Langevin Algorithm (MALA) sampler.
- Parameters:
num_samples (int) – Total iterations per chain (including burnin).
initial_params (array-like) – Starting parameter vector.
step_sizes (array-like) – Per-parameter step sizes (epsilon).
T (float) – Temperature scaling factor.
burnin (int) – Number of initial samples to discard per chain.
num_chains (int) – Number of independent chains.
show_progress (bool) – Display tqdm progress bars.
n_jobs (int) – Parallel jobs via joblib.
burn_in (int)
- Return type:
Same convention as MH_RandWalk.
- MH_RandWalk(num_samples, initial_params, proposal_widths, T=1.0, burn_in=0, num_chains=1, show_progress=True, n_jobs=1)[source]
Metropolis-Hastings Random-Walk sampler.
- Parameters:
num_samples (int) – Total iterations per chain (excluding burnin).
initial_params (array-like) – Starting parameter vector (same start used for all chains + small jitter for chains > 1).
proposal_widths (array-like) – Per-parameter proposal standard deviations.
T (float) – Temperature (default 1.0 = no tempering).
burnin (int) – Number of initial samples to discard per chain.
num_chains (int) – Number of independent chains. R-hat is automatically computed when num_chains >= 2.
show_progress (bool) – Display tqdm progress bars (default True).
n_jobs (int) – Parallel jobs via joblib (-1 = all cores). Requires joblib.
burn_in (int)
- Return type:
Union[Tuple[ndarray,float],Tuple[List[ndarray],List[float],ndarray]]- Returns:
If num_chains == 1 – (samples [n_post, n_params], acceptance_rate)
If num_chains > 1 – (list_of_chains, list_of_acceptance_rates) Use run_chains() for automatic R-hat reporting.
- frequentist_nsEVD(initial_params, max_retries=10)[source]
Estimate non-stationary EVD parameters via MLE with retries. :type initial_params:
Union[List[float],ndarray] :param initial_params: Initial guess for parameters. :type initial_params: array-like :type max_retries:int:param max_retries: Number of retry attempts with modified initial guess. :type max_retries: int- Returns:
params – Estimated parameters.
- Return type:
array-like
- Parameters:
initial_params (List[float] | ndarray)
max_retries (int)
- static get_param_description(config, n_cov)[source]
Returns a list of strings describing each parameter’s role in the parameter vector, based on the provided configuration (config. vector).
- Parameters:
config (list of int) – Non-stationarity configuration [location, scale, shape].
n_cov (int) – Total number of covariates available.
- Returns:
Descriptions of each parameter in order.
- Return type:
list of str
- static ns_EVDrvs(dist, params, cov, config, size)[source]
Generate non-stationary GEV or GPD random samples. :type dist:
rv_continuous:param dist: SciPy continuous distribution object (e.g., genextreme or genpareto). :type dist: rv_continuous :type params:Union[List[float],ndarray] :param params: Flattened parameter list according to config. :type params: list :type cov:ndarray:param cov: Covariate matrix, shape (n_covariates, n_samples). :type cov: np.ndarray :type config:List[int] :param config: Non-stationarity config [loc, scale, shape]. :type config: list of int :type size:int:param size: Number of random samples to generate. :type size: int- Returns:
Generated non-stationary random variates.
- Return type:
np.ndarray
- Parameters:
dist (rv_continuous)
params (List[float] | ndarray)
cov (ndarray)
config (List[int])
size (int)
- suggest_bounds(buffer=0.5)[source]
Suggests bounds for MLE optimization based on config. vector and distribution.
- Parameters:
buffer (float) – Fractional buffer around stationary parameter estimates.
- Returns:
bounds – List of (lower, upper) tuples for each parameter in order.
- Return type:
List[Tuple[float, float]]
- suggest_priors()[source]
Suggest default prior distributions for model parameters based on the current configuration and data statistics.
- Returns:
prior_specs – List of prior specifications for each parameter in the order expected by the sampler. Each element is a tuple like (distribution_name, distribution_parameters_dict).
- Return type:
list of tuples
- bayesian_metrics(samples, data, cov, config, dist)[source]
Compute Bayesian model selection criteria (DIC, AIC, BIC) from posterior samples.
This function evaluates the model’s performance using Deviance Information Criterion (DIC), Akaike Information Criterion (AIC), and Bayesian Information Criterion (BIC) based on the log-likelihoods computed from the posterior samples.
- Parameters:
samples (ndarray of shape (n_samples, n_params)) – Posterior samples of model parameters obtained from MCMC or another Bayesian method.
data (array-like) – Observed data used to compute the likelihood.
cov (array-like or None) – Covariates used in the non-stationary model, if applicable.
config (dict) – Configuration settings for the likelihood computation, e.g., fixed parameters, link functions.
dist (str or callable) – Distribution type used for modeling the data (e.g., “gev”, “gumbel”), passed to the likelihood function.
- Returns:
A dictionary containing the computed values of DIC, AIC, and BIC.
- Return type:
dict
Notes
- DIC is computed using the effective number of parameters
(pD = 2 * (max_ll - mean_ll)).
- AIC and BIC are computed using the maximum log-likelihood and number of
parameters.
- The log-likelihood is computed using the negative log-likelihood function
for each sample.
- gelman_rubin(chains)[source]
Compute the Gelman-Rubin R-hat statistic for each parameter.
- Parameters:
chains (list of np.ndarray) – List of chains (arrays of shape [n_samples, n_params])
- Returns:
R-hat values for each parameter
- Return type:
np.ndarray
- l_moments(data)[source]
Compute L-moments from the given data sample.
- Parameters:
data (array-like) – Sample data array.
- Returns:
Array containing [n, mean, L1, L2, T3, T4], where - n: sample size - mean: sample mean - L1, L2: first and second L-moments - T3, T4: L-skewness and L-kurtosis
- Return type:
np.ndarray
- neg_log_likelihood(params, data, dist)[source]
Compute the negative log-likelihood of data for given parameters of a stationary distribution.
- Parameters:
params (list or np.ndarray) – Parameters [loc, scale, shape] for the distribution.
data (array-like) – Observed data points.
dist (scipy.stats distribution object) – Distribution object (e.g., genpareto or genextreme).
- Returns:
Negative log-likelihood. Returns np.inf if parameters are invalid or evaluation fails.
- Return type:
float
- neg_log_likelihood_ns(params, data, cov, config, dist)[source]
Calculate the negative log-likelihood of the non-stationary extreme value distribution.
- Parameters:
params (np.ndarray) – Parameter vector ordered according to the config.
data (list or np.ndarray) – Observed extreme values (e.g., annual maxima).
cov (list of lists or np.ndarray) – Covariate matrix with shape (n_covariates, n_samples).
config (list of int) – Non-stationarity configuration [location, scale, shape], where 0 = stationary, >=1 = number of covariates for non-stationary.
dist (rv_continuous) –
- SciPy continuous distribution object (e.g., genextreme or
genpareto).
- Returns:
Negative log-likelihood value. Returns np.inf if invalid parameters.
- Return type:
float
- plot_posterior(chains, config, fig_size=None, param_names_override=None)[source]
Plot histograms with density curves for each parameter based on config. vector
- Parameters:
samples (np.ndarray) – MCMC samples of shape (n_iterations, n_parameters)
config (list of int) – Non-stationarity config [loc, scale, shape]
fig_size (tuple, optional) – Optional figure size (width, height). Default is based on number of parameters.
param_names_override (list of str, optional) – Custom parameter names to override default naming from config.
- plot_trace(chains, config, fig_size=None, param_names_override=None)[source]
Plot MCMC trace plots for each parameter based on config. vector
- Parameters:
chains (np.ndarray) – MCMC samples of shape (n_iterations, n_parameters)
config (list of int) – Non-stationarity config [loc, scale, shape]
fig_size (tuple) – Optional figure size.
param_names_override (list of str) – Optional custom names for parameters.