API Documentation

Demographic models

class momi.DemographicModel(N_e, gen_time=1, muts_per_gen=None)

Object for representing and inferring a demographic history.

Parameters:
  • N_e (float) – the population size, unless manually changed by DemographicModel.set_size()

  • gen_time (float) – units of time per generation. For example, if you wish to specify time in years, and a generation is 29 years, set this to 29. Default value is 1.

  • muts_per_gen (float,None) – mutation rate per base per generation. If unknown, set to None (the default). Can be changed with DemographicModel.set_mut_rate()

add_growth_param(name, g0=None, lower=-0.001, upper=0.001, rgen=None)

Add growth rate parameter to the demographic model.

Parameters:
  • name (str) – Parameter name

  • g0 (float) – Starting value. If None, randomly sample with rgen

  • lower (float) – Lower bound

  • upper (float) – Upper bound

  • rgen (function) – Function to sample random value. If None, use uniform distribution

add_leaf(pop_name, t=0, N=None, g=None)

Add a sampled leaf population to the model.

The arguments t, N, g can be floats or parameter names (strings).

If N or g are specified, then the size or growth rate are also set at the sampling time t. Otherwise they remain at their previous (default) values.

Note that this does not affect the population size and growth below time t, which may be an issue if lineages are moved in from other populations below t. If you need to set the size or growth below t, use DemographicModel.set_size().

Parameters:
  • pop_name (str) – Name of the population

  • t (float,str) – Time the population was sampled

  • N (float,str) – Population size

  • g (float,str) – Population growth rate

add_parameter(name, start_value=None, rgen=None, scale_transform=None, unscale_transform=None, scaled_lower=None, scaled_upper=None)

Low-level method to add a new parameter. Most users should instead call DemographicModel.add_size_param(), DemographicModel.add_pulse_param(), DemographicModel.add_time_param(), or DemographicModel.add_growth_param().

In order for internal gradients to be correct, scale_transform and unscale_transform should be constructed using autograd.

Parameters:
  • name (str) – Name of the parameter.

  • start_value (float) – Starting value. If None, use rgen to sample a random starting value.

  • rgen (function) – Function to get a random starting value.

  • scale_transform (function) – Function for internally transforming and rescaling the parameter during optimization.

  • unscale_transform (function) – Inverse function of scale_transform

  • scaled_lower (float) – Lower bound after scaling by scale_transform

  • scaled_upper (float) – Upper bound after scaling by scale_transform

add_pulse_param(name, p0=None, lower=0.0, upper=1.0, rgen=None)

Add a pulse parameter to the demographic model.

Parameters:
  • name (str) – Parameter name.

  • p0 (float) – Starting value. If None, randomly sample with rgen

  • lower (float) – Lower bound

  • upper (float) – Upper bound

  • rgen (function) – Function to sample random value. If None, use a uniform distribution.

add_size_param(name, N0=None, lower=1, upper=10000000000.0, rgen=None)

Add a size parameter to the demographic model.

Parameters:
  • name (str) – Parameter name

  • N0 (float) – Starting value. If None, use rgen to randomly sample

  • lower (float) – Lower bound

  • upper (float) – Upper bound

  • rgen (function) – Function to sample a random starting value. If None, a truncated exponential with rate 1 / N_e

add_time_param(name, t0=None, lower=0.0, upper=None, lower_constraints=[], upper_constraints=[], rgen=None)

Add a time parameter to the demographic model.

Parameters:
  • name (str) – Parameter name

  • t0 (float) – Starting value. If None, use rgen to randomly sample

  • lower (float) – Lower bound

  • upper (float) – Upper bound

  • lower_constraints (list) – List of parameter names that are lower bounds

  • upper_constraints (list) – List of parameter names that are upper bounds

  • rgen – Function to sample a random starting value. If None, a truncated exponential with rate 1 / (N_e * gen_time) constrained to satisfy the bounds and constraints.

fit_within_pop_diversity()

Estimates mutation rate using within-population nucleotide diversity.

The within-population nucleotide diversity is the average number of hets per individual in the population, assuming it is at Hardy Weinberg equilibrium.

This returns an estimated mutation rate for each (ascertained) population. Note these are non-independent estimates of the same value. It also returns standard deviations from the jacknife.

If DemographicModel.muts_per_gen is set, will also return Z-scores of the residuals.

Return type:

pandas.DataFrame

get_params(scaled=False)

Return an ordered dictionary with the current parameter values.

If scaled=True, returns the parameters scaled in the internal representation used by momi during optimization (see also: DemographicModel.add_parameter())

kl_div()

The KL-divergence at the current parameter values

log_likelihood()

The log likelihood at the current parameter values

move_lineages(pop_from, pop_to, t, p=1, N=None, g=None)

Move each lineage in pop_from to pop_to at time t with probability p.

The arguments t, p, N, g can be floats or parameter names (strings).

If N or g are specified, then the size or growth rate of pop_to is also set at this time, otherwise these parameters remain at their previous values.

Parameters:
  • pop_from (str) – Population lineages are moved from (backwards in time)

  • pop_to (str) – Population lineages are moved to (backwards in time)

  • t (float,str) – Time of the event

  • p (float,str) – Probability that lineage in pop_from moves to pop_to

  • N (float,str) – Population size of pop_to

  • g (float,str) – Growth rate of pop_to

optimize(method='tnc', jac=True, hess=False, hessp=False, printfreq=1, **kwargs)

Search for the maximum likelihood value of the parameters.

This is a wrapper around scipy.optimize.minimize(), and arguments for that function can be passed in via **kwargs. Note the following arguments are constructed by momi and not be passed in by **kwargs: fun, x0, jac, hess, hessp, bounds.

Parameters:
  • method (str) – Optimization method. Default is “tnc”. For large models “L-BFGS-B” is recommended. See scipy.optimize.minimize().

  • jac (bool) – Whether or not to provide the gradient (computed via autograd) to the optimizer. If False, optimizers requiring gradients will typically approximate it via finite differences.

  • hess (bool) – Whether or not to provide the hessian (computed via autograd) to the optimizer.

  • hessp (bool) – Whether or not to provide the hessian-vector-product (via autograd) to the optimizer

  • printfreq (int) – Log current progress via logging.info() every printfreq iterations

Return type:

scipy.optimize.OptimizeResult

set_data(sfs, length=None, mem_chunk_size=1000, non_ascertained_pops=None, use_pairwise_diffs=True)

Set dataset for the model.

Parameters:
  • sfs (Sfs) – Observed SFS

  • length (float) – Length of data in bases. Overrides sfs.length if set. Required if DemoModel.muts_per_gen is set and sfs.length is not.

  • mem_chunk_size – Controls memory usage by computing likelihood in chunks of SNPs. If -1 then no chunking is done.

  • non_ascertained_pops – Don’t ascertain SNPs within these populations. That is, ignore all SNPs that are not polymorphic on the other populations. The SFS is adjusted to represent probabilities conditional on this ascertainment scheme.

  • use_pairwise_diffs – Only has an effect if DemoModel.muts_per_gen is set. If False, assumes the total number of mutations is Poisson. If True, models the within population nucleotide diversity (i.e. the average number of heterozygotes per population) as independent Poissons. If there is missing data this is required to be True.

set_mut_rate(muts_per_gen)

Set the mutation rate.

Parameters:

muts_per_gen (float,None) – Mutation rate per base per generation. If unknown, set to None.

set_params(new_params=None, randomize=False, scaled=False)

Set the current parameter values

Parameters:
  • new_params (dict/list) – dict mapping parameter names to new values, or list of new values whose length is the current number of parameters

  • randomize (bool) – if True, parameters not in new_params get randomly sampled new values

  • scaled (bool) – if True, values in new_params have been pre-scaled according to the internal representation used by momi during optimization (see also: DemographicModel.add_parameter())

set_size(pop_name, t, N=None, g=0)

Set population size and/or growth rate at time t.

The arguments t, N, g can be floats or parameter names (strings).

If N is not specified then only the growth rate is changed.

If N is specified and g is not then the growth rate is reset to 0. Currently it is not possible to change the size without also setting the growth rate.

Parameters:
simulate_data(length, recoms_per_gen, num_replicates, muts_per_gen=None, sampled_n_dict=None, **kwargs)

Simulate data, using msprime as backend

Parameters:
  • length (int) – Length of each locus in bases

  • recoms_per_gen (float) – Recombination rate per generation per base

  • muts_per_gen (float) – Mutation rate per generation per base

  • num_replicates (int) – Number of loci to simulate

  • sampled_n_dict (dict) – Number of haploids per population. If None, use sample sizes from the current dataset as set by DemographicModel.set_data()

Returns:

Dataset of SNP allele counts

Return type:

SnpAlleleCounts

simulate_vcf(out_prefix, recoms_per_gen, length, muts_per_gen=None, chrom_name='1', ploidy=1, random_seed=None, sampled_n_dict=None, **kwargs)

Simulate a chromosome using msprime and write it to VCF

Parameters:
  • outfile (str,file) – Output VCF file. If a string ending in “.gz”, gzip it.

  • muts_per_gen (float) – Mutation rate per generation per base

  • recoms_per_gen (float) – Recombination rate per generation per base

  • length (int) – Length of chromosome in bases

  • chrom_name (str) – Name of chromosome

  • ploidy (int) – Ploidy

  • random_seed (int) – Random seed

  • sampled_n_dict (dict) – Number of haploids per population. If None, use sample sizes from the current dataset as set by DemographicModel.set_data()

stochastic_optimize(num_iters, n_minibatches=None, snps_per_minibatch=None, rgen=None, printfreq=1, start_from_checkpoint=None, save_to_checkpoint=None, svrg_epoch=-1, **kwargs)

Use stochastic optimization (ADAM+SVRG) to search for MLE

Exactly one of of n_minibatches and snps_per_minibatch should be set, as one determines the other.

Parameters:
  • num_iters (int) – Number of steps

  • n_minibatches (int) – Number of minibatches

  • snps_per_minibatch (int) – Number of SNPs per minibatch

  • rgen (numpy.RandomState) – Random generator

  • printfreq (int) – How often to log progress

  • start_from_checkpoint (str) – Name of checkpoint file to start from

  • save_to_checkpoint (str) – Name of checkpoint file to save to

  • svrg_epoch (int) – How often to compute full likelihood for SVRG. -1=never.

Return type:

scipy.optimize.OptimizeResult

Plotting

class momi.DemographyPlot(model, pop_x_positions, ax=None, figsize=None, linthreshy=None, minor_yticks=None, major_yticks=None, color_map='cool', pulse_color_bounds=(0, 1), draw=True)

Object for plotting a demography.

After creating this object, call DemographyPlot.draw() to draw the demography.

Parameters:
  • model (DemographicModel) – model to plot

  • pop_x_positions (list,dict) – list ordering populations along the x-axis, or dict mapping population names to x-axis positions

  • ax (matplotlib.axes.Axes) – Canvas to draw the figure on. Defaults to matplotlib.pyplot.gca()

  • figsize (tuple) – If non-None, calls matplotlib.pyplot.figure() to create a new figure with this width and height. Ignored if ax is non-None.

  • linthreshy (float) – Scale y-axis linearly below this value and logarithmically above it. (Default: use linear scaling only)

  • minor_yticks (list) – Minor ticks on y axis

  • major_yticks (list) – Major ticks on y axis

  • color_map (str,matplotlib.colors.Colormap) – Colormap mapping pulse probabilities to colors

  • pulse_color_bounds (tuple) – pair of (lower, upper) bounds for mapping pulse probabilities to colors

add_bootstrap(params, alpha, rad=-0.1, rand_rad=True)

Add an inferred bootstrap demography to the plot.

Parameters:
  • params (dict) – Inferred bootstrap parameters

  • alpha (float) – Transparency

  • rad (float) – Arc of pulse arrows in radians

  • rand_rad (bool) – Add random jitter to the arc of the pulse arrows

draw(alpha=1.0, tree_color='C0', draw_frame=True, rad=0, pulse_label=True)

Draw the demography.

This is method draws the demography by calling DemographyPlot.draw_tree(), DemographyPlot.draw_leafs(), DemographyPlot.draw_pulse(), and DemographyPlot.draw_frame(). Call those methods directly for finer control.

Parameters:
  • alpha (float) – Level of transparency (0=transparent, 1=opaque)

  • tree_color (str) – Color of tree

  • draw_frame (bool) – If True, call DemographyPlot.draw_frame() to draw tickmarks and legends

  • rad (float) – Arc of pulse arrows in radians.

  • pulse_label (bool) – If True, label each pulse with its strength

draw_N_legend(N_legend_values=None, title='N', **kwargs)

Draw legend of population sizes.

**kwargs are passed onto matplotlib.axes.Axes.legend().

Parameters:
  • N_legend_values – Values of N to plot.

  • title – Title of legend

Return type:

matplotlib.legend.Legend

draw_frame(pops=None, rename_pops=None, rotation=-30)

Draw tickmarks, legend, and colorbar.

Parameters:
  • pops (list) – Populations to label on x-axis. If None, add all populations

  • rename_pops (dict) – Dict to give new names to populations on x-axis.

  • rotation (float) – Degrees to rotate x-axis population labels by.

draw_leafs(leafs=None, facecolors='none', edgecolors='black', s=100, zorder=2, **kwargs)

Draw symbols for the leaf populations.

Parameters:

leafs (list) – Leaf populations to draw symbols for. If None, add all leafs.

Extra parameters facecolors, edgecolors, s, zorder, **kwargs are passed to matplotlib.axes.Axes.scatter().

draw_pulse(pop1, pop2, t, p, rad=-0.1, alpha=1.0, pulse_label=True, adj_label_x=0, adj_label_y=0)

Draw a pulse.

Use DemographyPlot.iter_pulses() to iterate over the pulses, which can then be plotted with this method.

Parameters:
  • pop1 (str) – Population the arrow is pointing into

  • pop2 (str) – Population the arrow is pointing away from

  • t (float) – Time of the pulse

  • p (float) – Strength of the pulse

  • rad (float) – Arc of the pulse arrow in radians.

  • alpha (float) – Transparency

  • pulse_label (bool) – Add a label for pulse strength?

  • adj_label_x (float) – Adjust pulse label x position

  • adj_label_y (float) – Adjust pulse label y position

draw_tree(tree_color='C0', alpha=1.0)

Draw the demographic tree (without pulse migrations)

Parameters:
  • tree_color (str) – Color of the tree

  • alpha (float) – Transparency

iter_pulses()

Iterate over the pulse arrows, to pass to DemographyPlot.draw_pulse()

Returns:

iterator over tuples (pop1, pop2, t, p)

Data

momi.snp_allele_counts(chrom_ids, positions, populations, ancestral_counts, derived_counts, length=None, use_folded_sfs=False)

Create a SnpAlleleCounts object.

Parameters:
  • chrom_ids (iterator) – the CHROM at each SNP

  • positions (iterator) – the POS at each SNP

  • populations (list) – the population names

  • ancestral_counts (iterator) – iterator over tuples of the ancestral counts at each SNP. tuple length should be the same as the number of populations

  • derived_counts (iterator) – iterator over tuples of the derived counts at each SNP. tuple length should be the same as the number of populations

  • use_folded_sfs (bool) – whether the folded SFS should be used when computing likelihoods. Set this to True if there is uncertainty about the ancestral allele.

Return type:

SnpAlleleCounts

class momi.SnpAlleleCounts

The allele counts for a list of SNPs.

To create a SnpAlleleCounts object, use SnpAlleleCounts.read_vcf(), SnpAlleleCounts.load(), or snp_allele_counts(). Do NOT call the class constructor directly, it is for internal use only.

classmethod concatenate(to_concatenate)

Combine a list of SnpAlleleCounts into a single object.

Parameters:

to_concatenate (iterator) – Iterator over SnpAlleleCounts

Return type:

SnpAlleleCounts

down_sample(sampled_n_dict)

Randomly subsample the alleles at each site.

Parameters:

sampled_n_dict (dict) – dictionary mapping population names to the subsample size (number of alleles to subsample from each population).

Return type:

SnpAlleleCounts

dump(f)

Write data in JSON format.

Parameters:

f (str,file) – filename or file object. If a filename, the resulting file is gzipped.

extract_sfs(n_blocks)

Extracts SFS from data.

Parameters:

n_blocks (int) – Number of blocks to split SFS into, for jackknifing and bootstrapping

Return type:

Sfs

classmethod load(f)

Load SnpAlleleCounts created from SnpAlleleCounts.dump() or python -m momi.read_vcf ...

Parameters:

f (str,file) – file object or file name to read in

Return type:

SnpAlleleCounts

classmethod read_vcf(vcf_file, ind2pop, bed_file=None, ancestral_alleles=True, info_aa_field='AA')

Read in a VCF file and return the allele counts at biallelic SNPs.

Parameters:
  • vcf_file (str) – VCF file to read in. The VCF file should be indexed, e.g. with tabix or bcftools. Alternatively, if equal to “-”, reads the VCF from stdin.

  • ind2pop (dict) – Maps individual samples to populations.

  • bed_file (str,None) – BED accessibility regions file. Only regions in the BED file are read from the VCF. The BED file is also used to determine the size of the data in bases, so the same BED file should NOT be used across multiple VCFs (otherwise regions will be double counted towards the data length). If no BED is provided, all SNPs in the VCF are read, and the length of the data is set to be unknown.

  • ancestral_alleles (bool,str) – If True, use the AA INFO field to determine ancestral allele, skipping SNPs missing this field. If False, ignore ancestral allele information, and set the SnpAlleleCounts.use_folded_sfs property so that the folded SFS is used by default when computing likelihoods. Finally, if ancestral_alleles is a string that is the name of a population in ind2pop, then treat that population as an outgroup, using its consensus allele as the ancestral allele; SNPs without consensus are skipped.

  • info_aa_field (str) – The INFO field to read Ancestral Allele from. Default is “AA”. Only has effect if ancestral_alleles=True.

Return type:

SnpAlleleCounts

momi.site_freq_spectrum(sampled_pops, freqs_by_locus, length=None)

Create Sfs from a list of dicts of counts.

freqs_by_locus[l][config] gives the frequency of config on locus l.

A config is a (n_pops,2)-array, where config[p][a] is the count of allele a in population p. a=0 represents the ancestral allele while a=1 represents the derived allele.

Parameters:
  • sampled_pops (list) – list of population names

  • freqs_by_locus (list) – list of dict

  • length (float,None) – Number of bases in the dataset. Set to None if unknown.

Return type:

Sfs

class momi.Sfs

Class for representing observed site frequnecy spectrum data.

To create a Sfs object, use SnpAlleleCounts.extract_sfs(), Sfs.load(), or site_freq_spectrum(). Do NOT call the class constructor directly, it is for internal use only.

property config_array

Array of unique configs.

This is a 3-d array with shape (n_configs, n_pops, 2). config_array[i, p, a] gives the count of allele a in population p in configuration i. Allele a=0 is ancestral, a=1 is derived. Sfs.sampled_pops gives the ordering of the populations.

Returns:

array of configs

Return type:

numpy.ndarray

dump(f)

Write Sfs to file

Parameters:

f (str,file) – Filename or object. If name ends with “.gz” gzip it.

fold()
Returns:

A copy of the SFS, but with folded entries.

Return type:

Sfs

property freqs_matrix

Sparse matrix representing the frequencies at each locus.

The (i,j)-th entry gives the frequency of the i-th config in Sfs.config_array at locus j

Return type:

scipy.sparse.csr_matrix

property length

Length of data in bases. None if unknown.

Return type:

float

classmethod load(f)

Load Sfs from file created by Sfs.dump() or python -m momi.extract_sfs

Parameters:

f (str,file) – file object or file name to read in

Return type:

Sfs

property n_loci

The number of loci in the dataset

Return type:

int

n_snps(vector=False)

Number of SNPs, either per-locus or overall.

Parameters:

vector (bool) – Whether to return a single total count, or an array of counts per-locus

p_missing

Estimate of probability that a random allele from each population is missing.

Missingness is estimated as follows: from each SNP remove a random allele; if the resulting config is monomorphic, then ignore. If polymorphic, then count whether the removed allele is missing or not.

This avoids bias from fact that we don’t observe some polymorphic configs that appear monomorphic after removing the missing alleles.

Returns:

1-d array of missingness per population

Return type:

numpy.ndarray

resample()

Create a new SFS by resampling blocks with replacement.

Note the resampled SFS is assumed to have the same length in base pairs as the original SFS, which may be a poor assumption if the blocks are not of equal length.

Returns:

Resampled SFS

Return type:

Sfs

property sampled_n

Number of samples per population

sampled_n[i] is the number of haploids in the i-th population of Sfs.sampled_pops

Returns:

1-d array of integers

Return type:

numpy.ndarray

property sampled_pops

Names of sampled populations

Returns:

1-d array of strings

Return type:

numpy.ndarray

subset_populations(populations, non_ascertained_pops=None)

Returns lower-dimensional SFS restricted to a subset of populations.

Parameters:
  • populations (list) – list of populations to subset to

  • non_ascertained_pops (list) – list of populations to treat as non-ascertained (SNPs must be polymorphic with respect to the ascertained populations)

Returns:

Lower-dimensional SFS restricted to a subset of populations.

Return type:

Sfs

momi.sfs_from_dadi(infile, outfile=None)

Generate a momi formatted Sfs object from the dadi frequency spectrum format as documented in section 3.1 of the dadi manual. One slight modification is that if you include population names after the “folding” flag on the info line, then this function will honor these pop names.

Parameters:
  • infile (str) – dadi-style sfs to be converted

  • outfile (str,None) – Optional output file to write the Sfs to

Return type:

Sfs

Statistics

class momi.SfsModelFitStats(demo_model, sampled_n_dict=None)

Class to compare expected vs. observed statistics of the SFS.

All methods return JackknifeGoodnessFitStat unless otherwise stated.

Currently, all goodness-of-fit statistics are based on the multinomial SFS (i.e., the SFS normalized to be a probability distribution summing to 1). Thus the mutation rate has no effect on these statistics.

See Patterson et al 2012, for definitions of f2, f3, f4 (abba-baba), and D statistics.

Note this class does NOT get updated when the underlying demo_model changes; a new SfsModelFitStats needs to be created to reflect any changes in the demography.

Parameters:
  • demo_model (momi.DemographicModel) – Demography to compute expected statistics under.

  • sampled_n_dict (dict) – The number of samples to use per population. SNPs with fewer than this number of samples are ignored. The default is to use the full sample size of the data, i.e. to remove all SNPs with any missing data. For datasets with large amounts of missing data, this could potentially lead to most or all SNPs being removed, so it is important to specify a smaller sample size in such cases.

abba_baba(A, B, C, D=None)

Same as f4()

all_pairs_ibs(fig=True)

Fit the IBS fraction for all pairs of populations, and optionally plot it.

Parameters:

fig (bool) – whether to plot it

Return type:

pandas.DataFrame

f2(A, B)

Computes f2 statistic (A-B)*(A-B)

Parameters:
  • A (str) – First population

  • B (str) – Second population

f3(A, B, O)

Computes f3 statistic (O-A)*(O-B)

Parameters:
  • A (str) – First population

  • B (str) – Second population

  • O (str) – Third population.

f4(A, B, C, D=None)

Returns the ABBA-BABA (f4) statistic for testing admixture.

Parameters:
  • A (str) – First population

  • B (str) – Second population

  • C (str) – Third population

  • D (str) – Fourth population. If None, use ancestral allele.

f_st(A, B)

Returns (pi_between - pi_within) / pi_between, where pi_between, pi_within represent the average number of pairwise diffs between 2 individuals sampled from different or the same population, respectively.

log_abba_baba(A, B, C, D=None)

Returns log(BABA/ABBA) = log(BABA)-log(ABBA)

pattersons_d(A, B, C, D=None)

Returns Patterson’s D, defined as (BABA-ABBA)/(BABA+ABBA).

Parameters:
  • A (str) – First population

  • B (str) – Second population

  • C (str) – Third population

  • D (str) – Fourth population. If None, use ancestral allele.

tensor_prod(derived_weights_dict)

Compute rank-1 tensor products of the SFS, which can be used to express a wide range of SFS-based statistics.

More specifically, this computes the sum

\[\sum_{i,j,\ldots} SFS_{i,j,\ldots} w^{(1)}_i w^{(2)}_j \cdots\]

where \(w^{(1)}_i\) is the weight corresponding to SFS entries with i derived alleles in population 1, etc. Note the SFS is normalized to sum to 1 here (it is a probability).

Parameters:

derived_weights_dict (dict) – Maps leaf populations to vectors (numpy.ndarray). If a population has n samples then the corresponding vector w should have length n+1, with w[i] being the weight for SFS entries with i copies of the derived allele in the population.

Return type:

JackknifeGoodnessFitStat

class momi.JackknifeGoodnessFitStat(expected, observed, jackknifed_array)

Object returned by methods of SfsModelFitStats.

Basic arithmetic operations are supported, allowing to build up complex statistics out of simpler ones.

The raw expected, observed, and jackknifed_array values can be accessed as attributes of this class.

Parameters:
  • expected (float) – the expected value of the statistic

  • observed (float) – the observed value of the statistic

  • jackknifed_array (numpy.ndarray) – array of the jackknifed values of the statistic.

property sd

Standard deviation of the statistic, estimated via jackknife

property z_score

Z-score of the statistic, defined as (observed-expected)/sd