Katana VentraIP

Statistical inference

Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability.[1] Inferential statistical analysis infers properties of a population, for example by testing hypotheses and deriving estimates. It is assumed that the observed data set is sampled from a larger population.

Not to be confused with Statistical interference.

Inferential statistics can be contrasted with descriptive statistics. Descriptive statistics is solely concerned with properties of the observed data, and it does not rest on the assumption that the data come from a larger population. In machine learning, the term inference is sometimes used instead to mean "make a prediction, by evaluating an already trained model";[2] in this context inferring properties of the model is referred to as training or learning (rather than inference), and using a model for prediction is referred to as inference (instead of prediction); see also predictive inference.

a , i.e. a particular value that best approximates some parameter of interest;

point estimate

an , e.g. a confidence interval (or set estimate), i.e. an interval constructed using a dataset drawn from a population so that, under repeated sampling of such datasets, such intervals would contain the true parameter value with the probability at the stated confidence level;

interval estimate

a , i.e. a set of values containing, for example, 95% of posterior belief;

credible interval

rejection of a ;[note 1]

hypothesis

or classification of data points into groups.

clustering

Statistical inference makes propositions about a population, using data drawn from the population with some form of sampling. Given a hypothesis about a population, for which we wish to draw inferences, statistical inference consists of (first) selecting a statistical model of the process that generates the data and (second) deducing propositions from the model.[3]


Konishi & Kitagawa state, "The majority of the problems in statistical inference can be considered to be problems related to statistical modeling".[4] Relatedly, Sir David Cox has said, "How [the] translation from subject-matter problem to statistical model is done is often the most critical part of an analysis".[5]


The conclusion of a statistical inference is a statistical proposition.[6] Some common forms of statistical proposition are the following:

: The probability distributions describing the data-generation process are assumed to be fully described by a family of probability distributions involving only a finite number of unknown parameters.[7] For example, one may assume that the distribution of population values is truly Normal, with unknown mean and variance, and that datasets are generated by 'simple' random sampling. The family of generalized linear models is a widely used and flexible class of parametric models.

Fully parametric

: The assumptions made about the process generating the data are much less than in parametric statistics and may be minimal.[9] For example, every continuous probability distribution has a median, which may be estimated using the sample median or the Hodges–Lehmann–Sen estimator, which has good properties when the data arise from simple random sampling.

Non-parametric

: This term typically implies assumptions 'in between' fully and non-parametric approaches. For example, one may assume that a population distribution has a finite mean. Furthermore, one may assume that the mean response level in the population depends in a truly linear manner on some covariate (a parametric assumption) but not make any parametric assumption describing the variance around that mean (i.e. about the presence or possible form of any heteroscedasticity). More generally, semi-parametric models can often be separated into 'structural' and 'random variation' components. One component is treated parametrically and the other non-parametrically. The well-known Cox model is a set of semi-parametric assumptions.

Semi-parametric

p-value

Confidence interval

significance testing

Null hypothesis

Predictive inference [edit]

Predictive inference is an approach to statistical inference that emphasizes the prediction of future observations based on past observations.


Initially, predictive inference was based on observable parameters and it was the main purpose of studying probability, but it fell out of favor in the 20th century due to a new parametric approach pioneered by Bruno de Finetti. The approach modeled phenomena as a physical system observed with error (e.g., celestial mechanics). De Finetti's idea of exchangeability—that future observations should behave like past observations—came to the attention of the English-speaking world with the 1974 translation from French of his 1937 paper,[63] and has since been propounded by such statisticians as Seymour Geisser.[64]

Algorithmic inference

Induction (philosophy)

Informal inferential reasoning

Information field theory

Population proportion

Philosophy of statistics

Prediction interval

Predictive analytics

Predictive modelling

Stylometry

Berger, R. L. (2002). Statistical Inference. Duxbury Press. ISBN 0-534-24312-6

Casella, G.

(1991). "Statistical models and shoe leather". Sociological Methodology. 21: 291–313. doi:10.2307/270939. JSTOR 270939.

Freedman, D.A.

Held L., Bové D.S. (2014). Applied Statistical Inference—Likelihood and Bayes (Springer).

Lenhard, Johannes (2006). (PDF). British Journal for the Philosophy of Science. 57: 69–91. doi:10.1093/bjps/axi152. S2CID 14136146.

"Models and Statistical Inference: the controversy between Fisher and Neyman–Pearson"

Lindley, D (1958). "Fiducial distribution and Bayes' theorem". Journal of the Royal Statistical Society, Series B. 20: 102–7. :10.1111/j.2517-6161.1958.tb00278.x.

doi

Rahlf, Thomas (2014). "Statistical Inference", in Claude Diebolt, and Michael Haupert (eds.), "Handbook of Cliometrics ( Springer Reference Series)", Berlin/Heidelberg: Springer.

Reid, N.; Cox, D. R. (2014). "On Some Principles of Statistical Inference". International Statistical Review. 83 (2): 293–308. :10.1111/insr.12067. hdl:10.1111/insr.12067. S2CID 17410547.

doi

Sagitov, Serik (2022). "Statistical Inference". Wikibooks.

http://upload.wikimedia.org/wikipedia/commons/f/f9/Statistical_Inference.pdf

Young, G.A., Smith, R.L. (2005). Essentials of Statistical Inference, CUP.  0-521-83971-8

ISBN

– lecture on the MIT OpenCourseWare platform

Statistical Inference

– lecture by the National Programme on Technology Enhanced Learning

Statistical Inference

An online, Bayesian (MCMC) demo/calculator is available at

causaScientia