SIX SIGMA Glossary: Statistical Concepts

take a large number of samples from a population that does not conform to a normal distribution
calculate the mean each of those samples
find the shape of the population distribution formed by these sample means

You will find that the distribution of the sample means will resemble a normal distribution. The larger the the number of items in each sample, the better the approximation.

The Central Limit Theorem is of considerable practical importance because many methods used in inferential statistics rely on the samples being taken from a population that conforms to a normal distribution. Many populations do not conform to a normal distribution, but this can be overcome by using the means of samples taken from the population. Control charts are a good example of this.

Confidence Interval

[Top]

A random sample taken from a population is used to estimate the population mean. The sample mean is a point estimate, and is unlikely to exactly equal the true population mean.

The confidence interval defines a band around the sample mean within which the true population will lie, to some degree of confidence:

For example, there is a 95% probability that the true population mean will lie within the 95% confidence interval of the sample mean. The method used to calculate the confidence interval will vary, but usually involves the normal distribution for large samples, or the t-distribution for small samples.

The 100(1-a)% confidence interval for the mean of a small sample (t distribution) is:

Degrees of Freedom

[Top]

The number of independent data values that are used in estimating the value of a population parameter.

The number of degrees of freedom in the standard deviation formula is n-1:

If 'n' were used, instead of 'n-1', the value of 's' would be biased; the standard deviation calculated from small samples would underestimate the population standard deviation.

The number of degrees of freedom is 'n-1' because only 'n-1' of the data values 'x_i' are independent; if any 'n-1' are known then the other can be calculated (using x-bar).

The mean x-bar is an estimate of the true population mean and was calculated using the same x_i values that are being used in the standard deviation calculation. It can be shown that, because of this, errors between the estimate x-bar and the true population mean tend to bias the value of 's'.

Efficiency of Estimators

[Top]

An estimator is a statistic that represents the properties of a population. Several estimators may be available to represent a particular property.

When selecting the one you prefer you would might consider the efficiency and bias of the alternatives.

The most efficient estimator is the one that gives the lowest expected variance of the error. Technically the efficiency is the efficiency is the lowest possible variance from any estimator divided by the expected variance of the selected estimator.

The bias is the expected (average) difference between the estimator value and the actual population value.

Population

[Top]

The entire collection of the items under study. In inferential statistics the population under study might be the hypothetical future output of a process, given certain parameter values.

Prediction Interval

[Top]

The confidence interval is used to predict the interval within which the population mean falls. The prediction interval is used to predict the interval within which a single future observation will fall.

The 100(1-a)% prediction interval for a small sample (t distribution) is:

Standard Error

[Top]

The standard deviation of the mean of a sample. If you:

take a large number of samples, of equal size, from a population
calculate the mean of each sample
calculate the standard deviation of the sample means

you will have found the standard error. The standard error is related to the population (process) standard deviation by:

where 'n' is the sample size.

[SIX SIGMA GLOSSARY ALPHABETICAL INDEX] [SIX SIGMA GLOSSARY INDEX OF TOPICS]

FREE eLearning

	Primer in Statistics First Module "Introduction to Statistics"
	Primer in Statistics Reference Booklet
	Excel Primer

MiC Quality Courses


::	Six Sigma Primer
::	Primer in Statistics
::	Advanced Statistics
::	Statistical Process Control SPC
::	Advanced SPC
::	Design of Experiments
::	Advanced DOE
::	Measurement Systems Analysis MSA/ Gage R&R

Links & Copyright

If you want to link to this glossary you are welcome but please let us know.

This glossary is copyright. We will take action against anybody who downloads, copies or otherwise breaches the copyright law.