# Statistics

## Introduction

With Desmos, students can investigate the shape, center, and spread of various data sets, run regression to model bivariate data, or (with a little bit of elbow grease) create and explore dynamic displays of important stats topics. You'll find lists at the core of our statistics experience. Start there, then dive deeper with the resources and challenges below.

The calculator provides several functions for computing statistical properties from lists of data, performing basic statistical tests, counting combinations and permutations, working with distributions, and generating random values. These functions are accessible from the "Statistics" and "Distribution" sections in the keypad, or can be typed directly into the expressions list using a keyboard.

## General Statistical Functions

Function: Example: Result:
total(list) or total(a,b,c,...) total Output the sum of a list of numbers.
count(list) or count(a,b,c,...) count Output the number of elements in a list of numbers.
mean(list) or mean(a,b,c,...) or mean(distribution) mean Output the mean of a list of numbers. This function will also return the mean of a distribution, if it exists. See the section on distributions below.
median(list) or median(a,b,c,...) or median(distribution) median Output the median of a list of numbers. This function will also return the median of a distribution, if it exists. See the section on distributions below.
min(list) or min(a,b,c,...) minimum Output the minimum value contained in a list of numbers.
max(list) or max(a,b,c,...) maximum Output the maximum value contained in a list of numbers.
quartile(list, q) quartile Output the qth quartile of list. q must be a number between 0 and 4 (inclusive), otherwise the result will be undefined. Note that this function uses the Moore and McCabe method, which discards the median in odd-length data sets before computing the upper and lower quartiles.
quantile(list, q) or quantile(distribution, q) quantile Output the qth quantile of list. q must be a number between 0 and 1 (inclusive), otherwise the result will be undefined. Passing a distribution as the first argument to quantile allows you to evaluate its inverse CDF. See the section on distributions below.
inversecdf(distribution,q) inverse cdf An alias for quantile. See the section on distributions below.
~ regression Used for performing regressions.
stdev(list) or stdev(a,b,c,...) or stdev(distribution) standard deviation Output the sample standard deviation of a list of numbers. This function will also return the standard deviation of a distribution, if it exists. See the section on distributions below.
stdevp(list) or stdevp(a,b,c,...) standard deviation population Output the population standard deviation of a list of numbers.
mad(list) or mad(a,b,c,...) mean absolute deviation Output the mean absolute deviation of a list of numbers.
var(list) or var(a,b,c,...) or var(distribution) variance Output the sample variance of a list of numbers. This function will also return the variance for a distribution, if it exists. See the section on distributions below.
varp(list) or varp(a,b,c,...) or var(distribution) population variance Output the population variance of a list of numbers.
cov(list1, list2) covariance Output the sample covariance between two lists of numbers.
covp(list1, list2) population covariance Output the population covariance between two lists of numbers.
corr(list1,list2) Pearson correlation coefficient Output the Pearson correlation coefficient between two lists of numbers.
spearman(list1, list2) Spearman Output Spearman's rank correlation coefficient between two lists of numbers. Note that any repeated data values are assigned their average (possibly fractional) rank before computing the correlation.
nCr(n, r) combination Output the number of r-sized combinations (unordered arrangements) that can be selected from a set of size n.
nPr(n, r) permutation Output the number of r-sized permutations (ordered arrangements) that can be selected from a set of size n.
n! factorial Output the factorial of n.

## Statistical Tests

ttest(list, value = 0)

Perform a one-sample t-test of whether the mean of the population from which list is sampled differs from value (the null hypothesis). The output includes p-values for both the one-tailed versions (labeled "less than" and "greater than") and the two-tailed version (labeled "not equal") of the test. Note that if the second argument is omitted the hypothesized mean defaults to 0.

tscore(list, value = 0)

Output the raw test statistic used in the one-sample ttest function.

ittest(list1, list2)

Perform an independent (unpaired) two-sample t-test of whether the mean of the population from which list1 is sampled differs from the mean of the population from which list2 is sampled. The output includes p-values for both the one-tailed versions (labeled "less than" and "greater than") and the two-tailed version (labeled "not equal") of the test. Note that, while the sample sizes may differ (list1 and list2 need not have equal length), this test does assume that the underlying populations have equal variance.

## Distributions

The calculator can plot the probability density functions (PDFs) or probability mass functions (PMFs), the cumulative distribution functions (CDFs), as well as compute cumulative probabilities for the following distributions:

## Generating Random Values

The calculator offers a single function called random() for generating different kinds of random values in different contexts, depending on the provided arguments. For example, calling random() on a list will uniformly select elements from the list, and calling random() on a distribution will sample numbers with a frequency defined by that distribution. Regardless of the context, calling random() without additional arguments will return a single value; calling random(n) with a single additional argument will return a list of n values; and calling random(n, seed) will return a list of n values, using seed to influence the random number generator. See the note on seeds below.

Type: Result:
random() Generate a random value sampled uniformly from the interval [0,1).
random(n) Generate a list of n random values sampled uniformly from the interval [0,1).
random(n, seed) Generate a list of n random values sampled uniformly from the interval [0,1), using seed to influence the random number generator.
list.random() Return a single item selected uniformly from list.
list.random(n) Return a list of n items selected uniformly—with replacement—from list.
distribution.random() Return a single random number sampled from distribution.
distribution.random(n) Return a list of n samples drawn from distribution.
distribution.random(n, seed) Return a list of n samples drawn from distribution, using seed to influence the random number generator.

### A Note on Random Seeds

Like many computer programs, Desmos uses a pseudorandom number generator (PRNG) to produce sequences of numbers that are in practice reasonably indistinguishable from random even though they are in fact deterministic. The sequence of numbers produced by a PRNG is fixed by an initial value called its seed. The calculator does most of the work of creating and managing these seeds for you, but also offers two ways for you to force a seed update. 1. If any expression contains a random() call, a small "randomize" icon will appear at the top of the expressions list. Clicking it will set the global seed to a new value, which will simultaneously re-randomize all expressions that use random().

2. It is also possible to pass an optional seed argument to any individual random() call, which will only affect the seed for that specific call. This is mainly useful because it allows you to re-randomize a single expression in response to updates elsewhere in the expressions list. For instance, using a slider as a seed argument allows you to generate new random values whenever the slider is moved, perhaps in an animation. It's important to note that the seed argument you pass to random() is only one small part of the overall seed consumed by the PRNG. The other parts are beyond user control—and are not necessarily stable as expressions are edited—so you should not rely on random() results being reproducible. Specifying a seed argument to the random function allows you to incorporate additional information into the seed (such as the value of a slider), but does not give you full control over the seed or the value produced by random().