Meaning from Data: Statistics Made Clear - Aired Order - All Seasons

S01E01 Describing Data and Inferring Meaning

January 1, 2006
The Great Courses

The statistical study of data deals with two fundamental questions: How can we describe and understand a situation when we have all the pertinent data about it? How can we infer features of all the data when we know only some of the data?

S01E02 Data and Distributions-Getting the Picture

The first three rules of statistics should be: Draw a picture, draw a picture, draw a picture. A visual representation of data reveals patterns and relationships, for example, the distribution of one variable, or an association between two variables.

S01E03 Inference-How Close? How Confident?

The logic of statistical inference is to compare data that we collect to expectations about what the data would be if the world were random in some particular respect. Randomness and probability are the cornerstones of all methods for testing hypotheses.

S01E04 Describing Dispersion or Measuring Spread

This lecture defines and explores standard deviation, which measures how widely data are spread from the mean. The various methods of measuring data dispersion have different properties that determine the best method to use.

S01E05 Models of Distributions-Shapely Families

Any shaped curve can model a data set. This lecture looks at skewed and bimodal shapes, and describes other characteristically shaped classes of distributions, including exponential and Poisson. Each shape arises naturally in specific settings.

S01E06 The Bell Curve

The most famous shape of distributions is the bell-shaped curve, also called a normal curve or a Gaussian distribution. This lecture explores its properties and why it arises so frequently-as in the central limit theorem, one of the core insights on which statistical inference is based.

S01E07 Correlation and Regression-Moving Together

One way we attempt to understand the world is to identify cases of cause and effect. In statistics, the challenge is to describe and measure the relationship between two variables, for example, incoming SAT scores and college grade point averages.

S01E08 Probability-Workhorse for Inference

Probability accomplishes the seemingly impossible feat of putting a useful, numerical value on the likelihood of random events. Our intuition about what to expect from randomness is often far from accurate. This lecture looks at several examples that place intuition and reality far apart.

S01E09 Samples-The Few, The Chosen

Sampling is a technique for inferring features of a whole population from information about some of its members. A familiar example is a political poll. Interesting issues and problems arise in taking and using samples. Examples of potential pitfalls are explored.

S01E10 Hypothesis Testing-Innocent Until

This lecture introduces a fundamental strategy of statistical inference called hypothesis testing. The method involves assessing whether observed data are consistent with a claim about the population in order to determine whether the claim might be false. Drug testing is a common application.

S01E11 Confidence Intervals-How Close? How Sure?

Headlines at election time frequently trumpet statistics such as: "Candidate A will receive 59 percent of the vote, with a margin of error of plus or minus 3 percent." This lecture investigates what this "margin of error" statement means and why it is incomplete as written.

S01E12 Design of Experiments-Thinking Ahead

When gathering data from which deductions can be drawn confidently, it's important to think ahead. Double-blind experiments and other strategies can help meet the goal of good experimental design.

S01E13 Law-You're the Jury

Opening the second part of the course, which deals with applying statistics, this lecture focuses on two examples of courtroom drama: a hit-and-run accident and a gender-discrimination case. In both, the analysis of statistics aids in reaching a fair verdict.

S01E14 Democracy and Arrow's Impossibility Theorem

An election assembles individual opinions into one societal decision. This lecture considers a surprising reality about elections: The outcome may have less to do with voters' preferences than with the voting method used, especially when three or more candidates are involved.

S01E15 Election Problems and Engine Failure

The challenge of choosing an election winner can be thought of as taking voters' rank orderings of candidates and returning a societal rank ordering. A mathematically similar situation occurs when trying to determine what type of engine lasts longest among competing versions.

S01E16 Sports-Who's Best of All Time?

Analyzing statistical data in sports is a sport of its own. This lecture asks, "Who is the best hitter in baseball history?" The question presents statistical challenges in comparing performances in different eras. Another mystery is also probed: "Is the 'hot hand' phenomenon real, or is it random?"

S01E17 Risk-War and Insurance

A discussion of strategies for estimating the number of Mark V tanks produced by the Germans in World War II brings up the idea of expected value, a central concept in the risky business of buying and selling insurance.

S01E18 Real Estate-Accounting for Value

Tax authorities often need to set valuations for every house in a tax district. The challenge is to use the data about recently sold houses to assess the values of all the houses. This classic example of statistical inference introduces the idea of multiple linear regression.

S01E19 Misleading, Distorting, and Lying

Statistics can be used to deceive as well as enlighten. This lecture explores deceptive practices such as concealing lurking variables, using biased samples, focusing on rare events, reporting handpicked data, extrapolating trends unrealistically, and confusing correlation with causation.

S01E20 Social Science-Parsing Personalities

This lecture addresses two topics that come up when applying statistics to social sciences: factor analysis, which seeks to identify underlying factors that explain correlation among a larger group of measured quantities, and possible limitations of hypothesis testing.

S01E21 Quack Medicine, Good Hospitals, and Dieting

Medical treatments are commonly based on statistical studies. Aspects to consider in contemplating treatment include the characteristics of the study group and the difference between correlation and causation. Another statistical concept, regression to the mean, explains why quack medicines can appear to work.

S01E22 Economics-"One" Way to Find Fraud

Economics relies on a wealth of statistical data, including income levels, the balance of trade, the deficit, the stock market, and the consumer price index. A surprising result of such data is that the leading digits of numbers do not occur with equal frequency, and that provides a statistical method for detecting fraud.

S01E23 Science-Mendel's Too-Good Peas

Statistics is essential in sciences from weather forecasting to quantum physics. This lecture discusses the statistics-based research of Johannes Kepler, Edwin Hubble, and Gregor Mendel. In Mendel's case, statisticians have looked at his studies of the genetics of pea plants and discovered data that are too good to be true.

S01E24 Statistics Everywhere

The importance of statistics will only increase as greater computer speed and capacity make dealing with ever-larger data sets possible. It has limits that need to be respected, but its potential for helping us find meaning in our data-driven world is enormous and growing.

All Seasons

Season 1