We would like to have a collection of datasets that can be used for teaching introductory statistics courses. Please send your data along with some description and any interesting points that can be made about them to


  • Brain size vs Body weight in animals. Recommended by Eugenia:

    It's a very good example to do in lecture. If you draw a scatterplot of the original data, it's highly non-linear. Yet after log transformation, the pattern becomes linear.

Data collections

  • Data and Story Library at Carnegie Mellon. Recommended by Eugenia:

    The data set from the Data and Story Library are interesting. I use them from time to time for making lecture examples.

Data sources recommended by Mike Whitlock

Mike W kindly contributed this collection of URLs after the Intro Stats meeting on May 26, 2009.

The data archive I'm involved with is called Dryad. They also have a list of other data repositories in related fields.

Of these, the ones that I know might have suitable data sets are KNB, FishBase, Pangaea, and NOAA.

There are also data sets or examples available through CHANCE news, Utts book web site, Bandolier, The Cochrane Collaboration, Swivel, Rice,, Canada fitness, and, of course, Statscan.

The data sets for our book.

The lab manual and data sets for my BIO300 labs are available (and I also have instructor materials for that, if anyone wants them. I'll be happy to send them to them.)

Links to other resources

  • Department of Statistics has some useful links in "Teaching Resources" on their Resources page.

