News about the course
These news items have the newest ones last.
- The course meets in S110 Foege Building, the seminar room on the
1st floor of Foege (its door is across from the white phone in the hall).
It meets at 9:00-10:20 Tuesdays and Thursdays, May 5 onwards.
- There is a course mailing list. Registered students are on it
automatically. For others, to join it or read past postings,
go to this
link. It requires a UW login, and past postings can be read only once
you are a list member. Members can post to the mailing list. I will try to
keep anyone from abusing the list and sending spam or really off-topic stuff.
- The course will be graded on homework assignments assigned here
each Thursday, to be turned in the following Tuesday. They will make use
of the R package.
- The R statistical package will be used in homeworks. Students
should load it onto their own laptop and bring the laptop to class, as
we will do exercises, group learning, and group confusion the last 20-30
minutes of each lecture. R is free and can be downloaded from the Cran-R
here. A brief introduction
document to R is available there as a PDF. Two quick example
sheets were made up by Josh Akey last year. They are
An R tutorial
and An R descriptive statistics
A rough syllabus (to be improved)
- (May 5) Probability. Stochastic processes (coins, phone calls, normals)
Distributions (uniform, binomial, geometric, exponential, Poisson, normal,
- (May 7) Distributions, cont'd. Histograms, etc. Quantiles, distributions of
functions of (multiples, averages, sums, sums of squares, differences)
- (May 12) Confidence intervals, t-test, experimental design, tests
- (May 14) Chi-squares, contingency tables
- (May 19) Regression, curve fitting, ANOVA, F-test
- (May 21) Bayesian inference, likelihood
- (May 26) Jackknife, bootstrap, permutation tests, cross-validation
- (May 28) ANOVA, more on
- (June 2) Multiple testing: Bonferroni, modifications of, FDR
- (June 4) Principal (not ``principle'') components, SVD, etc.
The lecture PDFs will be posted here. Now available are:
There is no textbook for the course. Josh Akey, in last year's
web pages, lists some books and a number of on-line statistics
texts available free on the web. They are
In fact, a whole bunch of on-line statistics textbooks will be
found if you Google: "online statistics text"
Josh's 2008 course web pages are excellent, especially his lecture
PDFs. Although the order of material is different, they are
very much work looking at.
The R language
R is a free interactive computer environment (in old-fashioned terms,
an "interpreter") that can be used for many purposes. It was originally
designed by statisticians (R is a clone of a language called S, which is
now commercial). It has many built-in statistics functions, which is
why we will use it. (At the main CRAN-R project site there are links to
many other analysis packages that can be loaded into R).
R can be downloaded and installed on Windows, Mac OS X, or Linux machines
(and some other types as well). It is available at the CRAN-R site here as executables, source code, and
many other resources including a terse PDF
introductory manual. When
using this manual skip over parts that go too deeply into stuff
you don't yet understand as there is valuable stuff after that.
Come back to the skipped stuff later.
Is R great? For many things, yes. Is it good at everything? I would
say that its array operations stand a good chance of driving the
puzzled beginner absolutely bonkers, so no. In this it
reminds me of a programming language called APL ("A Programming Language")
which could do many things interactively, had fervent evangelists,
was mostly about arrays,
and drove me absolutely bonkers. But what do I know about it,
Josh Akey produced two quick introduction sheets:
R in this course
We will do an R exercise in each class session. Students are
expected to bring a laptop with R loaded on it (kudos to the present
class for doing this successfully). I will distribute exercise sheets
at each class, and we will try to do them. As I make them I will
post them here.
- Exercise 2, May 7
- Exercise 3, May 12
- Exercise 4, May 14
- Exercise 5, May 19
- Exercise 6, May 21
and Exercise 6 (corrected), May 21
- Exercise 7, May 26
- Exercise 8, May 28
and the data set example8.txt
- Exercise 9, June 2
- Exercise 10, June 4 and
the data set journals.txt
Who is this guy who is teaching the course, anyway?
this page maintained fitfully by Joe Felsenstein