Course Description

An increasing amount of data is now generated in a variety of disciplines, ranging from finance and economics, to the natural and social sciences. Making use of this information requires both statistical tools and an understanding of how the substantive scientific questions should drive the analysis. In this hands-on course, we learn to explore and analyze real-world datasets. We cover techniques for summarizing and describing data, methods for statistical inference, and principles for effectively communicating results.

Prerequisites: MS&E 120 or equivalent, and CS 106A or equivalent

Please take note of the following two course policies.

  1. On-time attendance at lectures is required, and attendance at discussion sections is encouraged. Our aim is to create a collaborative and supportive learning environment. One of the best ways to learn the course material is to engage with the lectures by asking questions. If you need to miss a class (e.g,. for an illness or sporting event) or will be late, please Sharad prior to the lecture. In-class attendance checks will be periodically carried out throughout the quarter.

  2. Please do not use electronics (laptops, tablets, phones) during lectures. But please bring laptops to the Thursday discussion sections, as there will be in-class coding and analysis. See here and here on why we institute this policy. (We're happy to make exceptions in special circumstances.)

We encourage you to attend our 2-part crash course on R. The first part will be held during discussion section on Thursday, January 10. The second part will be offered at 6-9pm on Monday, January 14, and repeated at 6-9pm on Tuesday, January 15 in 200-034. You can view the R course materials here.

Sharad Goel ()
Jongbin Jung (TA) ()
Jerry Lin (TA) ()
Camelia Simoiu (TA) ()
Class: Tuesdays & Thursdays @ 1:30 PM - 2:50 PM in STLC 114
Discussion Section: Thursdays @ 3:00 PM - 4:20 PM in STLC 114

We use Piazza to manage course questions and discussion. Please sign up here.

Office Hours
Mondays @ 5 PM - 7 PM in Shriram 366 (Camelia)
Tuesdays @ 3 PM - 5 PM in Huang 251 (Sharad)
Wednesdays @ 10 AM - 12 PM in GESB 131 (Jongbin)
Thursdays @ 5 PM - 7 PM in Spilker 317 (Jerry)

There are no office hours during the first week of class. Feel free to schedule an appointment if you would like to meet.

[ Optional ] Textbooks
All of Statistics by Larry Wasserman (available online)
R for Data Science by Garrett Grolemund and Hadley Wickham
Statistics by David Freedman, Robert Pisani, and Roger Purves
Natural Experiments in the Social Sciences by Thad Dunning
Computing Environment
We primarily use R (RStudio is the recommended interface), including the suite of tidyverse packages.
8 homework assignments (50%)
Final project (20%)
Final exam (20%)
Attendance (10%)