Introduction to Econometrics

Prof. F.X. Diebold

Welcome!

This course provides an undergraduate introduction to
modern econometrics, in both cross-section and time-series environments.

Prerequisites: Econ 103 must be taken *prior* to Econ 104. (Certain other
introductory Penn courses or course sequences in probability and statistics
including regression may be acceptable, again taken *prior* to Econ 104, such as Penn's Stat 430+431 or Penn's Engineering equivalent. Any other background must be explicitly approved.)

Heavily-used site: *Econometric Data Science* (open text, slides, data,
code, etc.). The site is constantly evolving, so check frequently for updates.
The course outline is effectively the text's table of contents, but the slides
and lectures, *not the text*, are the
centerpiece of everything. Although the slides are self-contained, there is no
way to understand them without attending lectures, where I will interpret/embellish/generalize/specialize
them, guiding you selectively. Hence regular class attendance is absolutely essential.

Relevant texts, recommended but not required, include:
Gujarati, *Econometrics by Example*,
latest edition (pragmatic, easy to read); Wooldridge, *Introductory Econometrics: A Modern Approach*,
latest edition (balanced and comprehensive); Stock and Watson, *Introduction to Econometrics*,
latest edition (deep and insightful, worth the time investment).

Software: Your choice. The *Econometric Data Science* site
has some R, EViews, Stata, and Python code samples.

TA office hours as
announced in class. Professor Diebold’s office
hours (held in PCPSE 607) here.

Grading will be based on three equally-weighted
problem sets (50% of final grade) and three equally-weighted exams (50% of
final grade). Problem sets are due at the start of class on the assigned
day. *Under no circumstances will
late problem sets be accepted*, so be sure to start (and finish) them early,
to insure against illness and emergencies.

**Important Administrative
Policies: **Here. (*Read them carefully.*)

**Important Dates and Assignments (All are tentative until
confirmed/discussed in class)**:

*** Feb 4 ***

Add period ends.

*** Thurs Feb 14 ***

In-Class Exam 1 (No books, notes, electronic devices, etc.)

*** Tues Feb 19 ***

Problem Set 1 (Must be done alone. Show all code in an appendix.)

Obtain the test score dataset. (1) Display a scatterplot of
test score (TESTSCR) vs. student/teacher ratio (STR). (2) Regress TESTSCR on an
intercept alone. Interpret this regression and discuss your results. In this
intercept regression framework, how would you test the hypothesis that the
(population) mean score is 660? Do it, and discuss your results. (3) Regress
TESTSCR on an intercept and STR. Discuss your results. Do you need an
intercept? Again graph TESTSCR vs. STR, this
time with your preferred fitted regression line superimposed.

*** Feb 22 ***

Drop period ends.

*** Thurs March 14 ***

In-Class Exam 2 (No books, notes, electronic devices, etc.)

*** Tues March 26 ***

Problem Set 2 (May be done in groups of at most three. I expect a creative
analysis, well-defended yet qualified as appropriate,
thorough yet concise, maximum 15 pages. Show all code in an appendix.)

1.
Regress READING score on student/teacher ratio.

2. Select a "best" predictive regression model for
reading score. Among other things, you may want to consider non-normality,
outliers, group effects, nonlinearities, and heteroskedasticity.
Do the results differ from those of Regression 1? Interpret your results.

3. Repeat 1 and 2 with a predictive regression model for MATH
score. Are your selected models the same for reading and math?

4. Suppose California creates a new school district, and that
legislators mandate a 15/1 student/teacher ratio. Based on that
information alone, predict the new district's average reading score
(point, interval, density).

5. Now suppose that, in addition, you learn that the new
district has average income $8,000, 40% English
learners, 60 % qualifying for a reduced-price lunch, and all other variables at
their dataset sample mean. Predict the district's average reading score
(point, interval, density).

*** Tues Apr 30 ***

In-Class Exam 3 (No books, notes, electronic devices, etc.)

*** Tues May 7 ***

Problem Set 3 (May be done in groups of at most three. I expect a creative
analysis, well-defended yet qualified as appropriate,
thorough yet concise, maximum 15 pages. Show all code in an appendix.)

1. Specify and
estimate a model of U.S. monthly domestic auto sales, NSA (series DAUTONSA from
FRED), using ONLY data for January 1967 - February 2019. Among other
things, you may want to consider trend, seasonality and other calendar effects,
nonlinearity, and autoregressive dynamics (with at most six lags). Do NOT worry
about possible structural change, non-normality, or heteroskedasticity.

2. Use your preferred model from part 1 to make out-of-sample
point, interval, and density forecasts of DAUTONSA for March 2019. Evaluate the
performance your forecasts. Again, do not worry about possible structural
change, non-normality, or heteroskedasticity. (In
particular, construct your forecasts assuming structural stability and Gaussian
disturbances with constant variance.)

3. Now worry about possible structural change and/or non-normality
and/or heteroskedasticity, and re-do the analyses of
parts 1 and 2. How, if at all, do your preferred model and forecasts change?

4. Bonus (+5 points max, +4 pages max): How would you extend your
point, interval, and density forecasts to September 2019?

**Note Well:** *Changes may be implemented at any time. Check
this site frequently, and attend class, for updates and explanations.*