Econ 444: Elementary Econometrics
The Ohio State University
Winter 2011
Jason R. Blevins
Lecture 5: Gretl Tutorial
1. Introduction
Gretl, the GNU Regression, Econometrics and Time-series Library, is an open source, cross-
platform package for econometric analysis. It is loosely comparable to other packages you might
be familiar with such as Eviews and Stata. It is available for download at the following website:
2. A Simple Example
To introduce Gretl, we will create a data file using the one of the simple datasets we used in class
to calculate the OLS coefficients by hand. Then, we will use Gretl to import the dataset, plot
the data, calculate the regression coefficients, and plot the regression line. These tasks were
relatively easy to do by hand for small datasets with n = 3 observations and a single independent
variable, but for more realistic datasets using software such as Gretl is much more practical.
2.1. Creating a Plain-Text Data File
Recall our simple example dataset on stock price (Y
i
) and trade volume (X
i
):
X
i
Y
i
1
2
4
2
1
3
Open Notepad (any similar text editor will do) and create a new file consisting of a header
containing variable names (use “volume” and “price”) followed by one observation per line and
with whitespace (one or more spaces or a tab) between the X value and the Y value. The file
should look something like this (it’s easier to read if the columns line up):
volume
price
1
2
4
2
1
3
Save the file with a name that you can remember, something like
stocks.txt
and remember the
location of this file.
2.2. Loading the Data into Gretl
1. Open Gretl and select
File | Open data | Import | ASCII
.
1
Econ 444: Elementary Econometrics
Lecture 5: Gretl Tutorial
2. Navigate to the file where you stored the simple dataset you just created and click
Open
.
3. Gretl will automatically parse the data file and provide some output, such as the following:
parsing /home/jblevins/projects/444/stocks.txt...
using delimiter ’ ’
longest line: 15 characters
first field: ’volume’
number of columns = 2
number of variables: 2
number of non-blank lines: 4
scanning for variable names...
line: volume price
scanning for row labels and data...
treating these as undated data
4. Gretl should then show the variables from the dataset in rows.
• The first row is
const
, which represents the “constant term” in the regression, corre-
sponding to the
β
0
coefficient.
• The second and third variables should be
volume
and
price
respectively.
5. When you load data from a plain text file like this, if you expect to continue working with it
in Gretl, you can always save the dataset in Gretl’s own format (with a
.gdt
extension).
2.3. Manipulating the Dataset
The Data menu contains many useful functions for viewing and manipulating datasets. You can
experiment with some of these features (but be careful not to actually modify the dataset yet):
•
Display values
•
Edit values
•
Print description
2.4. Understanding the Dataset
Gretl also has many tools for summarizing and plotting data.
1. Select a particular variable, such as
volume
, and try
View | Summary statistics
.
2. Select
View | Graph specified vars | X-Y scatter
. Choose
volume
as the X-axis variable and
price
as the Y-axis variable and click
OK
.
This graph isn’t great because the dataset only has three points which are on the axes and
overlap with the tick marks, but for datasets that are more rich, this tool can be more
useful.
2
Econ 444: Elementary Econometrics
Lecture 5: Gretl Tutorial
2.5. Running a Regression
To calculate the OLS coefficients ˆ
β
0
and ˆ
β
1
for this dataset, select
Model | Ordinary Least Squares
.
Select
price
as the dependent variable and
const
(preselected) and
volume
as the independent
variables and click
OK
. You should see the following output:
Model 1: OLS estimates using the 3 observations 1-3
Dependent variable: price
VARIABLE
COEFFICIENT
STDERROR
T STAT
P-VALUE
const
2.66667
0.707107
3.771
0.16501
volume
-0.166667
0.288675
-0.577
0.66667
Gretl can also calculate:
• fitted values
• residuals
• fitted regression line
3. A More Realistic Example
Download and import the dataset
finaid.txt
from the course homepage at
org/courses/econ444w11/finaid.txt
. This dataset contains four variables related to finan-
cial aid at a liberal arts school (Occidental College):
finaid
amount of financial aid (dollars per year)
hsrank
GPA rank in high school (percentage)
male
indicator variable for male students
parent
expected family contribution (dollars per year)
Use Gretl to answer the following questions:
1. How many observations are in this dataset?
2. What is the mean and standard deviation of high school GPA rank?
3. Generate a scatter plot of financial aid (dependent variable) versus GPA (independent
variable). Does there appear to be a positive or negative correlation?
4. Consider the linear regression model
finaid
i
= β
0
+ β
1
hsrank
i
+ ε
i
What is the expected sign of
β
1
?
5. Now regress financial aid on a constant and high school GPA rank using OLS and write
down ˆ
β
0
and ˆ
β
1
.
3
Econ 444: Elementary Econometrics
Lecture 5: Gretl Tutorial
6. Plot the fitted regression line. Does the slope coincide with the positive or negative correla-
tion you saw above?
7. Are these results at all surprising? Are there important omitted variables or other relevant
factors that we might not be controlling for?
8. Now, add an additional regressor,
parent
, to control for need (parental contribution). What
are the resulting coefficients?
9. How do we interpret the coefficients on
hsrank
and
parent
?
10. Plot the fitted regression of financial aid (Y-axis) on high school GPA rank (X-axis), con-
trolling for parental contribution (control variable). To do this, in the main Gretl window,
select
View | Graph specified vars | X-Y with control
. Does this relationship coincide better
with your previous expectations?
11. Finally, add the
male
indicator variable to the regression and write down the OLS coeffi-
cients ˆ
β
k
for k = 0,1,2,3.
12. What does the coefficient on
male
indicate?
4