Ordinary Least Squares Regression With Shazam

Embed Size (px)

Citation preview

  • 8/6/2019 Ordinary Least Squares Regression With Shazam

    1/4

    Ordinary Least Squares Regression with Shazam

    Data

    Data files in the statistics class are usually Excel files stored

    on the hard drive or a floppy disk, but data may be entered directly by

    selecting file, new, and dataset on the standard toolbar (first

    toolbar or second row) of the Shazam screen.. Usually it is easier to

    create Excel files with your data. Each row of the Excel spreadsheet

    should be an observation and each column a variable. Suppose pcenrgy,

    popdnsty, pcincome, imptergy, tropics are five variables of a cross

    section analysis of 31 countries. Each observation or row would be a

    country and each column would contain data for one of the variables in

    that country. To import the excel data into Shazam, the first row

    should contain the variable names. Shazam will look for these names

    and show them as column headings when the data is imported. Choose a

    variable name of 8 or fewer characters; the first character of the

    variable name should be a letter. Also do not put any special

    characters or spaces in the variable name. The excel file shouldcontain no blank cells (missing data) if you wish to avoid

    complications. The first column should be the dependent variable to

    simplify matters. Shown below is the file in Excel.

    pcenrgy popdnst

    y

    pcincom

    e

    imprter

    gy

    tropics

    1525 13 8570 -25 2

    5215 2 20540 -98 1

    3279 97 27980 68 2

    5167 310 26420 78 2

    772 19 4720 40 1

    7879 3 19290 -50 2

    47 19 110 5 1

    707 129 860 -2 2

    6918 123 32500 24 2

    5613 17 24080 55 3

    4150 106 26050 47 2

    4156 234 28260 58 2

    92 75 370 66 1

    2454 111 4430 47 2

    260 313 390 18 1

    3003 269 15810 97 1

    3964 333 37850 80 2

    109 47 330 82 1

    692 61 4680 -88 1

    1456 48 3680 -51 1

    33 150 210 86 24741 456 25820 10 2

    4290 13 19480 19 2

    265 36 410 74 1

    308 12 2010 -141 2

    1939 108 10450 90 2

    7162 4896 32940 100 1

    5736 21 26220 38 3

    878 116 2800 63 1

    3786 243 20710 -15 2

  • 8/6/2019 Ordinary Least Squares Regression With Shazam

    2/4

    7905 29 28740 20 2

    2158 25 3450 -298 1

    Notice there is a row that must be deleted because there is no data in

    it. Save the file where you can find it on a disk or the hard drive.

    Shazam

    In Shazam, the top row across the screen says Shazam

    Professional Edition etc. The second row is a toolbar with a Shazam

    symbol first, then file, edit, project, data etc. to the right along

    the toolbar. In the third row, a toolbar has New and Open as the

    first two options with a file symbol next to Open. When Open is

    depressed (selected using the mouse) Shazam brings up a menu in

    windows that enables you to select the appropriate excel file you

    stored earlier.

    Be sure to identify the appropriate type of file so that

    Windows will show all files of that type that are stored on the drive.

    The default type of file Shazam looks for is a Shazam file. Change

    this to microsoft excel or all files so that the window will show

    excel files on the drive. Select the file that you have stored and

    then select open. Shazam will then show a message about variable

    names and data and asks do you want to continue; you should answer

    yes. Another menu will appear that will permit you to select a

    spreadsheetnormally select sheet 1 and open. Another popup will

    appear asking if you wish to add the data set to the current project;

    you should indicate yes. Then you will give the data set a name and

    select saveand when another window appears give the project a name

    and save.

    Check again to be sure the variable names are correctly entered

    and that there are no blank cells in the data set. At this pointloadthe data. You are using Shazam to do ordinary least squares

    regression and test to see if the assumptions of regression have been

    met by testing for multicollinearity, autocorrelation, and

    heteroschedasticity. You will want a regression equation, t tests of

    signficance for partial correlaltion coefficients (variable

    coefficients), a Durbin Watson statistic, an R squared and adjusted R

    squared, an F test for significance of the equation, a variable

    correlation matrix to test for multicollinearity, and a White test for

    heteroschedasticity.

    Shazam will do all these things, some automatically with the OLS

    command; others will be obtained as options or as a second command.

    The Shazam edition we are using provides wizards that assist you in

    writing the appropriate commands for ordinary least squares and other

    procedures. Although the program has these helps, it continues to

    function as a command driven program. Command windows enter commands to

    the program and output windows show the outputs from these commands.

    At this point, select Command Editor on the third toolbar of the

    Shazam screen. The data window will recede into the background and a

    new window will appear with a fourth toolbar. To obtain the variable

  • 8/6/2019 Ordinary Least Squares Regression With Shazam

    3/4

    correlation coefficient matrix, enter the following command on the

    blank command editor screen:

    stat pcenrgy popdnsty pcincome imptergy tropics/pcor

    Notice that if the abbreviations are correctly entered, Shazam will

    recognize them and show them as blue characters.

    Next use the wizard to help write commands for the ordinary least

    squares procedure. Select Wizards on the second toolbar and a window

    appears that describes the purpose of Wizards. Use the wizard to

    construct commands to complete the multiple regression project. Select

    Next and a menu of choices will appear. Select Ordinary least

    squaresregressionand Next. In the Tasks to Perform menu select

    all of the boxes except the one for forecasting. Go to the next window

    which is a summary of what you have chosen to this point. Move to the

    following window by selecting Next.

    A window now appears that allows you to select the dependent andindependent variables. Shade in per capita consumption of energy and

    use the Add button to move it to the dependent variable box. Shade

    in the other variables (population density, per capita income, imports

    of energy, tropics) and add them to the independent variable list.

    Notice lags could be introduced at this point. In this practice problem

    you do not want to lag anything, however. If you want to use only a

    part of the data you could specify the part which you will use at this

    point. In this practice problem, you will use the entire sample so make

    sure the use existing box is marked. Then go on to the next window.

    This window gives a number of options that could be used in the

    regression. For this regression nothing in the window will be selected.

    Supposedly, by not selecting suppress ANOVA the program will

    automatically perform analysis of variance. This feature does not work

    as it should: you will have to put in a command to obtain analysis of

    variance. Notice that Model form is Linear. If you wanted to do a

    regression in logarithms, at this point you would change the linear to

    one of the other options. In this practice problem, leave it as

    Linear. Go to the next window that is a menu of diagnostics. Select

    print observed, predicted and residuals and heteroskedasticity

    testsand move to the next window. In the practice problem there are

    no restrictions, so select Next. There are no hypotheses to specify

    so move to the next window. This window provides an opportunity to

    specify obtain confidence intervals for the variables. Shade in all

    the variables and move them to the selected side; go to the next window

    which is entitled Final Step. Be sure to select Generate commands and

    insert into currently active editor box. After you select Finish the

    wizard returns you to the command editor box. You should see thefollowing in the command editor:

    stat pcenrgy popdnsty pcincome imptergy tropics/pcor

    ols pcenrgy popdnsty pcincome imptergy tropics

    confid popdnsty pcincome imptergy tropics

    diagnos / listhet

  • 8/6/2019 Ordinary Least Squares Regression With Shazam

    4/4

    Notice that new commands have been added to the command editor other

    than the stat command specified earlier. You need to insert a

    command to obtain analysis of variance. This is done by adding a slash

    and anova after the variable list in the ols command. The command

    editor window should now look like this:

    stat pcenrgy popdnsty pcincome imptergy tropics/pcor

    ols pcenrgy popdnsty pcincome imptergy tropics/anova

    confid popdnsty pcincome imptergy tropics

    diagnos / listhet

    Select runon the fourth toolbar and Shazam will complete all

    the tasks you have selected for it. Be sure to look at the bottom of

    the window for any errors or warnings. Pay attention to these because

    they indicate data problems. You may need to correct your data.

    Print

    The print command will give you everything in the Command Editor

    (output) window. A copy of the data is obtained by depressing the

    energy2.xls (or whatever you have named the data) file on the thirdtoolbar and then print.

    If you have problems with the data, correct them. You will then

    have to reload the data. To begin a new regression, it is necessary to

    obtain a new command editor box. This is done by selecting New on

    the second toolbar. If you have several different command editors

    and data sets, the third toolbar becomes filled and an arrow appears at

    the right of the third toolbar to allow you to see all the previous

    command editors and data sets.

    It is wise at this point to consider the output from your

    commands to be sure you have everything needed: correlation matrix of

    variables; R squared and Adjusted R squared; analysis of variance;confidence intervals; variable coefficients, t ratios, and p values;

    Durbin-Watson statistic; and heteroschedasticity tests.