22
Social Statistics S519: Evaluation of Information Systems

Social Statistics S519: Evaluation of Information Systems

Embed Size (px)

Citation preview

Page 1: Social Statistics S519: Evaluation of Information Systems

Social Statistics

S519: Evaluation of Information Systems

Page 2: Social Statistics S519: Evaluation of Information Systems

Social Statistics

• Statistics describes a set of tools and techniques for describing, organizing and interpreting information or data.

• Do we need statistics? When and Why?

2

Page 3: Social Statistics S519: Evaluation of Information Systems

Why we need statistics

• Everybody relies on data in one way or another:– corporate presidents decide company policy based on

quarterly sales figures – politicians decide on campaign strategy based on polls – teachers decide grading curves based on a bell curve – you and I decide whether to smoke or not based on health

records of other people

• Therefore, we need a comprehensive and understandable way to deal with data:

• Statistics is the study of making sense of data.

3

Page 4: Social Statistics S519: Evaluation of Information Systems

Descriptive statistics

• Used to organize and describe the characteristics of a collection of data

4

Page 5: Social Statistics S519: Evaluation of Information Systems

Descriptive statistics

• How can you describe this table?

5

Name Gender Major Age Score

Sara Female LIS 27 A

Richard Male Psychology 30 C

Andrea Male Education 33 B

Emily Female Language 25 B

Bill Male LIS 28 C

Leo Female Psychology 26 A

Liz Female LIS 26 B

Alice Female LIS 28 C

Steven Male Psychology 24 C

Jeff Male LIS 30 B

Page 6: Social Statistics S519: Evaluation of Information Systems

Inferential statistics

• Make inferences from a smaller group of data to a possible larger one– Sample: a smaller group of data– Population: the whole group of a certain subject

6

Page 7: Social Statistics S519: Evaluation of Information Systems

Population & Samplepopulation

the set of all photographs of Mars the set of heights of people in the US Army the set of all measurements of water quality taking from the

Hudson river the set of all problems that can be solved using statistics.

sample the pictures selected from a specific region of Mars the heights of people in a particular division of the US Army the set of water measurements of the Hudson River taken

on 7/24/2009 the statistical problems we are solving in this class

7

Page 8: Social Statistics S519: Evaluation of Information Systems

Steps for statistical analysis

• Problem definition what is the population of interest, and what are the variables that are to be investigated

• Data collection describe and select the sample from the population

• Data analysis make some statistical inferences from the sample about the population

• Analysis Reporting report the inference together with a measure of reliability for the inference where we use the term variable to mean a characteristic or property of an individual population where the observations can vary.

8

Page 9: Social Statistics S519: Evaluation of Information Systems

An example Example: A tax auditor is responsible for 25,000 accounts. How

many accounts are in error? Defining the problem: The entire population consists of all 25,000

accounts. Our goal is to obtain a reasonable estimate for the number of accounts that are, in all likelihood, in error. Our variable x counts whether an account is in error.

Data collection and summary: The auditor decides to select 2000 accounts at random, tests each of these, and finds that 84 of them are in error.

Data analysis: In this case, the likely theory involves computing 84/2000 = 4.2%.

Analysis reporting: Based on our data analysis we infer that approximately 4.2% of the accounts will be in error.

9

Page 10: Social Statistics S519: Evaluation of Information Systems

Tools

• Excel• Excel Toolpak• SPSS/PASW

10

Page 11: Social Statistics S519: Evaluation of Information Systems

Excel Toolpak (1)1. Click the Microsoft Office Button , and then click Excel

Options.2. Click Add-Ins, and then in the Manage box, select Excel

Add-ins.3. Click Go.4. In the Add-Ins available box, select the Analysis ToolPak

check box, and then click OK. 5. If you get prompted that the Analysis ToolPak is not

currently installed on your computer, click Yes to install it.6. After you load the Analysis ToolPak, the Data Analysis

command is available in the Analysis group on the Data tab.

11

Page 12: Social Statistics S519: Evaluation of Information Systems

Excel Toolpak (2)

• Powerful, reliable, accessible, easy, and free

12

Page 13: Social Statistics S519: Evaluation of Information Systems

FormulaOperator Symbol Example What it does

Addition + =2+5 Adds 2 and 5

Subtraction - =5-3 Subtracts 3 from 5

Division / =10/5 Divides 10 by 5

Multiplication * =2*5 Multiplies 2 times 5

Power of ^ =4^2 4 power of 2

13

How does it work in Excel?

Page 14: Social Statistics S519: Evaluation of Information Systems

Basics of a Spreadsheet

• So let's get started digging into what makes a spreadsheet work. Spreadsheets are made up of:– columns– Rows– cells

• In each cell there may be the following types of data: – text (labels)– number data (constants)– formulas (mathematical equations)

14

Page 15: Social Statistics S519: Evaluation of Information Systems

Column

15

Page 16: Social Statistics S519: Evaluation of Information Systems

Row

16

Page 17: Social Statistics S519: Evaluation of Information Systems

Cell

17

Page 18: Social Statistics S519: Evaluation of Information Systems

Types of Datadata types examples descriptions

LABEL Name or Wage or Days anything that is just text

CONSTANT 5 or 3.75 or -7.4 any number

FORMULA =5+3 or = 8*5+3 math equation

18

ALL formulas MUST begin with an equal sign (=).

Page 19: Social Statistics S519: Evaluation of Information Systems

Formulas – SUM

• The Sum function takes all of the values in each of the specified cells and totals their values. The syntax is: =SUM(first value, second value, etc)

19

Page 20: Social Statistics S519: Evaluation of Information Systems

Formulas – AVERAGE

• The average function finds the average of the specified data. The syntax is as follows =Average(first value, second value, etc.)

20

Page 21: Social Statistics S519: Evaluation of Information Systems

Formulas – MAX/MIN

• MAX: This will return the largest (max) value in the selected range of cells.

• MIN: This will return the smallest (Min) value in the selected range of cells.

21

Page 22: Social Statistics S519: Evaluation of Information Systems

Formulas – COUNT

• This will return the number of entries (actually counts each cell that contains number data) in the selected range of cells.

22