17
1/16/2017 1 DMC- 617: Statistics and Informatics for Disaster Managers by MD. KHALID HASAN MS(SWE), MPS, MDM W elcome to Chapter3 Data Summarization and Presentation Learning Objectives: Outline: By the end of this chapter, you will be able to: 1. Explain the purpose of descriptive statistics in making data comprehensible. 2. Compute and interpret percentages, proportions, ratios, rates, and percentage change. 3. Construct and analyze frequency distributions for variables at each of the three levels of measurement. 4. Construct and analyze bar and pie charts, histograms, and line graphs. Data Summarization _Percentages and Proportion _Rates, Ratios and Percent Changes _Frequency Distributions Data Presentation _Pie Charts _Bar Charts _Histogram _Line Charts _Stem-and-Leaf Displays _Dot Plots _Scatter Plots _Box Plots

Welcome to 2_DMC 617_Data... · Md. Khalid Hasan [email protected] Institute of Disaster Management and Vulnerability Studies University of Dhaka The following are the numbers

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Welcome to 2_DMC 617_Data... · Md. Khalid Hasan khalidhasan@du.ac.bd Institute of Disaster Management and Vulnerability Studies University of Dhaka The following are the numbers

1/16/2017

1

DMC- 617: Statistics and Informatics for Disaster Managers

by

MD. KHALID HASANMS(SWE), MPS, MDM

Welcometo

Chapter3 Data Summarization and Presentation

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Learning Objectives: Outline:

By the end of this chapter, you will be able to:1. Explain the purpose of descriptive statistics in making data comprehensible.2. Compute and interpret percentages, proportions, ratios, rates, and percentagechange.3. Construct and analyze frequency distributions for variables at each of the three levels of measurement.4. Construct and analyze bar and pie charts, histograms, and line graphs.

Data Summarization_Percentages and Proportion_Rates, Ratios and Percent Changes_Frequency Distributions Data Presentation_Pie Charts_Bar Charts_Histogram_Line Charts_Stem-and-Leaf Displays_Dot Plots_Scatter Plots_Box Plots

Page 2: Welcome to 2_DMC 617_Data... · Md. Khalid Hasan khalidhasan@du.ac.bd Institute of Disaster Management and Vulnerability Studies University of Dhaka The following are the numbers

1/16/2017

2

Percentages and Proportions???Percentages and proportions supply a frame of reference for reporting research results in the sense that they standardize the raw data: percentages to the base 100 and proportions to the base 1.00.

The mathematical definition of percentages is:

Where,

f = frequency, or the number of cases in any categoryN= the number of cases in all categories

Md. Khalid Hasan [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Percentages and Proportions???Social scientists use proportions as well as percentages. Proportions vary from 0.00 to 1.00: They standardize results to a base of 1.00 instead of the base of 100 for percentages.

The mathematical definition of proportions is:

Where,

f = frequency, or the number of cases in any categoryN= the number of cases in all categories

Md. Khalid Hasan [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Page 3: Welcome to 2_DMC 617_Data... · Md. Khalid Hasan khalidhasan@du.ac.bd Institute of Disaster Management and Vulnerability Studies University of Dhaka The following are the numbers

1/16/2017

3

Applying Statistics: Communicating with Statistics

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

How can we choose between Percentages and Proportions???

_Percentages are easier for most people (including statisticians) to comprehend and generally would be preferred, especially when the goal is to express results clearly and quickly.

_Proportions are used less frequently—generally when we are dealing with probabilities. The preference for percentages is based solely on ease of communication; these statistics are equally valid ways of expressing results and you should be proficient in both.

_When the number of cases is small (say, fewer than 20), it is usually preferable to report the actual frequencies rather than percentages or proportions. With a small number of cases, percentages are unstable and can change drastically with relatively minor changes in the size of the data set.

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Page 4: Welcome to 2_DMC 617_Data... · Md. Khalid Hasan khalidhasan@du.ac.bd Institute of Disaster Management and Vulnerability Studies University of Dhaka The following are the numbers

1/16/2017

4

How can we choose between Percentages and Proportions???

_ Always report the number of observations along with proportions and percentages. This permits the reader to judge the adequacy of the sample size and, conversely, helps to prevent the researcher from lying with statistics.

_Percentages and proportions can be computed for variables at any level of measurement.

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Ratios, Rates, and Percentage Change ???

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Ratios are especially useful for comparing the relative size of different categories of a variable and are determined by dividing the frequency of one category by the frequency of another. The formula for a ratio is:

Where,

f1 = the number of cases in the first category

f2 = the number of cases in the second category

Page 5: Welcome to 2_DMC 617_Data... · Md. Khalid Hasan khalidhasan@du.ac.bd Institute of Disaster Management and Vulnerability Studies University of Dhaka The following are the numbers

1/16/2017

5

Ratios, Rates, and Percentage Change ???

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Rates provide still another way of summarizing the distribution of a single variable. Rates are defined as the number of actual occurrences of some phenomenon divided by the number of possible occurrences per some unit of time. Rates are usually multiplied by some power of 10 to eliminate decimal points.

For example,

Ratios, Rates, and Percentage Change ???

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Measuring social change, in all its variety, is an important task for all social sciences. One very useful statistic for this purpose is the percentage change, which tells us how much a variable has increased or decreased over a certain span of time.

To compute this statistic, we need the scores of a variable at two different points in time. The scores could be in the form of frequencies, rates, or percentages. The percentage change will tell us how much the score has changed at the later time relative to the earlier time.

Where,

f1 = First score, frequency, or value

f2 = Second score, frequency, or value

Page 6: Welcome to 2_DMC 617_Data... · Md. Khalid Hasan khalidhasan@du.ac.bd Institute of Disaster Management and Vulnerability Studies University of Dhaka The following are the numbers

1/16/2017

6

TABLE: Projected Population Growth for Six Nations, 1990–2020

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Frequency distributions

Frequency distributions are tables that report the number of cases in each category of a variable. They are very helpful and commonly used ways of organizing and working with data. In fact, the construction of frequency distributions is almost always the first step in any statistical analysis.

TABLE: Data from Counseling Center Survey

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Page 7: Welcome to 2_DMC 617_Data... · Md. Khalid Hasan khalidhasan@du.ac.bd Institute of Disaster Management and Vulnerability Studies University of Dhaka The following are the numbers

1/16/2017

7

Frequency Distributions for Variables Measured at the Nominal Level

TABLE: Sex of Respondents, Counseling Center Survey

TABLE: Marital Status of Respondents, Counseling Center Survey

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Frequency Distributions for Variables Measuredat the Interval-Ratio Level

TABLE: Satisfaction with Services, Counseling Center Survey

TABLE: Age of Respondents, Counseling Center Survey (interval width = one year of age)

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Page 8: Welcome to 2_DMC 617_Data... · Md. Khalid Hasan khalidhasan@du.ac.bd Institute of Disaster Management and Vulnerability Studies University of Dhaka The following are the numbers

1/16/2017

8

Cumulative Frequency and Cumulative Percentage

TABLE: Age of Respondents, Counseling Center Survey

TABLE: Age of Respondents, Counseling Center Survey

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Graphic Presentations of Data

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Page 9: Welcome to 2_DMC 617_Data... · Md. Khalid Hasan khalidhasan@du.ac.bd Institute of Disaster Management and Vulnerability Studies University of Dhaka The following are the numbers

1/16/2017

9

Pie Charts

A circle divided into portions that represent the relative frequencies or percentages of a population or a sample belonging to different categories is called a pie chart.

TABLE: Marital Status of Respondents, Counseling Center Survey

FIGURE: Marital Status of Respondents, Counseling Center Survey (N= 20)

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Bar Charts

When the data are qualitative or categorical, bar graphs can be used to represent the data. A bar graph represents the data by using vertical or horizontal bars whose heights or lengths represent the frequencies of the data.

TABLE: Marital Status of Respondents, Counseling Center Survey

FIGURE: Marital Status of Respondents, Counseling Center Survey (N= 20)

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Page 10: Welcome to 2_DMC 617_Data... · Md. Khalid Hasan khalidhasan@du.ac.bd Institute of Disaster Management and Vulnerability Studies University of Dhaka The following are the numbers

1/16/2017

10

Bar Charts

When the data are qualitative or categorical, bar graphs can be used to represent the data. A bar graph represents the data by using vertical or horizontal bars whose heights or lengths represent the frequencies of the data.

TABLE: Marital Status of Respondents, Counseling Center Survey

FIGURE: Marital Status of Respondents, Counseling Center Survey (N= 20)

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Histograms

Histograms look a lot like bar charts and, in fact, are constructed in much the same way. However, the bars in a histogram touch each other in a continuous series from the lowest to the highest scores. These graphs are most appropriate for interval-ratio level variables that have many scores covering a wide range.

FIGURE: Distribution of Income by Household, United States, 2009

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Page 11: Welcome to 2_DMC 617_Data... · Md. Khalid Hasan khalidhasan@du.ac.bd Institute of Disaster Management and Vulnerability Studies University of Dhaka The following are the numbers

1/16/2017

11

Line Charts

The construction of a line chart (or frequency polygon) is similar to the construction of a histogram. However, instead of using bars to represent the frequencies, use a dot at the midpoint of each interval. Straight lines then connect the dots.

Histograms and frequency polygons are alternative ways of displaying essentially the same message. Thus, the choice between the two techniques is left to the aesthetic pleasures of the researcher.

FIGURE: U.S. Rates of Marriage and Divorce, 1950–2008 (rates per 100,000 population)

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Stem-and-Leaf Displays

In a stem-and-leaf display of quantitative data, each value is divided into two portions—a stem and a leaf. The leaves for each stem are shown separately in a display.

An advantage of a stem-and-leaf display over a frequency distribution is that by preparing a stem-and-leaf display we do not lose information on individual observations. A stem-and-leaf display is constructed only for quantitative data.

Do Math:

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Page 12: Welcome to 2_DMC 617_Data... · Md. Khalid Hasan khalidhasan@du.ac.bd Institute of Disaster Management and Vulnerability Studies University of Dhaka The following are the numbers

1/16/2017

12

Stem-and-Leaf Displays

Step 1

Do Math:

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Stem-and-Leaf Displays

Step 2

Do Math:

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Page 13: Welcome to 2_DMC 617_Data... · Md. Khalid Hasan khalidhasan@du.ac.bd Institute of Disaster Management and Vulnerability Studies University of Dhaka The following are the numbers

1/16/2017

13

Stem-and-Leaf Displays

Try It Yourself

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

The following are the numbers of text messages sent last week by the cellular phone users on one floor of a college dormitory. Display the data in a stem-and-leaf plot. What can you conclude?

Stem-and-Leaf Displays

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Page 14: Welcome to 2_DMC 617_Data... · Md. Khalid Hasan khalidhasan@du.ac.bd Institute of Disaster Management and Vulnerability Studies University of Dhaka The following are the numbers

1/16/2017

14

Dotplots

Dotplots can help us detect outliers (also called extreme values) in a data set. Outliers are the values that are extremely large or extremely small with respect to the rest of the data values.

Table: Distances of Longest Field Goals (in Yards) Made by AFC Kickers During the 2008 NFL Season

FIGURE: Dotplot for Longest Completed Field Goal.

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Scatter Plot

When each entry in one data set corresponds to one entry in a second data set, the sets are called paired data sets. For instance, suppose a data set contains the costs of an item and a second data set contains sales amounts for the item at each cost. Because each cost corresponds to a sales amount, the data sets are paired.

One way to graph paired data sets is to use a scatter plot, where the ordered pairs are graphed as points in a coordinate plane. A scatter plot is used to show the relationship between two quantitative variables.

• The British statistician Ronald Fisher introduced a famous data set called Fisher’s Iris data set. This data set describes various physical characteristics, such as petal length and petal width (in millimeters), for three species of iris. In the scatter plot shown, the petal lengths form the first data set and the petal widths form the second data set. As the petal length increases, what tends to happen to the petal width? (Source: Fisher, R. A., 1936)

Case Study

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Page 15: Welcome to 2_DMC 617_Data... · Md. Khalid Hasan khalidhasan@du.ac.bd Institute of Disaster Management and Vulnerability Studies University of Dhaka The following are the numbers

1/16/2017

15

Scatter Plot - Case Study

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

The horizontal axis represents the petal length, and the vertical axis represents the petal width. Each point in the scatter plot represents the petal length and petal width of one flower.

Interpretation

From the scatter plot, you can see that as the petal length increases, the petal width also tends to increase.

Boxplots (box–whisker diagrams)

Boxplots, or box-and-whisker plots, are graphs that show the locations of scores and the amount of spread in a data set, while also providing a way to define scores as notably extreme. A boxplot consists of a box with two lines (called whiskers) extending from it; in addition, sometimes there are symbols beyond the lines (Outliers).

FIGURE: Boxplot of hygiene scores on day 1 of the Download Festival split by gender

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Page 16: Welcome to 2_DMC 617_Data... · Md. Khalid Hasan khalidhasan@du.ac.bd Institute of Disaster Management and Vulnerability Studies University of Dhaka The following are the numbers

1/16/2017

16

Create a chart in Excel

Visit the Websites and watch the videos

1. https://support.office.com/en-us/article/Create-a-chart-4d95c6a5-42d2-4cfc-aede-0ebf01d409a8?ui=en-US&rs=en-US&ad=US

2. https://www.youtube.com/watch?v=HacWD9HSww0&list=PLXP4h6BgzlN38suXd6gi0-psW0z0Fmv27&index=1

3. https://www.youtube.com/watch?v=KOOGje8W3P0

4. http://www.excel-easy.com/examples/line-chart.html

5. https://www.youtube.com/watch?v=cD5qvJfrh7w

Md. Khalid hasan, Assistant Professor, IDMVS, DU [email protected] of Disaster Management and Vulnerability Studies

University of Dhaka

References

1. Bluman, A. G. (2012). Elementary Statistics: A Step By Step Approach (8th ed.). McGraw-Hill. New York.

2. Dewberry, C. (2004). Statistical Methods for Organizational Research: Theory and practice. Routledge. New York.

3. Gravetter, J. F. & Wallnau, B. L. (2014). Essentials of Statistics for the Behavioral Sciences, (8th

ed.). Wadsworth. USA.

4. Healey, F. J. (2013). The Essentials of Statistics: a Tool for social Research (3rd ed.). Wadsworth. USA

5. Healey, F. J. (2009). Statistics: A Tool for Social Research (8th ed.). Wadsworth. USA.

6. Howitt, D. & Cramer, D. (2011). Introduction to Statistics in Psychology (5th ed.). Pearson. Edinburgh Gate. England.

7. Larson, R. & Farber, B. (2012). Elementary Statistics: Picturing the World (5th ed.). Prentice Hall. USA.

8. Weiss, A. N. (2012). Introductory Statistics (9th ed.). Addison-Wesley. USA.

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka

Page 17: Welcome to 2_DMC 617_Data... · Md. Khalid Hasan khalidhasan@du.ac.bd Institute of Disaster Management and Vulnerability Studies University of Dhaka The following are the numbers

1/16/2017

17

Md. Khalid [email protected]

Institute of Disaster Management and Vulnerability Studies University of Dhaka