Upload
valerie-gibson
View
218
Download
0
Tags:
Embed Size (px)
Citation preview
Dot Plot is a graphical summaries of data. A horizontal axis shows the range of values for
the observations. Each data value is represented by a dot placed above the axis.
Dot Plots show the details of the data and are useful for comparing the distribution of data
for 2 or more variables.
DOT DIAGRAM (DOT PLOT)
The data shown below are the number of case files for 30 “law and consulting firms” in Ankara in 2010.
83 83 75 80 76 80 81 84 79 80 84 86 72 82 82 79 81 79 80 73
90 82 81 75 77 80 79 76 85 85
Construct a dot plot of the given above data.
THE STEM_AND_LEAF DISPLAY
This technique can be used to show both the rank order and a shape of a data set
simultaneously.
To develop a stem_and_leaf display, we first arrange the leading digits of each data value
to the left of the vertical line. To the right of the vertical line, we record the last digit for
each data value as we pass through the observations in the order they were recorded.
The numbers to the left of the vertical line form the STEM.
Each digit to the right of the vertical line is a LEAF
Advantages:
1- It is easier to construct by hand
2- Since it shows the actual data, this display provides more information than histograms
FREQUENCY DISTRIBUTION FOR QUALITATIVE DATA
FREQUENCY DISTRIBUTION FOR QUANTITATIVE DATA
With quantitative data, we must be more careful in defining the non-overlapping classes to be used in the freq. distribution. There are three steps necessary to define classes for a freq. diApp. with quantitative data.1- Determine the number of non-overlapping classes.2- Determine the width of each class.3- Determine the class limits
1- NUMBER OF CLASSES:Classes are formed by specifying ranges that will be used to group the data. As a general guideline, we recommend using between 5 and 20 classes. For a small number of data items, as few as five or six classes may be used to summarize the data.
Sample Size Number of Classes
Fewer than 50 5 – 6 Classes
50 to 100 7 – 8 Classes
Over 100 9 – 10 Classes
3- CLASS LIMITS:Class limits must be chosen so that each item belongs to one and only one class.
The lower class limit identifies the smallest possible data value assigned to the class.
The upper class limit identifies the largest possible data value assigned to the class.
CLASS MIDPOINT (M): The class midpoint is the value halfway between the lower and upper class limits. The definitions of the Relative Freq. and Percent Freq. Distributions are as the same as for qualitative data
Cumulative Distributions:
Cumulative Distribution is another tabular summary of data (quantitative). Cumulative
Distribution use the number of classes, class widths and class limits developed for the
frequency distributions.
Cumulative Distribution shows the number of data items with values “less than or equal to
the upper class limit” of each class.
We also note that a cumulative relative frequency distribution shows the proportion of data
items, and a cumulative percent frequency distribution shows the percentage of data
items with values less than or equal to the upper limit of each class.
In the year 2000, the average salaries of elementary school teachers in Oregon, Washington and Alaska were (in thousands USD) 40.9, 41.1 and 47.3. Given that there were 19.7, 28.1 and 5.3 thousand teachers in these states. Find the average salary of all the elementary school teachers in the three states.
A wholesaler sold 575, 410 and 520 microwave ovens at prices (in USD) 75, 125 and 100 respectively. What is the mean price of the ovens sold?
If somebody invests TL 1,000 at 7%, TL 3,000 at 7.5% and TL 16,000 at 8%.What is the overall percentage yield of these investments?
The number of employees of the municipalities in a certain area of a country were given in the following table
Find the mean and modal class?
# Frequency
202-204 2
205-207 7
208-210 16
211-213 26
214-216 18
217-219 4
Eighty randomly selected light bulbs were tested to determine their lifetimes (in hours). This frequency distribution was obtained.
Find the mean and modal class?
Class Boundaries Frequency
52.5-63.5 6
63.5-74.5 12
74.5-85.5 25
85.5-96.5 18
96.5-107.5 14
107.5-118.5 5
THE MODE- If a data set has only one value that occurs with greatest frequency is said to be UNIMODAL-If a data set has two values that occur with the same greatest frequency is said to be BIMODAL-If a data set has more than two values that occur with the same greatest frequency, each value is used as the mode, and the data set is said to be MULTIMODAL-When no data value occurs more than once, the data set is said to have NO MODE
Example:
The following data represent the duration (in days) of Space Shuttle voyages for the years 1992-1994. (18 values)
8,9,9,14,8,8,10,7,6,9,7,8,10,14,11,8,14,11
Q: Find The Mode
MONTHLY STARTING SALARY (In TRL)
Graduate Monthly Starting Salary
1 2,850
2 2,950
3 3,050
4 2,880
5 2,755
6 2,710
7 2,890
8 3,130
9 2,940
10 3,325
11 2,920
12 2,880
TOTAL: 35,280
A customer in a supermarket selected 6 cartons of eggs(each containing a dozen) from a large display.
The egg-filled cartons weighed 25.9, 27.8, 25.8, 26.1, 23.5, and 45.4 ounces respectively.
a-) Find the mean weight of these cartons.b-) Find the median weight of these cartons.c-) Is the mean a good average in this exercise?
Five lightbulbs burned out after lasting, respectively,for 867, 849, 840, 852, and 822 hours of continuous use.Find the mean and also determine what the meanwould have been if the second value had been recorded incorrectly as 489 instead of 849.
Properties of the Mean
1- The Mean can be calculated for any set of NUMERICAL Data2- The Mean is unique and unambiguous value.3- The means of several sets of data can always be combined into the overall mean of all the data.4- If each value in a sample were replaced by the mean, then ∑X would remain unchanged5- The mean takes into account the value of each item in a set of data.6- The mean is relatively reliable in the sense that means of many samples drawn from the same population generally do not fluctuate, or vary, as widely as other statistics used to estimate the mean of a population.
Properties of the Median
In addition to the properties of the Mean;
1- It splits the data into two parts2- The median is preferable to the mean because it is not so easily affected by extreme values.
Problem: The following list gives the duration in minutes of 24 power failures.
18 125 44 96 31 53
26 80 49 125 63 58
45 33 89 12 103 127
75 40 80 61 28 129 Find the median.