Upload
anandkasirajankak
View
185
Download
4
Tags:
Embed Size (px)
Citation preview
An Introduction to Statistics
Statistics:
The subject concerned with scientific method for collecting, summarising, presenting, and
analysing data as well as drawing conclusions or making predictions on the basis of such
analysis.
Descriptive statistics:
The branch of statistics, which seeks only to describe and analyse any data is called
descriptive statistics.
Inferential statistics:
The branch of statistics dealing with drawing conclusions about the population with the help
of the analysis of a sample, drawn from it, is known as inferential statistics.
Classification and tabulation:
Classification is the first step in tabulation. Classification implies bringing together the items
which are similar in some respect(s).
Example: students of a class may be grouped together with respect to their obtained in an
examination, their age or area of specialisation, etc.
After classification, tabulation is done to condense the data in a compact form which can be
easily comprehended.
Diagrammatic / Graphical presentation:
There are several diagrams/graphs used for presentation of data.
Bar chart
Pareto chart
Pie chart
Histogram
Ogive
Line graph
Lorenz curve.
(i) Bar chart:
It comprises a series of bars of equal width- the base of the bars being equal to
the width of the class interval of a grouped data. The bars stand on a common
base line, the heights of the bars being proportional to the frequency of the
interval.
The following data give the distribution of 215 MBA students at a management
institute according to educational qualifications.
Educational Qualification No of students
B.Tech 55
B.Com 70
B.Sc 25
B.A 45
C.A 20
(a) Sub divided bar chart:
A subdivided bar chart is a bar chart wherein each bar is divided into further
components.
In the above example if the information about the cities from where the
students have graduated, is also available as given below.
Educational Qualification Metro Large
Mediu
m
No of
students
B.Tech 15 25 15 55
B.Com 35 20 15 70
B.Sc 10 10 5 25
B.A 15 10 20 45
C.A 10 5 5 20
(b) Percentage bar chart:
Percentage bar chart is one in which each bar is divided into components
which are expressed as percentage of the total bar.
Automaker
Average Sales
Estimates
Average Net
Profit
Estimates
Percentage of profit to sales
(iii)=(ii)/(i) *100
Tata motors 6848.8 466 7.2
Hero Honda 2196.5 224.2 10
Bajaj Auto 2444.7 345.4 14
TVS Motor 1032.9 35.1 3.4
Bharath Forge 461.6 63.4 14
Ashok Leyland 1635.8 94.7 5.8
M&M 2365.5 200.6 8.5
Marutiudyog 3426.5 315.7 9.2
(c) Multiple bar chart:
Multiple bar charts are one in which two or more bars are placed together
for each entity.
The bars are placed together to give comparative assessment of values of
some parameter over two periods of time or two different locations etc.
Pain Killer 2005 2006
Voveran 16.5 23.2
Calpol 13.2 18.2
Nise 15.2 18.6
Combiflam 9.4 14.1
Dolonex 6.8 10.3
Sumo 5.1 7.4
Volini 6.9 9.6
Moov 3.8 4.9
Nimulid 3.5 4.9
Another example…
Name
Net worth in
$ Billion March 06
Net worth in
$ Billion March 07
Lakshmi Mittal 20 32
MukeshAmbani 7 20.1
Anil Ambani 5.5 18.2
AzimPremij 11 17.1
Kushal Pal Singh 5 10
Sunil Mittaal& Family 4.9 9.5
Kumar Mangalam Birla 4.4 8
Shashi& Ravi Ruia 2.7 8
PallonjiMistry 3.3 5.6
Adi Godrej & Family 2.3 4.1
Shiv Nadar 3 4
Anil Agarwal 2.1 3.8
DilipShanghvi 2 3.1
Tulsi Tanti 2.4 3.7
Malvinder&Shivinder Singh 2 1.55
VenugopalDhoot 1.6 1.6
Naresh Goyal 1.3 1.9
Rahul Bajaj 1.1 1.5
(ii) Pareto chart:
This specialist bar chart, named after the famous Italian economist, is used to
classify a variable into groups or intervals from largest to smallest frequency.
It facilitates identification of the most frequent occurrence or causes of an event
or phenomenon. It is used for sorting by data by using any criteria like
geographical regions, organisation like management institutes, banks, countries,
cities etc.
Academic Background Frequency
Commerce 18
Economics 6
Eingineering 17
Information Technology 7
Science 8
(iii) Pie chart:
It is one of the most popular charts for presenting the whole into parts. It is a
circular chart divided into sectors representing relative magnitude of various
components.
A pie chart is obtained by dividing a circle into sectors such that these sectors
have areas or centre angles proportional to different components given in the
data.
Sources of Funds Percentage of Total Uses of Funds
Percentage of
Total
Excise 17 Central Plan 20
Customs 12 Non-plan Assistance and Expenditure 23
Corporate Tax 21 Defence 12
Income Tax 13 Interest Payments 20
Service Tax 7 states' Share 18
Borrowings & others 30 Subsidies 7
Total 100 100
sources of Funds percentage of Total size of Segment (Degrees)
Excise 17 61.2
Customs 12 43.2
Corporate Tax 21 75.6
Income Tax 13 46.8
Service Tax 7 25.2
Borrowings & Others 30 108
Total 100 360
uses of funds percentage of total
size of
segment
Central Plan 20 72
Non-plan Assistance and Expenditure 23 82.8
Defence 12 43.2
Interest Payments 20 72
states' Share 18 64.8
Subsidies 7 25.2
100 360
(iv) Histogram / Frequency polygon:
A histogram comprises of vertical rectangles whose base is proportional to the
class interval and height is proportional to the frequency of an interval.
The polygon formed by joining the top middle points of the rectangles of the
histogram s called frequency polygon.
(v) Line graphs:
A line graph is a visual presentation of a set of data values joined by straight
lines.
Bank Business Per Employee 2005-06 Business Per Employee 2001-02
Allahabad Bank 336 153
Andhra Bank 426.75 195.96
Bank of Baroda 396 222.76
Bank of India 381 218.74
Bank of Maharashtra 306.18 191.44
Canara Bank 441.57 214.88
Central Bank of India 240.46 148.77
Corporation Bank 527 290.44
Dena Bank 364 221
Indian Bank 295 156
Indian Overseas Bank 354.73 175.41
(vi) Lorenz curve:
Indicates the extent of inequality in the distribution of a financial parameter like
income
Descriptive statistics
The branch of statistics, which seeks only to describe and analyse any data is called
descriptive statistics.
Measures of central tendency:
1) Arithmetic mean
2) Median
3) Mode
4) Geometric mean
5) Harmonic mean
Arithmetic mean:
An average is a single value within the range of the data that is used to represent all of the values
in the series.
“Arithmeticmean is quotient of sum of the given values and number of the given values”.
Arithmetic mean: Problems for Practice
1) Find the arithmetic mean of the marks obtained by 10 students of class X in mathematics in
a certain examination. The marks obtained are
25,30,21,55,47,10,15,17,45,35
Ans=30.
2) Find the Arithmetic Mean from the following frequency table:
Marks 52 58 60 65 68 70 75
No of
Students
7 5 4 6 3 3 2
Ans= 61.6
3) The following table gives the distribution of 100 accidents in New Delhi during seven days of
a week of a given month. During that month there were 5 Mondays, 5 Tuesdays and 5
Wednesday s and only four each for the other days. Calculate the number of accidents per
day.
Day: Sunday Monday Tuesday Wednesday Thursday Friday Saturday
No of
Accidents:
26 16 12 10 8 10 18
Ans= 14.13
4) The data on number of patients attending a hospital in a month are given below. Find the
average number of patients attending the hospital in a day.
Number of
patients
0-10 10-20 20-30 30-40 40-50 50-60
Number
days
attending
the hospital
2 6 9 7 4 2
Ans=28.67
5) Ten coins were tossed together and the number of the resulting from them was observed.
The operation was performed 1050 times and the frequencies thus obtained for different
number of tails (x) are shown in the following table. Calculate the arithmetic mean by the
shortcut method.
X: 0 1 2 3 4 5 6 7 8 9 10
Y: 2 8 43 133 207 260 213 120 54 9 1
Ans=5.0114
6) For the following frequency table, find the mean.
Class: 100-120 120-140 140-160 160-180 180-200 200-220 220-240
Frequency 10 8 4 4 3 1 2
Ans=145.625
7) In a study on patients, the following data were obtained. Find the arithmetic mean.
Age (in
years)
10-19 20-29 30-39 40-49 50-59 60-69 70-79 80-89
No of
cases:
1 0 1 10 17 38 9 3
Ans=60.7
8) Find the value of p for the following distribution whose mean is 16.6
F: 12 16 20 24 16 8 4
X: 8 12 15 P 20 25 30
Ans=16.6
9) The mean height of 25 male workers in a factory is 61 inches and the mean height of 35
female workers in the same factory is 58 inches. Find the combined mean height of 60
workers in a factory. Ans=59.25
10) A firm of readymade garments make both men’s and women’s shirts. Its profit average is 6%
of sales. Its profits in men’s shirts average 8% of sales; and women’s shirts comprise 60% of
output. What is the average profit per sales rupee in women’s shirts? Ans= 4.67
11) The average score of girls in class X examination in a school is 67 and that of boys is 63. The
average score for the whole class is 64.5 find the percentage of girls and boys in the class.
10. Ans:62.5
12) There are 50 students in a class of which 40 are boys and rest girls. The average weight of
the class is 44 kg and the average weight of the girls is 40 kg. Find the average weight of the
boys. Ans=45
13) The mean annual salary of all employees in a company is Rs. 25,000. The mean salary of
male and female employees is Rs. 27,000 and Rs. 17,000 respectively. Find the percentage of
males and females employed by the company. Ans: males=80 and females=20.
14) The mean marks of 100 students were found to be 40. Later on it was discovered that a
score f 53 was misread as 83. Find the correct mean corresponding to the correct score.
Ans=39.7
15) Mean of 100 observations is found to be 40. If at the time of computation two items are
wrongly taken as 30 and 27 instead of 3 an 72. Find the correct mean. Ans =40.18
Median
Problems for Practice
1) The number of runs scored by 11 players of a cricket team of a school are
5 19 42 11 50 30 21 0 52 36 27 . Find median
Ans=27runs.
2) Find the median of the following items:
6 10 4 3 9 11 22 18
Ans=9.5
3) The following table represents the marks obtained by a batch of 12 students in certain class
tests in Statistics and Physics.
sr. no 1 2 3 4 5 6 7 8 9 10 11 12
Marks
(Statistics)53 54 32 30 60 46 28 25 48 72 33 65
Marks
(Physics)55 41 48 49 27 25 23 20 28 60 43 67
Ans=42.
4) Calculate median for the following data:
No of
students
6 4 16 7 8 2
Marks: 20 9 25 50 40 80
Ans= 25
5) Find the median of the following frequency distribution:
X: 5 7 9 12 14 17 19 21
Y: 6 5 3 6 5 3 2 4
Ans=12
6) The following table gives the weekly expenditure of 100 families. Find the median weekly
expenditure.
Weekly Expenditure 0-10 10-20 20-30 39-40 40-50
Number of Families 14 23 27 21 15
Ans=24.815
7) Calculate the mean and median for the following data:
Height (in cm) No of boys Height (in cm) No. Of boys
135-140
140-145
145-150
150-155
4
9
18
28
155-160
160-165
165-170
170-175
24
10
5
2
Ans=153.9
8) Calculate the median from the following data.
Weight (gms) No of apples Weight (gms) No of apples
410-419 14 450-459 45
420-429 20 460-469 18
430-439 42 470-479 7
440-449 54
Ans=443.94
9) Calculate the median:
Marks No of students Marks No of students
Less than 5 29 Less than 30 644
Less than 10 224 Less than 35 650
Less than 15 465 Less than 40 653
Less than 20 582 Less than 45 655
Less than 25 634
Ans=14.29
Mode
Problems for practice
1) A shoe shop in Delhi had sold 100 pairs of shoes of a particular brand on a certain day with
the following distribution: find the mode of the distribution.
Size of Shoes 4 5 6 7 8 9 10
No of pairs: 10 15 20 35 16 3 1
Ans=7
2) Find the mode for the following data:
Marks: 1-5 6-10 11-15 16-20 21-25
No of Students: 7 10 16 32 24
Ans=19.33
3) Calculate Median and Mode for the following distribution:
Production per day
(in tons)21-22 23-24 25-26 27-28 29-30
No of days: 7 13 22 10 8
Ans= 25.36
4) Calculate AM, median and mode from the following frequency distribution.
Variable Frequency Variable Frequency
10-13 8 25-28 54
13-16 15 28-31 36
16-19 27 31-34 18
19-22 51 34-37 9
22-25 75 37-40 7
(Mean =24.19, median=23.96, mode=23.6)
Measures of dispersion
The degree to which numerical data tend to spread about an average value is called the variation or
dispersion of the data.
Significance of measuring variation:
To determine the reliability of an average.
To serve as a basis for the control of the variability.
To compare two or more series with regard to their variability.
To facilitate the use of other statistical measures.
Methods of studying variation:
The range
The quartile deviation
The mean deviation
The standard deviation.
Range
1) The following are the prices of shares of AB Co Ltd from Monday to Saturday. Calculate
range and its coefficient.
Day Price Day price
Monday 200 Thursday 160
Tuesday 210 Friday 220
Wednesday 208 Saturday 250
Ans: range=90 and coefficient of range=0.22
2) Calculate the coefficient of range from the following:
Marks No of students Marks No of students
10-20 8 40-50 8
20-30 10 50-60 4
30-40 12
Ans=0.714
The quartile deviation
1) Find out the value of quartile deviation and its coefficient from the following data:
Marks 10 20 30 40 50 60
No of
students
4 7 15 8 7 2
Ans: QD=10 and coeff=0.333
2) Calculate quartile deviation and its coefficient from the following data:
Wages in Rs
per week
Less than 35 35-37 38-40 41-43 Over 43
No of wage
earners
14 62 99 18 7
Ans: QD=1.67 and coeff=0.044
Mean deviation
1) Calculate the mean deviation and its coefficient of the two income groups of five and seven
members.
1st group 4000 4200 4400 4600 4800
2nd group 3000 4000 4200 4400 4600 4800 5800
Ans: 1st: MD=240 coeff=0.054 & 2nd: MD=571.43, coeff=0.130
2) Calculate the mean deviation:
X 10 11 12 13 14
F 3 12 18 12 3
Ans=0.75
3) Calculate mean deviation and its coefficient.
Class frequency Class Frequency
0-10 5 40-50 20
10-20 8 50-60 14
20-30 12 60-70 12
30-40 15 70-80 6
Ans: MD=15.37 & coeff=0.357
Standard deviation
1) Blood serum cholesterol levels of 10 persons are as under
240,260,290,245,255,288,272,263,277,251.
Calculate standard deviation.
2) The annual salaries of a group of employees are given in the following table.
Salaries
in (Rs
000)
45 50 55 60 65 70 75 80
Number
of
persons
3 5 8 7 9 7 4 7
Calculate SD of the salaries. Ans =10.35
3) Calculate mean and SD of the following frequency distribution of marks:
Marks No of students Marks No of students
0-10 5 40-50 50
10-20 12 50-60 37
20-30 30 60-70 21
30-40 45
Ans : mean=40.9 & SD=14.839
Coefficient of variation
1) From the prices of shares of X and Y below find out which is more in value:
X 35 54 52 53 56 58 52 50 51 49
Y 108 107 105 105 106 107 104 103 104 101
Ans: CV of X=11.6 & CV of Y=1.905
2) Two brands of tyres are tested with the following results:
Life (in ‘000 miles) No of tyres brand
X Y
20-25 1 0
25-30 22 24
30-35 64 76
35-40 10 0
40-45 3 0
a) Which brands of tyres have greater life?
b) Compare the variability and state which brand of tyres would you use on your fleet of
trucks/
*********************************************************************************
Probability
1. A can solve 80% of the problems, while B can solve 90% of problems in a Statistics book. A problem is selected at random. What is the probability that at least one of them will solve it?
2. In a box, there are 2 white and 4 black balls. What is the probability that both of the two balls drawn, one after the other, are white?
3. In families with two children, what is the probability that a family will havei. One boy one girl?ii. Two girls?iii. Two boys?
In the absence of any other information, it is assumed that the probability of child being a boy or a girl is ½ .
4. A speaks the truth in 60% and B in 75% of the cases. In what percentage of the cases, they are likely to contradict each other stating the same fact?
5. An investment consultant predicts that the odds against the price of a certain stock going up are 2:1, and odds in favour of the price remaining the same are 1:3. What is the probability that the stock will go down?
6. The probability that A can solve a problem in Statistics is ½ , that B can solve 1/3, and C can solve it is 1/5. If all of them try independently, then find the probability that the problem will be solved.
7. A salesman is known to sell a product in 3 out of 5 attempts while another salesman in 2 out of 3 attempts. Find the probability thati. No sale will take place when they both try to sell the product ii. Either of them will succeed in selling the product.
8. An investment analyst presents the following table giving probabilities of next year’s economic conditions normal or good or very food, in the country, and probabilities of the movement increase or decline.
9. A class consists of 100 students; 25 of them are girls and 75 boys; 80 0f them are rich and 20 are poor; 40 of them have brown eyes and 60 have black eyes. What is the probability of selecting a brown eyed rich girl?
10. A candidate is selected for interviews for 3 posts. For the first post, there are 3 candidates are second, 4 and for the third post there are 2 candidates. What is the probability that the candidate is selected for at least one post?
11. Three machines producing 40%, 35% and 25% of the total output are known to produce with defective proportion of items as: o.04, 0.06 and 0.03, respectively. On a particular day, a unit of output is selected at random, and is found to be defective. What is the probability that it was produced by the second machine?
12. In a basin area where oil is likely to be found underneath the surface, there are three locations with three different types of earth composition, say C1, C2 and C3.