Medical Computing and Statistics

  • Published on

  • View

  • Download

Embed Size (px)


<p>PRESENTED BY DR.RAJ KUMAR SINGH(JR-1) DEPTT .OF ORTHODONTICS AND DENTAL ANATOMY</p> <p>SUPERVISOR DR. SANJEEV KUMAR VERMA CHAIRMAN, DEPTT.OF ORTHODONTICS AND DENTAL ANATOMY DR.Z.A.DENTAL COLLEGE,AMU,ALIGARH</p> <p>CO-SUPERVISOR DR.MD.SAIF KHAN LECTURER, DEPTT .OF PERIODONTICS DR.Z.A.DENTAL COLLEGE,AMU,ALIGARH</p> <p>Overview of seminar Introduction to medical computing Role of medical computing Introduction to statistics</p> <p> How to use statistics Role of statistics conclusion</p> <p>What does it mean computers</p> <p>in medicine ?</p> <p> The Computer Meets Medicine and Biology:</p> <p>Emergence of a Discipline</p> <p>After taking this course, you should know the answers to these questions:</p> <p>Why</p> <p>is information management a central issue in biomedical research and clinical practice?</p> <p>What are integrated information-management environments and how might we expect them to affect the practice of medicine and biomedical research in coming years?</p> <p>What do we mean by the terms medical computer science, medical computing ,medical informatics, clinical informatics, nursing informatics, bioinformatics, and health informatics? Why should health professionals and students of the</p> <p>health professions learn about medical-informatics concepts and informatics applications?</p> <p>How has the development of mini-computers, microprocessors, and the Internet changed the nature of biomedical computing? How is medical informatics related to clinical practice , biomedical engineering, molecular biology, decision science, information science, and computer science?</p> <p>Role of computing Medical Decision making: Probabilistic medical </p> <p>reasoning. Patient care systems. Patient monitoring systems. Computer aided surgery. Computer based patient record systems.</p> <p> Clinical decision support systems. The internet. Standards in medical informatics.</p> <p> Imaging modalities. Image management systems. Telemedicine. Bioinformatics.</p> <p>Conventional data collection for clinical trialMedical records</p> <p>Data sheets</p> <p>Computer database</p> <p>Analyses</p> <p>Results</p> <p>WHAT IS STATISTICS</p> <p>Introduction Statistics is a science that comprises data collection</p> <p>methods,processing of data into useful information,and utilising this information in taking decisions with least error Medical statistics-</p> <p>A collection of statistical procedures particularly well-suited to the analysis of healthcare-related data.</p> <p> Medicine is empirical science depending on observations Medical data are necessary for any medical decision,be it for diagnosis,for treatment planning or prognosis,is that some information is availaible for the patient</p> <p> Also for medico-legal or ethical reasons,to justify Thus, they must be evidence based</p> <p> Uncertainities in medicine arise mainly due to</p> <p>1)biological variability 2)environmental variability 3)sampling fluctuations 4)chance variability 5)instrument variability</p> <p> To deal with such enormous uncertainities that</p> <p>pervade all aspects of medical practice, a separate science has developed,called biostatistics It provides methods to measure uncertainities by</p> <p>probabilities ,and helps to control the impact of uncertainities on medical practice by laying principles to choose decisions that judiciously combine the probabilities with judgements</p> <p>How to use statistics Develop an underlying question of interest Generate a hypothesis Design a study</p> <p> Collect Data Analyze Data Descriptive statistics Statistical Inference</p> <p>Hypothesis: tentative assumption of the study or expected results of the study It should be very specific and limited to the piece of</p> <p>research in hand because it has to be tested.</p> <p> The role of hypothesis is to guide the researcher by</p> <p>delimiting the area of research and to keep him on the right track.</p> <p> Develop Study Design Research question Study sample Sample size Enrollment/Follow-up strategies On-going monitoring</p> <p>sampling Sample is that part of target population which is</p> <p>actually enquired on or investigated Types of sampling:-</p> <p>1)Simple random 2)systematic random 3)stratified random 4)cluster random 5)multistage random</p> <p>Existing data Primary data are those which one elicits from</p> <p>individual patients,subjects or other units(such as hospitals or laboratories) directly Secondary data are those that are elicited by others Secondary data sources include disease specific</p> <p>database on the web,medical literature,records of surveys and registrations done by the government</p> <p>Generation of new data Existing data may be incomplete and insufficient to</p> <p>provide answers to specific questions For these data are specially generated through new surveys</p> <p>and experiments Basically there are two types of studies to generate new</p> <p>data-descriptive and analytical In either setup,it is necessary that a sample of subjects is</p> <p>studied</p> <p>data collection designs</p> <p>Objective</p> <p>descriptive</p> <p>analytical</p> <p>Method</p> <p>surveys</p> <p>observational</p> <p>experimental</p> <p>Time frame</p> <p>prospective retrospective crosssectional(One point time)</p> <p>(Cohort-cause to effect) (Effect to cause)</p> <p>Setting</p> <p>animal trial</p> <p>Describing data with tables 1) frequency table 2) relative and cumulative frequency 3) grouped frequency</p> <p> 4) open- ended groups 5) cross-tabulation</p> <p>Frequency tablevariables frequency</p> <p>Mortality (%)11.2-15.1 15.2-20.1 20.2-25.1 25.2-30.1</p> <p>Tally1, 1, 1, 1, 1, 1, 1, 1, 1 1, 1, 1, 1, 1, 1, 1, 1 1, 1, 1, 1, 1 1, 1, 1</p> <p>No. of ICU9 8 5 3</p> <p>30.2-35.1</p> <p>1,</p> <p>1</p> <p>Relative and cumulative frequencyparity No.of women Percentage (relative frequency) Cumulative percentage</p> <p>01 2 3 4 7 8</p> <p>56 14 10 3 1 1</p> <p>12.515 35 25 7.5 2.5 25</p> <p>12.527.5 62.5 87.5 95 97.5 100</p> <p>Cross tabulationTwo variables within a single group of individualsCaries Yes Occlusal 21 (84%) (66) 2 or fewer children No 11 (73%)(34) 32(100) Totals</p> <p>proximal Totals</p> <p>4 (16%) (50) 4 (27%)(50) 25(100%) 15(100%)</p> <p>8(100) 40</p> <p>Describing data with charts1) Charting nominal data</p> <p>(1) the pie chart</p> <p>(2) the simple bar chart(3) the cluster bar chart (4) the stacked bar chart</p> <p>2) Charting ordinal data(1) the pie chart (2) the bar chart</p> <p>3) Charting discrete metric data 4) Charting continuous metric data 1)the histogram</p> <p>Pie chart</p> <p>4-5 categories One variable Start at 0 in the same order as the tablePie chart: Hair color of children reciving d-phenothrin</p> <p>dark , 21, 21%</p> <p>blonde, 18, 18% blonde</p> <p>red, 4, 4%</p> <p>brown red dark</p> <p>brown, 55, 57%</p> <p>Simple bar diagram</p> <p>Clustered bar diagramCluster percetage bar chart of the hair color receiving Malathion and dphenothrin60 50 40 30 20 10 0 malathion d-penothrin 16 4 28 18 4 22 blonde brown red dark 52 56</p> <p>HistogramExercise 3-5, Histogram40 35 30 25 20 15 10 5 0 19 20-24 25-29 30-34 35 Percentage age distribution of pregnant women Thrombosis cases</p> <p>Step chartExercise 3.8 Cumulative percetage o finfants 120 100 90 80 60 40 20 0 0 60 36.67 16.67 6.67 5 10 Cumulative percetage o finfants 100</p> <p>Charting cumulative ordinal or discrete metric data</p> <p>Cumulative frequency curveExercise 3.9 Ogive120 100 80 60 40 20 0 15-24 Attempting suicide Later successful</p> <p>25-34</p> <p>35-44</p> <p>45-54</p> <p>55-64</p> <p>65-74</p> <p>75-84</p> <p>&gt; 85</p> <p>Percentage cumulative frequency curves of age for male suicide attempters and later succeeders</p> <p>Data collection ,types and quality Evidence based decisions are only as good as the</p> <p>evidence itself Thus it is important that the data gathered for creating</p> <p>evidence is correct</p> <p> Methods such as interview,examination ,investigations</p> <p>are availaible</p> <p> He must decide which method is best for particular</p> <p>information</p> <p> Data can be either , quantitative or qualitative Qualitative data can be on nominal scale or ordinal</p> <p>scale Quantitative data are on metric scale</p> <p>Nominal scale data It can be allocated into one of a number of categories. Blood type, sex(male/female) No meaningful order</p> <p>Ordinal scale data It can be allocated to one of a number of categories but</p> <p>be put in meaningful order. Very satisfied, satisfied, neutral, unsatisfied, very</p> <p>unsatisfied.</p> <p>Descrete metric data Countable variables. Integer form Numbers of things Age, numbers of men</p> <p>Continuous metric data Measurable variables. Round to the nearest integer Kg, m, mmHg, hour, years</p> <p> Quality of data is assessed in terms of validity and</p> <p>reliability of the measurements or of the tools used to obtain the data Validity - the ability to correctly measure the</p> <p>characteristic that it purports to measure</p> <p> For tests,this is assessed in terms of sensitivity-</p> <p>specificity ,and positive and negative predictivities Reliability - the ability to give same result when used</p> <p>repeatedly in identical conditions</p> <p>Statisitcal analyses Descriptive Statistics Describe the sample Inference Make inferences about the population Primarily performed in two ways:</p> <p>Hypothesis testing Estimation (more important !!)</p> <p> Prediction</p> <p>Descriptive statistics Descriptive statistics are a way of summarizing the complexity of the data with a single number.</p> <p> A. For one variable ("univariate analysis"): Measures of "CENTRAL TENDENCY") (averages) and of DISPERSION or variance around that average. Examples: Means, Modes, Medians, Standard Deviation, quartiles</p> <p> B. Descriptive statistics for the strength of relationship between two variables (bivariate analysis) or among a set of variables (multivariate analysis) are measures of ASSOCIATION or correlation.</p> <p>Measure of central tendency</p> <p>Nominal &amp; Ordinal Frequencies Percents Medians Modes (all)</p> <p>Interval &amp; Ratio</p> <p>Means</p> <p>Measure of dispersionNominal &amp; Ordinal (qualitative) Range Deviation Interval &amp; Ratio(quantitative) Standard Quartiles</p> <p>Measure of associationNominal &amp; Ordinal Interval &amp; Ratio</p> <p>Cross-tabulation Non-Parametric Phi, Gamma , Eta Lamda, Tau-B etc.</p> <p>Pearson's R</p> <p>Measure of significanceNominal &amp; Ordinal Chi Squre ,t-test Interval &amp; Ratio Anova (F-ratio)</p> <p>Inferential statistics Are measures of the SIGNIFICANCE of the relationship between two or more variables. Significance refers to the probability that the findings could be attributed to sampling error. Appropriate statistics depend on the LEVEL OF MEASUREMENT OF THE DEPENDENT VARIABLE (and of the independent variable).</p> <p>Parameters Summary measures , as mean and standard deviation</p> <p>can be obtained for a sample as also for entire population Summary measures,when obtained for the entire</p> <p>target population ,are called parameters The values of parameters are hardly ever known</p> <p>because nobody has time and resources to study the entire population</p> <p> When parameter values are unknown,as almost</p> <p>invariably is,it becomes necessary to fall back on samples to get some tangible lead regarding the characteristic of population Measures such as mean and SD when obtained for</p> <p>sample subjects are called statistics</p> <p>Standard deviation and normalmean</p> <p>Tests of parametric significance1) Student t-test:</p> <p>for comparison of mean between 2 groups 2) Anova F-test:</p> <p>for comparison of means in three or more groups (both the above test requires that the means follow a Gaussian distribution and hence are called parametric tests)</p> <p>Nonparametric test When sample size is very small and distribution is</p> <p>skewed, parametric tests cannot be used In such cases ,non parametric tests(less powerful test</p> <p>than parametric) are used For paired data - non-parametric tests commonly used</p> <p>are sign test and other is Wilcoxon signed rank test</p> <p> For unpaired two-sample data - the non-parametric</p> <p>test is Mann-Whitney test Another important non-parametic test is Chi-square</p> <p>test(used for nominal data),a test of proportion This is used to test the significance of association of</p> <p>two or more qualitative characteristics</p> <p>Point estimation and standard error It is a reality that samples in all likelihood will differ</p> <p>from one another Even though there is rarely a need for a second sample</p> <p>in scientific endeavours provided the first is chosen with due precautions such as random selection and inclusion of sufficient number of individuals</p> <p> In such cases ,summary measures based on one</p> <p>sample alone are considered good estimates of the respective characteristics of target population These are called point estimates</p> <p> Although point estimates obtained from carefully</p> <p>derived sample are fairly representative of population parameters,uncertainities arising out of sampling variation must be taken into account Sampling variation is a reality that says that samples in</p> <p>all likelihood will differ from one another</p> <p> S.E. of mean calculates these uncertainities Point estimates have reliability only when SE is small</p> <p>Confidance interval When SE is large,an interval estimate should be</p> <p>obtained This is also called confidence interval This is the range that is very likely to contain the</p> <p>parameter value</p> <p> This likelihood is called confidence level Generally a 95% confidence level is used The 95% CI is obtained as statistic+_2 SE of that</p> <p>statistic</p> <p>Null hypothesis It is the hypothesis that says that there is no</p> <p>difference,or that asserts the existing knowledge or claim,and is tested for refutation by the study For eg- newer drug B is not better than existing drug A</p> <p>for releiving toothache A null hypothesis is sought to be refuted by</p> <p>conducting a study</p> <p> A null hypothesis is either rejected or not rejected,it is</p> <p>never accepted Alternate hypothesis is the assertion that is accepted</p> <p>when the null is rejected Note that alternative is accepted when null is rejected</p> <p>but nothing is accepted when null is not rejected</p> <p>Evidance against null In case of medical studies,evidence is provided in</p> <p>terms of the results of a trial conducted on some patients,or observations regarding natural occurences in a group or many group of people</p> <p> The evidence is considered sufficient against the null</p> <p>hypothesis if 1)study is unbiased 2)There are no confounders that can affect the findings 3)Sample size is sufficient to inspire confidence in results and sampling fluctuations are minimal</p> <p>Type-1 error and p- values Type I error - when a true null hypothesis is rejected</p> <p>due to the wrong evidence provided by the data This is serious error The probability of type-I error is called P-value</p> <p> Thus, P value is the chance that the presence of</p> <p>difference is concluded when actually there is none It is this type I error that later on forces ban on some</p> <p>drugs after they are licensed for marketing</p> <p> The maximum threshold of tolerance of the</p> <p>probability of type-I error is called the significance level It is denoted by and is fixed in advance,generally at 0.05 percent P-value is calculated on basis of the data but is fixed</p> <p>in advance</p> <p> When P</p>