Transcription

Chow, S. L. (2002). STATISTICS AND ITS ROLE IN PSYCHOLOGICAL RESEARCH. InMethods in Psychological Research, In Encyclopedia of Life Support Systems (EOLSS), EolssPublishers, Oxford, UK, [http://www.eolss.net]Siu L. ChowDepartment of Psychology, University of Regina, CanadaKeywords: associated probability, conditional probability, confidence-interval estimate,correlation, descriptive statistics, deviation score, effect size, inferential statistics, random samplingdistribution, regression, standard deviation, standard error, statistical power, statistical significance,sum of squares, test statistic, Type I error, Type II errorContents1.2.3.4.5.IntroductionDescriptive StatisticsBridging Descriptive and Inferential StatisticsInferential StatisticsEffect Size and Statistical PowerSummaryAs readers will have noticed, some everyday words are given technical meanings in statisticalparlance (e.g. “mean,” “normal,” “significance,” “effect,” and “power”). It is necessary to resist thetemptation of conflating their vernacular and technical meanings. A failure to do so may have a lotto do with the ready acceptance of the “effect size” and “power” arguments in recent years.To recapitulate, statistics is used (i) to describe succinctly data in terms of the shape, centraltendency, and dispersion of their simple frequency distribution, and (ii) to make decisions about theproperties of the statistical populations on the basis of sample statistics. Statistical decisions aremade with reference to a body of theoretical distributions: the distributions of various test statisticsthat are in turn derived from the appropriate sample statistics. In every case, the calculated teststatistic is compared to the theoretical distribution, which is made up of an infinite number oftokens of the test statistic in question. Hence, the “in the long run” caveat should be made explicit inevery probabilistic statement based on inferential statistics (e.g. “the result is significant at the 0.05level in the long run”).Despite the recent movement to discourage psychologists from conducting significance tests,significance tests can be (and ought to be) defended by (i) clarifying some concepts, (ii) examiningthe role of statistics in empirical research, and (iii) showing that the sampling distribution of the teststatistic is both the bridge between descriptive and inferential statistics and the probabilityfoundation of significance tests.1. IntroductionStatistics, as a branch of applied mathematics, consists of univariate and multivariate procedures.Psychologists use univariate procedures when they measure only one variable; they use multivariateprocedures when multiple variables are used (a) to ascertain the relationship between two or morevariables, (b) to derive the test statistic, or (c) to extract factors (or latent variables). As multivariatestatistics is introduced in The Construction and Use of Psychological Tests and Measures, thisarticle is almost exclusively about univariate statistics. The exception is the topic of linearcorrelation and regression.1

The distinction needs to be made before proceeding between the substantive population and thestatistical population. Suppose that an experiment is carried out to study the effects of dietsupplements on athletic performance. The substantive population consists of all athletes. Thesample selected from the substantive population is divided into two sub-samples. The experimentalsub-sample receives the prescribed diet supplements and the control sub-sample receives a placebo.In this experimental context, the two groups are not samples of the substantive population, “allathletes.” Instead, they are samples of two statistical populations defined by the experimentalmanipulation “athletes given diet supplements” and “athletes given the placebo.” In general terms,even if there is only one substantive population in an empirical study, there are as many statisticalpopulations as there are data-collection conditions. This has the following five implications.First, statistics deal with methodologically defined statistical populations. Second, statisticalconclusions are about data in their capacity to represent the statistical populations, not aboutsubstantive issues. Third, apart from very exceptional cases, research data (however numerous) aretreated as sample data. Fourth, testing the statistical hypothesis is not corroborating the substantivetheory. Fifth, data owe their substantive meanings to the theoretical foundation of the research (forthe three embedding conditional syllogisms, see Experimentation in Psychology--Rationale,Concepts, and Issues).Henceforth, “population” and “sample” refer to statistical population and statistical sample,respectively. A parameter is a property of the population, whereas a statistic is a characteristic of thesample. A test statistic (e.g. the student-t) is an index derived from the sample statistic. The teststatistic is used to make a statistical decision about the population.In terms of utility, statistics is divided into descriptive and inferential statistics. Psychologists usedescriptive statistics to describe research data succinctly. The sample statistic (e.g. the samplemean, X ) thus obtained is used to derive the test statistic (e.g. the student-t) that features ininferential statistics. This is made possible by virtue of the “random sampling distribution” of thesample statistic. Inferential statistics consists of procedures used for (a) drawing conclusions about apopulation parameter on the basis of a sample statistic, and (b) testing statistical hypotheses.2. Descriptive StatisticsTo measure something is to assign numerical values to observations according to some well-definedrules. The rules give rise to data at four levels: categorical, ordinal, interval, or ratio. A preliminarystep in statistical analysis is to organize the data in terms of the research design. Psychologists usedescriptive statistics to transform and describe succinctly their data in either tabular or graphicalform. These procedures provide the summary indices used in further analyses.2.1. Four Levels of MeasurementUsing numbers to designate or categorize observation units is measurement at the nominal orcategorical level. An example is the number on the bus that signifies its route. Apart from counting,nominal data are amenable to no other statistical procedure.An example of ordinal data is the result of ranking or rating research participants in terms of somequality (e.g. their enthusiasm). The interval between two successive ranks (or ratings) isindeterminate. Consequently, the difference between any two consecutive ranks (e.g. Ranks 1 and2) may not be the same as that between another pair of consecutive ranks (e.g. Ranks 2 and 3).Temperature is an example of the interval-scale measurement. The size of two successive intervalsis constant. For example, the difference between 20 C and 30 C is the same as that between 10 Cand 20 C. However, owing to the fact that 0 C does not mean the complete absence of heat (i.e.there is no absolute zero in the Celsius scale), it is not possible to say that 30 C is twice as warm as15 C.In addition to having a constant difference between two successive intervals, it is possible to make adefinite statement about the ratio between two distances by virtue of the fact that 0 m means no2

distance. Hence, a distance of 4 km is twice as far as 2 km because of the absolute zero in thevariable, distance. Measurements that have properties like those of distance are ratio data.2.2. Data—Raw and DerivedSuppose that subjects are given 60 minutes to solve as many anagram problems as possible. Thescores thus obtained are raw scores when they are not changed numerically in any way. In a slightlydifferent data collection situation, the subjects may be allowed as much time as they need. Theirdata may be converted into the average number of problems solved in a 30-minute period or theaverage amount of time required to solve a problem. That is, derived data may be obtained byapplying an appropriate arithmetic operation to the raw scores so as to render more meaningful theresearch data.2.3. Data Tabulation and DistributionsData organization is guided by considering the best way (i) to describe the entire set of data withoutenumerating them individually, (ii) to compare any score to the rest of the scores, (iii) to determinethe probability of obtaining a score with a particular value, (iv) to ascertain the probability ofobtaining a score within or outside a specified range of values, (v) to represent the data graphically,and (vi) to describe the graphical representation thus obtained.2.3.1. Simple Frequency DistributionThe entries in panel 1 of Table 1 represent the performance of 25 individuals. This method ofpresentation becomes impracticable if scores are more numerous. Moreover, it is not conducive tocarrying out the six objectives just mentioned. Hence, the data are described in a more useful wayby (a) identifying the various distinct scores (the “Score” row in panel 2), and (b) counting thenumber of times each score occurs (i.e. the “Frequency” row in panel 2). This way of representingthe data is the tabular “simple frequency distribution” (or “frequency distribution” for short).Table 1. Various ways of tabulating dataPanel 1: A complete enumeration of all the scores15141413131311111111111099887Panel 2: The simple frequency 04121093Panel 3: Distributions derived from the simple frequency iveCumulative Cumulative RelativerelativeFrequencyfrequency percentage 0.120.88419760.160.763

111098754321Total 0.120.042.3.2. Derived DistributionsThe frequency distributions tabulated in panel 2 of Table 1 have been represented in columns 1 and2 of panel 3. This is used to derive other useful distributions: (a) the cumulative percentagedistribution (column 3), (b) the cumulative percentage (column 4), (c) the relative frequency(probability) distribution (column 4), and (d) the cumulative probability distribution (column 6).Cumulative frequencies are obtained by answering the question “How many scores equal or aresmaller than X?” where X assumes every value in ascending order of numerical magnitude. Forexample, when X is 8, the answer is 3 (i.e. the sum of 1 plus 2) because there is one occurrence of 7and two occurrences of 8. A cumulative percentage is obtained when 100 multiply a cumulativerelative frequency.A score’s frequency is transformed into its corresponding relative frequency when the total numberof scores divides the frequency. As relative frequency is probability, the entries in column 5 are therespective probabilities of occurrence of the scores. Relative frequencies may be cumulated in thesame way as are the frequencies. The results are the cumulative probabilities.2.3.3. Utilities of Various DistributionsPsychologists derive various distributions from the simple frequency distribution to answerdifferent questions. For example, the simple frequency distribution is used to determine the shape ofthe distribution (see Section 2.4.1. The Shape of the Simple Frequency Distribution). Thecumulative percentage distribution makes it easy to determine the standing of a score relative to therest of the scores. For example, it can be seen from column 3 in panel 3 of Table 1 that 22 out of 25scores have a value equal to or smaller than 13. Similarly, column 4 shows that a score of 13 equals,or is better than, 88% of the scores (see column 5).The relative frequencies make it easy to determine readily what probability or proportion of times aparticular score may occur (e.g. the probability of getting a score of 12 is 0.16 from column 5).Likewise, it is easily seen that the probability of getting a score between 9 and 12, inclusive, is 0.64(i.e. 0.12 0.16 0.20 0.16). The cumulative probability distribution in column 6 is used toanswer the following questions:(a) What is the probability of getting a score whose value is X or larger?(b) What is the probability of getting a score whose value is X or smaller?(c) What are X1 and X2 such that they include 95% of all scores?The probability in (a) or (b) is the associated probability of X. In like manner, psychologists answerquestions about the associated probability of the test statistic with a cumulative probabilitydistribution at a higher level of abstraction (see Section 3.2. Random Sampling Distribution ofMeans). The ability to do so is the very ability required in making statistical decisions about chanceinfluences or using many of the statistical tables.2.4. Succinct Description of DataResearch data are described succinctly by reporting three properties of their simple frequencydistribution: its shape, central tendency, and dispersion (or variability).4

2.4.1. The Shape of the Simple Frequency DistributionThe shape of the simple frequency distribution depicted by columns 1 and 2 in panel 3 of Table 1 isseen when the frequency distribution is represented graphically in the form of a histogram (Figure1a) or a polygon (Figure 1b). Columns 1 and 6 jointly depict the cumulative probability distributionwhose shape is shown in Figure 1c. In all cases, the score-values are shown on the X or horizontalaxis, whereas the frequency of occurrence of a score-value is represented the Y or vertical axis.A frequency distribution may be normal or non-normal in shape. The characterization “normal” inthis context does not have any clinical connotation. It refers to the properties of being symmetricaland looking like a bell, as well as having two tails that extend to positive and negative infinitieswithout touching the X axis. Any distribution that does not have these features is a non-normaldistribution.2.4.2. Measures of Central TendencySuppose that a single v