Statistics Overview - University of Kentucky

Statistics Overview - University of Kentucky

Statistics Overview Some New, Some Old Some to come Science of Statistics Descriptive Statistics methods of summarizing or describing a set of data tables, graphs, numerical summaries Inferential Statistics methods of making inference about a population based on the information in a sample Levels of Measurement Nominal: The numerical values just "name" the attribute uniquely; no ordering of the cases is implied.

Ordinal: Attributes can be rank-ordered; here, distances between attributes do not have any meaning. Interval: The distance between attributes does have meaning. Ratio: There is always an absolute zero that is meaningful; this means that you can construct a meaningful ratio. It's important to recognize that there is a hierarchy implied in the level of measurement idea. At each level up the hierarchy, the current level includes all of the qualities of the one below it and adds something new. In general, it is desirable to have a higher level of measurement.

Variables Individuals are the objects described by a set of data; may be people, animals or things Variable is any characteristic of an individual Categorical variable places an individual into one of several groups or categories Quantitative variable takes numerical values for which arithmetic operations make sense Distribution of a variable tells us what values it takes and how often it takes these values Correlation

Correlation can be used to summarize the amount of linear association between two continuous variables x and y. A positive association between the x and y variables is indicated by an increase in x accompanied by an increase in y. A negative association is indicated by an increase in x accompanied by a decrease in y. For more information see http://www.anu.edu.au/nceph/surfstat/ surfstat-home/1-4-2.html Chi-square

A chi square statistic is used to investigate whether distributions of categorical variables differ from one another. The chi square distribution, like the t distributions, form a family described by a single parameter, degrees of freedom. df = (r 1) X (c 1) For a detailed example, see http://math.hws.edu/javamath/ryan/ChiSquare.html Hypothesis Testing Hypothesis testing in science is a lot like the criminal court system in the United States consider How do we decide guilt?

Assume Proof Two innocence until ``proven'' guilty. has to be ``beyond a reasonable doubt.'' possible decisions: guilty or not guilty Jury cannot declare someone innocent Statistical Hypotheses Statistical Hypotheses are statements about population parameters. Hypotheses are not necessarily true.

The hypothesis that we want to prove is called the alternative hypothesis, Ha. Hypothesis formed which contradicts Ha is called the null hypothesis, Ho. After taking the sample, we must either: Reject Ho and believe Ha or Fail to Reject Ho because there was not sufficient evidence to reject it. Type I and II Error Consider the jury trial

If a person is really innocent, but the jury decides (s)he's guilty, then they've sent an innocent person to jail. Type I error. If a person is really guilty, but the jury finds him/her not guilty, a criminal is walking free on the streets. Type II error. In our criminal court system, a Type I error is considered more important than a Type II error, so we protect against a Type I error to the detriment of a Type II error. This is typically the same in statistics. Decision Reject Ho Fail to Reject Ho

Truth Ho is true Ho is false Type I Error OK OK Type II Error P-value The choice of alpha is subjective. The smaller alpha is, the smaller the critical region. Thus, the harder it is to Reject Ho. The p-value of a hypothesis test is the smallest value of

alpha such that Ho would have been rejected. If P-value is less than or equal to alpha, reject Ho. If P-value is greater than alpha, do not reject Ho. Confidence Intervals Statisticians prefer interval estimates. Point Estimate +/- Critical Value * Standard Error The degree of certainty that we are correct is known as the level of confidence.

Common levels are 90%, 95%, and 99%. Increasing the level of confidence, Decreases the probability of error increases the critical point widens the interval Increasing n, decreases the width of the interval Gamma This is a statistics utilized in cross-tabulation tables.

Typically viewed as a nonparametric statistic. The Gamma statistic is preferable to Spearman R or Kendall tau when the data contain many tied observations. Gamma is a probability; specifically, it is computed as the difference between the probability that the rank ordering of the two variables agree minus the probability that they disagree, divided by 1 minus the probability of ties. It is basically equivalent to Kendall tau, except that ties are explicitly taken into account. Detailed discussions of the Gamma statistic can be found in Goodman and Kruskal (1954, 1959, 1963, 1972), Siegel (1956), and Siegel and Castellan (1988).

Gamma This statistic also tells us about the strength of a relationship. Can be used with ordinal or higher level of data. For a more detailed discussion of Lambda, Gamma and Tau, see http://72.14.209.104/search? q=cache:8ZS4_FvVqrgJ:ms.cc.sunysb.edu/~mlebo/ _private/Classes/POL501/Lecture %252012.pdf+gamma+AND+lambda+AND+tau+AND+sta tistics&hl=en&gl=us&ct=clnk&cd=39 Considering Bias

A sample is expected to mirror the population from which it comes, however, there is no guarantee that any sample will be precisely representative of the population from which it comes. The difference between the sample and the population is referred to as bias. Sampling Bias A tendency to favor selecting people that have a particular characteristic or set of characteristics. Sampling bias is usually the result of a poor sampling plan. The most notable is the bias of non response when people of specific characteristics have no chance of appearing in the sample. Non-Sampling Error In surveys of personal characteristics, unintended errors may result from: The manner in which the response is elicted The social desirability of the persons surveyed The purpose of the study

The personal biases of the interviewer or survey writer Enjoy the exploration! Questions or comments

Recently Viewed Presentations

  • ROBOTC Software - Robofest - Home

    ROBOTC Software - Robofest - Home

    using RobotC. Key tasks. Find the edge of the table. Follow the edge of the table. Find a putting green. Find the golf ball. Aim for the hole. ... This empty loop with allow the robot to spin until an...
  • A p* primer: logit models for social networks

    A p* primer: logit models for social networks

    Frank and Strauss (1986) introduced Markov dependence, in which a possible tie from . i to j is assumed to be contingent on any other possible tie involving i or j, even if the status of all other ties in...
  • Meet the Counselors 2017-2018

    Meet the Counselors 2017-2018

    Must have a Skyward login and password to log onto the referral system. On our school website, under the "Counseling tab", "Gifted & Talented" tab on the left- This same information so that you can navigate at home ... daily...
  • Fractals Project - Phoenix

    Fractals Project - Phoenix

    Fractals Project Natalie Rowe
  • BSA STEM NOVA AWARDS Science Technology Engineering Math

    BSA STEM NOVA AWARDS Science Technology Engineering Math

    Complete one Adventure/Merit Badge/Exploration from a list. Different one for each Nova Award. Scientific Exploration. Activities and/or create something. Visit a place where the STEM area is in practice. Discuss with Counselor how the STEM area affects everyday life. 4/28/2019....
  • WHERE AM I? Online Anatomy Module 1 INTRO

    WHERE AM I? Online Anatomy Module 1 INTRO

    Thus, "deciding that won't be easy" , "wine-making is fun" are natural & correct English In anatomy, as noun, it means something that sticks out or protrudes from a cell, from a bone, or from a soft organ PROCESS has...
  • A Brief Review of Solar Energy: Technology and

    A Brief Review of Solar Energy: Technology and

    March 24 2011. At under 0.5%, the Solar technology is massively under penetrated in the existing electricity domain. CrediteSwisse. Dec 13, 2007 . Available Solar energy on Earth
  • The Producer Price Index Manual - Ottawa Group

    The Producer Price Index Manual - Ottawa Group

    Use other country mirror data or global commodity price indices or establishment surveys. The resource constraint Where resources are not a problem, the preferable approach is a one-off switch to an index based on establishment-based price surveys.