What is Statistic?
- Statistics is a way to get information from data.
- Statistics is the study of the collection, organizing, analysis, interpretation and presentation of data.
Two Kinds of Statistics
Descriptive statistics consists of methods for organizing, summarizing, and presenting data in a convenient and informative way. These methods include graphical techniques and numerical techniques.
Inferential statistics consists of methods for drawing conclusions about a population based on information obtained from a sample of the population.
Population and Sample
The population is the collection of all individuals, items, or data under consideration in a statistical study.
The sample is a part of the population from which information is collected.
Census and Survey
A census is a study of the entire population. We collect information from the whole population.
A survey is a study of a sample which was derived from the population. We collect information from a sample.
Parameter and Statistic
A parameter is a number/numerical quantity that describes a characteristic of the population.
A statistics is a number/numerical quantity that describes a characteristic of a sample.
Variables and Data
Types of variables
A variable is a characteristic of each person or thing of the population or a characteristic that varies from one person or thing to another. There are several types of variable as below.
Types of Data
Data are values of a variable. There are two ways to classify data as below.
Qualitative data are obtained by observing values of a qualitative variable. They are most often nonnumerical.
Quantitative data are obtained by observing values of a quantitative variable. They are inherently numerical.
Discrete data are obtained by observing values of a discrete variable. These data can take on only certain values. These values are often integers or whole numbers.
Nominal data are categorical data and numbers that are simply used as identifiers or names.
Ordinal data represent an ordered series of relationships or rank order. Here the order of the values is important and significant, but the differences between each one are not really known.
Interval data are numeric data in which we know not only the order but also the exact differences between the values. Here’s the problem with interval data: they don’t have a “true zero.”
Ratio data tell us about the order, they tell us the exact value of units, and they also have an absolute zero–which allows for a wide range of both descriptive and inferential statistics to be applied.
Classification Data Based on Who Collected the Data
Data collected by the investigator himself/ herself for a specific purpose of addressing the problem at hand. (Questionnaires, telephone interviews, face to face interviews).
Some Advantages of Using Primary Data
- The investigator collects data specific to the problem under study.
- There is no doubt about the quality of the data collected (for the investigator).
- If required, it may be possible to obtain additional data during the study period.
Some Disadvantages of Using Primary Data
- The investigator has to contend with all the hassles of data collection.
- Ensuring the data collected is of a high quality.
- Cost of obtaining the data is often the major expense in studies.
Data collected by someone else for some other purpose (but being utilized by the investigator for another purpose).
Some Advantages of Using Secondary Data
- There are no any hassles of data collection.
- It is less expensive.
- The investigator is not personally responsible for the quality of data (“I didn’t do it”).
Some Disadvantages of Using Secondary data
- The investigator cannot decide what is collected.
- One can only hope that the data is of good quality.
- Obtaining additional data about something is not possible (most often).