After psychologists develop a theory, form a hypothesis, make observations, and collect data, they end up with a lot of information, usually in the form of numerical data. The term statistics refers to the analysis and interpretation of this numerical data. Psychologists use statistics to organize, summarize, and interpret the information they collect.
To organize and summarize their data, researchers need numbers to describe what happened. These numbers are called descriptive statistics. Researchers may use histograms or bar graphs to show the way data are distributed. Presenting data this way makes it easy to compare results, see trends in data, and evaluate results quickly.
Example: Suppose a researcher wants to find out how many hours students study for three different courses. Each course has 100 students. The researcher does a survey of ten students in each of the courses. On the survey, he asks the students to write down the number of hours per week they spend studying for that course. The data look like this:
Course A | Course B | Course C | |||
Student | Hours per week | Student | Hours per week | Student | Hours per week |
Joe | 9 | Hannah | 5 | Meena | 6 |
Peter | 7 | Ben | 6 | Sonia | 6 |
Zoey | 8 | Iggy | 6 | Kim | 7 |
Ana | 8 | Louis | 6 | Mike | 5 |
Jose | 7 | Keesha | 7 | Jamie | 6 |
Lee | 9 | Lisa | 6 | Ilana | 6 |
Joshua | 8 | Mark | 5 | Lars | 5 |
Ravi | 9 | Ahmed | 5 | Nick | 20 |
Kristen | 8 | Jenny | 6 | Liz | 5 |
Loren | 1 | Erin | 6 | Kevin | 6 |
To get a better sense of what these data mean, the researcher can plot them on a bar graph. Histograms or bar graphs for the three courses might look like this:
Researchers summarize their data by calculating measures of central tendency, such as the mean, the median, and the mode. The most commonly used measure of central tendency is the mean, which is the arithmetic average of the scores. The mean is calculated by adding up all the scores and dividing the sum by the number of scores.
However, the mean is not a good summary method to use when the data include a few extremely high or extremely low scores. A distribution with a few very high scores is called a positively skewed distribution. A distribution with a few very low scores is called a negatively skewed distribution. The mean of a positively skewed distribution will be deceptively high, and the mean of a negatively skewed distribution will be deceptively low. When working with a skewed distribution, the median is a better measure of central tendency. The median is the middle score when all the scores are arranged in order from lowest to highest.