If you are involved in observing statistics or looking for some kind of technical data, you may need to read a histogram. A histogram is a specific visual representation of data, usually a graph that uses bars without spaces to represent the number of incidents in a given group or set of samples. For beginners who need to understand the data in a histogram and how to interpret it, this article presents some essential steps.
Part 1 of 2: Read a histogram
Step 1. Recognize the difference between a bar graph and a histogram
Bar charts and histograms are similar, but they have some very specific differences. A bar chart groups numbers into categories, while a histogram groups numbers into ranges. The latter are generally used to display the results of a continuous set of data, such as height, weight, time, etc.
- A bar chart usually has spaces between the bars, while histograms do not.
- A histogram often shows how often an event occurs within the defined range. Show how many times this event occurs.
Step 2. Read the axes of the graph
The X axis is the horizontal axis, and the Y axis is the vertical axis. Both provide essential information for reading the histogram. Many show the results of the frequency of an event and have a Y-axis indicating the frequency. The X-axis indicates the ranges in which the data is grouped.
For example, a histogram detailing pitching height frequency in professional baseball will have an X axis for height and a Y axis for frequency
Step 3. Identify the ranges used
The data is grouped into ranges or intervals for the chart. Choosing the correct interval size is important to produce a graph to help you interpret the results. It is important that you choose ranges that are not too wide or too specific, and that allow you to observe the underlying pattern of frequency in the data.
- For example, the average height of a professional baseball pitcher is 1.88 meters (6 feet, 2 inches), but obviously there are exceptions. Since the height range is possibly between 1.68 meters (5 feet, 6 inches) and 1.98 meters (6 feet, 6 inches), the intervals should only vary between 2.5 and 5 centimeters (1 or 2 inches).
- Another clarification regarding the ranges is that the first group can vary between 1.68 meters (5 feet, 6 inches) and 1.73 meters (5 feet, 8 inches), but does not include 1.73 meters (5 feet, 8 inches). Each group includes its content until the beginning of the next.
Step 4. Use the top of the bar to read the frequency for that group
If you want to know how many times an event has occurred within a specific range, simply read the top of the bar and the value on the Y axis at that point.
For example, looking at the histogram, the number of players in the range of 1.83 meters (6 feet) to less than 1.88 meters (6 feet, 2 inches) is 50
Part 2 of 2: Plot a histogram
Step 1. Gather the data to be plotted
If you want to gather data regarding the frequency of something, plotting it on a histogram is a good way to look at it. Whether you're looking for the number of copies sold of a specific book or the weight distribution of cows on a farm, histograms are an easy way to get an overview of the pattern of data distribution.
Step 2. Choose the intervals of the range
When representing your data, first decide how you want to divide it between ranges. Choose intervals that give you a good representation. Remember not to be too broad or too specific.
- For example, suppose you have 10 data points for the weight of cows on your farm: 520, 635, 500, 725, 815, 700, 800, 615, 635, and 590 kilos (1150, 1400, 1100, 1600, 1800, 1550, 1750, 1350, 1400 and 1300 pounds). These vary between tens and hundreds of kilos, so the interval should vary between tens and hundreds.
- Set intervals every 100 kilograms (200 pounds), starting with 500 kilograms (1,100 pounds) up to 850 kilograms (1,900 pounds).
- 500-590, 590-680, 680-770, 770-850 kilos (1100-1300, 1300-1500, 1500-1700, 1700-1900 pounds) for a total of four intervals.
Step 3. Separate the data into intervals
Once you choose the intervals, you will have to sort and organize the data based on them. Start by sorting the values in ascending order. Then draw a line at the division of the intervals. Counts the number of values that correspond to each interval. This will be the number of the frequency in each range.
- Remember that if the value is equal to the limit with an interval, it corresponds to the interval to the right.
- For example, suppose you have 10 data points for the weight of cows on your farm: 520, 635, 500, 725, 815, 700, 750, 615, 635, and 590 kilos (1150, 1400, 1100, 1600, 1800, 1550, 1650, 1350, 1400 and 1300 pounds).
- Sort them in ascending order: 500, 520, 590, 615, 635, 635, 700, 725, 750, 815 kilos (1100, 1150, 1300, 1350, 1400, 1400, 1550, 1600, 1650, 1800 pounds).
- Divide them by intervals: 500 520 | 590 615 635 635 | 700 725 750 815 (1100 1150 | 1300 1350 1400 1400 | 1550 1600 1650 | 1800).
- Count the frequencies: interval 1: 2, interval 2: 4, interval 3: 3, interval 4: 1.
Step 4. Plot the histogram
You can build the histogram by hand with the data you have sorted, or you can use a program like Excel or another statistics program. If you want to do it by hand, just draw an X and Y axis, and set the scale for each. The X-axis sets the intervals you have chosen, and the Y-axis scale sets the frequency of the data. Make bars for each interval that go up to the respective value of the frequency. Color and make sure all the bars touch each other.
- For the cow weight example, the X-axis ranges from 500 to 870 kilograms (1,100 to 1,900 pounds) in increments of 100 kilograms (200 pounds). The Y-axis scale ranges from 1 to 4 in increments of 1.
- The first interval, 500 to 590 kilos (1100 to 1300), has a frequency of 2, so draw a bar to two and color. Directly next to the first interval, draw a second bar for the second interval that has a frequency of 4. The third bar will go to 3 and the last to 1.
Step 5. Label both axes
No graph will be complete without labeled axes. Make them big and bold so they stand out. Make sure the labels represent the data presented accurately. The Y-axis will be labeled for frequency, while the label for the X-axis will depend on the type of data collected.