DI Fundamentals
As mentioned earlier, DI is not primarily about the mastery of esoteric
math concepts; instead, it’s kind of like math’s version of Reading
Comprehension. Familiarity with the question type is crucial for DI success. The
following fundamentals will acclimate you to the kinds of diagrams and questions
you’re likely to face and the approaches that will take you through them:
- Eyeballing
- Simple Arithmetic
- Percentage Questions
- Multistep Questions
- Standard Data Analysis
- Not Enough Information
Eyeballing
The most basic DI questions test nothing more than whether you can
track down and properly interpret relevant data to answer a question. These
don’t require any actual calculations; they require you to
eyeball, or scan the diagrams, for the data you need. There
are two main kinds of questions that benefit from eyeballing. Let’s look at
both.
Single Eyeballing
The most basic question of this type asks you to scan a diagram to
locate a single piece of information. Even if the DI set comes with more
than one diagram, only one will be in play for this type of question. We
return to our X-ray example for a question of this type:
|
|
|
In which of the following years
was the number of new homes built with a wood
exterior less than the number of new homes built
with a siding exterior? |
| (A) |
1940 |
| (B) |
1950 |
| (C) |
1960 |
| (D) |
1970 |
| (E) |
1980 |
|
As you’ll see later in our discussion of the DI step method, the
first step is to get a handle on the data presented, so let’s take a
quick look at the graph before eyeballing the answer.
This graph depicts three types of exterior surface materials used
for new houses over the course of six decades. The graph is a
“multipart” bar graph in that each bar contains data concerning three
different categories—in this case, siding, brick, and wood. When asked
to estimate the number of new homes built with siding in 1970, some test
takers might say a little less than 70,000, since that’s where the top
of the siding bar reaches in that year. However, a little less than
70,000 is the total number of new homes built with all
three surfaces in 1970—not the number of homes built with siding. So for
multipart bar graphs, make sure you estimate the difference
between the top and bottom of each segment to obtain the
correct values.
Notice the caveat at the bottom stating that all the homes
represented in the data were built with a single material. That’s simply
to ward off any complications regarding double or triple surfaces. If a
house falls into the siding category, for example, it can’t also be
counted in the brick or wood categories. Finally, notice that the
y-axis values for the data are all given in 1,000s.
This means that eight units on the chart (for example, corresponding to
the approximate number of new homes built with siding in 1970) means
8,000 houses, not 8. Once you’ve become accustomed to multipart bar
graphs, analyzing the diagram should take only a few moments.
Now that we have a good handle on the data, we can eyeball it to
answer the question. We’re asked to compare the number of new homes with
wood exteriors to the number of new homes with siding. According to the
boxed labels on the side, the graph represents wood in gray and siding
in white, which means we need to find the bar in which the white section
is bigger than the gray. Eyeballing the graph immediately tells us that
1970, 1980, and 1990 are out, since the gray portions of these bars
dwarf the white portions. The year 1990 isn’t even a choice, but we can
at least chop D and E.
Of the others, it’s probably easier to see that the siding bar is
bigger than the wood bar in 1940 than it is to compare the two surface
materials in 1950 and 1960, so you may have been comfortable at this
point choosing A and moving on. If you wanted to make sure,
you’d have to take a closer look at those other two years, and let’s do
just that to get some more practice eyeballing this kind of graph. The
wood bar in 1950 reaches through two full boxes and almost to a third,
while the siding bar in that year covers a little more than one box. So
wood trumps siding in 1950, showing B to be incorrect.
Similarly, eyeballing the 1960 bar shows that wood clocks in at over
30,000, while siding comes in under 30,000, so C is out,
confirming A as the correct choice.
Double Eyeballing
The test makers may ratchet up the difficulty level by asking you
to eyeball multiple scenarios, as in the following question:
|
Price per Square Foot of
Exterior Surface Materials from 1940 to 1990
|
1940
|
1950
|
1960
|
1970
|
1980
|
1990
|
|
Siding
|
$4.59
|
$4.79
|
$5.79
|
$6.29
|
$8.99
|
$8.99
|
|
Brick
|
$12.00
|
$12.00
|
$15.00
|
$16.50
|
$22.50
|
$34.00
|
|
Wood
|
$9.00
|
$11.50
|
$11.00
|
$13.00
|
$15.00
|
$18.00
|
|
|
Which of the following is true of
the year in which the price per square foot of one
of the exterior building materials decreased from
that of the previous decade? |
| (A) |
No new brick homes were built. |
| (B) |
The number of new siding homes built
outnumbered the number of new wood homes built. |
| (C) |
The number of new wood homes built was
approximately double the number of new brick homes
built. |
| (D) |
The number of new brick homes built outnumbered
the number of new siding homes built. |
| (E) |
More than 90,000 new homes were built with
siding. |
|
Notice now that another diagram—a table—has been added to the mix.
(We’re introducing it here for the sake of instruction. On the actual
test, all diagrams included in a DI question set will appear on the
screen from the get-go.) The question concerns prices, but the choices
concern number of homes built, which means we’ll need to look in two
places to get our answer. First we should eyeball the table, looking for
a price decrease from one decade to the next. Indeed, this happens only
in one place: The price per square foot of wood decreased in 1960 from
its 1950 level. So our first eyeballing venture yields 1960 as the
target year. Now we need to eyeball the 1960 bar of the new homes graph
to see which choice accords with that bar. Let’s test the choices in
order, with our eyeballing skills at the ready:
A: No, there were brick homes built in 1960. This choice
seems to refer to 1990.
B: As we saw in the previous question, wood homes
outnumbered siding homes in 1960, so B is out.
C: At a glance, the wood and brick segments look nearly
equal in 1960, so this “approximately double” business is off the mark.
D has it right: The 1960 black brick bar (try that ten
times fast) spans two full boxes plus most of two other boxes, while the
white bar in that year covers just a little more than two full boxes.
Eyeballing shows that the brick section of the 1960 bar is bigger than
the siding section of that bar, so D is correct for this
double eyeballing challenge.
E: Nuh-uh! This one’s written as a trap to tempt people who
confuse the total number of homes with the individual segment numbers.
Total homes built top 90,000 in 1960, but
siding homes account for only a little more than
20,000 of those.
Simple Arithmetic
Eyeballing questions involve reading the answers right off the
diagrams. The next step up in difficulty is questions requiring you to
do something with the values you eyeball. In simple
arithmetic questions, you need to find the relevant data and then perform
some basic calculations. Sometimes you’ll be looking for a precise answer;
other times, an approximation. Let’s look at an example of each.
Precise Calculations
If a question looking for a numerical answer doesn’t include the
word approximate or approximately,
then that answer must be exact. That means that you’ll need to read
precise figures off of a graph or table provided and perform some simple
math based on those figures. Consider the following, based on the
materials pricing table from the previous question:
|
Price per Square Foot of
Exterior Surface Materials from 1940 to 1990
|
1940
|
1950
|
1960
|
1970
|
1980
|
1990
|
|
Siding
|
$4.59
|
$4.79
|
$5.79
|
$6.29
|
$8.99
|
$8.99
|
|
Brick
|
$12.00
|
$12.00
|
$15.00
|
$16.50
|
$22.50
|
$34.00
|
|
Wood
|
$9.00
|
$11.50
|
$11.00
|
$13.00
|
$15.00
|
$18.00
|
|
|
The difference between the lowest
price per square foot of brick and the highest price
per square foot of brick for the years cited is |
|
I. greater than the combined price per square foot
of all three building materials in 1950
II. one dollar more than the combined price per
square foot of brick and wood in 1940
III. two dollars less than the combined 1960 and
1970 prices per square foot of wood |
| (A) |
I only |
| (B) |
II only |
| (C) |
III only |
| (D) |
I and II |
| (E) |
II and III |
|
The question and choices may sound confusing, and the Roman
numeral format may seem a bit odd as well, but all the question requires
is that you track down the right information and then do a bit of adding
and subtracting. First let’s work with the information in the question.
The lowest price per square foot of brick is $12, both in 1940 and 1950.
The highest price per square foot of brick is $34 in 1990. The
difference is therefore 34 – 12 = 22. Now all we have to do is check the
three Roman numeral statements to see which ones accord with this value
of 22.
Even leaving out the cents, the combined price per square foot of
all three building materials in 1950 is 4 + 12 + 11 = 27, which is
greater than 22, so statement I is incorrect. The combined price per
square foot of brick and wood in 1940 is 12 + 9 = 21, and the 22 figure
we calculated is in fact one dollar more than this amount, so II
provides an accurate completion of the question. The combined 1960 and
1970 prices per square foot of wood is 11 + 13 = 24. Our 22 figure is
certainly two dollars less than this amount, so III works also.
Therefore E is correct.
It may seem involved at the outset, but all we really did was get
the numbers and do some very simple math.
Approximating Values
If a question does include the word
approximate or approximately, then
estimate the relevant values via eyeballing, and then work through the
math with the values you get. Here’s an example:
|
|
|
Approximately how many times
greater is the number of juniors who take the bus to
school X than the combined number of juniors who
drive and walk to school X? |
| (A) |
|
| (B) |
|
| (C) |
|
| (D) |
24 |
| (E) |
50 |
|
First, approximate the figures: The bus bar clocks in at a tad
over 80, so go with 80 for now. The drive bar looks roughly equal to 10,
while the walk bar is a shade under 20. The amount that the bus bar is
over 80 roughly cancels out the amount the walk bar is below 20, so we
can simply go with these values as our approximations.
Now let’s do the math: If there are 80 bussers and 10 + 20 = 30
combined drivers and walkers, then we must need to calculate how much
bigger 80 is than 30. To calculate how much bigger
x is
than
y, divide
x by
y. 80 ÷ 30 = 8 ÷ 3 =

, choice
B.
Notice that D and E are “left field”
choices that are far too big to fit the scenario here. The bus figure is
only roughly ten times the drive figure, so the bus figure can’t be more
than ten times the drive and walk figures combined. Moreover, they’re
traps to boot: You’d get 24 if you multiplied 8 by 3, and 50 is the
difference between 80 and 30, not the
number of times greater 80 is than
30.
Percentage Questions
There are many ways that DI questions might test your understanding of
percentages, so if you’re shaky in this area, we advise you to go back to
chapter 2 to review the Math 101 percentages concepts before going any
further. As with simple arithmetic questions, sometimes the test makers are
looking for a precise answer and other times an approximation. A third kind
of common percentage problem involves percent increases and decreases, based
on the formulas you learned in Math 101. Let’s look at all three kinds.
Precise Calculations Using Percentages
Try the following:
|
|
|
How many more juniors and seniors
combined take the bus to school X than walk to
school X? |
| (A) |
15 |
| (B) |
30 |
| (C) |
36 |
| (D) |
60 |
| (E) |
72 |
|
The first thing to notice is that the data is presented in terms
of percentages, not raw numbers. Overlooking this fact would lead one to
simply calculate 40 – 10 = 30. Not surprisingly, the test makers have
included 30 as an enticement for those who make this mistake. We need to
calculate both figures, and then subtract.
If 40% of 240 juniors and seniors take the bus to school, then we
need to multiply 240 × .4 to get 96 juniors and seniors who bus it to
school. You may have had to use your scratch paper for this calculation,
but hey, that’s what it’s for. The walking figure could be done in your
head, since taking 10% of any number means moving the decimal one
place to the left. The number of walkers therefore equals 10%
of 240, or 24. 96 – 24 = 72, choice E.
Approximating Percentages
Again, if the word approximate or
approximately shows up in a question, then you
shouldn’t expect the figures to be particularly tidy or the calculations
to be simple. In such cases, use your powers of approximation to get
into the ballpark, as we advised in our introduction to GRE math in
chapter 1. Try it out in the following question. Hint: You’ll need to do
a precise percentage calculation before approximating for the final
answer.
|
|
|
If a total of 16 juniors and
seniors who currently bus to school X begin driving
to school X, what would be the approximate
percentage of juniors and seniors who drive to
school? |
| (A) |
19% |
| (B) |
26% |
| (C) |
32% |
| (D) |
37% |
| (E) |
41% |
|
We know the number of additional drivers (16), but we need to add
that to the number of current drivers before we can approximate the new
percentage of drivers overall. Here’s where a precise calculation comes
in: The number of current drivers equals 25% of 240, or 60. Combining
this with the 16 new drivers, we now have 76 junior and senior drivers
out of 240 total juniors and seniors. The new percentage of drivers is
therefore (76 ÷ 240) × 100%. The number 76 is fairly awkward, but 80,
which is not too far from 76, works better.

reduces to

, which is roughly equal to 33%.
Since we rounded up from 76, 76 out of 240 is a little less than 33%, so
C is the closest approximation.
A is a
left-field choice, since the percentage of drivers can’t decrease if
more drivers are added and the total number stays the same, and
E is a trap that you’d get if you simply added 16 to
25.
Percent Increase and Decrease
When a value changes, it’s possible to calculate the percentage
that that value goes up or down, whichever the case may be. You may be
tested on this concept in any of the three math question types, but the
GRE test makers particularly enjoy utilizing this concept in Data
Interpretation. In all cases, the formulas remain the same, and we’ll
repeat them here for your convenience:
percent increase = difference between the two numbers ÷ smaller of
the two numbers × 100%
percent decrease = difference between the two numbers ÷ greater of
the two numbers × 100%
Try the following question to see how this concept plays out in
the context of DI.
|
Price per Square Foot of
Exterior Surface Materials from 1940 to 1990
|
1940
|
1950
|
1960
|
1970
|
1980
|
1990
|
|
Siding
|
$4.59
|
$4.79
|
$5.79
|
$6.29
|
$8.99
|
$8.99
|
|
Brick
|
$12.00
|
$12.00
|
$15.00
|
$16.50
|
$22.50
|
$34.00
|
|
Wood
|
$9.00
|
$11.50
|
$11.00
|
$13.00
|
$15.00
|
$18.00
|
|
|
The price per square foot of wood
in 1990 represents what percentage increase compared
to the price per square foot of wood in 1980? |
| (A) |
3% |
| (B) |
17% |
| (C) |
18% |
| (D) |
20% |
| (E) |
33% |
|
The question isolates wood as the featured surface material, and
tracking down the relevant figures we see that the price of wood
increased from $15 in 1980 to $18 in 1990. Plugging these values into
our handy percent increase formula yields:

.
So the answer is a 20% increase, choice D. Notice the
distractors included among the choices:
A is the total difference per square foot between the
dollar amounts, 3, not the percent increase in price
per square foot from 1980 to 1990.
B, 17%, is approximately what you’d get if you mixed
up the formula and divided the difference by the greater amount (18)
instead of the smaller amount (15).
C repeats a number from the problem (18).
E (33) is what you’d get if for some reason you added
the 15 and 18 figures together. As an extra enticement, 33% is a fairly
common percentage (roughly

),
which may have caught your eye for that reason as well.
But if you isolated the correct information from the table,
plugged it into the correct formula, and did the math correctly, no
distractor would deter you from D.
Multistep Questions
So far the questions you’ve seen aren’t overly complex, but if you’re
doing well on the section, the CAT software program may see fit to throw
more difficult DI questions your way. One way to toughen these up is to
require you to perform multiple steps to get the answer. You’ve already seen
some basic examples. For instance, a double eyeballing question requires you
to bounce from one part of a diagram to another or from one diagram to a
different one altogether. Still, your main task in those is to just find the
relevant information. Harder questions may involve bouncing between the
diagrams, multiple calculations, and occasionally even some math reasoning.
Let’s look at a difficult example of this type. See what you can make of
this one:
|
|
|
Approximately what percentage of
seniors drives to school X? |
| (A) |
10% |
| (B) |
20% |
| (C) |
25% |
| (D) |
40% |
| (E) |
55% |
|
Doesn’t seem so tough, but the problem is that the circle graph
doesn’t differentiate between juniors and seniors. Fortunately we have the
bar chart depicting the travel arrangements of the juniors, which makes it
possible to bounce between the diagrams to arrive at a solution.
Our general percentage formula is percent
of a specific occurrence = the number of specific occurences ÷ the total
number × 100%. The percent of seniors that drives to school is
therefore equal to the number of seniors who drive to school ÷ the
total number of seniors × 100%.
We don’t have either of these numbers yet, but we can get them. By
adding together the bars on the bar graph, we can calculate the number of
juniors as approximately 80 + 40 + 20 + 10 = 150. The circle graph tells us
that there are 240 total juniors and seniors, so if we subtract the 150
juniors, we’re left with 240 – 150 = 90 seniors total (approximately). That
gives us our denominator. From the circle graph we can calculate the total
number of juniors and seniors who drive as

. Bouncing back to our bar graph, we
see that roughly 10 juniors drive, which means that 60 – 10 = 50 seniors
drive (approximately). Now we can finally calculate the percentage of
seniors who drive as approximately . That’s more than half, leaving only
E as a possibility.
What makes this multistep question hard is that we need to employ a
bit of math logic to extract the numbers we need from the data, as well as
bounce around quite a bit on our way to our final approximation. This is the
kind of maneuvering you should expect to see on the more difficult DI
questions.
Standard Data Analysis
We began the chapter by mentioning some math topics you’ll probably
never see in a DI set, such as the Pythagorean theorem. Now we’ll tell you
about some topics you very well might see. Since concepts such as average,
mean, median, and mode all deal with various ways to analyze data, they’re
all fair game in DI. You’ve gotten practice with some of these concepts in
the previous chapters, and will see others in the practice test at the end
of the book. But let’s now see how another data analysis topic—frequency
distribution—might appear as the basis of a DI question. Try your hand at
the following:
|
|
Size of Litter
(numbers of kittens)
|
Number of Litters
|
|
3
|
9
|
|
4
|
5
|
|
5
|
16
|
|
6
|
11
|
|
7
|
26
|
The table shows the frequency distribution of
litter size for feline litters in a certain study. |
|
What is the average (arithmetic mean)
of the number of kittens in all litters in the study
containing fewer than five kittens but more than six
kittens? |
| (A) |
3.275 |
| (B) |
4.125 |
| (C) |
5.725 |
| (D) |
6.125 |
| (E) |
It cannot be determined from the information
given. |
|
First off, we’ll need to pull the frequency distribution concept from
our reservoir of Math 101 knowledge. A frequency distribution lists a
set of values and the frequency of occurrence for each value. In
this example, the first row tells us that nine litters contained three
kittens each. The second row tells us that five litters contained four
kittens each, and so on. We’re asked for the average of specific litters, so
be careful: This “fewer than five, more than six” business is simply another
obstacle set in your path to deter you from finding the data you need. If
you read carefully, however, you’ll see that you need to average the litters
of three and fours kittens (“fewer than five”) with the litter of seven
kittens (“more than six”). That means we need to work with the first,
second, and last rows of the table.
So how do we average these? We need to use the standard average
formula: Average equals the sum of the terms divided by the number of
terms. In this case, the sum of the terms means
the total number of kittens in the three litter size categories we singled
out. The number of terms is the number of litters
contributing to this total number of kittens. First let’s figure out the
number of kittens: Nine litters of three kittens is simply 9 × 3 = 27
kittens. Doing the same for the four- and seven-kitten litters gives us 5 ×
4 = 20 kittens and 26 × 7 = 182 kittens. The total number of kittens is
therefore 27 + 20 + 182 = 229. That’s a lot of kitties!
Now we have to divide this total number of kittens by the number of
litters producing these kittens to get our average. There are 9 + 5 + 26 =
40 litters in the sample we’re considering, so the final average litter size
comes to

, choice
C.
Now surely no one would want .725 of a kitten, but averages are often
expressed this specifically and do provide useful information (think of the
common average of 2.3 children per American household). Here we know the
litter average is between five and six kittens, closer to six.
So, what’s the deal with funky choice E? Glad you asked.
That’s the very topic of the next section.
Not Enough Information
Have you ever heard someone in a group conversation discuss an overly
personal situation, and someone else mutter the phrase “too much
information . . .”? Our next form of DI question has the opposite problem:
not enough information. In some DI questions, the data presented is
insufficient to allow you to calculate an answer. When that’s the case,
choice E will state that there’s not enough information to
answer the question. That doesn’t mean that every time you see a choice like
this, it will be correct—the previous question is a case in point. But it
does mean that when a question contains a “cannot be determined” choice, you
should be alert to the possibility that you may not be able to solve the
problem. Some test takers, ignoring the choices, spend many minutes
agonizing over a question, only to find to their dismay that there’s no way
to solve it and all along a choice said just that. So if you can’t see a way
into a problem and don’t think you have the data you need, check the choices
to see if “not enough information” is an option.
Let’s look at an example. If you’ve been paying attention, you’ll know
the answer to the following question is E. But for practice,
think about why you don’t have enough information to solve it.
|
|
|
How many more seniors take the bus to
school X than drive to school X? |
| (A) |
8 |
| (B) |
15 |
| (C) |
32 |
| (D) |
57 |
| (E) |
It cannot be determined from the information
given. |
|
Okay, so the answer is E—not a real shocker, since we
told you that above. But did you figure out why? The
percentages in the graph represent the combined junior and senior
populations; there’s no way to figure out from this circle graph alone the
breakdown of juniors and seniors within each category. We therefore can’t
answer the question since we don’t know the actual number of seniors that
take the bus or drive to school. If we had the bar chart from earlier
listing the number of juniors in each category, that would be a different
story. But given only the circle chart to work with, no-go. Be aware of the
fact that some DI questions test not whether you can find the right
information to answer a question but whether you recognize that it doesn’t
exist.