Vision, in general, involves a lot of unconscious pattern recognition. If we can harness that power, we can show people data in a way that doesn't require a lot of thought for them to engage with the data.
Life Magazine, 19 Feb 1965
Graphics are a form of external cognition that allow us to think about the data rather than the chart
Graphics are a form of external cognition that allow us to think about the data rather than the chart
Good graphics take advantage of how the brain works
preattentive processing
perceptual grouping
awareness of visual limitations
In good graphics, the
all work together to pass information to the brain via the visual system.
In good graphics, the
all work together to pass information to the brain via the visual system.
The structure and aesthetics used to create the chart should contribute to the understanding of what is being shown!
Of course, even decent visualizations can't compensate for lousy data... or rather, chart designers who aren't thinking about how to represent the data well.
But perfectly reasonable data can also be ruined by bad aesthetic choices.
As with anything, graphics require a combination of "art" and "science" - you not only have to use the best method to display the data (which this isn't, necessarily), you also have to use some judgment as to how to show what you're hoping to show... and this is a good example of what happens when that doesn't happen.
Occurs automatically (no effort)
Color, shape, angle
Combinations of preattentive features require attention
Occurs automatically (no effort)
Color, shape, angle
Combinations of preattentive features require attention
Using preattentive features reduces the amount of work your viewer has to expend to understand your chart
What do you see here? 3 pac-men shapes and 3 acute angles? No?
I see 3 circles, a triangle with a black outline, and a white triangle with no outline. But... that's not really what's there, is it?
I'll talk next about the Gestalt laws, but if you can't remember them, just remember this saying - "The whole is greater than the sum of the parts" - just as here, what we see is more orderly than what is actually there.
The Gestalt laws are a set of rules for how we interpret ambiguity in the visual scene.
The law of Closure says that it's easier to interpret things if you imagine them as a closed figure - it's more likely that a closed figure is for instance obstructed, than that it is a set of more complex, less meaningful figures. This is sometimes also stated as the "law of good figure"
the law of Proximity says that things that are close together are likely part of the same unit. So you might interpret things as a dalmation instead of a series of blobs of black ink.
The law of continuation says that figures with edges that are smooth are more likely to be continuous than things with edges that are sharp angles.
the law of similarity says that things are likely to be viewed as part of a group if they look similar.
Then, the law of figure/ground helps explain why we see both the tree and the AL figure combination here - we have contextual information that helps us simplify the picture into two groups - the figure (the tree), and the background (the AL); thus also helps us separate the AL from the white background behind it.
There are a few other gestalt laws, but these are the main ones.
Now, let's talk about how these laws apply to charts! I swear, I didn't forget that I am supposed to be talking about data visualization.
Proximity
Similarity
Add annotations to your plots based on what you want to emphasize. If you want to show the trend (or deviations from it), add a line and maybe a confidence band. If you want to show clustering, use ellipses and color and/or shape.
What you add to the plot helps to determine what people will see in the data!
Not all graphical representations are equally accurate
Optical illusions
Designing plots for disabilities
Color choices
When you design a visualization, try to make the most important variables represented by dimensions that are accurate.
In some cases, we only care about relative accuracy - for those, things like color saturation are fine for encoding information.
You may have heard people talk about how awful pie charts are - that's because anything that can be put into a pie chart can also be put into a bar chart, which will be read more accurately.
We're really bad at judging vertical distance, as well. If you need to show the difference between two curves, you should attempt to find a different way to do it than showing both curves on the same chart - for instance, plot the difference alongside the two curves.
Low visual acuity:
Colorblindness:
R packages for accessibility
Unfortunately, there is relatively little research on other disabilities + statistical graphics
Hue: shade of color (red, orange, yellow...)
Intensity: amount of color
Both color and hue are pre-attentive. Bigger contrast corresponds to faster detection.
Use color to your advantage
When choosing color schemes, we will want mappings from data to color that are not just numerically but also perceptually uniform
Distinguish between sequential scales and categorical scales
Color is context-sensitive: A and B are the same intensity and hue, but appear to be different.
Which is bigger?
ggplot2
: Scalesggplot2
Aesthetics: features such as color, shape, and size that map other characteristics to structural features
Scales map data values to the visual values of an aesthetic
Qualitative schemes: no more than 7 colors
Can use colorRampPalette()
from the RColorBrewer package to produce larger palettes by interpolating existing ones
Quantitative schemes: use color gradient with only one hue for positive values
Quantitative schemes: use color gradient with two hues for positive and negative values. Gradient should go through a light, neutral color (white)
Small objects or thin lines need more contrast than larger areas
scale_colour_discrete
scale_colour_brewer(palette = ...)
scale_colour_gradient
(define low, high values)scale_colour_gradient2
(define low, mid, and high values)scale_fill_...
ggsci
, viridis
, wesanderson
, RColorBrewer
) that have color schemes for any occasion.data(diamonds)
In the diamonds data, clarity and cut are ordinal, while price and carat are continuous
Find a graphic that gives an overview of these four variables while respecting their types
Maps in ggplot2
General references
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |