+ - 0:00:00
Notes for current slide
Notes for next slide

Graphical Insights from Data:
Perception

Heike Hofmann

1 / 39

Introduction

2 / 39

What do you see?

4 / 39

Vision, in general, involves a lot of unconscious pattern recognition. If we can harness that power, we can show people data in a way that doesn't require a lot of thought for them to engage with the data.

It's not just an illusion - it's a photo

1965 Life magazine cover showing the dalmation illusion

Life Magazine, 19 Feb 1965

5 / 39

Why Graphics Matter

Graphics are a form of external cognition that allow us to think about the data rather than the chart

6 / 39

Why Graphics Matter

Graphics are a form of external cognition that allow us to think about the data rather than the chart

Good graphics take advantage of how the brain works

  • preattentive processing

  • perceptual grouping

  • awareness of visual limitations

6 / 39

Good Graphics

In good graphics, the

  1. graph form
  2. data (and structure)
  3. aesthetics

all work together to pass information to the brain via the visual system.

7 / 39

Good Graphics

In good graphics, the

  1. graph form
  2. data (and structure)
  3. aesthetics

all work together to pass information to the brain via the visual system.

The structure and aesthetics used to create the chart should contribute to the understanding of what is being shown!

7 / 39

Bad Graphics

8 / 39

Of course, even decent visualizations can't compensate for lousy data... or rather, chart designers who aren't thinking about how to represent the data well.

But perfectly reasonable data can also be ruined by bad aesthetic choices.

As with anything, graphics require a combination of "art" and "science" - you not only have to use the best method to display the data (which this isn't, necessarily), you also have to use some judgment as to how to show what you're hoping to show... and this is a good example of what happens when that doesn't happen.

Spot the Difference

10 / 39

Spot the Difference

11 / 39

Preattentive perception

  • Occurs automatically (no effort)

  • Color, shape, angle

  • Combinations of preattentive features require attention

    • Unless you double-encode
      (use different features for the same variable)
12 / 39

Preattentive perception

  • Occurs automatically (no effort)

  • Color, shape, angle

  • Combinations of preattentive features require attention

    • Unless you double-encode
      (use different features for the same variable)

Using preattentive features reduces the amount of work your viewer has to expend to understand your chart

12 / 39

What do you see?

13 / 39

What do you see here? 3 pac-men shapes and 3 acute angles? No?

I see 3 circles, a triangle with a black outline, and a white triangle with no outline. But... that's not really what's there, is it?

I'll talk next about the Gestalt laws, but if you can't remember them, just remember this saying - "The whole is greater than the sum of the parts" - just as here, what we see is more orderly than what is actually there.

Gestalt Principles

What sorts of relationships are inferred, and under what circumstances?

14 / 39

Gestalt Laws of Perception

15 / 39

The Gestalt laws are a set of rules for how we interpret ambiguity in the visual scene.

The law of Closure says that it's easier to interpret things if you imagine them as a closed figure - it's more likely that a closed figure is for instance obstructed, than that it is a set of more complex, less meaningful figures. This is sometimes also stated as the "law of good figure"

the law of Proximity says that things that are close together are likely part of the same unit. So you might interpret things as a dalmation instead of a series of blobs of black ink.

The law of continuation says that figures with edges that are smooth are more likely to be continuous than things with edges that are sharp angles.

the law of similarity says that things are likely to be viewed as part of a group if they look similar.

Then, the law of figure/ground helps explain why we see both the tree and the AL figure combination here - we have contextual information that helps us simplify the picture into two groups - the figure (the tree), and the background (the AL); thus also helps us separate the AL from the white background behind it.

There are a few other gestalt laws, but these are the main ones.

Now, let's talk about how these laws apply to charts! I swear, I didn't forget that I am supposed to be talking about data visualization.

Gestalt Laws in Data Visualization

  • Proximity

  • Similarity

16 / 39

Gestalt Laws in Data Visualization

17 / 39

Gestalt Laws in Data Visualization

  • Good continuation
18 / 39

Which one is different?

Lineup with trend lines

19 / 39

Which one is different?

Lineup with color lines

20 / 39

Plot Annotations Matter!

  • Plot 12: 59.1%
  • Plot 5: 9.1%
  • Other plots: 31.7%

  • Plot 12: 9.7%
  • Plot 5: 29.0%
  • Plot 18: 32.3%
  • Other plots: 29.0%
21 / 39

Add annotations to your plots based on what you want to emphasize. If you want to show the trend (or deviations from it), add a line and maybe a confidence band. If you want to show clustering, use ellipses and color and/or shape.

What you add to the plot helps to determine what people will see in the data!

Visual Limitations

22 / 39

Visual Limitations

  • Not all graphical representations are equally accurate

  • Optical illusions

  • Designing plots for disabilities

  • Color choices

23 / 39

Accuracy of Graphical Judgements

  1. Position along a common scale (most accurate)
    • scatter plot
  2. Position along nonaligned scale
    • multiple scatter plots
  3. Length
    • bar chart
  4. Angle, Slope
    • pie chart
  5. Area
    • bubble chart
  6. Volume, Density, Color saturation
    • heatmap
  7. Color hue (least accurate)
24 / 39

When you design a visualization, try to make the most important variables represented by dimensions that are accurate.

In some cases, we only care about relative accuracy - for those, things like color saturation are fine for encoding information.

You may have heard people talk about how awful pie charts are - that's because anything that can be put into a pie chart can also be put into a bar chart, which will be read more accurately.

Optical Illusions

25 / 39

We're really bad at judging vertical distance, as well. If you need to show the difference between two curves, you should attempt to find a different way to do it than showing both curves on the same chart - for instance, plot the difference alongside the two curves.

Designing for Accessibility

  • Low visual acuity:

    • High contrast (bright/dark)
    • large font size
    • textures/patterns can be hard to make out
  • Colorblindness:

    • Safest: design for a black-and-white photocopier
    • Avoid rainbow gradients
    • If you need a 2-color gradient, use blue/purple - white - orange (safe for most types of colorblindness)
  • R packages for accessibility

    • ajrgodfrey/BrailleR - translate plots into text descriptions for screen readers
    • sonify - represent data using sound
    • gt - tables with metadata that is easy for screen readers
26 / 39

Unfortunately, there is relatively little research on other disabilities + statistical graphics

Color

  • Hue: shade of color (red, orange, yellow...)

  • Intensity: amount of color

  • Both color and hue are pre-attentive. Bigger contrast corresponds to faster detection.

  • Use color to your advantage

  • When choosing color schemes, we will want mappings from data to color that are not just numerically but also perceptually uniform

  • Distinguish between sequential scales and categorical scales

27 / 39

Color

Color is context-sensitive: A and B are the same intensity and hue, but appear to be different.

Edward Adelson’s checkershadow illusion

28 / 39

Ordering Variables

Which is bigger?

  • Position: higher is bigger (y), items to the right are bigger (x)
  • Size, Area
  • Color: not always ordered. More contrast = bigger.
  • Shape: Unordered.

29 / 39

Aesthetics in ggplot2: Scales

30 / 39

Aesthetics in ggplot2

Aesthetics: features such as color, shape, and size that map other characteristics to structural features

Scales map data values to the visual values of an aesthetic

  • to change a mapping, add a new scale

31 / 39

Scales

32 / 39

Gradients

Qualitative schemes: no more than 7 colors

Can use colorRampPalette() from the RColorBrewer package to produce larger palettes by interpolating existing ones

Quantitative schemes: use color gradient with only one hue for positive values

33 / 39

More Gradients

Quantitative schemes: use color gradient with two hues for positive and negative values. Gradient should go through a light, neutral color (white)

Small objects or thin lines need more contrast than larger areas

34 / 39

Factors vs. Continuous variables

  • Factor variable:
    • scale_colour_discrete
    • scale_colour_brewer(palette = ...)
  • Continuous variable:
    • scale_colour_gradient (define low, high values)
    • scale_colour_gradient2 (define low, mid, and high values)
    • Equivalents for fill: scale_fill_...

35 / 39

Color in ggplot2

  • There are packages available (ggsci, viridis, wesanderson, RColorBrewer) that have color schemes for any occasion.

36 / 39

Your Turn

data(diamonds)
  • In the diamonds data, clarity and cut are ordinal, while price and carat are continuous

  • Find a graphic that gives an overview of these four variables while respecting their types

37 / 39

Additional Resources

38 / 39

Introduction

2 / 39
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow