280 likes | 404 Vues
This lecture focuses on the principles of effective chart creation and the importance of graphical integrity in data presentation. We'll cover different chart types, the choice between single or multiple series, and how to represent data accurately to avoid misrepresentation. Key concepts include maximizing the Data-to-Ink ratio, calculating the Lie Factor, and identifying distortions that can mislead viewers. By the end of this session, you will gain practical insights into how to create clear, informative, and accurate graphics using tools like Excel while avoiding common pitfalls.
E N D
QMCS 200 Types of Charts Graphical Integrity R. Smith - qmcs 200
Today’s Class • Review last TIA homework • Review Exam • Finish up the first exercise • Talk about Network Lab • Talk about multi-series charts • Talk about graphical integrity • Chapter 2, HOE #2 R. Smith - qmcs 200
Using Charts: What are we doing? • One series or many? • 1 series: it’s easy – stacked and clustered are identical! • Multiple: decide between cluster, stacked, 100% stacked • Looking at absolute or relative values? • Relative numbers: Pie or 100% Stacked Chart • Absolute numbers: Column, Area, Bar, Line, etc. • Categories, or some sort of sequence? • Categories = Column or Bar Chart • Sequence = Line or Area Chart • Comparing category totals or subtotals? • Totals = Stacked Chart • Subtotals = Cluster Chart R. Smith - qmcs 200
Handling multiple series • Graph the dataonly, not the totals! • This throws off the whole appearance of your graph • In a stacked chart, it makes the chart unreadable • Pick the right axis for the chart • What clusters or stacked totals do you want to compare? R. Smith - qmcs 200
Building Good Graphics Some ideas regarding quality graphics • Graphical Integrity • A graphic’s most obvious interpretations must be consistent with the underlying data • Avoid accidental or intentional misrepresentation • Forgo “chartjunk” that draws attention away from the data • Examples: moiré vibration, grids, graphical “ducks” • Maximize the Data-to-Ink ratio • Data-to-Ink Ratio = ink used to print the actual data all ink used to print the graphic R. Smith - qmcs 200
The “Lie Factor” – an example • New York Times, August 9, 1978 R. Smith - qmcs 200
The Lie Factor • Computes the effect of distortion in a quantitative graphic like a chart • Lie Factor = size effect shown in graphic size effect in the data • More than 5% (.95 to 1.05) is substantial • Not caused by errors in drafting • Lie Factors of 2 to 5 are common R. Smith - qmcs 200
Calculating the Lie Factor • Change in fuel economy from 1978-1985 = 53% (0.53) • Change in graphic = change from 0.6” to 5.3” • (5.3 - 0.6)/0.6 = 7.83 = 783% • Lie Factor = 7.83/0.53 = 14.8 -- almost 15 times reality R. Smith - qmcs 200
An Accurate View • Accurate in 2 dimensions • Puts numbers in context by including current and expected average MPG of cars on the road R. Smith - qmcs 200
Good Graphics and Excel • It’s hard to build an inaccurate graph, unless you make a mistake • Including totals, swapping the axes by mistake, etc. • It’s easy to build a “visually busy” graph • Excel “guesses” about colors • You have to adjust things to work with your printing environment • It’s easy to crank up the data-to-ink ratio • 3D graphs are mostly wasted ink • 3D graphs can be misleading, too R. Smith - qmcs 200
Possible Distortions in Excel • Data Problems (GIGO) • Ignoring monetary inflation • Comparing time periods of different durations • Omitting data that provides context • Ignoring other factors that differ in the categories • Visual Problems • Data-to-Ink Ratio – Excel uses lots of ink and color even when not needed • 3D graphics – can visually misrepresent relative values R. Smith - qmcs 200
Examples • The mis-scaled gas mileage • Oil Prices and Inflation • Nobel Prizes and time periods • Context and traffic enforcement • Influences on state budgets • 2D and 3D ambiguities R. Smith - qmcs 200
Oil Prices and Inflation R. Smith - qmcs 200
Inconsistent Time Periods • National Science Foundation, 1974 R. Smith - qmcs 200
Inconsistent Time Periods • National Science Foundation, 1974 R. Smith - qmcs 200
Context is Essential • Two data points can’t possibly tell the story alone R. Smith - qmcs 200
So, what’s the real story? • Different contexts yield different interpretations • Is this a blip, or a real change, or part of a cycle? R. Smith - qmcs 200
Connecticut Statistics • Actually, drop was a return to previous levels • “Why did deaths go up 1954-56?” R. Smith - qmcs 200
Regional Statistics • Compare per capita deaths in region over nearby years • All states enjoyed a reduction, not just the state that cracked down R. Smith - qmcs 200
Budget Inflation Revisited • New York State budgets and aid to municipalities • Dramatic growth, right? R. Smith - qmcs 200
Visual Trickery • Material on left, and perspective on early years make them appear small • Horizontal arrows emphasize small size • Perspective makes the rightmost cluster stand forward, looking extra large • Vertical arrows emphasize height R. Smith - qmcs 200
Reducing Data-to-Ink Ratio • If we take out the extra graphics and perspective (left) we yield a simple story (right) • But the story is still misleading R. Smith - qmcs 200
Adjusting for Change • Take into account both inflation and population change • This story shows no significant changes in 7 years R. Smith - qmcs 200
A 2-D Graphics Problem • Washington Post, October 25, 1978 • Inflation is one-dimensional • Shrinking money in 2 dimensions is misleading • Dollar’s area should reflect its purchasing power • The 1978 dollar should be about twice as large as shown R. Smith - qmcs 200
A 3-D Graphics Problem • Problem with 3D image • Prices increased 454% • Graphic size: 4,280% • Lie Factor = 9.4 • If we take the “barrel metaphor” seriously, the barrel volume increases 27,000% • Lie Factor = 59.4 R. Smith - qmcs 200
Excel 3D Pie Chart • Gives Denver (19%) more ink than Miami (21%) • Makes Boston look overwhelming • Makes New York look much less pitiful R. Smith - qmcs 200
Minimizing Data to Ink Ratio • How much ink do you really need to display the data you have? • Extra ink is chartjunk • Makes the chart feel like it says more that it really does • A checklist • Extra “dimensions” like 3D blocks in a 2D bar chart • Shaded backgrounds – do they make the data easier to see, or do they waste ink? • Grid lines – do they really help people compare the data? R. Smith - qmcs 200
For More Information • Edward Tufte’s The Visual Display of Quantitative Information • Chapter 2: “Graphical Integrity” • Also, notes from other chapters • www.edwardtufte.com R. Smith - qmcs 200