Exploring Public Library National Ratings: Methodological Foundations and Implications

Understanding the Methodological Foundations of Public Library National Rating SystemsRay LyonsLibrary Statistics for the 21st Century WorldIFLA Satellite Meeting - Montréal, Québec August 18 - 19, 2008

Milestones in Statistical Thinking • Historical account by Alain Desrosières: The Politics of Large Numbers: A History of Statistical Reasoning (Harvard University Press, 1998) • Traces statistical ideas from 15th century to present day • Evolutionary ideas from German descriptive statistics and English political arithmetic pertinent to library statistical collection

Milestones in Statistical Thinking Desrosières describes two key statistical practices: • Creation of equivalences • Establishing standard classifications to describe phenomena by focusing on similarities and ignoring differences • Encoding • Specification and use of definitions to assign individual cases to classifications

Purpose of this Presentation To suggest that creating equivalent classes and other aspects of statistical data collection limit the accuracy and validity of national and local public library comparisons.

National Collection of Public Library Statistics in the USA • Federal-State Cooperative System (FSCS) • Collaborative initiated in 1980’s • 1991 - First FSCS data published by U.S. Department of Education’s National Center for Educational Statistics

National Collection of Public Library Statistics in the USA • Collection system renamed Public Library Statistics Cooperative (PLSC) in 2007 • PLSC recently transferred to Institute of Museum and Library Services, an agency of the federal government (www.imls.gov)

National Public Library Ratings Introduced in USA • Hennen’s American Public Library Ratings (HAPLR) • Used FSCS input and output statistics for 9000+ public libraries • Created by library consultant Thomas Hennen (library director in Wisconsin) • Issued annually since 1999(except for 2000 and 2007)

Hennen’s American Public LibraryRatings (HAPLR) • Published (and endorsed) by American Library Association (ALA) • Calculation methods were controversial among profession • Highly-rated public libraries were delighted with calculation methods • “Methodologically indefensible and politically priceless” - US Library Director

Hennen’s American Public LibraryRatings (HAPLR) • Utilizes 5 enabling (input) indicators and 3 use (output) indicators Enabling (Input) Total staff Materials expenditures Total operating expenditures Number of printed volumes Serial subscriptions Use (Output) Visits Circulation (loans) Reference transactions

Hennen’s American Public LibraryRatings (HAPLR) • Recombines 8 indicators into 15 rates (ratios), similar in style to BIX (1)Total expenditures per capita (2) Materials expenditures per total expenditures (3) Materials expenditures per capita (4) Staff per 1000 population (5) Periodical subscriptions per 1000 population . . . and 10 others • See www.haplr-index.com

Study Conducted at NCLIS • 2006 research on HAPLR ratings methodology overseen by U.S. National Commission for Library and Information Science (NCLIS) “Unsettling Scores: An Evaluation of the Hennen American Public Library Ratings, Public Library Quarterly, Volume 26, Numbers 3/4, 2007. (ISSN 016-6846)

Library Journal’s Public Library National Ratings Announced • LJ Index introduced June 2008* • Rates 9200 public libraries using PLSC data • Co-designed with Keith Lance (Library Research Service,State Library of Colorado) • Emphasizes disclosing limitations of rating methodology Keith Curry Lance and Ray Lyons, “The New LJ Index,” Library Journal, June 15, 2008, p. 38-41.

Library Journal’s Public Library National Ratings Announced • Encourages responsible interpretation of rating results • LJ Index based upon 4 use (output) indicators: • Visits • Loans (circulation) • Internet terminal use • Program attendance

Public Library Ratings UseComposite Scoring • HAPLR, BIX, and LJ Index each calculate a single composite score summarizing each library’s performance Circulation per capita + FTE staffing + Subscriptions + Internet computers + . . . + Total program attendance Calculation Algorithm 635 (Composite Score)

Context of Comparative Library Statistics Comparative measures are: Measures used as part of a more general process to assess library value and effectiveness Measures intended for use in an ongoing process of performance measurement

Performance Measurement Model RESULTS EFFORTS Services Utilized Intermediate Outcomes End Outcomes Resources Outcome Measures Outcome Measures

Performance Management as Envisioned by the Public Library Association (PLA) • Planning-for-Results approach to management • Abandonment of established operational and performance standards • ALA / PLA 1987 publication, Output Measures for Public Libraries: A Manual of Standardized Procedures, defines standard statistics and collection procedures

Comparative Performance Measurement Public sector management practice Used by state and local governments for: Accountability Planning/budgeting Program monitoring Operational improvement Uses established standards and benchmarking (Ammons, 2001)

Key Problems with Library Statistics • Lack of reliable methods for identifying peer libraries • Lack of instruments for measuring comparability • Comparisons are either approximate or inaccurate • Can result in incorrect or misleading conclusions

Key Problems with Library Statistics • National rating systems use simplistic or imprecise criteria for identifying peers (library type, community size, etc.) • Ignore library mission, unique community needs, institutional context, etc. • Accuracy and validity of comparisons are compromised

Key Problems with Library Statistics • Lack of criteria for evaluating measures There are no ‘right’ or ‘wrong’ scores on an output measure; ‘high’ and ‘low’ values are relative. The scores must be interpreted in terms of library goals, scores on other measures, and a broad range of other factors. - Van House, Weill, and McClure (1990)

A Diversionary ‘Visual Aid’ FIG NEWTON COOKIES Versus GRANOLA BAR SNACK 100% more fruit!!!

Key Problems with Library Statistics • National rating systems apply the “More-is-Better Rule” • Views higher numbers as favorable performance, lower as unfavorable • “More activity does not necessarily mean better activity” - Van House, Weill, and McClure (1990) • Striving to earn higher numbers may compromise service quality

Key Problems with Library Statistics • Collection of standard statistics assumes all library resources/activities counted to be equivalent • Standardization ignores differences in: - Complexity - Sophistication - Relevance - Quality (Merit) - Value (Worth) - Effectiveness - Efficiency - Significance

Key Problems with Library Statistics • National ratings systems add, subtract, and multiply these non-equivalent ‘units’ of library resources, services, and products if they were equivalent • Final scores imply appear arithmetically consistent and correct even though they count unequal units

Key Problems with Library Statistics • Data imprecision due to • Inconsistent collection methods • Mistakes • Sampling error • “Gaming” • Imprecision makes individual library comparisons less accurate

Key Problems with Library Statistics • Variety of reasons for insufficient scores: - Inadequate knowledge of community needs - Staff skill deficiencies - Inadequate staffing • - Inefficient workflows • Inadequate planning • - Limited user • competencies • ... and others Adapted from Poll and te Boekhorst (2007)

Output measures “reflect the interaction of users and library resources, constrained by the environment in which they operate. The meaning of a specific score on any measure depends on a broad range of factors including the library’s goals, the current circumstances of the library and its environment, the users, the manner in which the measure was constructed, and how the data were collected.” [emphasis added] - Van House, Weill, and McClure (1990)

Improvements Needed • Fuller understanding of limitations of statistical indicators and comparison methods “The [input and output] measures are best used with other information about the library.” - Van House, Weill, and McClure (1990)

Improvements Needed • Relate amounts and types of resources and services to verifiable levels of community need • Increased understanding of measurement and interpretation • Draw reasonable conclusions and interpretations • Study behavioral science measurement practices

Behavioral Science Measurement Model Conceptualization Nominal Definition Operational Definition Measurement in Real World Babbie (2007)

Recommendations • Promote view of public library national rating systems as ‘contests’ with simple, arbitrary rules • Ratings are: • Quite approximate indicators of performance • Inadequate indicators of quality, excellence, and effectiveness

Recommendations • Advise libraries to interpret national ratings cautiously and conscientiously • Develop and test measurement instruments for identifying peer libraries • Use ratings to inspire more robust measurement of public library performance

Exploring Public Library National Ratings: Methodological Foundations and Implications