Methodsofresearchtextbook.pdf

Distribution of Chi SquareProbability

df .99 .98 .95 .90 .80 .70 .50

11 .03157 .03628 .00393 .0158 .0642 .148 .45512 .0201 .0404 .103 .211 .446 .713 1.38613 .115 .185 .352 .584 1.005 1.424 2.36614 .297 .429 .711 1.064 1.649 2.195 3.35715 .554 .752 1.145 1.610 2.343 3.000 4.351

16 .872 1.134 1.635 2.204 3.070 3.828 5.34817 1.239 1.564 2.167 2.833 3.822 4.671 6.34618 1.646 2.032 2.733 3.490 4.594 5.528 7.34419 2.088 2.532 3.325 4.168 5.380 6.393 8.34310 2.558 3.059 3.940 4.865 6.179 7.267 9.342

11 3.053 3.609 4.575 5.578 6.989 8.148 10.34112 3.571 4.178 5.226 6.304 7.807 9.034 11.34013 4.107 4.765 5.892 7.042 8.634 9.926 12.34014 4.660 5.368 6.571 7.790 9.467 10.821 13.33915 5.229 5.985 7.261 8.547 10.307 11.721 14.339

16 5.812 6.614 7.962 9.312 11.152 12.624 15.33817 6.408 7.255 8.672 10.085 12.002 13.531 16.33818 7.015 7.906 9.390 10.865 12.857 14.440 17.33819 7.633 8.567 10.117 11.651 13.716 15.352 18.33820 8.260 9.237 10.851 12.443 14.578 16.266 19.337

21 8.897 9.915 11.591 13.240 15.445 17.182 20.33722 9.542 10.600 12.338 14.041 16.314 18.101 21.33723 10.196 11.293 13.091 14.848 17.187 19.021 22.33724 10.856 11.992 13.848 15.659 18.062 19.943 23.33725 11.524 12.697 14.611 16.473 18.940 20.867 24.337

26 12.198 13.409 15.379 17.292 19.820 21.792 25.33627 12.879 14.125 16.151 18.114 20.703 22.719 26.33628 13.565 14.847 16.928 18.939 21.588 23.647 27.33629 14.256 15.574 17.708 19.768 22.475 24.577 28.33630 14.953 16.306 18.493 20.599 23.364 25.508 29.336

For larger values of df, the expression �2�2– �2df–1 may be used as a normal deviate with unit variance, remembering that the probability of �2 corresponds with that of a single tail of the normal curve.

continued on the inside back cover

Basics of Research Methods for

CRIMINAL JUSTICE and CRIMINOLOGY

Second Edition

Michael G. Maxfi eldRutgers University

Earl BabbieChapman University

Australia • Brazil • Japan • Korea • Mexico • SingaporeSpain • United Kingdom • United States

© 2009, 2006 Wadsworth, Cengage Learning

ALL RIGHTS RESERVED. No part of this work covered by the copyright herein may be reproduced, transmitted, stored, or used in any form or by any means graphic, electronic, or mechanical, including but not limited to photocopying, recording, scanning, digitizing, taping, Web distribu-tion, information networks, or information storage and retrieval systems, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without the prior written permission of the publisher.

For product information and technology assistance, contact us at Cengage Learning Customer & Sales Support, 1-800-354-9706.

For permission to use material from this text or product, submit all requests online at cengage.com/permissions.

Further permissions questions can be emailed topermissionrequest@cengage.com.

Library of Congress Control Number:

ISBN-13: 978-0-495-50385-9ISBN-10: 0-495-50385-1

Wadsworth10 Davis DriveBelmont, CA 94002-3098USA

Cengage Learning is a leading provider of customized learning solutions with offi ce locations around the globe, including Singapore, the United Kingdom, Australia, Mexico, Brazil, and Japan. Locate your local offi ce at international.cengage.com/region.

Cengage Learning products are represented in Canada by Nelson Education, Ltd.

For your course and learning solutions, visit academic.cengage.com.

Purchase any of our products at your local college store or at our preferred online store www.ichapters.com.

Basics of Research Methods for Criminal Justice and Criminology, Second Edition

Michael G. Maxfi eld and Earl Babbie

Senior Editor, Criminal Justice: Carolyn Henderson Meier

Assistant Editor: Meaghan Banks

Editorial Assistant: John Chell

Technology Project Manager: Bessie Weiss

Marketing Manager: Michelle Williams

Marketing Assistant: Jillian Myers

Marketing Communications Manager: Tami Strang

Project Manager, Editorial Production: Jennie Redwitz

Creative Director: Rob Hugel

Art Director: Maria Epes

Print Buyer: Paula Vang

Permissions Editor: Bob Kauser

Production Service: Linda Jupiter Productions

Copy Editor: Lunaea Weatherstone

Proofreader: Henrietta Bensussen

Indexer: Katherine Simpson

Illustrator: Newgen

Cover Designer: Yvo Riezebos, Riezebos Holzbaur Design Group

Cover Image: (c) George Hammerstein/Solus-Veer/Corbis

Compositor: Newgen

Printed in Canada1 2 3 4 5 6 7 12 11 10 09 08

To Max Jacob Fauth

iv

Earl Babbie grew up in small-town Vermont and New Hampshire, venturing into the outer world by way of Harvard, the U.S. Marine Corps, the University of California, Berkeley, and 12 years of teaching at the University of Hawai’i. Along the way, he married Sheila (two months after their fi rst date), and created Aaron three years after that. He resigned from teach-ing in 1980 and wrote full-time for seven years, until the call of the classroom became too loud to ignore. To him, teaching is like playing jazz: even if you perform the same number over and over, it never comes out the same twice, and you don’t know exactly what it’ll sound like until you hear it. Teaching is like writing with your voice. Recently he has rediscovered his roots in summer trips to Vermont. Rather than a return to the past, it feels more like the next turn in a widening spiral, and he can’t wait to see what’s around the next bend.

About the AuthorsMichael G. Maxfi eld is Professor of Criminal Justice at Rutgers University, Newark. He is the author of numerous articles and books on a variety of topics, including victimization, po-licing, homicide, community corrections, and long-term consequences of child abuse and ne-glect. He is the coauthor (with Earl Babbie) of the textbook Research Methods for Criminal Justice and Criminology, now in its fi fth edition, and co-editor (with Mike Hough) of Surveying Crime in the 21st Century, in the Crime Prevention Studiesseries. Other recent work includes a POP Cen-ter guide on the problem of abandoned vehicles (forthcoming) and a special issue of Criminal Justice Policy Review on environmental criminol-ogy. Formerly a Visiting Fellow at the National Institute of Justice, Maxfi eld works with a vari-ety of public agencies and other organizations, acting as a consultant and advocate of frugal evaluation for justice policy. Recent projects initiated collaboration with police departments and other justice agencies in the areas of repeat domestic violence, performance measurement systems, and auto theft. Maxfi eld received his Ph.D. in political science from Northwestern University.

v

Brief ContentsPART ONE: An Introduction to Criminal Justice Inquiry 1

Chapter 1: Criminal Justice and Scientifi c Inquiry 2

Chapter 2: Ethics and Criminal Justice Research 26

PART TWO: Structuring Criminal Justice Inquiry 49

Chapter 3: General Issues in Research Design 50

Chapter 4: Concepts, Operationalization, and Measurement 80

Chapter 5: Experimental and Quasi-Experimental Designs 112

PART THREE: Modes of Observation 139

Chapter 6: Sampling 140

Chapter 7: Survey Research and Other Ways of Asking Questions 169

Chapter 8: Field Research 200

Chapter 9: Agency Records, Content Analysis, and Secondary Data 229

PART FOUR: Application and Analysis 253

Chapter 10: Evaluation Research and Problem Analysis 254

Chapter 11: Interpreting Data 287

This page intentionally left blank

vii

Quantitative and Qualitative Data 23Knowing through Experience: Summing Up and

Looking Ahead 24Main Points 24

Chapter 2: Ethics and Criminal Justice Research 26

Introduction 27Ethical Issues in Criminal Justice Research 27

No Harm to Participants 27

ETHICS AND EXTREME FIELD RESEARCH 28

Voluntary Participation 31Anonymity and Confi dentiality 32Deceiving Subjects 33Analysis and Reporting 33Legal Liability 34Special Problems 35

Promoting Compliance with Ethical Principles 37Codes of Professional Ethics 37Institutional Review Boards 38Institutional Review Board Requirements and

Researcher Rights 41

ETHICS AND JUVENILE GANG MEMBERS 42

Ethical Controversies 42The Stanford Prison Experiment 42Discussion Examples 45

Main Points 46

PART TWO: Structuring Criminal Justice Inquiry 49

Chapter 3: General Issues in Research Design 50

Introduction 51Causation in the Social Sciences 51

Criteria for Causality 52Necessary and Suffi cient Causes 53

Validity and Causal Inference 53Statistical Conclusion Validity 53

ContentsPreface xiii

PART ONE: An Introduction to Criminal Justice Inquiry 1

Chapter 1: Criminal Justice and Scientifi c Inquiry 2

Introduction 3

HOME DETENTION 4

What Is This Book About? 4Two Realities 4The Role of Science 6

Personal Human Inquiry 6Tradition 7Authority 7

ARREST AND DOMESTIC VIOLENCE 8

Errors in Personal Human Inquiry 8Inaccurate Observation 8Overgeneralization 8Selective Observation 9Illogical Reasoning 10Ideology and Politics 10To Err Is Human 10

Foundations of Social Science 11Theory, Not Philosophy or Belief 11Regularities 13What about Exceptions? 13Aggregates, Not Individuals 13A Variable Language 14Variables and Attributes 15Variables and Relationships 18

Purposes of Research 18Exploration 18Description 19Explanation 19Application 20

Differing Avenues for Inquiry 20Idiographic and Nomothetic Explanations 21Inductive and Deductive Reasoning 22

viii Contents

Conceptualization 83Indicators and Dimensions 83

WHAT IS RECIDIVISM? 84

Creating Conceptual Order 84Operationalization Choices 86

Measurement as Scoring 87

JAIL STAY 88

Exhaustive and Exclusive Measurement 88Levels of Measurement 89Implications of Levels of Measurement 91

Criteria for Measurement Quality 92Reliability 93Validity 94

Measuring Crime 97General Issues in Measuring Crime 97

UNITS OF ANALYSIS AND MEASURING

CRIME 98

Measures Based on Crimes Known to Police 98

Victim Surveys 102Surveys of Offending 103Measuring Crime Summary 104

Composite Measures 105Typologies 106An Index of Disorder 107

Measurement Summary 109Main Points 109

Chapter 5: Experimental and Quasi-Experimental Designs 112

Introduction 113The Classical Experiment 113

Independent and Dependent Variables 114Pretesting and Posttesting 114Experimental and Control Groups 115Double-Blind Experiments 116Selecting Subjects 116Randomization 117

Experiments and Causal Inference 117Experiments and Threats to Validity 118Threats to Internal Validity 118

Internal Validity 55External Validity 55Construct Validity 55Validity and Causal Inference Summarized 57Does Drug Use Cause Crime? 57

CAUSATION AND DECLINING CRIME IN

NEW YORK CITY 58

Introducing Scientifi c Realism 60Units of Analysis 61

Individuals 61Groups 61Organizations 62Social Artifacts 62The Ecological Fallacy 63Units of Analysis in Review 63

UNITS OF ANALYSIS IN THE NATIONAL

YOUTH GANG SURVEY 64

The Time Dimension 65Cross-Sectional Studies 66Longitudinal Studies 66Approximating Longitudinal Studies 67The Time Dimension Summarized 70

How to Design a Research Project 70The Research Process 71Getting Started 73Conceptualization 73Choice of Research Method 74Operationalization 74Population and Sampling 74Observations 75Analysis 75Application 75Research Design in Review 75

The Research Proposal 76Elements of a Research Proposal 76

Answers to the Units-of-Analysis Exercise 78Main Points 78

Chapter 4: Concepts, Operationalization, and Measurement 80

Introduction 81Conceptions and Concepts 81

Contents ix

The British Crime Survey 161Probability Sampling in Review 162

Nonprobability Sampling 162Purposive Sampling 162Quota Sampling 163Reliance on Available Subjects 164Snowball Sampling 165Nonprobability Sampling in Review 166

Main Points 166

Chapter 7: Survey Research and Other Ways of Asking Questions 169

Introduction 170Topics Appropriate to Survey Research 171

Counting Crime 171Self-Reports 171Perception and Attitudes 172Targeted Victim Surveys 172Other Evaluation Uses 172

Guidelines for Asking Questions 173Open-Ended and Closed-Ended

Questions 173Questions and Statements 174Make Items Clear 174Short Items Are Best 174Avoid Negative Items 174Biased Items and Terms 175Designing Self-Report Items 175

Questionnaire Construction 177General Questionnaire Format 177Contingency Questions 177Matrix Questions 178Ordering Items in a Questionnaire 180

DON’T START FROM SCRATCH! 181

Self-Administered Questionnaires 181Mail Distribution and Return 182Warning Mailings and Cover Letters 182Follow-Up Mailings 183Acceptable Response Rates 183Computer-Based Self-Administration 184

In-Person Interview Surveys 185The Role of the Interviewer 185

Ruling Out Threats to Internal Validity 120Generalizability and Threats to Validity 121

Variations in the Classical Experimental Design 123

Quasi-Experimental Designs 124Nonequivalent-Groups Designs 125Cohort Designs 128Time-Series Designs 128Variations in Time-Series Designs 132Variable-Oriented Research and Scientifi c

Realism 133Experimental and Quasi-Experimental Designs

Summarized 135Main Points 136

PART THREE: Modes of Observation 139

Chapter 6: Sampling 140

Introduction 141The Logic of Probability Sampling 141

Conscious and Unconscious Sampling Bias 143

Representativeness and Probability of Selection 144

Probability Theory and Sampling Distribution 145The Sampling Distribution of 10 Cases 145From Sampling Distribution to Parameter

Estimate 149Estimating Sampling Error 150Confi dence Levels and Confi dence

Intervals 151Probability Theory and Sampling Distribution

Summed Up 152Populations and Sampling Frames 153Types of Sampling Designs 154

Simple Random Sampling 154Systematic Sampling 154Stratifi ed Sampling 155Disproportionate Stratifi ed Sampling 156Multistage Cluster Sampling 157Multistage Cluster Sampling with

Stratifi cation 158Illustration: Two National Crime Surveys 160

The National Crime Victimization Survey 160

x Contents

Topics Appropriate for Agency Records and Content Analysis 230

Types of Agency Records 232Published Statistics 232Nonpublic Agency Records 234New Data Collected by Agency Staff 236

IMPROVING POLICE RECORDS OF DOMESTIC

VIOLENCE 238

Reliability and Validity 239Sources of Reliability and Validity

Problems 240

HOW MANY PAROLE VIOLATORS WERE THERE

LAST MONTH? 242

Content Analysis 244Coding in Content Analysis 244Illustrations of Content Analysis 246

Secondary Analysis 247Sources of Secondary Data 248Advantages and Disadvantages of Secondary

Data 249Main Points 250

PART FOUR: Application and Analysis 253

Chapter 10: Evaluation Research and Problem Analysis 254

Introduction 255Topics Appropriate for Evaluation Research and

Problem Analysis 255The Policy Process 256Linking the Process to Evaluation 257

Getting Started 260Evaluability Assessment 260Problem Formulation 261Measurement 263

Designs for Program Evaluation 266Randomized Evaluation Designs 266Home Detention: Two Randomized

Studies 269Quasi-Experimental Designs 271Other Types of Evaluation Studies 273

Problem Analysis and Scientifi c Realism 273Problem-Oriented Policing 274Auto Theft in Chula Vista 275

Coordination and Control 186Computer-Assisted In-Person Interviews 187

Telephone Surveys 189Computer-Assisted Telephone

Interviewing 190Comparison of the Three Methods 191Strengths and Weaknesses of Survey

Research 192Other Ways of Asking Questions 194

Specialized Interviewing 194Focus Groups 195

Should You Do It Yourself ? 196Main Points 198

Chapter 8: Field Research 200

Introduction 201Topics Appropriate to Field Research 202The Various Roles of the Observer 203Asking Questions 205Gaining Access to Subjects 207

Gaining Access to Formal Organizations 207Gaining Access to Subcultures 210Selecting Cases for Observation 210Purposive Sampling in Field Research 212

Recording Observations 214Cameras and Voice Recorders 214Field Notes 215Structured Observations 216Linking Field Observations and Other

Data 217Illustrations of Field Research 219

Field Research on Speeding and Traffi c Enforcement 219

CONDUCTING A SAFETY AUDIT 220

Bars and Violence 222Strengths and Weaknesses of Field Research 224

Validity 224Reliability 225Generalizability 226

Main Points 227

Chapter 9: Agency Records, Content Analysis, and Secondary Data 229

Introduction 230

Contents xi

Describing Two or More Variables 296Bivariate Analysis 296

MURDER ON THE JOB 298

Multivariate Analysis 301Inferential Statistics 303

Univariate Inferences 304Tests of Statistical Signifi cance 305Visualizing Statistical Signifi cance 306Chi Square 307Cautions in Interpreting Statistical

Signifi cance 309Main Points 311

Glossary 313References 321Name Index 332Subject Index 334

Other Applications of Problem Analysis 276Space- and Time-Based Analysis 276Scientifi c Realism and Applied Research 280

The Political Context of Applied Research 282Evaluation and Stakeholders 282

WHEN POLITICS ACCOMMODATES

FACTS 283

Politics and Objectivity 284Main Points 285

Chapter 11: Interpreting Data 287

Introduction 288Univariate Description 288

Distributions 288Measures of Central Tendency 289Measures of Dispersion 291Comparing Measures of Dispersion and

Central Tendency 293Computing Rates 295

This page intentionally left blank

xiii

introductory graduate courses, prefer the more extensive coverage offered in RMCJC.

OrganizationThe overall organization of Basics follows RMCJC. Part One introduces research meth-ods. Chapter 1 begins with a brief treatment of the epistemology of social science. We then de-scribe the role of theory and different general approaches to empirical research in criminal justice. Chapter 2 considers the ethics of con-ducting research in such a sensitive area of so-cial life. We trace the foundations of efforts to protect human subjects, then describe different ways ethical principles are operationalized by researchers.

Part Two examines the main elements of planning empirical research. Chapter 3 describes three important topics: causation, units of anal-ysis, and the time dimension. This chapter con-cludes with a step-by-step consideration of how to plan research and prepare a research pro-posal. In Chapter 4 we describe measurement in general, including a brief version of material on measuring crime from RMCJC. Even with the abbreviated presentation here, we believe the coverage of measurement concerns is the most rigorous available in any undergraduate text on criminal justice research methods. Chapter 5 examines research design, with extensive treat-ment of experimental and quasi-experimental approaches. As in RMCJC, we also consider sci-entifi c realism as an approach to research that complements traditional treatments of design.

Part Three covers data collection in some de-tail, albeit more concisely than in the larger text. Chapter 6 describes sampling, with its founda-tions in probability theory. We also discuss dif-ferent techniques of nonprobability sampling. Finally, we consider combined approaches such as adaptive sampling. Each of the next three chapters centers on a general category of data

PrefaceSince the fi rst edition of Research Methods for Criminal Justice and Criminology (RMCJC) was published in 1995, we have been delighted to hear comments from instructors who have used the text (and from a few who do not use it!). Though it is always gratifying to learn of positive reactions, we have also listened to suggestions for revising the book through its fi ve editions. Some colleagues suggested trimming the text substantially to focus on the most important principles of research methods in criminal jus-tice. Students and instructors are also increas-ingly sensitive to the cost of college texts.

As a result, we introduced Basics of Research Methods for Criminal Justice and Criminology about three years ago. Our objective in producing that text was fi vefold: (1) retain the key elements of the parent text; (2) concentrate on funda-mental principles of research design; (3) ap-peal to a broad variety of teaching and learning styles; (4) retain salient examples that illustrate various methods; (5) reduce less-central points of elaboration and the examples used to illus-trate them. That proved to be more challenging than we initially thought. At one point we were tempted to do something simple like drop two chapters, wrap the result in a soft cover, and de-clare what was left to be the basics. Fortunately that sentiment was reined in and we pursued a more deliberate approach that involved plan-ning from the ground up.

Basics is shorter, more concise, and focused on what we believe is the most central material for introductory courses in research methods. Rather than simply offering a truncated version of the full text, Basics has been crafted to appeal to those seeking a more economical alternative while retaining the big book’s highly success-ful formula. Many instructors teaching shorter courses, or courses where students are better served by concentrating on basic principles of criminal justice research, have used the Basicsedition. Others, especially instructors teaching

xiv Preface

these topics in Chapter 4, “Concepts, Opera-tionalization, and Measurement,” and Chap-ter 5, “Experimental and Quasi-Experimental Designs,” which will help students grasp these important concepts.

• The introductory material on data collec-tion modes has been cut from Chapter 6, “Sampling,” which now focuses entirely on sampling. We have added new material on probability sampling from RMCJC. Like-wise, Chapter 7, “Survey Research and Other Ways of Asking Questions,” includes revised guidance on computer-assisted interviewing and the scope for greater use of web-based survey techniques. Updated material draws largely on a book edited by Mike Hough and Mike Maxfi eld: Surveying Crime in the 21st Century (Monsey, NY: Criminal Justice Press; London: Willan; 2007). Together, these revi-sions highlight the important role of case selection, while presenting updated mate-rial on different approaches to sampling.

• Chapter 10, “Evaluation Research and Prob-lem Analysis,” follows the RMCJC shift in focus from policy analysis to problem analy-sis. This refl ects the growing use of evidence-based planning by justice agencies. Among other things, this produces broader cover-age of applied research methods.

Popular features from the fi rst edition have been retained, resulting in an up-to-date, con-cise presentation of evolving methods in crimi-nal justice research. We are happy to present this revised edition and look forward to hear-ing from instructors and students.

Learning ToolsAs has always been the case in RMCJC, our ap-proach to this text is student-centered. We combine a solid discussion of principles with a number of examples. Over the fi ve editions of RMCJC we have struck a good balance, and that is carried over into this edition of Basics. The end of each chapter presents additional tools

collection: survey research and other ways of asking questions (Chapter 7); fi eld observa-tion, including systematic and ethnographic approaches (Chapter 8); existing data collected by justice agencies, and secondary data analysis (Chapter 9).

In Part Four we present chapters on applied research and an introduction to data analysis. This follows the organization of RMCJC, though these chapters are somewhat briefer than in the larger book. Our treatment of applied research (Chapter 10) has always been well received by instructors and students using RMCJC.

Features of the New EditionWe are gratifi ed that both texts have been so well received. At the same, we are grateful to have been given a number of ideas from col-leagues about how to improve Basics. Some of the changes in this new edition stem from sug-gestions by reviewers or colleagues who have used the text. Other revisions are drawn from the fi fth edition of RMCJC.

• Our discussion of theory and criminal justice research has been streamlined and moved into Chapter 1, “Criminal Justice and Scien-tifi c Inquiry.” This responds to suggestions that a more concise presentation of theory would aid student understanding.

• Chapter 2, “Ethics and Criminal Justice Re-search,” is now devoted exclusively to ethics in criminal justice research, drawing on the more complete discussion in RMCJC. Stu-dents will benefi t from the more complete consideration of ethics.

• Chapter 3, “General Issues in Research De-sign,” includes guidelines on developing a research proposal. This chapter also includes updated examples of scientifi c realism. Each of these features, adapted from the larger text, will help students better understand the research process.

• Sorting out validity in measurement and causal inference can be diffi cult for students. We present a more concise discussion of

Preface xv

to criminal justice, criminology, corrections, criminal law, policing, and juvenile justice.

Student Resources• Crime Scenes 2.0 Bring criminal justice to

life with this interactive simulation CD-ROM featuring six scenarios of various crimes ( ju-venile murder, prostitution, assault, arrest-ing force/DUI, search and seizure, and em-bezzlement/white-collar crime) to illustrate all the stages of the criminal justice system. Students make choices about the outcomes at various decision points in each scenario, illustrating the consequences of each choice. Use the scenarios to introduce or review con-cepts, spark class discussion, or as a basis for group research projects. Written by Bruce Berg (California State University, Long Beach), this CD-ROM was awarded gold and silver medals by New Media magazine.

• Current Perspectives Designed to give stu-dents a deeper understanding of special top-ics in criminal justice, the timely articles in the Current Perspectives readers are selected by experts in each topic from within Info-Trac® College Edition. Each reader includes access to InfoTrac® College Edition. Topics available include:

• Juvenile Justice

• Cybercrime

• Terrorism and Homeland Security

• Public Policy and Criminal Justice

• New Technologies and Criminal Justice

• Racial Profi ling

• White Collar Crime

• Victimology (publishing 2008)

• Forensics and Criminal Investigation(publishing 2008)

• Ethics and Criminal Justice (publishing 2008)

• Guide to Careers in Criminal Justice, Third Edition This handy guide, compiled by Caridad Sanchez-Leguelinel of John Jay College of Criminal Justice, gives students information on a wide variety of career paths,

to aid student learning. Main Points summarizes topics with a brief statement that should trig-ger student retention. This is followed by KeyTerms, each of which is introduced in the chap-ter. Each key term is also presented in a glos-sary at the end of the book. We offer a few Re-view Questions and Exercises that are designed for class discussion. In our own teaching, we ask students to review these items before class.

AncillariesA number of supplements are provided by Wad-sworth to help instructors use Basics of Research Methods in Criminal Justice and Criminology, Sec-ond Edition, in their courses and to aid stu-dents in preparing for exams. Supplements are available to qualifi ed adopters. Please consult your local sales representative for details.

Instructor Resources• Instructor’s Resource Manual with Test

Bank Fully updated and revised, the Instruc-tor’s Resource Manual with Test Bank for this edition includes learning objectives, de-tailed chapter outlines, chapter summaries, key terms, class discussion exercises, lec-ture suggestions, and a complete test bank. Each chapter’s test bank contains multiple-choice, true-false, fi ll-in-the-blank, and essay questions (approximately 75 questions in all), along with a complete answer key.

• eBank Microsoft® PowerPoint® slides Microsoft PowerPoint slides are provided to assist you in preparing for your lectures. Available online, the slides are fully custom-izable to your course.

• Classroom Activities for Criminal Jus-tice Stimulate student engagement with a compilation of the best of the best in crimi-nal justice classroom activities. Novice and seasoned instructors will fi nd this booklet a powerful course customization tool contain-ing tried-and-true favorites and exciting new projects drawn from the spectrum of crimi-nal justice subjects, including introduction

xvi Preface

including requirements, salaries, training, contact information for key agencies, and employment outlooks.

• Handbook of Selected Supreme Court Cases for Criminal Justice This supple-mentary handbook covers almost 40 land-mark cases, each of which includes a full case citation, an introduction, a summary from Westlaw, excerpts from the case, and the de-cision. The updated edition includes Hamdiv. Rumsfeld, Roper v. Simmons, Ring v. Arizona, Atkins v. Virginia, Illinois v. Caballes, and much more.

• Internet Activities for Criminal Justice In addition to providing a wide range of activities for any criminal justice class, this booklet familiarizes students with Internet resources useful both to students of and professionals in criminal justice. Internet Activities for Criminal Justice integrates Inter-net resources and addresses with important topics such as criminal and police law, polic-ing organizations, policing challenges, cor-rections systems, juvenile justice, criminal trials, and current issues in criminal justice.

• Internet Guide for Criminal Justice, Sec-ond Edition Intended for the novice user, this guide provides students with back-ground and vocabulary necessary to navi-gate and understand the Web, then provides them with a wealth of criminal justice web-sites and Internet project ideas.

• Writing and Communicating for Crimi-nal Justice This booklet provides students with a basic introduction to academic, pro-fessional, and research writing in criminal justice. It contains articles on writing skills, a basic grammar review, and a survey of ver-bal communication on the job that will ben-efi t students in their professional careers.

• Companion Website The book-specifi c website at academic.cengage.com/criminaljustice/maxfi eld offers students a variety of study tools and useful resources such as a tutorial quiz, glossary, fl ash cards, and addi-tional study aids.

AcknowledgmentsSeveral reviewers made perceptive and use-ful comments on the fi rst edition of Basics. We thank them for their insights and suggestions:

Brian Forst, American UniversityShaun Gabbidon, Pennsylvania State Uni-

versity, HarrisburgDavid Jenks, California State University, Los

AngelesElizabeth McConnell, University of Hous-

ton, DowntownJ. Mitchell Miller, University of South

CarolinaWayne Pitts, University of MemphisSudipto Roy, Indiana State UniversityMichael Sabath, San Diego State UniversityTheodore Skotnicki, Niagara County Com-

munity CollegeClete Snell, University of Houston,

DowntownDennis Stevens, University of Southern

Mississippi

This edition continues to benefi t from con-tributions by students at the Rutgers Univer-sity School of Criminal Justice. We thank Dr. Carsten Andresen (now at the Travis County De-partment of Community Corrections and Su-pervision), Dr. Gisela Bichler (now at California State University, San Bernardino), Dr. Sharon Chamard (now at University of Alaska), Shuryo Fujita, Galma Jahic (now at Istanbul Bilgi Uni-versity, Turkey), Dr. Jarret Lovell (now at Cali-fornia State University, Fullerton), Dr. Marie Mele (now at Monmouth University), Dr. Dina Perrone (now at Bridgewater State College), and Dr. Christopher Sullivan (now at the University of South Florida).

We are especially grateful for the excellent support and assistance we can always count on from people at Cengage Learning: Carolyn Henderson Meier, Jennie Redwitz, Michelle Williams, and Meaghan Banks. Special thanks to copy editor Lunaea Weatherstone and pro-duction coordinator Linda Jupiter.

1

Part One

An Introduction to Criminal Justice Inquiry

characteristics and issues that make sci-ence different from other ways of knowing things. Chapter 1 begins with a look at na-tive human inquiry, the sort of thing all of us have been doing all our lives. Because people sometimes go astray in trying to understand the world around them, we’ll consider the primary characteristics of scientifi c inquiry that guard against those errors.

Chapter 2 deals with the ethics of social science research. The study of crime and criminal justice often presents special chal-lenges with regard to ethics. We’ll see that most ethical questions are rooted in two fundamental principles: (1) research sub-jects should not be harmed, and (2) their participation must be voluntary.

The overall purpose of Part One, therefore, is to construct a backdrop against which to view more specifi c aspects of research design and execution. By the time you complete the chapters in Part One, you’ll be ready to look at some of the more concrete aspects of crim-inal justice research.

What comes to mind when you encoun-ter the word science? What do you think of when we describe criminal justice as a social science? For some people, science is math-ematics; for others, it is white coats and lab-oratories. Some confuse it with technology or equate it with diffi cult high school or college courses.

Science is, of course, none of these things per se, but it is diffi cult to specify what ex-actly science is. Scientists, in fact, disagree on the proper defi nition. Some object to the whole idea of social science; others question more specifi cally whether criminal justice can be a social science.

For the purposes of this book, we view science as a method of inquiry—a way of learning and knowing things about the world around us. Like other ways of learning and knowing about the world, science has some special characteristics. We’ll examine these traits in this opening set of chapters. We’ll also see how the scientifi c method of inquiry can be applied to the study of crime and criminal justice.

Part One lays the groundwork for the rest of the book by examining the fundamental

2

Chapter 1

Criminal Justice and Scientifi c InquiryPeople learn about their world through a variety of methods, and they often make mistakes along the way. Science is different from other ways of learning and knowing. We’ll consider the foundations of social science, different pur-poses of research, and different general approaches to social science.

Introduction 3

HOME DETENTION 4

What Is This Book About? 4

Two Realities 4

The Role of Science 6

Personal Human Inquiry 6

Tradition 7

Authority 7

ARREST AND DOMESTIC

VIOLENCE 8

Errors in Personal Human Inquiry 8

Inaccurate Observation 8

Overgeneralization 8

Selective Observation 9

Illogical Reasoning 10

Ideology and Politics 10

To Err Is Human 10

Foundations of Social Science 11

Theory, Not Philosophy or Belief 11

Regularities 13

What about Exceptions? 13

Aggregates, Not Individuals 13

A Variable Language 14

Chapter 1 Criminal Justice and Scientifi c Inquiry 3

IntroductionCriminal justice professionals are both consumers and producers of research.

Spending a semester studying criminal justice research methodology may not be high on your list of “Fun Things to Do.” Perhaps you are or plan to be a criminal justice professional and are thinking, “Why do I have to study research methods? When I graduate, I’ll be working in probation (or law enforcement, or corrections, or court services), not conducting research! I would benefi t more from learning about proba-tion counseling (or police management, or cor-rections policy, or court administration).” Fair enough. But as a criminal justice professional, you will need to be a consumer of research. One objective of this book is to help you become an informed consumer of research.

For example, fi ndings from an experimen-tal study of policing, the Kansas City Preven-tive Patrol Experiment, appeared to contradict a fundamental belief that a visible police patrol force prevents crime. Acting as a consumer of research fi ndings, a police offi cer, supervisor, or executive should be able to understand how that research was conducted and how the study’s fi ndings might apply in his or her department.

Most criminal justice professionals, espe-cially those in supervisory roles, routinely re-view various performance reports and statisti-cal tabulations. A continually growing number of research reports may now be found on the Internet. For example, the National Criminal Justice Reference Service (NCJRS) was estab-lished to archive and distribute research reports to criminal justice professionals and research-ers around the world. Many such reports are prepared specifi cally to keep the criminal jus-tice community informed about new research developments and may be downloaded from the NCJRS website (www.ncjrs.gov, accessed May 6, 2008). An understanding of research methods can help decision makers critically evaluate such reports and recognize when methods are properly and improperly applied. The box ti-tled “Home Detention” describes an example of how knowledge of research methods can help policy makers avoid mistakes.

Another objective of this book is to help you produce research. In other courses you take or in your job, you may become a producer of research. Probation offi cers sometimes test new approaches to supervising or counseling clients, and police offi cers try new methods of dealing with recurring problems. Many cities

Variables and Attributes 15

Variables and Relationships 18

Purposes of Research 18

Exploration 18

Description 19

Explanation 19

Application 20

Differing Avenues for Inquiry 20

Idiographic and Nomothetic Explanations 21

Inductive and Deductive Reasoning 22

Quantitative and Qualitative Data 23

Knowing through Experience: Summing Up and Looking Ahead 24

4 Part One An Introduction to Criminal Justice Inquiry

Two RealitiesUltimately, we live in a world of two realities. Part of what we know could be called our “ex-periential reality”—the things we know from di-rect experience. If you dive into a glacial stream fl owing down through the Canadian Rockies, you don’t need anyone to tell you the water is cold; you notice that all by yourself. And if you step on a piece of broken glass, you know it hurts without anyone telling you. These are things you experience.

The other part of what we know could be called our “agreement reality”—the things we consider real because we’ve been told they’re real, and everyone else seems to agree they are real. A big part of growing up in any society, in fact, is learning to accept what everybody around us “knows” to be true. If we don’t know those same things, we can’t really be a part of society. If you were to seriously question a ge-ography professor as to whether the sun really

and states have a compelling need to evaluate services provided to offenders released from prison or jail. Determining whether changes or existing programs are effective is an example of applied research. A problem-solving approach, rooted in systematic research, is being used in more and more police departments and in many other criminal justice agencies as well. Therefore criminal justice professionals need to know not only how to interpret research accurately but also how to produce accurate research.

What Is This Book About?This book focuses on how we know what we know.

This book focuses on how we learn. Although you will come away from the book knowing many things you don’t know right now, our primary purpose is to help you look at how you know things, not what you know.

HOME DETENTION

Home detention with electronic moni-toring (ELMO) was widely adopted as

an alternative punishment in the United States in the 1980s. The technology for this new sanc-tion was made possible by advances in telecom-munications and computer systems. Prompted by growing prison and jail populations, not to men-tion sales pitches by equipment manufacturers, criminal justice offi cials embraced ELMO. Ques-tions about the effectiveness of these programs quickly emerged, however, and led to research to determine whether the technology worked. Com-prehensive evaluations were conducted in Marion County (Indianapolis), Indiana. Selected fi ndings from these studies illustrate the importance of understanding research methods in general and the meaning of various ways to measure program success in particular.

ELMO programs directed at three groups of people were studied: (1) convicted adult offend-

ers, (2) adults charged with a crime and await-ing trial, and (3) juveniles convicted of burglary or theft. People in each of the three groups were assigned to home detention for a specifi ed time. They could complete the program in one of three ways: (1) successful release after serving their term, (2) removal due to rule violations, such as being arrested again or violating program rules, or (3) running away, or absconding. The agencies that administered each program were required to submit regular reports to county offi cials on how many individuals in each category completed their home-detention terms. The accompanying table summarizes the program-completion types during the evaluation study.

Convicted Pretrial Adults (%) Adults (%) Juveniles

Success 81 73 99

Ruleviolation 14 13 1

Abscond 5 14 0

Chapter 1 Criminal Justice and Scientifi c Inquiry 5

hended while he is committing a crime or im-mediately thereafter.”

Seven years later, the Police Foundation, a private research organization, published results from an experimental study that presented a dramatic challenge to the conventional wisdom on police patrol. Known as the Kansas City Pre-ventive Patrol Experiment, this study compared police beats with three levels of preventive patrol: (1) control beats, with one car per beat; (2) proactive beats, with two or three cars per beat; and (3) reactive beats, with no routine pre-ventive patrol. After almost one year, research-ers examined data from the three types of beats and found no differences in crime rates, citizen satisfaction with police, fear of crime, or other measures of police performance (Kelling, Pate, Dieckman, and Brown 1974).

Additional studies conducted in the 1970s cast doubt on other fundamental assump-tions about police practices. A quick response to crime reports made no difference in arrests,

sets in the west, you’d quickly fi nd yourself set apart from other people. The fi rst reality is a product of our own experience; the second is a product of what people have told us.

To illustrate the difference between agree-ment and experiential realities, consider preven-tive police patrol. The term “preventive” implies that when police patrol their assigned beats they prevent crime. Police do not prevent all crime, of course, but it is a commonsense belief that a visible, mobile police force will prevent some crimes. In fact, the value of patrol in preventing crime was a fundamental principle of police operations for many years. A 1967 report on policing for President Lyndon Johnson by the President’s Commission on Law Enforcement and Administration of Justice (p. 1) stated that “the heart of the police effort against crime is patrol. . . . The object of patrol is to disperse policemen in a way that will eliminate or reduce the opportunity for misconduct and to increase the probability that a criminal will be appre-

These percentages, reported by agencies to county offi cials, indicate that the juvenile program was a big success; virtually all juveniles were successfully released.

Now consider some additional information on each program collected by the evaluation team. Data were gathered on new arrests of pro-gram participants and on the number of success-ful computerized telephone calls to participants’ homes.

Convicted Pretrial Juveniles Adults (%) Adults (%) (%)

New arrest 5 1 11

Successfulcalls 53 52 17

As the table shows, many more juveniles were arrested, and juveniles successfully answered a much lower percentage of telephone calls to their homes. What happened?

The simple answer is that the staff responsible for administering the juvenile program were not keeping track of offenders. The ELMO equipment was not maintained properly, and police were not visiting the homes of juveniles as planned. Because staff were not keeping track of program participants, they were not aware that many juve-niles were violating the conditions of home deten-tion. And because they did not detect violations, they naturally reported that the vast majority of young burglars and thieves completed their home detention successfully.

A county offi cial who relied on only agency reports of program success would have made a big mistake in judging the juvenile program to be 99 percent successful. In contrast, an informed consumer of such reports would have been skep-tical of a 99 percent success rate and searched for more information.

Source: Adapted from Maxfield and Baumer (1991) and Baumer, Maxfield, and Mendelsohn (1993).

6 Part One An Introduction to Criminal Justice Inquiry

The Role of ScienceScience offers an approach to both agreement reality and experiential reality. Scientists have certain criteria that must be met before they will agree on the reality of something they haven’t personally experienced. In general, an assertion must have both logical and empirical support: it must make sense, and it must agree with actual observations. For example, why do earthbound scientists accept the assertion that it’s cold on the dark side of the moon? First, it makes sense because the surface heat of the moon comes from the sun’s rays. Second, scientifi c measure-ments made on the moon’s dark side confi rm the assertion. Therefore scientists accept the reality of things they don’t personally experi-ence—they accept an agreement reality—but they have special standards for doing so.

More to the point of this book, however, sci-ence offers a special approach to the discovery of reality through personal experience. Episte-mology is the science of knowing; methodol-ogy (a subfi eld of epistemology) might be called the science of fi nding out. This book focuses on criminal justice methodology—how social sci-entifi c methods can be used to better under-stand crime and criminal justice policy. To un-derstand scientifi c inquiry, let’s fi rst look at the kinds of inquiry we all do each day.

Personal Human InquiryEveryday human inquiry draws on personal experi-ence and secondhand authority.

Most of us would like to be able to predict how things are going to be for us in the future. We seem quite willing, moreover, to undertake this task using causal and probabilistic reasoning. First, we generally recognize that future circum-stances are somehow caused or conditioned by present ones. For example, we learn that get-ting an education will affect what kind of job we have later in life and that running stoplights may result in an unhappy encounter with an alert traffi c offi cer. As students, we learn that

according to a research study in Kansas City (Van Kirk 1977). And criminal investigation by police detectives rarely resulted in an arrest (Greenwood 1975).

We mention these examples not to attack routine law enforcement practices but to show that systematic research on policing has illus-trated how traditional beliefs—as examples of agreement reality— can be misleading. Simply increasing the number of police offi cers on pa-trol does not reduce crime because police patrol often lacks direction. Faster response time to calls for police assistance does not increase ar-rests because there is often a long delay between the time when a crime occurs and when it is re-ported to police. Clever detective work seldom solves crimes because investigators get most of their information from reports prepared by pa-trol offi cers, who in turn get their information from victims and witnesses.

Traditional beliefs about patrol effective-ness, response time, and detective work are ex-amples of agreement reality. In contrast, the re-search projects that produced alternative views about each law enforcement practice represent experiential reality. These studies are exam-ples of empirical research, the production of knowledge based on experience or observation. In each case, researchers conducted studies of police practices and based their conclusions on observations and experience. Empirical re-search is a way of learning about crime and criminal justice, and explaining how to con-duct empirical research is the purpose of this book.

In focusing on empirical research, we do not intend to downplay the importance of other ways of knowing things. Law students are trained in how to interpret statutes and judicial opinions. Historians take courses on methods of historical interpretation, mathematics ma-jors learn numerical analysis, and students of philosophy study logic. If you are a criminal justice major, many of the other courses you take—say, a course on theories of crime and de-viance—will add to your agreement reality.

Chapter 1 Criminal Justice and Scientifi c Inquiry 7

jumping-off point for the development of more knowledge.

AuthorityDespite the power of tradition, new knowledge appears every day. Throughout life we learn about new discoveries and understandings from others. However, our acceptance of this new knowledge often depends on the status of the discoverer. For example, you are more likely to believe a judge who declares that your next traffi c violation will result in a suspension of your driver’s license than your parents when they say the same thing.

Like tradition, authority can both help and hinder human inquiry. We do well to trust the judgment of individuals who have special train-ing, expertise, and credentials in a matter, es-pecially in the face of contradictory arguments on a given question. At the same time, inquiry can be greatly hindered by the legitimate au-thorities who err within their own special prov-ince. Biologists, after all, do make mistakes in the fi eld of biology, and biological knowledge changes over time. Criminal justice research sometimes yields mistaken results, and we are wise to not uncritically accept research fi nd-ings only because they come from experts. The box titled “Arrest and Domestic Violence” illus-trates the problems that can result when crimi-nal justice policy makers accept too quickly the results from criminal justice research.

Inquiry is also hindered when we depend on the authority of experts speaking outside their realm of expertise. Consider a political or religious leader, lacking any biochemical exper-tise, who declares marijuana to be a dangerous drug. The advertising industry plays heavily on this misleading use of authority by having pop-ular athletes discuss the value of various sports drinks and having movie stars evaluate the per-formance of automobiles.

Both tradition and authority, then, are double-edged swords in the search for knowl-edge about the world. Simply put, they provide us with a starting point for our own inquiry,

studying hard will result in better examination grades.

Second, we recognize that such patterns of cause and effect are probabilistic in nature: the effects occur more often when the causes occur than when the causes are absent—but not al-ways. Thus, as students, we learn that studying hard produces good grades in most instances, but not every time. We recognize the danger of ignoring stoplights without believing that ev-ery such violation will produce a traffi c ticket.

The concepts of causality and probability play a prominent role in this book. Science makes causality and probability more explicit and pro-vides techniques for dealing with them more rigorously than does casual human inquiry.

However, our attempts to learn about the world are only partly linked to personal inquiry and direct experience. Another, much larger, part comes from the agreed-on knowledge that others give us. This agreement reality both as-sists and hinders our attempts to fi nd out things for ourselves. Two important sources of agreement reality—tradition and authority—deserve brief consideration here.

TraditionEach of us is born into and inherits a culture made up, in part, of fi rmly accepted knowledge about the workings of the world. We may learn from others that planting corn in the spring will result in the greatest assistance from the gods, that the circumference of a circle is ap-proximately 3.14 times its diameter, or that driving on the left side of the road (in the United States) is dangerous. We may test a few of these “truths” on our own, but we simply ac-cept the great majority of them. These are the things that “everybody knows.”

Tradition, in this sense, has some clear ad-vantages for human inquiry. By accepting what everybody knows, we are spared the over-whelming task of starting from scratch in our search for regularities and understanding. Knowledge is cumulative, and an inherited body of information and understanding is the

8 Part One An Introduction to Criminal Justice Inquiry

In contrast to casual human inquiry, scien-tifi c observation is a conscious activity. Sim-ply making observations in a more deliberate way helps to reduce error. If you had gone to the fi rst class meeting with a conscious plan to observe and record what your instructor was wearing, you’d have increased your chances of accuracy.

In many cases, using both simple and com-plex measurement devices helps to guard against inaccurate observations. Suppose that you had taken color photographs of your in-structor on the fi rst day. The photos would have added a degree of precision well beyond that provided by unassisted human memory.

OvergeneralizationWhen we look for patterns among the specifi c things we observe around us, we often assume that a few similar events are evidence of a gen-eral pattern. The tendency to overgeneralize is probably greatest when there is pressure to reach a general understanding, yet overgeneral-ization also occurs in the absence of pressure.

but they may lead us to start at the wrong point or push us in the wrong direction.

Errors in Personal Human InquiryEveryday personal human inquiry reveals a number of potential biases.

Aside from the potential dangers of relying on tradition and authority, we often stumble when we set out to learn for ourselves. Let’s consider some of the common errors we make in our own casual inquiries and then look at the ways sci-ence provides safeguards against those errors.

Inaccurate ObservationThe keystone of inquiry is observation. But quite frequently we fail to observe things right in front of us or mistakenly observe things that aren’t so. Do you recall what your instruc-tor was wearing on the fi rst day of this class? If you had to guess now, what are the chances you would be right?

ARREST ANDDOMESTIC VIOLENCE

In 1983, preliminary results were re-leased from a study on the deterrent effects of arrest in cases of domestic violence. The study re-ported that male abusers who were arrested were less likely to commit future assaults than offenders who were not arrested. Conducted by researchers from the Police Foundation, the study used rigor-ous experimental methods adapted from the nat-ural sciences. Criminal justice scholars generally agreed that the research was well designed and ex-ecuted. Public offi cials were quick to embrace the study’s fi ndings that arresting domestic violence offenders deterred them from future violence.

Here, at last, was empirical evidence to sup-port an effective policy in combating domestic assaults. Results of the Minneapolis Domestic Violence Experiment were widely disseminated, in

part because of aggressive efforts by the research-ers to publicize their fi ndings (Sherman and Cohn 1989). The attorney general of the United States recommended that police departments make ar-rests in all cases of misdemeanor domestic vio-lence. Within fi ve years, more than 80 percent of law enforcement agencies in U.S. cities adopted arrest as the preferred way of responding to do-mestic assaults (Sherman 1992, 2).

Several things contributed to the rapid adop-tion of arrest policies to deter domestic violence. First, the experimental study was conducted care-fully by highly respected researchers. Second, results were widely publicized in newspapers, in professional journals, and on television programs. Third, offi cials could understand the study, and most believed that its fi ndings made sense. Finally, mandating arrest in less serious cases of domestic violence was a straightforward and politically at-tractive approach to a growing problem.

Chapter 1 Criminal Justice and Scientifi c Inquiry 9

tion means repeating a study, checking to see whether similar results are obtained each time. The study may also be repeated under slightly different conditions or in different locations. The box titled “Arrest and Domestic Violence” describes an example of why replication can be especially important in applied research.

Selective ObservationAnother danger of overgeneralization is that it may lead to selective observation. Once we have concluded that a particular pattern exists and have developed a general understanding of why, we will be tempted to pay attention to future events and situations that correspond with the pattern and to ignore those that don’t. Racial, ethnic, and other prejudices are reinforced by selective observation.

Research plans often specify in advance the number and kind of observations to be made as a basis for reaching a conclusion. For exam-ple, if we wanted to learn whether women were more likely than men to support long prison sentences for sex offenders, we would have to

Whenever overgeneralization does occur, it can misdirect or impede inquiry.

Imagine you are a rookie police offi cer newly assigned to foot patrol in an urban neighbor-hood. Your sergeant wants to meet with you at the end of your shift to discuss what you think are the major law enforcement problems on the beat. Eager to earn favor with your supervisor, you interview the manager of a popular store in a small shopping area. If the manager mentions vandalism as the biggest concern, you might report that vandalism is the main problem on your beat, even though other business owners and area residents believe that drug dealing contributes to the neighborhood problems of burglary, street robbery, and vandalism. Over-generalization leads to misrepresentation and simplifi cation of the problems on your beat.

Criminal justice researchers guard against overgeneralization by committing themselves in advance to a suffi ciently large sample of obser-vations and by being attentive to how represen-tative those observations are. The replicationof inquiry provides another safeguard. Replica-

Sherman and Berk (1984), however, urged caution in uncritically embracing the results of their study. Others urged that similar research be conducted in other cities to check on the Min-neapolis fi ndings (Lempert 1984). Recognizing the need for more research, the U.S. National Institute of Justice sponsored more experiments—known as replications—in six other cities. Not everyone was happy about the new studies. For example, a feminist group in Milwaukee opposed the replication in that city because it believed that the effectiveness of arrest had already been proved (Sherman and Cohn 1989, 138).

Results from the replication studies brought into question the effectiveness of arrest policies. In three cities, no deterrent effect was found in police records of domestic violence. In other cit-ies, there was no evidence of deterrence for lon-ger periods (6 to 12 months), and in three cities researchers found that violence actually escalated

when offenders were arrested (Sherman 1992, 30). For example, Sherman and associates (1992, 167) report that in Milwaukee “the initial deter-rent effects observed for up to thirty days quickly disappear. By one year later [arrests] produce an escalation effect.” Arrest works in some cases but not in others. In responding to domestic assaults, as in many other cases, it’s important to carefully consider the characteristics of offenders and the nature of the relationship between offender and victim.

After police departments throughout the country embraced arrest policies following the Minneapolis study, researchers were faced with the diffi cult task of explaining why initial results must be qualifi ed. Arrest seemed to make sense; offi cials and the general public believed what they read in the papers and saw on television. Chang-ing their minds by reporting complex fi ndings was more diffi cult.

10 Part One An Introduction to Criminal Justice Inquiry

bias in police practices and sentencing policies. Ideological or political views on such issues can undermine objectivity in the research process. Criminal justice professionals may have par-ticular diffi culty separating ideology and poli-tics from a more detached, scientifi c study of crime.

Criminologist Samuel Walker (1994, 16) compares ideological bias in criminal justice research to theology: “The basic problem . . . is that faith triumphs over facts. For both liber-als and conservatives, certain ideas are unchal-lenged articles of faith, almost like religious be-liefs that remain unshaken by empirical facts.”

Most of us have our own beliefs about public policy, including policies for dealing with crime. The danger lies in allowing such beliefs to distort how research problems are de-fi ned and how research results are interpreted. The scientifi c approach to the study of crime and criminal justice policy guards against, but does not prevent, ideology and theology color-ing the research process. In empirical research, so-called articles of faith are compared with experience.

To Err Is HumanWe have seen some of the ways that we can go astray in our attempts to know and understand the world and some of the ways that science protects its inquiries from these pitfalls. Social science differs from our casual, day-to-day in-quiry in two important respects. First, social sci-entifi c inquiry is a conscious activity. Although we engage in continual observation in daily life, much of it is unconscious or semiconscious. In social scientifi c inquiry, we make a conscious decision to observe, and we stay alert while we do it. Second, social scientifi c inquiry is a more careful process than our casual efforts; we are more wary of making mistakes and take special precautions to avoid doing so.

Do social scientifi c research methods offer total protection against the errors that people commit in personal inquiry? No. Not only do individuals make every kind of error we’ve looked at, but social scientists as a group also

make a specifi ed number of observations on that question. We might select a thousand peo-ple to be interviewed. Even if the fi rst 10 women supported long sentences and the fi rst 10 men opposed them, we would continue to interview everyone selected for the study and record each observation. We would base our conclusion on an analysis of all the observations, not just those fi rst 20.

Illogical ReasoningPeople have various ways of handling observa-tions that contradict their judgments about the way things are. Surely one of the most remark-able creations of the human mind is the maxim about the exception that proves the rule, an idea that makes no sense at all. An exception can draw attention to a rule or to a supposed rule, but in no system of logic can it prove the rule it contradicts. Yet we often use this pithy saying to brush away contradictions with a simple stroke of illogic.

What statisticians call the gambler’s fallacy is another illustration of illogic in day-to-day rea-soning. According to this fallacy, a consistent run of good or bad luck is presumed to fore-shadow its opposite. An evening of bad luck at poker may kindle the belief that a winning hand is just around the corner; many a poker player has stayed in a game too long because of that mistaken belief. Conversely, an extended period of good weather may lead us to worry that it is certain to rain on our weekend picnic.

Although we all sometimes use embarrass-ingly illogical reasoning, scientists avoid this pitfall by using systems of logic consciously and explicitly. Chapters 2 and 4 examine the logic of science in more depth.

Ideology and PoliticsCrime is, of course, an important social prob-lem, and a great deal of controversy surrounds policies for dealing with crime. Many people feel strongly one way or another about the death penalty, gun control, and long prison terms for drug users as approaches to reducing crime. There is ongoing concern about racial

Chapter 1 Criminal Justice and Scientifi c Inquiry 11

more or less universal principles. Barney Gla-ser and Anselm Strauss (1967) coined the term grounded theory to describe this method of theory construction. Field research—the direct observation of events in progress—is frequently used to develop theories, or survey research may reveal patterns of attitudes that suggest particular theoretical explanations.

Once developed, theories provide general statements about social life that are used to guide research. For example, routine activ-ity theory states that crimes are more likely to occur when a motivated offender encoun-ters a suitable victim in the absence of a capa-ble guardian (Cohen and Felson 1979). Mike Townsley and associates used routine activity theory to guide their research on “contagious” burglaries (Townsley, Homel, and Chaseling 2003). They argued that once burglars struck one house in a neighborhood, they were more likely to break into nearby houses because they had become more familiar with an area and its potential targets. The research results were generally consistent with these expectations—burglary tended to cluster around houses of similar type in a neighborhood.

Townsley and associates used routine activ-ity theory to generate a hypothesis about pat-terns of burglary. A hypothesis is a specifi ed expectation about empirical reality. Taking a different example, a theory might contain the hypothesis “Working-class youths have higher delinquency rates than upper-class youths.” Such a hypothesis could then be tested through research.

Drawing on theories to generate hypotheses that are tested through research is the tradi-tional image of science, illustrated in Figure 1.2 on page 14. Here we see the researcher beginning with an interest in something or an idea about it. Next comes the development of a theoretical understanding of how a number of concepts, represented by the letters A, B, C, and so on, may be related to each other. The theoretical considerations result in a hypothesis, or an expectation about the way things would be in the world if the theoretical expectations were

succumb to the pitfalls and stay trapped for long periods.

Foundations of Social ScienceSocial scientifi c inquiry generates knowledge through logic and observation.

The two pillars of science are (1) logic, or ratio-nality, and (2) observation. A scientifi c under-standing of the world must make sense and must agree with what we observe. Both of these elements are essential to social science and re-late to three key aspects of the overall scientifi c enterprise: theory, data collection, and data analysis.

As a broad generalization, scientifi c theory deals with the logical aspect of science, data collection deals with the observational aspect, and data analysis looks for patterns in what is observed. This book focuses mainly on issues re-lated to data collection— demonstrating how to conduct empirical research—but social science involves all three elements. With this in mind, the theoretical context of designing and execut-ing research is an important part of the overall process. Chapter 11 presents a conceptual intro-duction to the statistical analysis of data. Figure 1.1 offers a schematic view of how the book ad-dresses these three aspects of social science.

Let’s turn now to some of the fundamen-tal things that distinguish social science from other ways of looking at social phenomena.

Theory, Not Philosophy or BeliefSocial scientifi c theory has to do with what is, not what should be. A theory is a systematic ex-planation for the observed facts and laws that relate to a particular aspect of life—juvenile de-linquency, for example, or perhaps social strati-fi cation or political revolution. Joseph Maxwell (2005, 42) defi nes theory as “a set of concepts and the proposed relationships among these, a structure that is intended to represent or model something about the world.”

Often, social scientists begin constructing a theory by observing aspects of social life, seeking to discover patterns that may point to

12 Part One An Introduction to Criminal Justice Inquiry

Observation

Chapters 7–10

Planning to doresearch

Chapters 3–5

Sampling

Chapter 6

DATA COLLECTION

DATA ANALYSIS

Chapter 11

THEORY

Mediaexposure

Victimizationexperience

Personalcommunication

networks

Personaland household

vulnerability

Fear of crime

Knowledgeof events

Knowledgeof victims

Neighborhoodconditions

34% 78%

66% 22%

x

y

y

xY = a + x1 + x2 + x3 + x4 + e

a

cd g

b

Figure 1.1 Social Science � Theory � Data Collection � Data Analysis

Chapter 1 Criminal Justice and Scientifi c Inquiry 13

What about Exceptions?The objection that there are always exceptions to any social regularity misses the point. The existence of exceptions does not invalidate the existence of regularities. Thus it is not impor-tant that a particular police offi cer earns more money than a particular judge if, overall, judges earn more than police offi cers. The pattern still exists. Social regularities represent probabilistic patterns, and a general pattern does not have to be refl ected in 100 percent of the observable cases to be a pattern.

This rule applies in the physical as well as the social sciences. In genetics, the mating of a blue-eyed person with a brown-eyed person will probably result in brown-eyed offspring. The birth of a blue-eyed child does not chal-lenge the observed regularity, however. Rather, the geneticist states only that brown-eyed off-spring are more likely and, furthermore, that a brown-eyed offspring will be born in only a certain percentage of cases. The social scientist makes a similar, probabilistic prediction, that women overall are less likely to murder any-body, but when they do, their victims are most often males.

Aggregates, Not IndividualsSocial scientists primarily study social patterns rather than individual ones. All regular patterns refl ect the aggregate, or combined, actions and situations of many individuals. Although social scientists study motivations that affect individ-uals, aggregates are more often the subject of social scientifi c research.

A focus on aggregate patterns rather than on individuals distinguishes the activities of criminal justice researchers from the daily rou-tines of many criminal justice practitioners. Consider the task of processing and classifying individuals newly admitted to a correctional facility. Prison staff administer psychologi-cal tests and review the prior record of each new inmate to determine security risks, pro-gram needs, and job options. A researcher who is studying whether white inmates tend to be

correct. The notation Y � f (X ) is a conventional way of saying that Y (for example, auto theft rate) is a function of or is in some way caused by X (for example, availability of off-street park-ing). At that level, however, X and Y have gen-eral rather than specifi c meanings.

In the operationalization process, general concepts are translated into specifi c indicators. Thus the lowercase x is a concrete indicator of capital X. As an example, census data on the number of housing units that have garages (x)are a concrete indicator of off-street parking (X ).This operationalization process results in the formation of a testable hypothesis: is the rate of auto theft higher in areas where fewer housing units have garages? Observations aimed at fi nd-ing out are part of what is typically called hy-pothesis testing. We consider the logic of hy-pothesis testing more fully in Chapters 3 and 5.

RegularitiesUltimately, social scientifi c theory aims to fi nd patterns of regularity in social life. This as-sumes, of course, that life is regular, not chaotic or random. That assumption applies to all sci-ence, but it is sometimes a barrier for people when they fi rst approach social science.

A vast number of norms and rules in soci-ety create regularity. Only persons who have reached a certain age may obtain a driver’s li-cense. In the National Hockey League, only men participate on the ice. Such informal and formal prescriptions regulate, or regularize, so-cial behavior.

In addition to regularities produced by norms and rules, social science is able to identify other types of regularities. For example, teen-agers commit more crimes than middle-aged people. When males commit murder, they usu-ally kill another male, but female murderers more often kill a male. On average, white urban residents view police more favorably than non-whites do. Judges receive higher salaries than police offi cers. Probation offi cers have more empathy for the people they supervise than prison guards do.

14 Part One An Introduction to Criminal Justice Inquiry

That’s just the way we think. Suppose someone says to you, “Women are too soft-hearted and weak to be police offi cers.” You are likely to hear that comment in terms of what you know about the speaker. If it’s your old Uncle Albert, who, you recall, is also strongly opposed to daylight saving time, zip codes, and computers, you are likely to think his latest pronouncement simply fi ts into his dated views about things in general. If the statement comes from a candidate for sheriff who is trailing a female challenger and who has begun making other statements about women being unfi t for public offi ce, you may hear his latest comment in the context of this political challenge.

In both of these examples, you are trying to understand the thoughts of a particular, con-crete individual. In social science, however, we go beyond that level of understanding to seek insights into classes or types of individuals. In the two preceding examples, we might use terms

assigned to more desirable jobs than nonwhite inmates would be more interested in pat-terns of job assignment. The focus would be on aggregates of white and nonwhite persons rather than the assignment for any particular individual.

Social scientifi c theories, then, typically deal with aggregate, not individual, behavior. Their purpose is to explain why aggregate patterns of behavior are so regular even when the indi-viduals who perform them change over time. In another important sense, social science doesn’t seek to explain people. Rather, it seeks to un-derstand the systems within which people op-erate, the systems that explain why people do what they do. The elements in such systems are not people but variables.

A Variable LanguageOur natural attempts at understanding usually take place at the concrete, idiosyncratic level.

Figure 1.2 The Traditional Image of Science

Idea/Interest

THEORETICAL UNDERSTANDING

HYPOTHESIS

Y = f(X )

y = f(x)

[Operationalization]

[Hypothesis testing]

?

AB

G

D

CF

ZHJ

I L

X Y

E

K

Chapter 1 Criminal Justice and Scientifi c Inquiry 15

employed, and intoxicated. Any quality we might use to describe ourselves or someone else is an attribute.

Variables are logical groupings of attri-butes. Male and female are attributes, and genderis the variable composed of the logical grouping of those two attributes. The variable occupationis composed of attributes such as dentist, pro-fessor, and truck driver. Prior record is a variable composed of a set of attributes such as priorconvictions, prior arrests without convictions, and noprior arrests. It’s helpful to think of attributes as the categories that make up a variable. See Figure 1.3 for a schematic view of what social scientists mean by variables and attributes. Panel A lists both variables and attributes, mix-ing them together. Panel B separates these con-cepts to distinguish variables from attributes. Panel C presents variables together with the at-tributes they carry.

The relationship between attributes and variables lies at the heart of both description and explanation in science. We might describe a prosecutor’s offi ce in terms of the variable gen-der by reporting the observed frequencies of the attributes male and female: “The offi ce staff is 60 percent men and 40 percent women.” An incarceration rate can be thought of as a de-scription of the variable incarceration status of a state’s population in terms of the attributes incarcerated and not incarcerated. Even the report of family income for a city is a summary of at-tributes composing the income variable: $27,124,$44,980, $76,000, and so forth.

The relationship between attributes and variables becomes more complicated as we try to explain how concepts are related to each other. Here’s a simple example involving two variables: type of defense attorney and sentence. For the sake of simplicity, let’s assume that the variable defense attorney has only two attributes: private attorney and public defender. Similarly, let’s give the variable sentence two attributes: probation and prison.

Now let’s suppose that 90 percent of people represented by public defenders are sentenced to prison and the other 10 percent are sentenced

like old-fashioned or bigoted to describe the per-son who made the comment. In other words, we try to identify the actual individual with some set of similar individuals, and that identifi ca-tion operates on the basis of abstract concepts.

One implication of this approach is that it enables us to make sense out of more than one person. In understanding what makes the big-oted candidate think the way he does, we can also learn about other people who are like him. This is possible because we have not been studying bigots as much as we have been studying bigotry.

Bigotry is considered a variable in this case because the level of bigotry varies; that is, some people in an observed group are more bigoted than others. Social scientists may be interested in understanding the system of variables that causes bigotry to be high in one instance and low in another. However, bigotry is not the only vari-able here. Gender, age, and economic status also vary among members of the observed group.

Here’s another example. Consider the prob-lem of whether police should make arrests in cases of domestic violence. The object of a po-lice offi cer’s attention in handling a domestic assault is the individual case. Of course, each case includes a victim and an offender, and po-lice are concerned with preventing further harm to the victim. The offi cer must decide whether to arrest an assailant or to take some other ac-tion. The criminal justice researcher’s subject matter is different: does arrest as a general pol-icy prevent future assaults? The researcher may study an individual case (victim and offender), but that case is relevant only as a situation in which an arrest policy might be invoked, which is what the researcher is really studying.

Variables and AttributesSocial scientists study variables and the at-tributes that compose them. Social scientifi c theories are written in a variable language, and people get involved mostly as the carriers of those variables.

Attributes are characteristics or qualities that describe some object, such as a person. Examples are bigoted, old-fashioned, married, un-

16 Part One An Introduction to Criminal Justice Inquiry

we bet on your ability to guess whether a per-son is sentenced to prison or probation. We’ll pick the people one at a time (not telling you which ones we’ve picked), and you have to guess which sentence each person receives. We’ll do it for all 20 people in Figure 1.4A. Your best strategy in this case is to always guess prison because 12 out of the 20 people are categorized that way. You’ll get 12 right and 8 wrong, for a net success score of 4.

Now suppose that we pick a person from Figure 1.4A and we have to tell you whether the person has a private attorney or a public de-fender. Your best strategy now is to guess prison for each person with a public defender and pro-

to probation. And let’s suppose that 30 percent of people with private attorneys go to prison and the other 70 percent receive probation. This is shown visually in Figure 1.4A.

Figure 1.4A illustrates a relationship be-tween the variables defense attorney and sentence.This relationship can be seen by the pairings of attributes on the two variables. There are two predominant pairings: (1) persons represented by private attorneys who are sentenced to pro-bation and (2) persons represented by public defenders who are sentenced to prison. But there are two other useful ways of viewing that relationship.

First, imagine that we play a game in which

A. B.

C.

Figure 1.3 Variables and Attributes

Two Different Kinds of Concepts

Variables Attributes

Gender FemaleSentence ProbationProperty crime Auto theftAge Middle-agedOccupation Thief

Some Common Criminal Justice Concepts

FemaleProbation

ThiefGender

SentenceProperty crimeMiddle-aged

AgeAuto theft

Occupation

The Relationship between Variables and Attributes

Variables Attributes

Gender Female, male

Age Young, middle-aged, old

Sentence Fine, prison, probation

Property crime Auto theft, burglary, larceny

Occupation Judge, lawyer, thief

Chapter 1 Criminal Justice and Scientifi c Inquiry 17

in Figure 1.4B. Notice that half the people have private attorneys and half have public defend-ers. Also notice that 12 of the 20 (60 percent) are sentenced to prison— 6 who have private at-torneys and 6 who have public defenders. The equal distribution of those sentenced to proba-tion and those sentenced to prison, no matter what type of defense attorney each person had, allows us to conclude that the two variables are unrelated. Here, knowing what type of attorney a person had would not be of any value to you in guessing whether that person was sentenced to prison or probation.

bation for each person represented by a private attorney. If you follow that strategy, you will get 16 right and 4 wrong. Your improvement in guessing the sentence on the basis of knowing the type of defense attorney illustrates what it means to say that the variables are related. You would have made a probabilistic statement on the basis of some empirical observations about the relationship between type of lawyer and type of sentence.

Second, let’s consider how the 20 people would be distributed if type of defense attorney and sentence were unrelated. This is illustrated

A. Defendants represented by public defenders are sentenced to prison more often than those represented by private attorneys.

Prison

Probation

Public DefenderPrivate Attorney

B. There is no relationship between type of attorney and sentence.

Prison

Probation

Public DefenderPrivate Attorney

SENTENCE

SENTENCE

DEFENSE ATTORNEY

DEFENSE ATTORNEY

Figure 1.4 Illustration of Relationships between Two Variables

18 Part One An Introduction to Criminal Justice Inquiry

on what we know about each. For example, we know that private attorneys tend to be more experienced than public defenders. Many law school graduates gain a few years of experience as public defenders before they enter private practice. Logically, then, we would expect the more experienced private attorneys to be bet-ter able to get more lenient sentences for their clients. We might explore this question directly by examining the relationship between attorney experience and sentence, perhaps comparing inexperienced public defenders with public de-fenders who have been working for a few years. Pursuing this line of reasoning, we could also compare experienced private attorneys with pri-vate attorneys fresh out of law school.

Notice that the theory has to do with the variables defense attorney, sentence, and years ofexperience, not with individual people per se. People are the carriers of those variables. We study the relationship between the variables by observing people. Ultimately, however, the the-ory is constructed in terms of variables. It de-scribes the associations that might logically be expected to exist between particular attributes of different variables.

Purposes of ResearchWe conduct criminal justice research to serve vari-ous purposes.

Criminal justice research, of course, serves many purposes. Explaining associations between two or more variables is one of those purposes; oth-ers include exploration, description, and appli-cation. Although a given study can have several purposes, it is useful to examine them sepa-rately because each has different implications for other aspects of research design.

ExplorationMuch research in criminal justice is conducted to explore a specifi c problem. A researcher or offi cial may be interested in a crime or criminal justice policy issue about which little is known. Or perhaps an innovative approach to policing,

Variables and RelationshipsWe will look more closely at the nature of the re-lationships between variables later in this book. For now, let’s consider some basic observations about variables and relationships that illustrate the logic of social scientifi c theories and their use in criminal justice research.

Theories describe relationships that might logically be expected among variables. This ex-pectation often involves the notion of causa-tion: a person’s attributes on one variable are expected to cause or encourage a particular at-tribute on another variable. In the example just given, having a private attorney or a public de-fender seemed to cause a person to be sentenced to probation or prison, respectively. Apparently there is something about having a public de-fender that leads people to be sentenced to prison more often than if they are represented by a private attorney.

Type of defense attorney and sentence are ex-amples of independent and dependent variables, respectively. In this example, we assume that criminal sentences are determined or caused by something; the type of sentence depends on something and so is called the dependent vari-able. The dependent variable depends on an independent variable; in this case, sentence de-pends on type of defense attorney.

Notice, at the same time, that type of defense attorney might be found to depend on some-thing else— our subjects’ employment status, for example. People who have full-time jobs are more likely to be represented by private at-torneys than those who are unemployed. In this latter relationship, the type of attorney is the de-pendent variable, and the subject’s employment status is the independent variable. In cause-and-effect terms, the independent variable is the cause and the dependent variable is the effect.

How does this relate to theory? Our discus-sion of Figure 1.4 involved the interpretation of data. We looked at the distribution of the 20 people in terms of the two variables. In con-structing a theory, we form an expectation about the relationship between the two variables based

Chapter 1 Criminal Justice and Scientifi c Inquiry 19

or public offi cial observes and then describes what was observed. Criminal justice observa-tion and description, methods grounded in the social sciences, tend to be more accurate than the casual observations people may make about how much crime there is or how violent teen-agers are today. Descriptive studies are often concerned with counting or documenting ob-servations; exploratory studies focus more on developing a preliminary understanding about a new or unusual problem.

Descriptive studies are frequently conducted in criminal justice. The FBI has compiled Uni-form Crime Reports (UCR) since 1930. UCR data are routinely reported in newspapers and widely interpreted as accurately describing crime in the United States. For example, 2006 UCR fi gures (Federal Bureau of Investigation 2007) showed that Nevada had the highest rate of auto theft (1080.4 per 100,000 residents) in the nation and South Dakota had the lowest (91.8 per 100,000 residents).

Descriptive studies in criminal justice have other uses. A researcher may attend meetings of neighborhood anticrime groups and observe their efforts to organize block watch commit-tees. These observations form the basis for a case study that describes the activities of neigh-borhood anticrime groups. Such a descriptive study might present information that offi cials and residents of other cities can use to promote such organizations themselves. Or consider research by Richard Wright and Scott Decker (1994), in which they describe in detail how burglars search for and select targets, how they gain entry into residences, and how they dis-pose of the goods they steal.

ExplanationA third general purpose of criminal justice re-search is to explain things. Recall our earlier ex-ample, in which we sought to explain the rela-tionship between type of attorney and sentence length. Reporting that urban residents have generally favorable attitudes toward police is a descriptive activity, but reporting why some

court management, or corrections has been tried in some jurisdiction, and the researcher wishes to determine how common such prac-tices are in other cities or states. An exploratory project might collect data on a measure to es-tablish a baseline with which future changes will be compared.

For example, heightened concern over drug use might prompt efforts to estimate the level of drug abuse in the United States. How many people are arrested for drug sales or possession each year? How many high school seniors report using marijuana in the past week or the past month? How many hours per day do drug deal-ers work, and how much money do they make? These are examples of research questions in-tended to explore different aspects of the prob-lem of drug abuse. Exploratory questions may also be formulated in connection with criminal justice responses to drug problems. How many cities have created special police or prosecutor task forces to crack down on drug sales? What sentences are imposed on major dealers or on casual users? How much money is spent on treatment for drug users? What options exist for treating different types of addiction?

Exploratory studies are also appropriate when a policy change is being considered. One of the fi rst questions public offi cials typically ask when they consider some new policy is how other cities (or states) have handled this problem.

Exploratory research in criminal justice can be simple or complex, using a variety of meth-ods. A mayor anxious to learn about drug ar-rests in his or her city might simply phone the police chief and request a report. Estimating how many high school seniors have used mari-juana requires more sophisticated survey meth-ods. Since the early 1970s, the National Insti-tute on Drug Abuse has supported nationwide surveys of students regarding drug use.

DescriptionA key purpose of many criminal justice studies is to describe the scope of the crime problem or policy responses to the problem. A researcher

20 Part One An Introduction to Criminal Justice Inquiry

Rather than observing and analyzing current or past behavior, policy analysis tries to anticipate the future consequences of alternative actions.

Similarly, justice organizations are increas-ingly using techniques of problem analysis to study patterns of cases and devise appropriate responses. Problem-oriented policing is perhaps the best-known example, in which crime ana-lysts work with police and other organizations to examine recurring problems. Ron Clarke and John Eck (2005) have prepared a comprehensive guide for this type of applied research.

Our brief discussion of distinct research purposes is not intended to imply that research purposes are mutually exclusive. Many criminal justice studies have elements of more than one purpose. Suppose you want to evaluate a new program to reduce bicycle theft at your univer-sity. First, you need some information that de-scribes the problem of bicycle theft on campus. Let’s assume your evaluation fi nds that thefts from some campus locations have declined but that there was an increase in bikes stolen from racks outside dormitories. You might explain these fi ndings by noting that bicycles parked outside dorms tend to be unused for longer pe-riods and that there is more coming and going among bikes parked near classrooms. One op-tion to further reduce thefts would be to pur-chase more secure bicycle racks. A policy anal-ysis might compare the costs of installing the racks with the predicted savings resulting from a reduction in bike theft.

Differing Avenues for InquirySocial scientifi c research is conducted in a variety of ways.

There is no one way of doing criminal justice research. If there were, this would be a much shorter book. In fact, much of the power and potential of social scientifi c research lies in the many valid approaches it comprises.

Three broad and interrelated distinctions underlie many of the variations of social scien-tifi c research: (1) idiographic and nomothetic

people believe that police are doing a good job while other people do not is an explanatory ac-tivity. Similarly, reporting why Nevada has the highest auto-theft rate in the nation is expla-nation; simply reporting auto-theft rates for different states is description. A researcher has an explanatory purpose if he or she wishes to know why the number of 14-year-olds involved in gangs has increased, as opposed to simply describing changes in gang membership.

ApplicationResearchers also conduct criminal justice stud-ies of an applied nature. Applied research stems from a need for specifi c facts and fi ndings with policy implications. Another purpose of crimi-nal justice research, therefore, is its application to public policy. We can distinguish two types of applied research: evaluation and policy/problem analysis.

Applied research is often used to evaluate the effects of specifi c criminal justice programs. Determining whether a program designed to reduce burglary actually had the intended effect is an example of evaluation. In its most basic form, evaluation involves comparing the goals of a program with the results. If one goal of in-creased police foot patrol is to reduce fear of crime, then an evaluation of foot patrol might compare levels of fear before and after increas-ing the number of police offi cers on the beat on foot. In most cases, evaluation research uses social scientifi c methods to test the results of a program or policy change.

The second type of applied research is pol-icy and problem analysis. What would hap-pen to court backlogs if we designated a judge and prosecutor who would handle only drug-dealing cases? How many new police offi cers would have to be hired if a department shifted to community policing? These are examples of what if questions addressed by policy analysis. Answering such questions is sort of a counter-part to program evaluation. Policy analysis is different from other forms of criminal justice research primarily in its focus on future events.

Chapter 1 Criminal Justice and Scientifi c Inquiry 21

effi ciently, using only one or just a few explana-tory factors. Finally, it settles for a partial rather than a full explanation of a type of situation.

In each of the preceding nomothetic ex-amples, you might qualify your causal state-ments with phrases such as “on the whole” or “usually.” You usually do better on exams when you’ve studied in a group, but there have been exceptions. Your team has won some games on the road and lost some at home. And last week you got a speeding ticket on the way to Tuesday’s chemistry class, but you did not get one over the weekend. Such exceptions are an acceptable price to pay for a broader range of overall explanation.

Both idiographic and nomothetic ap-proaches to understanding can be useful in daily life. They are also powerful tools for crim-inal justice research. The researcher who seeks an exhaustive understanding of the inner work-ings of a particular juvenile gang or the rulings of a specifi c judge is engaging in idiographic re-search. The aim is to understand that particu-lar group or individual as fully as possible.

Rick Brown and Ron Clarke (2004) sought to understand thefts of a particular model of Nissan trucks in the south of England. Most stolen trucks were never recovered. Their re-search led Brown and Clarke to a shipping yard where trucks were taken apart and shipped to ports in France and Nigeria as scrap metal. They later learned that trucks were reassembled and sold to individuals and small companies. In the course of their research, they linked most thieves in England and most resellers abroad to legitimate shipping and scrap metal businesses. Even though Brown and Clarke sought answers to the idiosyncratic problem of stolen trucks in one region of England, they came to some ten-tative conclusions about loosely organized in-ternational theft rings.

Sometimes, however, the aim is a more gen-eralized understanding across a class of events, even though the level of understanding is inevi-tably more superfi cial. For example, researchers who seek to uncover the chief factors that lead to

explanations, (2) inductive and deductive rea-soning, and (3) quantitative and qualitative data. Although it is possible to see them as competing choices, a good researcher masters each of these orientations.

Idiographic and Nomothetic ExplanationsAll of us go through life explaining things; we do it every day. You explain why you did poorly or well on an exam, why your favorite team is winning or losing, and why you keep getting speeding tickets. In our everyday explanations, we engage in two distinct forms of causal rea-soning—idiographic and nomothetic expla-nation—although we do not ordinarily distin-guish them.

Sometimes we attempt to explain a single situation exhaustively. You might have done poorly on an exam because (1) you had forgot-ten there was an exam that day, (2) it was in your worst subject, (3) a traffi c jam caused you to be late to class, and (4) your roommate kept you up the night before with loud music. Given all these circumstances, it is no wonder that you did poorly on the exam.

This type of causal reasoning is idiographicexplanation. Idio in this context means “unique, separate, peculiar, or distinct,” as in the word idiosyncrasy. When we complete an idiographic explanation, we feel that we fully understand the many causes of what happened in a par-ticular instance. At the same time, the scope of our explanation is limited to the case at hand. Although parts of the idiographic explanation might apply to other situations, our intention is to explain one case fully.

Now consider a different kind of explana-tion. For example, every time you study with a group, you do better on an exam than if you study alone. Your favorite team does better at home than on the road. You get more speeding tickets on weekends than during the week. This type of explanation— called nomothetic—seeks to explain a class of situations or events rather than a single one. Moreover, it seeks to explain

22 Part One An Introduction to Criminal Justice Inquiry

or in a group? It suddenly occurs to you that you almost always do better on exams when you studied with others than when you studied alone. This is known as the inductive mode of inquiry.

Inductive reasoning (induction) moves from the specifi c to the general, from a set of particular observations to the discovery of a pattern that represents some degree of order among the varied events under examination. Notice, incidentally, that your discovery doesn’t necessarily tell you why the pattern exists—merely that it does.

There is the second, and very different, way you might reach the same conclusion about studying for exams. As you approach your fi rst set of exams in college, you might wonder about the best ways to study. You might consider how much you should review the readings and how much you should focus on your class notes. Should you study at a measured pace over time or pull an all-nighter just before the exam? Among these musings, you might ask whether you should get together with other students in the class or study on your own. You decide to evaluate the pros and cons of both options. On the one hand, studying with others might not be as effi cient because a lot of time might be spent on material you already know. Or the group might get distracted from studying. On the other hand, you can understand something even better when you’ve explained it to someone else. And other students might understand ma-terial that you’ve been having trouble with and reveal perspectives that might have escaped you.

So you add up the pros and the cons and conclude, logically, that you’d benefi t from studying with others. This seems reasonable to you in theory. To see whether it is true in prac-tice, you test your idea by studying alone for half your exams and studying with others for half. This second approach is known as the de-ductive mode of inquiry.

Deductive reasoning (deduction) moves from the general to the specifi c. It moves from a pattern that might be logically or theoreti-

juvenile delinquency are pursuing a nomothetic inquiry. They might discover that children who frequently skip school are more likely to have records of delinquency than those who attend school regularly. This explanation would extend well beyond any single juvenile, but it would do so at the expense of a complete explanation.

In contrast to the idiographic study of Nis-san truck theft, Pierre Tremblay and associ-ates (2001) explored how a theory of offending helped explain different types of offender net-works. Examining auto thefts over 25 years in Quebec, the authors concluded that different types of relationships were involved in different types of professional car theft. However, Trem-blay and associates found that persons involved in legitimate car sales and repair businesses were key members in all networks. The research-ers showed how complex relationships among people involved in legitimate and illegitimate activities helped explain patterns of car theft over a quarter-century. This is an illustration of the nomothetic approach to understanding.

Thus social scientists have access to two dis-tinct logics of explanation. We can alternate be-tween searching for broad, albeit less detailed, universals (nomothetic) and probing more deeply into more specifi c cases (idiographic).

Inductive and Deductive ReasoningThe distinction between inductive and deduc-tive reasoning exists in daily life, as well as in criminal justice research. You might take two different routes to reach the conclusion that you do better on exams if you study with oth-ers. Suppose you fi nd yourself puzzling, half-way through your college career, over why you do so well on exams sometimes and poorly at other times. You list all the exams you’ve taken, noting how well you did on each. Then you try to recall any circumstances shared by all the good exams and by all the poor ones. Do you do better on multiple-choice exams or essay exams? Morning exams or afternoon exams? Exams in the natural sciences, the humanities, or the social sciences? After you studied alone

Chapter 1 Criminal Justice and Scientifi c Inquiry 23

may tend to be a little younger than you in age but to act more mature. Or we might have been thinking of how young or old your friends look or of the variation in their life experiences, their worldliness. All these other meanings are lost in the numerical calculation of average age.

In addition to greater detail, nonnumerical observations seem to convey a greater richness of meaning than do quantifi ed data. Think of the cliché “he is older than his years.” The meaning of that expression is lost in attempts to specify how much older. In this sense, the richness of meaning is partly a function of am-biguity. If the expression meant something to you when you read it, that meaning came from your own experiences, from people you have known who might fi t the description of being older than their years.

This concept can be quantifi ed to a certain extent, however. For example, we could make a list of life experiences that contribute to what we mean by worldliness:

Getting married Getting divorced Having a parent die Seeing a murder committed Being arrested Being fi red from a job Running away with a rock band

We could quantify people’s worldliness by counting how many of these experiences they have had: the more such experiences, the more worldly we say they are. If we think that some experiences are more powerful than others, we can give those experiences more points than others. Once we decide on the specifi c experi-ences to be considered and the number of points each warrants, scoring people and comparing their worldliness is fairly straightforward.

To quantify a concept like worldliness, we must be explicit about what we mean. By focus-ing specifi cally on what we will include in our measurement of the concept, as we did here, we also exclude the other possible meanings. Inevitably, then, quantitative measures will be

cally expected to observations that test whether the expected pattern actually occurs in the real world. Notice that deduction begins with why and moves to whether, whereas induction moves in the opposite direction.

Both inductive and deductive reasoning are valid avenues for criminal justice and other social scientifi c research. Moreover, they work together to provide ever-more powerful and complete understandings.

Quantitative and Qualitative DataSimply put, the distinction between quantita-tive and qualitative data is the distinction be-tween numerical and nonnumerical data. When we say that someone is witty, we are making a qualitative assertion. When we say that that person has appeared three times in a local com-edy club, we are attempting to quantify our assessment.

Most observations are qualitative at the out-set, whether it is our experience of someone’s sense of humor, the location of a pointer on a measuring scale, or a check mark entered in a questionnaire. None of these things is inher-ently numerical. But it is often useful to convert observations to a numerical form. Quantifi ca-tion often makes our observations more explicit, makes it easier to aggregate and summarize data, and opens up the possibility of statistical analyses, ranging from simple descriptions to more complex testing of relationships between variables.

Quantifi cation requires focusing our atten-tion and specifying meaning. Suppose someone asks whether your friends tend to be older or younger than you. A quantitative answer seems easy. You think about how old each of your friends is, calculate an average, and see whether it is higher or lower than your own age. Case closed.

Or is it? Although we focused our attention on “older or younger” in terms of the number of years people have been alive, we might mean something different with that idea—for exam-ple, “maturity” or “worldliness.” Your friends

24 Part One An Introduction to Criminal Justice Inquiry

entists sometimes use tensiometers, spectro-graphs, and other such equipment for measure-ment, criminal justice researchers use a variety of techniques, examined in Part Three.

The other key to criminal justice research is interpretation. Much of interpretation is based on data analysis, which is introduced in Part Four. More generally, however, interpre-tation very much depends on how observa-tions are structured, a point we will encounter repeatedly.

As we put the pieces together—measure-ment and interpretation—we are in a position to describe, explain, or predict something. And that is what social science is all about.

✪ Main Points• Knowledge of research methods is valuable to

criminal justice professionals as consumers and producers of research.

• The study of research methods is the study of how we know what we know.

• Inquiry is a natural human activity for gaining an understanding of the world around us.

• Much of our knowledge is based on agreement rather than direct experience.

• Tradition and authority are important sources of knowledge.

• Empirical research is based on experience and produces knowledge through systematic observation.

• In day-to-day inquiry, we often make mistakes. Science offers protection against such mistakes.

• Whereas people often observe inaccurately, sci-ence avoids such errors by making observation a careful and deliberate activity.

• Sometimes we jump to general conclusions on the basis of only a few observations. Scientists avoid overgeneralization through replication.

• Scientists avoid illogical reasoning by being as careful and deliberate in their thinking as in their observations.

• The scientifi c study of crime guards against, but does not prevent, ideological and political beliefs infl uencing research fi ndings.

• Social science involves three fundamental as-pects: theory, data collection, and data analysis.

• Social scientifi c theory addresses what is, not what should be.

more superfi cial than qualitative descriptions. This is the trade-off.

What a dilemma! Which approach should we choose? Which is better? Which is more ap-propriate to criminal justice research?

The good news is that we don’t have to choose. In fact, by choosing to undertake a qualitative or quantitative study, researchers run the risk of artifi cially limiting the scope of their inquiry. Both qualitative and quantitative methods are useful and legitimate. And some research situations and topics require elements of both approaches.

Knowing through Experience: Summing Up and Looking AheadEmpirical research involves measurement and interpretation.

This chapter introduced the foundation of criminal justice research: empirical research, or learning through experience. Each avenue for inquiry—nomothetic or idiographic descrip-tion, inductive or deductive reasoning, quali-tative or quantitative data—is fundamentally empirical. It’s worth keeping that in mind as we examine the various forms criminal justice research can take.

It is also helpful to think of criminal jus-tice research as organized around two basic activities: measurement and interpretation. Re-searchers measure aspects of reality and then draw conclusions about the meaning of what they have measured. All of us are observing all the time, but measurement refers to some-thing more deliberate and rigorous. Part Two of this book describes ways of structuring ob-servations to produce more deliberate, rigorous measures.

Our ability to interpret observations in crim-inal justice research depends crucially on how those observations are structured. After decid-ing how to structure observations, we have to actually measure them. Whereas physical sci-

Chapter 1 Criminal Justice and Scientifi c Inquiry 25

related violence, it is easy to assume that these are real problems identifi ed by systematic study. Choose a criminal justice topic or claim that’s currently prominent in news stories or enter-tainment. Consult a recent edition of the Sour-cebook of Criminal Justice Statistics (citation below) for evidence to refute the claim.

✪ Additional ReadingsBabbie, Earl, The Sociological Spirit (Belmont, CA:

Wadsworth, 1994). The primer in some socio-logical points of view introduces some of the concepts commonly used in the social sciences.

Hoover, Kenneth R., and Todd Donovan, The Ele-ments of Social Scientifi c Thinking, 9th edition (Belmont, CA: Wadsworth, 2007). This book provides an excellent overview of the key ele-ments in social scientifi c analysis.

Levine, Robert, A Geography of Time: The Temporal Misadventures of a Social Psychologist (New York: Basic Books, 1997). Most of us think of time as absolute. Levine’s book is fun and fascinat-ing as he explores how agreement reality plays a major role in how people from different cul-tures think about time.

Pastore, Ann L., and Kathleen Maguire (eds.), Sour-cebook of Criminal Justice Statistics (Washington, DC: U.S. Department of Justice, Offi ce of Jus-tice Programs, Bureau of Justice Statistics, an-nual; www.albany.edu/sourcebook; accessed May 6, 2008). For 30 years this annual publica-tion has been a source of basic data on criminal justice. If you’re not yet familiar with this com-pendium, Chapter 1 is a good place to start.

Tilley, Nick, and Gloria Laycock, Working Out What to Do: Evidence-Based Crime Reduction (London: Home Offi ce Policing and Reducing Crime Unit, Crime Reduction Series, no. 11, 2002; www.homeoffice.gov.uk /rds/crimreducpubs1.html; accessed May 6, 2008). One of many ex-cellent publications from the British Home Offi ce, this guide helps justice professionals develop policies based on empirical experi-ence. The guide is clearly written and useful as an illustration of how practitioners use social science.

• Theory guides research. In grounded theory, ob-servations contribute to theory development.

• Social scientists are interested in explaining ag-gregates, not individuals.

• Although social scientists observe people, they are primarily interested in discovering relation-ships that connect variables.

• Explanations may be idiographic or nomothetic.

• Data may be quantitative or qualitative.

• Theories may be inductive or deductive.

✪ Key TermsThese terms are defi ned in the chapter where they are set in boldface and can also be found in the glossary at the end of the book.

aggregate, p. 13attribute, p. 15deductive

reasoning, p. 22dependent

variable, p. 18empirical, p. 6grounded

theory, p. 11hypothesis, p. 11hypothesis

testing, p. 13

idiographic, p. 21independent

variable, p. 18inductive

reasoning, p. 22nomothetic, p. 21replication, p. 9theory, p. 11variable, p. 15

✪ Review Questions and Exercises1. Review the common errors of personal inquiry

discussed in this chapter. Find a newspaper or magazine article about crime that illustrates one or more of those errors. Discuss how a sci-entist would avoid making that error.

2. Briefl y discuss examples of descriptive research and explanatory research about changes in crime rates in some major city.

3. Often things we think are true and supported by considerable experience and evidence turn out not to be true, or at least not true with the certainty we expected. Criminal justice seems especially vulnerable to this phenomenon, per-haps because crime and criminal justice policy are so often the subjects of mass and popular media attention. If news stories, movies, and TV shows all point to growing gang- or drug-

26

Chapter 2

Ethics and Criminal Justice ResearchWe’ll examine some of the ethical considerations that must be taken into account along with the scientifi c ones in the design and execution of research. We’ll consider different types of ethical issues and ways of handling them.

Introduction 27

Ethical Issues in Criminal Justice Research 27

No Harm to Participants 27

ETHICS AND EXTREME FIELD

RESEARCH 28

Voluntary Participation 31

Anonymity and Confi dentiality 32

Deceiving Subjects 33

Analysis and Reporting 33

Legal Liability 34

Special Problems 35

Promoting Compliance with Ethical Principles 37

Codes of Professional Ethics 37

Institutional Review Boards 38

Institutional Review Board Requirements and Researcher Rights 41

ETHICS AND JUVENILE GANG

MEMBERS 42

Ethical Controversies 42

The Stanford Prison Experiment 42

Discussion Examples 45

Chapter 2 Ethics and Criminal Justice Research 27

IntroductionDespite our best intentions, we don’t always recog-nize ethical issues in research.

Most of this book focuses on scientifi c proce-dures and constraints. We’ll see that the logic of science suggests certain research procedures, but we’ll also see that some scientifi cally “per-fect” study designs are not feasible, because they would be too expensive or take too long to execute. Throughout the book, we’ll deal with workable compromises.

Before we get to scientifi c and practical con-straints on research, it’s important to explore another essential consideration in doing crimi-nal justice research in the real world— ethics. Just as certain designs or measurement proce-dures are impractical, others are constrained by ethical problems.

All of us consider ourselves ethical—not perfect perhaps, but more ethical than most of humanity. The problem in criminal justice research—and probably in life—is that ethical considerations are not always apparent to us. As a result, we often plunge into things with-out seeing ethical issues that may be obvious to others and even to ourselves when they are pointed out. Our excitement at the prospect of a new research project may blind us to obstacles that ethical considerations present.

Any of us can immediately see that a study that requires juvenile gang members to dem-onstrate how they steal cars is unethical. You’d speak out immediately if we suggested inter-viewing people about drug use and then pub-lishing what they said in the local newspaper. But, as ethical as we think we are, we are likely to miss the ethical issues in other situations—not because we’re bad, but because we’re human.

Ethical Issues in Criminal Justice ResearchA few basic principles encompass the variety of ethi-cal issues in criminal justice research.

In most dictionaries and in common usage, ethics is typically associated with morality, and

both deal with matters of right and wrong. But what is right and what is wrong? What is the source of the distinction? Depending on the individual, sources vary from religion to politi-cal ideology to pragmatic observations of what seems to work and what doesn’t.

Webster’s New World Dictionary (4th ed.) is typical among dictionaries in defi ning ethicalas “conforming to the standards of conduct of a given profession or group.” Although the relativity embedded in this defi nition may frus-trate those in search of moral absolutes, what we regard as moral and ethical in day-to-day life is no more than a matter of agreement among members of a group. And, not surprisingly, dif-ferent groups have agreed on different ethical codes of conduct. If someone is going to live in a particular society, it is extremely useful to know what that society considers ethical and unethical. The same holds true for the criminal justice research “community.”

Anyone preparing to do criminal justice research should be aware of the general agree-ments shared by researchers about what’s proper and improper in the conduct of scien-tifi c inquiry. Ethical issues in criminal justice can be especially challenging because our re-search questions frequently address illegal be-havior that people are anxious to conceal. This is true of offenders and, sometimes, people who work in criminal justice agencies.

The sections that follow explore some of the more important ethical issues and agreements in criminal justice research. Our discussion is restricted to ethical issues in research, not in policy or practice. Thus, we will not consider such issues as the morality of the death pen-alty, acceptable police practices, the ethics of punishment, or codes of conduct for attorneys and judges. If you are interested in substantive ethical issues in criminal justice policy, consult Jocelyn Pollock (2003) or Richard Hall and as-sociates (1999) for an introduction.

No Harm to ParticipantsWeighing the potential benefi ts from doing research against the possibility of harm to the

28 Part One An Introduction to Criminal Justice Inquiry

as well as embarrassment. Although the likeli-hood of physical harm may seem remote, it is worthwhile to consider possible ways it might occur.

Harm to subjects, researchers, or third par-ties is possible in fi eld studies that collect in-formation from or about persons engaged in criminal activity; this is especially true for fi eld research. Studies of drug crimes may involve lo-

people being studied— or harm to other people—is a fundamental ethical dilemma in all re-search. For example, biomedical research can involve potential physical harm to people or animals. Social research may cause psychologi-cal harm or embarrassment in people who are asked to reveal information about themselves. Criminal justice research has the potential to produce both physical and psychological harm,

ETHICS AND EXTREMEFIELD RESEARCH

Dina Perrone Bridgeport State College

As a female ethnographer studying active drug use in a New York dance club, I have encoun-tered awkward and diffi cult situations. The main purpose of my research was to study the use of ecstasy and other drugs in rave club settings. I be-came a participant observer in an all-night dance club (The Plant) where the use of club drugs was common. I covertly observed activities in the club, partly masking my role as a researcher by assum-ing the role of club-goer.

Though I was required to comply with uni-versity institutional review board guidelines, pub-lished codes and regulations offered limited guid-ance for many of the situations I experienced. As a result, I had to use my best judgment, learning from past experiences to make immediate deci-sions regarding ethical issues. I was forced to make decisions about how to handle drug epi-sodes, so as not to place my research or my infor-mants in any danger. Because my research was conducted in a dance club that is also a place for men to pick up women, I faced problems in getting information from subjects while watching out for my physical safety.

Drug Episodes and Subject SafetyI witnessed many drug episodes—adverse reac-tions to various club drugs—in my visits to The Plant. I watched groups trying to get their friends out of K-holes resulting from ketamine, or Spe-

cial K. I even aided a subject throwing up. Being a covert observer made it diffi cult to handle these episodes. There were times in the club when I felt as though I was the only person not under the infl uence of a mind-altering substance. This led me to believe that I had better judgment than the other patrons. Getting involved in these episodes, however, risked jeopardizing my research.

During my fi rst observation, I tried to inter-vene in what appeared to be a serious drug epi-sode but was warned off by an informant. I was new to the club and unsure what would happen if I got involved. If I sought help from club staff or outsiders in dealing with acute drug reactions, patrons as well as the bouncers would begin to question why I kept coming there. I needed to gain the trust of the patrons to enlist participants in my research. Furthermore, the bouncers could throw me out of the club, fearing I was a trouble-maker who would summon authorities.

As a researcher, I have an ethical responsibil-ity to my participants, and as a human being, I have an ethical responsibility to my conscience. I decided to be extra cautious during my research and to pay close attention to how drug episodes are handled. I would fi rst consult my informants and follow their suggestions. But if I ever thought a person suffering a drug episode was at risk while other patrons were neither able nor inclined to help, I would intervene to the best of my ability.

Sexual Advances in the Dance ClubThe Plant is also partly a “meat market.” Unlike most bars and dance clubs, the patrons’ attire and the dance club entertainment are highly erotic. Most of the males inside the club are shirtless, and the majority of females wear extremely reveal-

Chapter 2 Ethics and Criminal Justice Research 29

Potential danger to fi eld researchers should also be considered. For instance, Peter Reu-ter and associates (1990) selected their drug dealer subjects by consulting probation depart-ment records. The researchers recognized that sampling persons from different Washington, D.C., neighborhoods would have produced a more generalizable group of subjects, but they rejected that approach because mass media

cating and interviewing active users and dealers. Bruce Johnson and associates (1985) studied heroin users in New York, recruiting subjects by spreading the word through various means. Other researchers have studied dealers in De-troit (Mieczkowski 1990) and St. Louis ( Jacobs 1999). Collecting information from active crim-inals presents at least the possibility of violence against research subjects by other drug dealers.

ing clothes. In staged performances, males and females perform dances with sexual overtones, and clothing is partly shed. This atmosphere promotes sexual encounters; men frequently ap-proach single women in search of a mate. Men had a tendency to approach me—I appeared to be unattached, and because of my research role, I made it a point to talk to as many people as pos-sible. It’s not diffi cult to imagine how this behav-ior could be misinterpreted.

There were times when men became sexu-ally aggressive and persistent. In most instances, I walked away, and the men usually got the hint. However, some men are more persistent than oth-ers, especially when they are on ecstasy. In situa-tions in which men make sexual advances, Terry Williams and colleagues (1992) suggest devel-oping a trusting relationship with key individuals who can play a protective role. Throughout my research, I established a good rapport with my informants, who assumed that protective role. Unfortunately, acting in this role has had the po-tential to place my informants in physically dan-gerous circumstances.

During one observation, “Tom” grabbed me after I declined his invitation to dance. Tom per-sisted, grabbed me again, then began to argue with “Jerry,” one of my regular informants, who came to my aid. This escalated to a fi stfi ght broken up only after two bouncers ejected Tom from the club.

I had placed my informant and myself in a dangerous situation. Although I tried to convince myself that I really had no control over Jerry’s ac-tions, I felt responsible for the fi ght. A basic prin-ciple of fi eld research is to not invite harm to par-ticipants. In most criminal justice research, harm is associated mainly with the possibility of arrest

or psychological harm from discussing private is-sues. Afterward, I tried to think about how the in-cident escalated and how I could prevent similar problems in the future.

Ethical Decision Rules Evolving from ExperienceAcademic associations have formulated codes of ethics and professional conduct, but limited guidance is available for handling issues that arise in some types of ethnographic research. Instead, like criminal justice practitioners, those research-ers have to make immediate decisions based on experience and training, without knowing how a situation will unfold. Throughout my research, I found myself in situations that I would nor-mally avoid and would probably never confront. Should I help the woman over there get through a drug episode? If I don’t, will she be okay? If I walk away from this aggressive guy, will he follow me? Does he understand that I wanted to talk to him just for research?

The approach I developed to tackle these is-sues was mostly gained by consulting with col-leagues and reading other studies. An overarch-ing theme regarding all codes of ethics is that ethnographers must put the safety and interests of their participants fi rst, and they must recognize that their informants are more knowledgeable about many situations than they are. Through-out the research, I used my judgment to make the best decisions possible when handling these situ-ations. To decide when to intervene during drug episodes, I followed the lead of my informants. Telling men that my informant was my boyfriend and walking away were successful tactics in turn-ing away sexual advances.

30 Part One An Introduction to Criminal Justice Inquiry

computer questionnaires in the British Crime Survey (Mirrlees-Black 1999). Rather than ver-bally respond to questions from interviewers, respondents read and answer questions on a laptop computer. This procedure affords a greater degree of privacy for research subjects.

Although the fact often goes unrecognized, subjects can also be harmed by the analysis and reporting of data. Every now and then, re-search subjects read the books published about the studies they participated in. Reasonably sophisticated subjects can locate themselves in the various indexes and tables of published studies. Having done so, they may fi nd them-selves characterized—though not identifi ed by name—as criminals, deviants, probation viola-tors, and so forth.

Largely for this reason, information on the city of residence of victims identifi ed in the National Crime Victimization Survey is not available to researchers or the public. The rela-tive rarity of some types of crime means that if crime victimization is reported by city of resi-dence individual victims might recognize the portrayal of their experience or might be identi-fi ed by third parties.

Recent developments in the use of crime-mapping software have raised similar concerns. Many police departments now use some type of computer-driven crime map, and some have made maps of small areas available to the pub-lic on the Web. As Tom Casady (1999) points out, this raises new questions of privacy as in-dividuals might be able to identify crimes di-rected against their neighbors. Researchers and police alike must recognize the potential for such problems before publishing or otherwise displaying detailed crime maps. See crime maps for cities in the San Diego metropolitan area for examples (www.arjis.org; accessed January 16, 2008).

By now, it should be apparent that virtually all research runs some risk of harming other people somehow. A researcher can never com-pletely guard against all possible injuries, yet some study designs make harm more likely

reports of widespread drug-related violence gen-erated concern about the safety of research staff (Reuter, MacCoun, and Murphy 1990, 119). Whether such fears were warranted is unclear, but this example does illustrate how safety is-sues can affect criminal justice research.

Other researchers acknowledge the potential for harm in the context of respect for ethical principles. The box titled “Ethics and Extreme Field Research” gives examples of subtle and not-so-subtle ethical dilemmas encountered by a Rutgers University graduate in her study of drug use in rave clubs.

More generally, John Monahan and associ-ates (1993) distinguish three different groups at potential risk of physical harm in their re-search on violence. First are research subjects themselves. Women at risk of domestic violence may be exposed to greater danger if assailants learn they have disclosed past victimizations to researchers. Second, researchers might trigger attacks on themselves when they interview sub-jects who have a history of violent offending. Third, and most problematic, is the possibility that collecting information from unstable indi-viduals might increase the risk of harm to third parties. The last category presents a new di-lemma if researchers learn that subjects intend to attack some third party. Should researchers honor a promise of confi dentiality to subjects or intervene to prevent the harm?

The potential for psychological harm to subjects exists when interviews are used to col-lect information. Crime surveys that ask re-spondents about their experiences as victims of crime may remind them of a traumatic, or at least an unpleasant, experience. Surveys may also ask respondents about illegal behaviors such as drug use or crimes they have commit-ted. Talking about such actions with interview-ers can be embarrassing.

Some researchers have taken special steps to reduce the potential for emotional trauma in interviews of domestic violence victims (Tjaden and Thoennes 2000). One of the most interest-ing examples involves the use of self-completed

Chapter 2 Ethics and Criminal Justice Research 31

periment; they are told that participation is completely voluntary; and they are further in-structed that they can expect no special rewards (such as early parole) for participation. Even under these conditions, volunteers often are motivated by the belief that they will personally benefi t from their cooperation. In other cases, prisoners— or other subjects—may be offered small cash payments in exchange for participa-tion. To people with very low incomes, small payments may be an incentive to participate in a study they would not otherwise endure.

When an instructor in an introductory criminal justice class asks students to fi ll out a questionnaire that she or he plans to ana-lyze and publish, students should always be told that their participation in the survey is completely voluntary. Even so, students might fear that nonparticipation will somehow affect their grade. The instructor should therefore be especially sensitive to the implied sanctions and make provisions to obviate them, such as allowing students to drop the questionnaires in a box near the door prior to the next class.

Notice how this norm of voluntary partici-pation works against a number of scientifi c concerns or goals. In the most general terms, the goal of generalizability is threatened if ex-perimental subjects or survey respondents are only the people who willingly participate. The same is true when subjects’ participation can be bought with small payments. Research results may not be generalizable to all kinds of people. Most clearly, in the case of a descriptive study, a researcher cannot generalize the study fi ndings to an entire population unless a substantial majority of a scientifi cally selected sample actu-ally participates—both the willing respondents and the somewhat unwilling.

Field research (the subject of Chapter 8) has its own ethical dilemmas in this regard. Often, a researcher who conducts observations in the fi eld cannot even reveal that a study is being done, for fear that this revelation might signifi -cantly affect what is being studied. Imagine that you are interested in whether the way stereo

than others. If a particular research procedure seems likely to produce unpleasant effects for subjects—such as asking survey respondents to report deviant behavior—the researcher should have fi rm scientifi c grounds for doing so. If re-searchers pursue a design that is essential and also likely to be unpleasant for subjects, they will fi nd themselves in an ethical netherworld, forced to do some personal agonizing.

As a general principle, possible harm to sub-jects may be justifi ed if the potential benefi ts of the study outweigh the harm. Of course, this raises a further question of how to determine whether possible benefi ts offset possible harms. There is no simple answer, but as we will see, the research community has adopted certain safeguards that help subjects to make such de-terminations themselves.

Not harming people is an easy norm to ac-cept in theory, but it is often diffi cult to ensure in practice. Sensitivity to the issue and experi-ence in research methodology, however, should improve researchers’ efforts in delicate areas of inquiry. Review Dina Perrone’s observations in the box “Ethics and Extreme Field Research” for examples.

Voluntary ParticipationCriminal justice research often intrudes into people’s lives. The interviewer’s telephone call or the arrival of a questionnaire via e-mail signals the beginning of an activity that respondents have not requested and that may require a sig-nifi cant portion of their time and energy. Being selected to participate in any sort of research study disrupts subjects’ regular activities.

A major tenet of medical research ethics is that experimental participation must be vol-untary. The same norm applies to research in criminal justice. No one should be forced to participate. But this norm is far easier to accept in theory than to apply in practice.

For example, prisoners are sometimes used as subjects in experimental studies. In the most rigorously ethical cases, prisoners are told the nature—and the possible dangers— of the ex-

32 Part One An Introduction to Criminal Justice Inquiry

been interviewed, because researchers did not record their names. Nevertheless, in some situ-ations, the price of anonymity is worth pay-ing. In a survey of drug use, for example, we may decide that the likelihood and accuracy of responses will be enhanced by guaranteeing anonymity.

Respondents in many surveys cannot be con-sidered anonymous because an interviewer col-lects the information from individuals whose names and addresses are known. Other means of data collection may similarly make it impos-sible to guarantee anonymity for subjects. If we wished to examine juvenile arrest records for a sample of ninth-grade students, we would need to know their names even though we might not be interviewing them or having them fi ll out a questionnaire.

Confi dentiality Confi dentiality means that a researcher is able to link information with a given person’s identity but essentially promises not to do so publicly. In a survey of self-reported drug use, the researcher is in a position to make public the use of illegal drugs by a given respon-dent, but the respondent is assured that this will not be done. Similarly, if fi eld interviews are conducted with juvenile gang members, re-searchers can certify that information will not be disclosed to police or other offi cials. Studies using court or police records that include indi-viduals’ names may protect confi dentiality by not including any identifying information.

Some techniques ensure better performance on this guarantee. To begin, fi eld or survey in-terviewers who have access to respondent iden-tifi cations should be trained in their ethical responsibilities. As soon as possible, all names and addresses should be removed from data collection forms and replaced by identifi cation numbers. A master identifi cation fi le should be created linking numbers to names to permit the later correction of missing or contradictory information. This fi le should be kept under lock and key and be made available only for le-gitimate purposes.

Whenever a survey is confi dential rather

headphones are displayed in a discount store af-fects rates of shoplifting. Therefore, you plan a fi eld study in which you will make observations of store displays and shoplifting. You cannot very well ask all shoppers whether they agree to participate in your study.

The norm of voluntary participation is an important one, but it is sometimes impossible to follow. In cases in which researchers ulti-mately feel justifi ed in violating it, it is all the more important to observe the other ethical norms of scientifi c research.

Anonymity and Confi dentialityThe clearest concern in the protection of the subjects’ interests and well-being is the protec-tion of their identity. If revealing their behav-ior or responses would injure them in any way, adherence to this norm becomes crucial. Two techniques—anonymity and confi dentiality—assist researchers in this regard, although the two are often confused.

Anonymity A research subject is considered anonymous when the researcher cannot as-sociate a given piece of information with the person. Anonymity addresses many potential ethical diffi culties. Studies that use fi eld ob-servation techniques are often able to ensure that research subjects cannot be identifi ed. Researchers may also gain access to nonpublic records from courts, corrections departments, or other criminal justice agencies in which the names of persons have been removed.

One example of anonymity is a web-based survey where no login or other identifying in-formation is required. Respondents anony-mously complete online questionnaires that are then tabulated. Likewise, a telephone survey is anonymous if residential phone numbers are selected at random and respondents are not asked for identifying information. Interviews with subjects in the fi eld are anonymous if the researchers neither ask for nor record the names of subjects.

Assuring anonymity makes it diffi cult to keep track of which sampled respondents have

Chapter 2 Ethics and Criminal Justice Research 33

pate in a study of human development. She also prepared a brochure describing her research on human development that was distributed to respondents.

Although we might initially think that concealing our research purpose by deception would be particularly useful in studying active offenders, James Inciardi (1993), in describing methods for studying “crack houses,” makes a convincing case that this is inadvisable. First, concealing our research role when investigat-ing drug dealers and users implies that we are associating with them for the purpose of ob-taining illegal drugs. Faced with this situation, a researcher would have the choice of engaging in illegal behavior or offering a convincing ex-planation for declining to do so. Second, mas-querading as a crack-house patron would have exposed the researcher to the considerable dan-ger of violence that was found to be common in such places. Because the choice of committing illegal acts or becoming a victim of violence is really no choice at all, Inciardi (1993, 152) ad-vises researchers who study active offenders in fi eld settings: “Don’t go undercover.”

Analysis and ReportingAs criminal justice researchers, we have ethi-cal obligations to our subjects of study. At the same time, we have ethical obligations to our colleagues in the scientifi c community; a few comments on those obligations are in order. In any rigorous study, the researcher should be more familiar than anyone else with the tech-nical shortcomings and failures of the study. Researchers have an obligation to make those shortcomings known to readers. Even though it’s natural to feel foolish admitting mistakes, researchers are ethically obligated to do so.

Any negative fi ndings should be reported. There is an unfortunate myth in social scien-tifi c reporting that only positive discoveries are worth reporting (and journal editors are some-times guilty of believing that as well). And this is not restricted to social science. Helle Krogh Johansen and Peter Gotzsche (1999) describe how published research on new drugs tends

than anonymous, it is the researcher’s respon-sibility to make that fact clear to respondents. He or she must never use the term anonymous tomean confi dential. Note, however, that research subjects and others may not understand the dif-ference. For example, a former assistant attor-ney general in New Jersey once demanded that Maxfi eld disclose the identities of police offi -cers who participated in an anonymous study. It required repeated explanations of the differ-ence between anonymous and confi dential before the lawyer fi nally understood that it was not possible to identify participants. In any event, subjects should be assured that the information they provide will be used for research purposes only and not be disclosed to third parties.

Deceiving SubjectsWe’ve seen that the handling of subjects’ iden-tities is an important ethical consideration. Handling our own identity as researchers can be tricky, too. Sometimes it’s useful and even neces-sary to identify ourselves as researchers to those we want to study. It would take a master con art-ist to get people to participate in a laboratory experiment or complete a lengthy questionnaire without letting on that research was being con-ducted. We should also keep in mind that de-ceiving people is unethical; in criminal justice research, deception needs to be justifi ed by com-pelling scientifi c or administrative concerns.

Sometimes, researchers admit that they are doing research but fudge about why they are doing it or for whom. Cathy Spatz Widom and associates interviewed victims of child abuse some 15 years after their cases had been heard in criminal or juvenile courts (Widom, Weiler, and Cottler 1999). Widom was interested in whether child abuse victims were more likely than a comparison group of nonvictims to have used illegal drugs. Interviewers could not ex-plain the purpose of the study without poten-tially biasing responses. Still, it was necessary to provide a plausible explanation for asking detailed questions about personal and family experiences. Widom’s solution was to inform subjects that they had been selected to partici-

34 Part One An Introduction to Criminal Justice Inquiry

data may be subject to subpoena by a criminal court. Because disclosure of research data that could be traced to individual subjects violates the ethical principle of confi dentiality, a new dilemma emerges.

Fortunately, federal law protects researchers from legal action in most circumstances, pro-vided that appropriate safeguards are used to protect research data. Research plans for 2002 published by organizations in the Offi ce of Justice Programs summarized this protection: “[Research] information and copies thereof shall be immune from legal process, and shall not, without the consent of the person furnish-ing such information, be admitted as evidence or used for any purpose in any action, suit, or other judicial, legislative, or administrative pro-ceedings” (42 U.S. Code §22.28a). This not only protects researchers from legal action but also can be valuable in assuring subjects that they cannot be prosecuted for crimes they describe to an interviewer or fi eld worker. Bruce John-son and associates (1985, 219) prominently dis-played a Federal Certifi cate of Confi dentiality at their research offi ce to assure heroin dealers that they could not be prosecuted for crimes disclosed to interviewers. More savvy than many people about such matters, the heroin us-ers were duly impressed.

Note that such immunity requires confi -dential information to be protected. We have already discussed the principle of confi denti-ality, so this bargain should be an easy one to keep.

Somewhere between legal liability and phys-ical danger lies the potential risk to fi eld re-searchers from law enforcement. Despite being up-front with crack users about his role as a re-searcher, Inciardi (1993) points out that police could not be expected to distinguish him from his subjects. Visibly associating with offenders in natural settings brings some risk of being arrested or inadvertently being an accessory to crime. On one occasion, Inciardi fl ed the scene of a robbery and on another was caught up in a crack-house raid. Another example

to focus on successful experiments. Unsuc-cessful research on new formulations is less often published, which leads pharmaceutical researchers to repeat studies of drugs already shown to be ineffective. Largely because of this bias, researchers at the Johns Hopkins Univer-sity Medical School have established the Jour-nal of Negative Observations in Genetic Oncology (NOGO), dedicated to publishing negative fi nd-ings from cancer research (www.path.jhu.edu/nogo/; accessed May 6, 2008). In social science, as in medical research, it is often as important to know that two things are not related as to know that they are.

In general, science progresses through hon-esty and openness, and is retarded by ego de-fenses and deception. We can serve our fellow researchers—and the scientifi c community as a whole—by telling the truth about all the pit-falls and problems experienced in a particular line of inquiry. With luck, this will save others from the same problems.

Legal LiabilityTwo types of ethical problems expose research-ers to potential legal liability. To illustrate the fi rst, assume you are making fi eld observations of criminal activity, such as street prostitution, that is not reported to police. Under criminal law in many states, you might be arrested for obstructing justice or being an accessory to a crime. Potentially more troublesome is the situ-ation in which participant observation of crime or deviance draws researchers into criminal or deviant roles themselves—such as smuggling cigarettes into a lockup in order to obtain the cooperation of detainees.

The second and more common potential source of legal problems involves knowledge that research subjects have committed illegal acts. Self-report surveys or fi eld interviews may ask subjects about crimes they have committed. If respondents report committing offenses they have never been arrested for or charged with, the researcher’s knowledge of them might be construed as obstruction of justice. Or research

Chapter 2 Ethics and Criminal Justice Research 35

We will tell you what the researchers decided at the end of this chapter. You should recog-nize, however, how applied research in criminal justice agencies can involve a variety of ethical issues.

Research Causes Crime Because criminal acts and their circumstances are complex and imperfectly understood, some research projects have the potential to produce crime or infl u-ence its location or target. Certainly, this is a potentially serious ethical issue for researchers.

Most people agree that it is unethical to en-courage someone to commit an offense solely for the purpose of a research project. What’s more problematic is recognizing situations in which research might indirectly promote of-fending. Scott Decker and Barrik Van Winkle (1996) discuss such a possibility in their re-search on gang members. Some gang mem-bers offered to illustrate their willingness to use violence by inviting researchers to witness a drive-by shooting. Researchers declined all such invitations (1996, 46). Another ethical is-sue was the question of how subjects used the $20 cash payments they received in exchange for being interviewed (1996, 51):

We set the fee low enough that we were confi dent that it would not have a crimino-genic effect. While twenty dollars is not a small amount of money, it is not suffi cient to purchase a gun or bankroll a large drug buy. We are sure that some of our subjects used the money for illegal purposes. But, after all, these were individuals who were regularly engaged in delinquent and crimi-nal acts.

You may or may not agree with the authors’ reasoning in the last sentence. But their consid-eration of how cash payments would be used by active offenders represents an unusually care-ful recognition of the ethical dilemmas that emerge in studying active offenders.

A different type of ethical problem is the possibility of crime displacement in studies of

is the account Bruce Jacobs (1996) gives of his contacts with police while he was study-ing street drug dealers. Exercises presented at the end of the chapter ask you to think more carefully about the ethical issues involved in Jacobs’s contact with police.

Special ProblemsCertain types of criminal justice studies present special ethical problems in addition to those we have mentioned. Applied research, for example, may evaluate some existing or new program. Evaluations frequently have the potential to disrupt the routine operations of agencies be-ing studied. Obviously, it is best to minimize such interferences whenever possible.

Staff Misbehavior While conducting ap-plied research, researchers may become aware of irregular or illegal practices by staff in public agencies. They are then faced with the ethical question of whether to report such informa-tion. For example, investigators conducting an evaluation of an innovative probation program learned that police visits to the residences of probationers were not taking place as planned. Instead, police assigned to the program had been submitting falsifi ed log sheets and had not actually checked on probationers.

What is the ethical dilemma in this case? On the one hand, researchers were evaluating the probation program and so were obliged to re-port reasons it did or did not operate as planned. Failure to deliver program treatments (home visits) is an example of a program not operat-ing as planned. Investigators had guaranteed confi dentiality to program clients—the offend-ers assigned to probation—but no such agree-ment had been struck with program staff. On the other hand, researchers had assured agency personnel that their purpose was to evaluate the probation program, not individuals’ job perfor-mance. If researchers disclosed their knowledge that police were falsifying reports, they would violate this implied trust.

What would you have done in this situation?

36 Part One An Introduction to Criminal Justice Inquiry

Suppose researchers believe that diverting do-mestic violence offenders from prosecution to counseling reduces the possibility of repeat vio-lence. Is it ethical to conduct an experiment in which some offenders are prosecuted but oth-ers are not?

You may recognize the similarity between this question and those faced by medical re-searchers who test the effectiveness of experi-mental drugs. Physicians typically respond to such questions by pointing out that the ef-fectiveness of a drug cannot be demonstrated without such experiments. Failure to conduct research, even at the potential expense of sub-jects not receiving the trial drugs, would there-fore make it impossible to develop new drugs or distinguish benefi cial treatments from those that are ineffective and even harmful.

One solution to this dilemma is to interrupt an experiment if preliminary results indicate that a new policy, or drug, does in fact produce improvements in a treatment group. Michael Dennis (1990) describes how such plans were incorporated into a long-term evaluation of en-hanced drug treatment counseling. If prelimi-nary results had indicated that the new coun-seling program reduced drug use, researchers and program staff were prepared to provide enhanced counseling to subjects in the control group. Dennis recognized this potential ethi-cal issue and planned his elaborate research de-sign to accommodate such midstream changes. Similarly, Martin Killias and associates (2000) planned to interrupt their experimental study of heroin prescription in Switzerland if com-pelling evidence pointed to benefi ts from that approach to treating drug dependency.

Mandatory Reporting The situation is some-what murkier for researchers studying certain kinds of family violence. Following the Federal Child Abuse Prevention and Treatment Act of 1974, all states developed child protection agen-cies and adopted mandatory reporting laws. Specifi c provisions vary, but in general, people who learn about possible cases of child abuse must report them to designated state agencies.

crime prevention programs. Consider an ex-perimental program to reduce street prostitu-tion in one area of a city. Researchers studying such a program might designate experimental target areas for enhanced enforcement, as well as nearby comparison areas that will not receive an intervention. If prostitution is displaced from target areas to adjacent neighborhoods, the evaluation study contributes to an increase in prostitution in the comparison areas.

In a review of more than 50 evaluations of crime prevention projects, René Hesseling (1994) concludes that displacement tended to be asso-ciated with programs targeting street prostitu-tion, bank robbery, and certain combinations of offenses. The type of crime prevention action also made a difference, with displacement more common for target-hardening programs. For example, installing security screens on ground-fl oor windows in some buildings seemed to displace burglary to less protected structures. Similarly, adding steering column locks to new cars tended to increase thefts of older cars (Fel-son and Clarke 1998).

At the same time, Hesseling demonstrates that displacement is by no means inevitable. Ronald Clarke and John Eck (2005) further argue that researchers and public offi cials un-realistically assume a deterministic model of offending behavior. Instead, offenders are eas-ily dissuaded by a variety of crime prevention measures.

In any event, when it does occur, displace-ment tends to follow major policy changes that are not connected with criminal justice research. Researchers cannot be expected to control actions by criminal justice offi cials that may benefi t some people at the expense of oth-ers. However, it is reasonable to expect research-ers involved in planning an evaluation study to anticipate the possibility of such things as dis-placement and bring them to the attention of program staff.

Withholding of Desirable Treatments Cer-tain kinds of research designs in criminal justice can lead to different kinds of ethical questions.

Chapter 2 Ethics and Criminal Justice Research 37

safeguards are used. In 1974, the National Research Act was signed into law after a few highly publicized examples of unethical prac-tices in medical and social science research. A few years later, what has become known as The Belmont Report prescribed a brief but com-prehensive set of ethical principles for protect-ing human subjects (National Commission for the Protection of Human Subjects of Biomedi-cal and Behavioral Research 1979). In only six pages, three principles were presented:

1. Respect for persons: Individuals must be al-lowed to make their own decisions about participation in research, and those with limited capacity to make such decisions should have special protection.

2. Benefi cence: Research should do no harm to participants, and seek to produce benefi ts.

3. Justice: The benefi ts and burdens of par-ticipating in research should be distributed fairly.

Copious federal regulations have stemmed from these three principles. But in most cases, the research community has adopted two gen-eral mechanisms for promoting ethical research practices: codes of professional ethics and insti-tutional review boards.

Codes of Professional EthicsIf the professionals who design and conduct research projects can fail to recognize ethical problems, how can such problems be avoided? One approach is for researchers to consult one of the codes of ethics produced by pro-fessional associations. Formal codes of con-duct describe what is considered acceptable and unacceptable professional behavior. The American Psychological Association (2002) code of ethics is quite detailed, refl ecting the different professional roles of psychologists in research, clinical treatment, and educational contexts.

Many of the ethical questions criminal justice researchers are likely to encounter are addressed in the ethics code of the Ameri-can Sociological Association (1997). Paul

This certainly seems to be a worthwhile goal, but what about researchers who learn about possible child maltreatment in the course of a survey? In most states, such requirements ap-ply only to health professionals and teachers. But in eight states, anyone who suspects a case of child maltreatment must report it to designated authorities.

Notice how this is consistent with one ethi-cal principle—protection of human subjects by reporting possible victims—but at odds with another principle— confi dentiality. A Bureau of Justice Statistics report on human subjects pro-tection suggests that researchers warn subjects at the beginning of an interview that any infor-mation disclosed about child abuse must be reported to authorities (Sieber 2001). But that threatens researchers’ ability to learn about child abuse. Another approach, adopted by Li-anne Woodward and David Fergusson (2000), is to interview subjects age 18 and older, asking about experiences of abuse victimization when they were children. This is an imperfect solu-tion, but it illustrates the trade-offs between our interest in protecting research subjects and our interest in studying the phenomenon of child abuse. For more examples and guidance in this vexing research/ethical area, see Seth Ka-lichman’s (2000) book, published by the Ameri-can Psychological Association.

Research in criminal justice, especially ap-plied research, can pose a variety of ethical di-lemmas, only some of which we have mentioned here. See the “Additional Readings” at the end of this chapter for more information.

Promoting Compliance with Ethical PrinciplesCodes of ethics and institutional review boards are two main ways of promoting compliance with ethi-cal principles.

No matter how sensitive they might be to the rights of individuals and possible ways subjects might be harmed, researchers are not always the best judges of whether or not adequate

38 Part One An Introduction to Criminal Justice Inquiry

Links to a number of ethics codes for so-cial science associations are listed, including those listed above and the British Society of Criminology.

What can we make of the inability of the largest professional association of criminolo-gists to agree on a code of ethics? The wide vari-ety of approaches to doing research in this area probably has something to do with it. Crimi-nologists also encounter a range of ethical is-sues and have diverging views on how those is-sues should be addressed. Finally, we have seen examples of the special problems criminolo-gists face in balancing ethics and research. Not all of these problems have easy solutions that can be embodied in a code.

Even when they exist, professional codes of ethics for social scientists cannot be expected to prevent unethical practices in criminal jus-tice research any more than the American Bar Association’s Code of Professional Responsi-bility eliminates breaches of ethics by lawyers. For this reason, and in reaction to some contro-versial medical and social science research, the U.S. Department of Health and Human Ser-vices (HHS) has established regulations protect-ing human research subjects. These regulations do not apply to all social science or criminal justice research. It is, however, worthwhile to understand some of their general provisions. Material in the following section is based on the Code of Federal Regulations, Title 45, Chapter 4.6. Those regulations are themselves rooted in The Belmont Report.

Institutional Review BoardsGovernment agencies and nongovernment or-ganizations (including universities) that con-duct research involving human subjects must establish review committees, known as institu-tional review boards (IRB). These IRBs have two general purposes. First, board members make judgments about the overall risks to human subjects and whether these risks are acceptable, given the expected benefi ts from actually doing the research. Second, they determine whether

Reynolds (1979, 442– 449) has created a com-posite code for the use of human subjects in re-search, drawing on 24 codes of ethics published by national associations of social scientists. The National Academy of Sciences publishes a very useful booklet on a variety of ethical is-sues, including the problem of fraud and other forms of scientifi c misconduct (Committee on Science, Engineering, and Public Policy 1995).

The two national associations represent-ing criminology and criminal justice research-ers have one code of ethics between them. The Academy of Criminal Justice Sciences (ACJS) based its code of ethics on that developed by the American Sociological Association. ACJS members are bound by a very general code that refl ects the diversity of its membership: “Most of the ethical standards are written broadly, to provide applications in varied roles and var-ied contexts. The Ethical Standards are not exhaustive— conduct that is not included in the Ethical Standards is not necessarily ethi-cal or unethical” (Academy of Criminal Justice Sciences 2000, 1).

After years of inaction, a committee of the American Society of Criminology (ASC) pro-posed a draft code of ethics in 1998, which drew extensively on the code for sociology. But no ethics code had been adopted as of July 2007, and the ASC withdrew its draft code from cir-culation in 1999. In personal correspondence with Maxfi eld in 2003, a prominent ASC offi -cer expressed doubt that any sort of code would be approved soon; eventually, this person felt, a very brief statement of general principles might be approved. In the meantime, the ASC website includes the following statement on a page ti-tled “Code of Ethics”:

The American Society of Criminology has not formally adopted a code of ethics. We would suggest that persons interested in this general topic examine the various codes of ethics adopted by other professional as-sociations. (www.asc41.com/ethicspg.html; accessed January 11, 2008)

Chapter 2 Ethics and Criminal Justice Research 39

about the purpose of the research—human development— one component of which is be-ing a victim of child abuse, which the subjects were not told.

Another potential problem with obtain-ing informed consent is ensuring that sub-jects have the capacity to understand the de-scriptions of risks, benefi ts, procedures, and so forth. Researchers may have to provide oral descriptions to participants who are unable to read. For subjects who do not speak English, re-searchers should be prepared to describe proce-dures in their native language. And if research-ers use specialized terms or language common in criminal justice research, participants may not understand the meaning and thus be un-able to grant informed consent. Consider this statement: “The purpose of this study is to determine whether less restrictive sanctions such as restitution produce heightened sensi-tivity to social responsibility among persistent juvenile offenders and a decline in long-term recidivism.” Can you think of a better way to describe this study to delinquent 14-year-olds? Figure 2.1 presents a good example of an in-formed consent statement that was used in a study of juvenile burglars. Notice how the state-ment describes research procedures clearly and unambiguously tells subjects that participation is voluntary.

Other guidelines for obtaining informed consent include explicitly telling people that their participation is voluntary and assuring them of confi dentiality. However, it is more important to understand how informed con-sent addresses key ethical issues in conducting criminal justice research. First, it ensures that participation is voluntary. Second, by inform-ing subjects of procedures, risks, and benefi ts, researchers are empowering them to resolve the fundamental ethical dilemma of whether the possible benefi ts of the research offset the pos-sible risks of participation.

Special Populations Federal regulations on human subjects include special provisions

the procedures to be used include adequate safeguards regarding the safety, confi dentiality, and general welfare of human subjects.

Under HHS regulations, virtually all research that uses human subjects in any way, includ-ing simply asking people questions, is subject to IRB review. The few exceptions potentially include research conducted for educational purposes and studies that collect anonymous information only. However, even those studies may be subject to review if they use certain spe-cial populations (discussed later) or procedures that might conceivably harm participants. In other words, it’s safe to assume that most re-search is subject to IRB review if original data will be collected from individuals whose identi-ties will be known. If you think about the vari-ous ways subjects might be harmed and the dif-fi culty of conducting anonymous studies, you can understand why this is the case.

Federal regulations and IRB guidelines ad-dress other potential ethical issues in social research. Foremost among these is the typical IRB requirement for dealing with the ethical principle of voluntary participation.

Informed Consent The norm of voluntary participation is usually satisfi ed through in-formed consent—informing subjects about research procedures and then obtaining their consent to participate. Although this may seem like a simple requirement, obtaining informed consent can present several practical diffi cul-ties. It requires that subjects understand the purpose of the research, possible risks and side effects, possible benefi ts to subjects, and the procedures that will be used.

If you accept that deception may sometimes be necessary, you will realize how the require-ment to inform subjects about research pro-cedures can present something of a dilemma. Researchers usually address this problem by telling subjects at least part of the truth or of-fering a slightly revised version of why the re-search is being conducted. In Widom’s study of child abuse, subjects were partially informed

40 Part One An Introduction to Criminal Justice Inquiry

ethical principles and satisfi ed the concerns of their university’s IRB.

Prisoners are treated as a special popula-tion for somewhat different reasons. Because of their ready accessibility for experiments and interviews, prisoners have frequently been used in biomedical experiments that produced seri-ous harm (Mitford 1973). Recognizing this, HHS regulations specify that prisoner sub-jects may not be exposed to risks that would be considered excessive for nonprison subjects. Furthermore, undue infl uence or coercion can-not be used in recruiting prisoner subjects. In-formed consent statements presented to pro-spective subjects must indicate that a decision not to participate in a study will have no infl u-ence on work assignments, privileges, or parole decisions. To help ensure that these ethical

for certain types of subjects, and two of these are particularly important in criminal justice research—juveniles and prisoners. Juveniles, of course, are treated differently from adults in most aspects of the law. Their status as a special population of human subjects refl ects the legal status of juveniles, as well as their capacity to grant informed consent. In most studies that involve juveniles, consent must be obtained both from parents or guardians and from the juvenile subjects themselves.

In some studies, however, such as those that focus on abused children, it is obviously not de-sirable to obtain parental consent. Decker and Van Winkle faced this problem in their study of St. Louis gang members. See the box “Ethics and Juvenile Gang Members” for a discussion of how they reconciled the confl ict between two

You and your parents or guardian are invited to participate in a research study of the monitoring program that you were assigned to by the Juvenile Court. The purpose of this research is to study the program and your reactions to it. In order to do this a member of the research team will need to interview you and your parents/guardians when you complete the monitoring program. These interviews will take about 15 minutes and will focus on your experiences with the court and monitoring program, the things you do, things that have happened to you, and what you think. In addition, we will record from the court records information about the case for which you were placed in the monitoring program, prior cases, and other information that is put in the records after you are released from the monitoring program.

Anything you or your parents or guardian tell us will be strictly confidential. This means that only the re-searchers will have your answers. They will not under any conditions (except at your request) be given to the court, the police, probation officers, your parents, or your child!

Your participation in this research is voluntary. If you don’t want to take part, you don’t have to! If you de-cide to participate, you can change your mind at any time. Whether you participate or not will have no effect on the program, probation, or your relationship with the court.

The research is being directed by Dr. Terry Baumer and Dr. Robert Mendelsohn from the Indiana Univer-sity School of Public and Environmental Affairs here in Indianapolis. If you ever have any questions about the research or comments about the monitoring program that you think we should know about, please call one of us at 274-0531.

Consent Statement

We agree to participate in this study of the Marion County Juvenile Monitoring Program. We have read the above statement and understand what will be required and that all information will be confidential. We also understand that we can withdraw from the study at any time without penalty.

Juvenile Date: ________________

Parent /Guardian

Parent /Guardian

Researcher

Figure 2.1 Informed Consent Statement for Evaluation of Marion County Juvenile Monitoring Program

Chapter 2 Ethics and Criminal Justice Research 41

regulations actually create problems by setting constraints on their freedom and professional judgments in conducting research. Recall that potential confl ict between the rights of research-ers to discover new knowledge and the rights of subjects to be free from unnecessary harm is a fundamental ethical dilemma. It is at least in-convenient to have outsiders review a research proposal. Or a researcher may feel insulted by the implication that the potential benefi ts of research will not outweigh the potential harm or inconvenience to human subjects.

Many university IRBs have become ex-tremely cautious in reviewing research propos-als. See Christopher Shea’s (2000) discussion for examples of problems resulting from this. Professional associations and research-oriented federal agencies have tried to offer guidance on what is and is not subject to various levels of IRB approval. Joan Sieber (2001) prepared an analysis of human subjects issues associated with large surveys for the Bureau of Justice Sta-tistics. Always alert for possible restrictions on academic freedom, the American Association of University Professors (2001) published a useful summary of how IRBs have come to regulate social science research.

issues are recognized, if an IRB reviews a proj-ect in which prisoners will be subjects, at least one member of that IRB must be either a pris-oner or someone specifi cally designated to represent the interests of prisoners. Figure 2.2 presents a checklist required by the Rutgers University IRB for proposed research involving prisoners.

Regarding the last item in Figure 2.2, ran-dom selection is generally recognized as an eth-ical procedure for selecting subjects or decid-ing which subjects will receive an experimental treatment. HHS regulations emphasize this in describing special provisions for using prison subjects: “Unless the principal investigator pro-vides to the [IRB] justifi cation in writing for following some other procedures, control sub-jects must be selected randomly from the group of available prisoners who meet the characteris-tics needed for that particular research project” (45 CFR 46.305[4]).

Institutional Review Board Requirements and Researcher RightsFederal regulations contain many more provi-sions for IRBs and other protections for human subjects. Some researchers believe that such

Figure 2.2 Excerpts from Checklist for Research Involving Prisoners

• Does the research entail any possible advantages accrued to the prisoner through their participation in the research that impairs their ability to weigh the risk /benefi ts of the participation in the limited choice environment that exists in a prison? This comparison is to be made with respect to the general living conditions, medical care, amenities and earning opportunities which exist in a prison.

• Are the risks of the research commensurate with those that would be accepted by non-prisoner participants? Provide Rationale below.

• Is the information presented to the prisoners in the Consent Form or Oral consent script done so in a language understandable by the participants?

• Does the Consent Form explicitly state to the subject that “Do not tell us any information about past or future crimes that are unknown to the authorities as we cannot guarantee confi dentiality of that information Additionally, I [the researcher] must report to the authorities information you tell me about harming yourself or other people, or any plans you have to escape.”

• Does adequate assurance exist that parole boards will not take into account participation in the research when determining parole and that the prisoners were clearly informed of this prior to participation in the research?

• Describe what specifi c steps were taken to ensure that the Informed Consent Form includes information specifi c to the prisoner subject population. This is not necessary if the research is limited to data analysis

• Is the selection of prisoner research participants fair and equitable and immune from arbitrary intervention by prisoner authorities? If not, has the PI provided suffi cient justifi cation for the implementation of alternative procedures?

Source: Adapted from Rutgers University Offi ce of Research and Sponsored Programs document, available at: http://orsprutgers.edu/Humans/default.php#general

42 Part One An Introduction to Criminal Justice Inquiry

Virtually all colleges and universities have IRBs. Consult the Rutgers University website (http://orsp.rutgers.edu/; accessed May 6, 2008) for an example, or visit the IRB website at your institution.

Ethical ControversiesExamples illustrate how ethics is a problem in jus-tice research.

By way of illustrating the importance of ethics principles, together with problems that may be encountered in applying those principles, we now describe a research project that provoked widespread ethical controversy and discussion. This is followed by further examples of ethical questions for discussion.

The Stanford Prison ExperimentFew people would disagree that prisons are dehumanizing. Inmates forfeit freedom, of course, but their incarceration also results in a loss of privacy and individual identity. Violence

There is some merit in such concerns; how-ever, we should not lose sight of the reasons IRB requirements and other regulations were created. Researchers are not always the best judges of the potential for their work to harm individuals. In designing and conducting crim-inal justice research, they may become excited about learning how to better prevent crime or more effectively treat cocaine addiction. That excitement and commitment to scientifi c ad-vancement may lead researchers to overlook possible harms to individual rights or well-be-ing. You may recognize this as another way of asking whether the ends justify the means. Be-cause researchers are not always disinterested parties in answering such questions, IRBs are established to provide outside judgments. Also recognize that IRBs can be sources of expert ad-vice on how to resolve ethical dilemmas. Decker and Van Winkle shared their university’s con-cern about balancing confi dentiality against the need to obtain informed consent from juve-nile subjects; together, they were able to fashion a workable compromise.

ETHICS AND JUVENILEGANG MEMBERS

Scott Decker and Barrik Van Winkle faced a range of ethical issues in their study of gang members. Many of these should be obvious given what has been said so far in this chapter. Violence was common among subjects and presented a real risk to researchers. Decker and Van Winkle (1996, 252) reported that 11 of the 99 members of the original sample had been killed since the project began in 1990. There was also the obvious need to assure confi dentiality to subjects.

Their project was supported by a federal agency and administered through a university, so Decker and Van Winkle had to comply with fed-eral human subjects guidelines as administered by the university institutional review board (IRB). And because many of the subjects were juveniles, they had to address federal regulations concern-

ing that special population. Foremost among these was the normal requirement that informed consent for juveniles include parental notifi cation and approval.

You may immediately recognize the confl icting ethical principles at work here, together with the potential for confl ict. The promise of confi dential-ity to gang members is one such principle that was essential for the researchers to obtain candid re-ports of violence and other law-breaking behavior. But the need for confi dentiality confl icted with ini-tial IRB requirements to obtain parental consent for their children to participate in the research:

This would have violated our commit-ment to maintain the confi dentiality of each subject, not to mention the ethical and practical diffi culties of fi nding and in-forming each parent. We told the Human Subjects Committee that we would not,

Chapter 2 Ethics and Criminal Justice Research 43

was constructed in the basement of a psychol-ogy department building: three 6-foot by 9-foot “cells” furnished with only a cot, a prison “yard” in a corridor, and a 2-foot by 7-foot “solitary confi nement cell.” Twenty-one subjects were selected from 75 volunteers after screening to eliminate those with physical or psychological problems. Offered $15 per day for their partici-pation, the 21 subjects were randomly assigned to be either guards or prisoners.

All subjects signed contracts that included instructions about prisoner and guard roles for the planned two-week experiment. “Prison-ers” were told that they would be confi ned and under surveillance throughout the experiment, and their civil rights would be suspended; they were, however, guaranteed that they would not be physically abused.

“Guards” were given minimal instructions, most notably that physical aggression or physi-cal punishment of “prisoners” was prohibited. Together with a “warden,” however, they were generally free to develop prison rules and pro-cedures. The researchers planned to study how

is among the realities of prison life that people point to as evidence of the failure of prisons to rehabilitate inmates.

Although the problems of prisons have many sources, psychologists Curtis Haney, Craig Banks, and Philip Zimbardo (1973) were interested in two general explanations. The fi rst was the dispositional hypothesis—prisons are brutal and dehumanizing because of the types of people who run them and are incarcerated in them. Inmates have demonstrated their disre-spect for legal order and their willingness to use deceit and violence; persons who work as prison guards may be disproportionately authoritar-ian and sadistic. The second was the situational hypothesis—the prison environment itself cre-ates brutal, dehumanizing conditions indepen-dent of the kinds of people who live and work in the institutions.

Haney and associates set out to test the situ-ational hypothesis by creating a functional prison simulation in which healthy, psychologi-cally normal male college students were assigned to roles as prisoners and guards. The “prison”

in effect, tell parents that their child was being interviewed because they were an active gang member, knowledge that the parents may not have had. (Decker and Van Winkle 1996, 52)

You might think deception would be a possibility—informing parents that their child was selected for a youth development study, for exam-ple. This would not, however, solve the logistical diffi culty of locating parents or guardians, some of whom had lost contact with their children. Furthermore, it was likely that even if parents or guardians could be located, suspicions about the research project and the reasons their children were selected would prevent many parents from granting consent. Loss of juvenile subjects in this way would compromise the norm of generality as we have described it in this chapter and elsewhere.

Finally, waiving the requirement for parental

consent would have undermined the legal princi-ple that the interests of juveniles must be protected by a supervising adult. Remember that researchers are not always the best judges of whether suffi -cient precautions have been taken to protect sub-jects. Here is how Decker and Van Winkle (1996, 52) resolved the issue with their IRB:

We reached a compromise in which we found an advocate for each juvenile mem-ber of our sample; this person—a univer-sity employee—was responsible for making sure that the subject understood (1) their rights to refuse or quit the interview at any time without penalty and (2) the confi -dential nature of the project. All subjects signed the consent form.

Source: Adapted from Decker and Van Winkle (1996).

44 Part One An Introduction to Criminal Justice Inquiry

Subjects in each group accepted their roles all too readily. Prisoners and guards could in-teract with each other in friendly ways because guards had the power to make prison rules. But interactions turned out to be overwhelmingly hostile and negative. Guards became aggressive, and prisoners became passive. When the experi-ment ended prematurely, prisoners were happy about their early “parole,” but guards were dis-appointed that the study would not continue.

Haney and colleagues justify the prison sim-ulation study in part by claiming that the dis-positional/situational hypotheses could not be evaluated using other research designs. Clearly, the researchers were sensitive to ethical issues. They obtained subjects’ consent to the experi-ment through signed contracts. Prisoners who showed signs of acute distress were released early. The entire study was terminated after less than half of the planned two weeks had elapsed when its unexpectedly harsh impact on subjects became evident. Finally, researchers conducted group therapy debriefi ng sessions with prison-ers and guards and maintained follow-up con-tacts for a year to ensure that subjects’ negative experiences were temporary.

Two related features of this experiment raise ethical questions, however. First, subjects were not fully informed of the procedures. Although we have seen that deception, including some-thing less than full disclosure, can often be jus-tifi ed, in this case deception was partially due to the researchers’ uncertainty about how the prison simulation would unfold. This relates to the second and more important ethical prob-lem: guards were granted the power to make up and modify rules as the study progressed, and their behavior became increasingly authori-tarian. Comments by guards illustrate their reactions as the experiment unfolded (Haney, Banks, and Zimbardo 1973, 88):

“They [the prisoners] didn’t see it as an ex-periment. It was real and they were fi ghting to keep their identity. But we were always there to show them just who was boss.”

both guards and prisoners reacted to their roles, but guards were led to believe that the purpose of the experiment was to study prisoners.

If you had been a prisoner in this experi-ment, you would have experienced something like the following after signing your contract: First, you would have been arrested without notice at your home by a real police offi cer, perhaps with neighbors looking on. After be-ing searched and taken to the police station in handcuffs, you would have been booked, fi nger-printed, and placed in a police detention facil-ity. Next, you would have been blindfolded and driven to “prison,” where you would have been stripped, sprayed with a delousing solution, and left to stand naked for a period of time in the “prison yard.” Eventually, you would have been issued a prison uniform (a loose overshirt stamped with your ID number), fi tted with an ankle chain, led to your cell, and ordered to re-main silent. Your prison term would then have been totally controlled by the guards.

Wearing mirrored sunglasses, khaki uni-forms, and badges and carrying nightsticks, guards supervised prisoner work assignments and held lineups three times per day. Although lineups initially lasted only a few minutes, guards later extended them to several hours. Prison-ers were fed bland meals and accompanied by guards on three authorized toilet visits per day.

The behavior of all subjects in the prison yard and other open areas was videotaped; au-diotapes were made continuously while prison-ers were in their cells. Researchers administered brief questionnaires throughout the experi-ment to assess emotional changes in prisoners and guards. About four weeks after the experi-ment concluded, researchers conducted inter-views with all subjects to assess their reactions.

Haney and associates (1973, 88) had planned to run the prison experiment for two weeks, but they halted the study after six days because subjects displayed “unexpectedly intense reac-tions.” Five prisoners had to be released even before that time because they showed signs of acute depression or anxiety.

Chapter 2 Ethics and Criminal Justice Research 45

chapters in this book and whenever you plan a research project.

To further sensitize you to the ethical com-ponent of criminal justice and other social re-search, we’ve prepared brief descriptions of real and hypothetical research situations. Can you see the ethical issue in each? How do you feel about it? Are the procedures described ul-timately acceptable or unacceptable? It would be very useful to discuss these examples with other students in your class.

1. In a federally funded study of a probation program, a researcher discovers that one participant was involved in a murder while on probation. Public disclosure of this inci-dent might threaten the program that the researcher believes, from all evidence, is ben-efi cial. Judging the murder to be an anomaly, the researcher does not disclose it to federal sponsors or describe it in published reports.

2. As part of a course on domestic violence, a professor requires students to telephone a domestic violence hotline, pretend to be a victim, and request help. Students then write up a description of the assistance of-fered by hotline staff and turn it in to the professor.

3. Studying aggression in bars and nightclubs, a researcher records observations of a sav-age fi ght in which three people are seri-ously injured. Ignoring pleas for help from one of the victims, the researcher retreats to a restroom to write up notes from these observations.

4. In a study of state police, researchers learn that offi cers have been instructed by superi-ors to “not sign anything.” Fearing that ask-ing offi cers to sign informed consent state-ments will sharply reduce participation, researchers seek some other way to satisfy their university IRB. What should they do?

5. In the example mentioned in the section “Staff Misbehavior” (page xx), the research-ers disclosed to public offi cials that police were not making visits to probationers as

“During the inspection, I went to cell 2 to mess up a bed which the prisoner had made and he grabbed me, screaming that he had just made it. . . . He grabbed my throat, and although he was laughing, I was pretty scared. I lashed out with my stick and hit him in the chin (although not very hard), and when I freed myself I became angry.” “Acting authoritatively can be fun. Power can be a great pleasure.”

How do you feel about this experiment? On the one hand, it provided valuable insights into how otherwise normal people react in a simu-lated prison environment. Subjects appeared to suffer no long-term harm, in part because of precautions taken by researchers. Paul Reyn-olds (1979, 139) found a certain irony in the short-term discomforts endured by the college student subjects: “There is evidence that the major burdens were borne by individuals from advantaged social categories and that the major benefactors would be individuals from less ad-vantaged social categories [actual prisoners], an uneven distribution of costs and benefi ts that many nevertheless consider equitable.” On the other hand, researchers did not anticipate how much and how quickly subjects would accept their roles. In discussing their fi ndings, Haney and associates (1973, 90) note: “Our results are . . . congruent with those of Milgram who most convincingly demonstrated the proposi-tion that evil acts are not necessarily the deeds of evil men, but may be attributable to the op-eration of powerful social forces.” This quote il-lustrates the fundamental dilemma—balancing the right to conduct research against the rights of subjects. Is it ethical for researchers to create powerful social forces that lead to evil acts?

Discussion ExamplesResearch ethics is an important and ambiguous topic. The diffi culty of resolving ethical issues cannot be an excuse for ignoring them. You need to keep ethics in mind as you read other

46 Part One An Introduction to Criminal Justice Inquiry

the document carefully. How would the code apply to the Stanford prison simulation?

2. Discuss the general trade-offs between the re-quirements of sound scientifi c research meth-ods and the need to protect human subjects. Where do tensions exist? Cite illustrations of tensions from two or more examples of ethical issues presented in this chapter.

3. Review the box “Ethics and Juvenile Gang Mem-bers” on page 42. Although it is not shown in the box, Decker and Van Winkle developed an informed consent form for their subjects. Keep-ing in mind the various ethical principles dis-cussed in this chapter, try your hand at prepar-ing an informed consent statement that Decker and Van Winkle might have used.

✪ Additional ReadingsAls-Nielsen, Bodil, Wendong Chen, Christian

Gluud, and Lise L. Kjaergard, “Association of Funding and Conclusions in Randomized Drug Trials: A Refl ection of Treatment Effect or Ad-verse Events?” Journal of the American Medical As-sociation 290(7, August 2003): 921–927. A brief, interesting, and nontechnical analysis of re-search on the effects of new drugs. The authors fi nd that research sponsored by drug compa-nies is much more likely to fi nd that drugs are effective, compared to research sponsored by nonprofi t organizations. What do you make of that?

American Association of University Professors, Re-search on Human Subjects: Academic Freedom and the Institutional Review Board (2006), report from the Committee on Academic Freedom and Ten-ure (Washington, DC: American Association of University Professors, 2006; www.aaup.org/AAUP/comm/rep/A/humansubs.htm; accessed May 5, 2008). Largely in response to concern about overly restrictive institutional review boards, the AAUP convened a series of meetings with representatives from major social science professional groups; this report summarizes the discussion. The most interesting sections address the expansion of human subjects’ pro-tections in social science and even historical research.

Committee on Science, Engineering, and Public Policy, On Being a Scientist: Responsible Conduct in Research, 2nd ed. (Washington, DC: National Academy Press, 1995). This monograph covers

called for in the program intervention. Pub-lished reports describe the problem as “ir-regularities in program delivery.”

✪ Main Points• In addition to technical and scientifi c consid-

erations, criminal justice research projects are shaped by ethical considerations.

• What’s ethically “right” and “wrong” in research is ultimately a matter of what people agree is right and wrong.

• Researchers tend not to be the best judges of whether their own work adequately addresses ethical issues.

• Most ethical questions involve weighing the possible benefi ts of research against the poten-tial for harm to research subjects.

• Criminal justice research may generate special ethical questions, including the potential for le-gal liability and physical harm.

• Scientists agree that participation in research should, in general, be voluntary. This norm, however, can confl ict with the scientifi c need for generalizability.

• Most scientists agree that research should not harm subjects unless they willingly and know-ingly accept the risks of harm.

• Anonymity and confi dentiality are two ways to protect the privacy of research subjects.

• Compliance with ethical principles is promoted by professional associations and by regulations issued by the Department of Health and Hu-man Services (HHS).

• HHS regulations include special provisions for two types of subjects of particular interest to many criminal justice researchers: prisoners and juveniles.

• Institutional review boards (IRBs) play an im-portant role in ensuring that the rights and interests of human subjects are protected. But some social science researchers believe that IRBs are becoming too restrictive.

✪ Key Termsanonymity, p. 32 confi dentiality, p. 32

✪ Review Questions and Exercises1. Obtain a copy of the Academy of Criminal Jus-

tice Sciences’ (2000) code of ethics at this web-site: www.acjs.org (accessed May 6, 2008). Read

Chapter 2 Ethics and Criminal Justice Research 47

describes the dangers and depressing realities of fi eld research in a crack house. Should a fi eld researcher intervene when witnessing a gang rape in a crack house? Read this selection for Inciardi’s answer.

Kalichman, Seth C., Mandated Reporting of Sus-pected Child Abuse: Ethics, Law, and Policy, 2nded. (Washington, DC: American Psychological Association, 2000). This report presents an ex-tensive discussion of legal and ethical issues in conducting research on child abuse.

a range of issues, including research fraud and other misconduct by researchers. Although many of the issues are specifi c to the natural sciences, criminologists will fi nd much valuable material.

Inciardi, James A., “Some Considerations on the Methods, Dangers, and Ethics of Crack-House Research,” Appendix A in James A. Inciardi, Dor-othy Lockwood, and Anne E. Pettieger, Women and Crack Cocaine (New York: Macmillan, 1993), pp. 147–157. In this thoughtful essay, Inciardi

This page intentionally left blank

49

Part Two

Structuring Criminal Justice Inquiry

Posing questions properly is often more dif-fi cult than answering them. Indeed, a prop-erly phrased question often seems to answer itself. We sometimes discover the answer to a question in the very process of clarifying the question for someone else.

At its base, scientifi c research is a process for achieving generalized understanding through observation. Part Three will de-scribe some of the specifi c methods of obser-vation for criminal justice research. But fi rst, Part Two deals with the posing of proper questions, the structuring of inquiry.

Chapter 3 addresses some of the funda-mental issues that must be considered in planning a research project. It examines questions of causation, the units of analysis in a research project, the important role of time, and the kinds of things we must con-sider in proposing to do research projects.

Chapter 4 deals with the specifi cation of what it is we want to study—a process known as conceptualization—and the mea-

surement of the concepts we specify. We’ll look at some of the terms that we use casu-ally in everyday life, and we’ll see how es-sential it is to be clear about what we really mean by such terms when we do research. Once we are clear on what we mean when we use certain terms, we are in a position to create measurements of what those terms refer to. The process of devising steps or operations for measuring what we want to study is known as operationalization. By way of illustrating this process, Chapter 4 includes an extended discussion of different approaches to measuring crime.

Chapter 5 concentrates on the general design of a criminal justice research project. A criminal justice research design specifi es a strategy for fi nding out something, for structuring a research project. Chapter 5 de-scribes commonly used strategies for experi-mental and quasi-experimental research. Each is adapted in some way from the clas-sical scientifi c experiment.

50

Chapter 3

General Issues in Research DesignHere we’ll examine some fundamental principles about conducting empiri-cal research: (1) causation and (2) variations on who or what is to be studied and when and how to do the studies. We’ll also take a broad overview of the research process.

Introduction 51

Causation in the Social Sciences 51

Criteria for Causality 52

Necessary and Suffi cient Causes 53

Validity and Causal Inference 53

Statistical Conclusion Validity 53

Internal Validity 55

External Validity 55

Construct Validity 55

Validity and Causal Inference Summarized 57

Does Drug Use Cause Crime? 57

CAUSATION AND DECLINING

CRIME IN NEW YORK CITY 58

Introducing Scientifi c Realism 60

Units of Analysis 61

Individuals 61

Groups 61

Organizations 62

Social Artifacts 62

The Ecological Fallacy 63

Units of Analysis in Review 63

UNITS OF ANALYSIS IN THE

NATIONAL YOUTH GANG

SURVEY 64

The Time Dimension 65

Cross-Sectional Studies 66

Chapter 3 General Issues in Research Design 51

IntroductionCausation, units, and time are key elements in planning a research study.

Science is an enterprise dedicated to “fi nd-ing out.” No matter what we want to fi nd out, though, there are likely to be a great many ways of going about it. Topics examined in this chap-ter address how to plan scientifi c inquiry—how to design a strategy for fi nding out something. Often criminal justice researchers want to fi nd out something that involves questions of cause and effect. They may want to fi nd out more about things that make crime more likely to oc-cur or about policies that they hope will reduce crime in some way.

In practice, all aspects of research design are interrelated. They are separated here and in sub-sequent chapters so that we can explore partic-ular topics in detail. We start with a discussion of causation in social science, the foundation of explanatory research. We then examine units of analysis—the what or whom to study. Deciding on units of analysis is an important part of all research, partly because people sometimes inap-propriately use data measuring one type of unit to say something about a different type of unit.

Next, we consider alternative ways of han-dling time in criminal justice research. It is

sometimes appropriate to examine a static cross section of social life, but other studies follow social processes over time. In this re-gard, researchers must consider the time order of events and processes in making statements about cause.

We then provide a brief overview of the whole research process. This serves two pur-poses: (1) it provides a map to the remainder of this book, and (2) it conveys a sense of how re-searchers go about designing a study.

The chapter concludes with guidelines for preparing a research proposal. Often the actual conduct of research needs to be preceded by a detailed plan of our intentions—perhaps to ob-tain funding for a major project or to get an in-structor’s approval for a class assignment. We’ll see that preparing a research proposal offers an excellent way to ensure that we have considered all aspects of our research in advance.

Causation in the Social SciencesCausation is the focus of explanatory research.

Cause and effect are implicit in much of what we have examined so far. One of the chief goals of social scientifi c researchers is to explain why

Longitudinal Studies 66

Approximating Longitudinal Studies 67

The Time Dimension Summarized 70

How to Design a Research Project 70

The Research Process 71

Getting Started 73

Conceptualization 73

Choice of Research Method 74

Operationalization 74

Population and Sampling 74

Observations 75

Analysis 75

Application 75

Research Design in Review 75

The Research Proposal 76

Elements of a Research Proposal 76

52 Part Two Structuring Criminal Justice Inquiry

scribed by William Shadish, Thomas Cook, and Donald Campbell (2002). The fi rst requirement in a causal relationship between two variables is that the cause precede the effect in time. It makes no sense to imagine something being caused by something else that happened later on. A bul-let leaving the muzzle of a gun does not cause the gunpowder to explode; it works the other way around. As simple and obvious as this crite-rion may seem, criminal justice research suffers many problems in this regard. Often the time or-der connecting two variables is simply unclear. Which comes fi rst: drug use or crime? And even when the time order seems clear, exceptions may be found. For example, we normally assume that obtaining a master’s degree in management is a cause of more rapid advancement in a state department of corrections. Yet corrections ex-ecutives might pursue graduate education after they have been promoted and recognize that ad-vanced training in management skills will help them do their job better.

The second requirement in a causal relation-ship is that the two variables be empirically cor-related with each other—they must occur to-gether. It makes no sense to say that exploding gunpowder causes a bullet to leave the muzzle of a gun if, in observed reality, a bullet does not come out after the gunpowder explodes.

Again, criminal justice research has diffi cul-ties with this requirement. In the probabilistic world of nomothetic models of explanation, at least, we encounter few perfect correlations. Most judges sentence repeat drug dealers to prison, but some don’t. We are forced to ask, therefore, how strong the empirical relation-ship must be for that relationship to be consid-ered causal.

The third requirement for a causal relation-ship is that the observed empirical correlation between two variables cannot be explained away as being due to the infl uence of some third vari-able that causes both of them. For example, we may observe that drug markets are often found near bus stops, but this does not mean that bus stops encourage drug markets. A third variable is at work here: groups of people natu-

things are the way they are. Typically we do that by specifying the causes for the way things are: some things are caused by other things.

Much of our discussion in this section de-scribes issues of causation and validity for so-cial science in general. Recall from Chapter 1 that criminal justice research and theory are most strongly rooted in the social sciences. Fur-thermore, social scientifi c research methods are adapted from those used in the physical sci-ences. Many important and diffi cult questions about causality and validity occupy researchers in criminal justice, but our basic approach re-quires stepping back a bit to consider the larger picture of how we can or cannot assert that some cause produces some effect.

At the outset, it’s important to keep in mind that cause in social science is inherently prob-abilistic, a point we introduced in Chapter 1. We say, for example, that certain factors make delinquency more or less likely. Thus, victims of childhood abuse or neglect are more likely to report alcohol abuse as adults (Schuck and Widom 2001). Recidivism is less likely among offenders who receive more careful assessment and classifi cation at institutional intake (Cul-len and Gendreau 2000).

Criteria for Causality We begin our consideration of cause by exam-ining what criteria must be satisfi ed before we can infer that something causes something else. Joseph Maxwell (2005, 106–107) writes that criteria for assessing an idiographic explana-tion are (1) how credible and believable it is and (2) whether alternative explanations (“rival hy-potheses”) were seriously considered and found wanting. The fi rst criterion relates to logic as one of the foundations of science. We demand that our explanations make sense, even if the logic is sometimes complex. The second crite-rion reminds us of Sherlock Holmes’s dictum that when all other possibilities have been elim-inated the remaining explanation, however im-probable, must be the truth.

Regarding nomothetic explanation, we ex-amine three specifi c criteria for causality, as de-

Chapter 3 General Issues in Research Design 53

with are probabilistic and partial—we are able to partly explain cause and effect in some per-centage of cases we observe.

Validity and Causal InferenceScientists assess the truth of statements about cause by considering threats to validity.

Paying careful attention to cause-and-effect relationships is crucial in criminal justice re-search. Cause and effect are also key elements of applied studies, in which a researcher may be in-terested in whether, for example, a new manda-tory sentencing law actually causes an increase in the prison population.

When we are concerned with whether we are correct in inferring that a cause produced an effect, we are concerned with the validityof causal inference. In the words of Shadish, Cook, and Campbell (2002, 34), validity is “the approximate truth of an inference. . . . When we say something is valid, we make a judgment about the extent to which relevant evidence sup-ports that inference as being true or correct.” They emphasize that approximate is an impor-tant word because one can never be absolutely certain about cause.

Our next concern is a number of different validity threats in causal inference—reasons we might be incorrect in stating that some cause produced some effect. As Maxwell (2005, 106) puts it, “A key concept for validity is thus the validity threat: a way you might be wrong” (emphasis in original). Here we will summarize the threats to four general categories of valid-ity: statistical conclusion validity, internal va-lidity, construct validity, and external validity. Chapter 5 discusses each type in more detail, linking the issue of validity to different ways of designing research.

Statistical Conclusion ValidityStatistical conclusion validity refers to our ability to determine whether a change in the suspected cause is statistically associated with a change in the suspected effect. This corresponds

rally congregate near bus stops, and street drug markets are often found where people naturally congregate.

To sum up, most social researchers consider two variables to be causally related— one causes the other—if (1) the cause precedes the effect in time, (2) there is an empirical correlation between them, and (3) the relationship is not found to result from the effects of some third variable on each of the two initially observed. Any relationship that satisfi es all these criteria is causal, and these are the only criteria that need to be satisfi ed.

Necessary and Suffi cient CausesRecognizing that virtually all causal relation-ships in criminal justice are probabilistic is central to understanding other points about cause. Within the probabilistic model, it is use-ful to distinguish two types of causes: neces-sary and suffi cient causes. A necessary cause is a condition that, by and large, must be present for the effect to follow. For example, it is neces-sary for someone to be charged with a criminal offense to be convicted, but being charged is not enough; you must plead guilty or be found guilty by the court. Figure 3.1 illustrates that relationship.

A suffi cient cause, in contrast, is a condition that more or less guarantees the effect in ques-tion. Pleading guilty to a criminal charge is a suffi cient cause for being convicted, although you could be convicted through a trial as well. Figure 3.2 illustrates this state of affairs.

The discovery of one cause that is both necessary and suffi cient is the most satisfy-ing outcome in explanatory research. If we are studying juvenile delinquency, we want to dis-cover a single condition that (1) has to be pres-ent for delinquency to develop and (2) always results in delinquency. Then we will surely feel that we know precisely what causes juvenile de-linquency. Unfortunately, we seldom discover causes that are both necessary and suffi cient, nor, in practice, are causes 100 percent neces-sary or 100 percent suffi cient. Most causal rela-tionships that criminal justice researchers work

54 Part Two Structuring Criminal Justice Inquiry

Basing conclusions on a small number of cases is a common threat to statistical conclu-sion validity. Suppose a researcher studies 10 drug users and 10 nonusers and compares the numbers of times these subjects are arrested for other crimes over a six-month period. The researcher might fi nd that the 10 users were ar-rested an average of three times, whereas non-users averaged two arrests in six months. There is a difference in arrest rates, but is it a signifi -cant difference? Statistically, the answer is no because so few drug users were included in the study. Researchers cannot have much confi –

with one of the fi rst questions asked by research-ers: are two variables related to each other? If we suspect that using illegal drugs causes people to commit crimes, one of the fi rst things we will be interested in is the common variation between drug use and crime. If drug users and nonus-ers commit equal rates of crime and if about the same proportions of criminals and non-criminals use drugs, there will be no statistical relationship between measures of drug use and criminal offending. That seems to be the end of our investigation of the causal relationship be-tween drugs and crime.

Convicted

NotConvicted

Plead GuiltyPlead Innocent

Figure 3.2 Suffi cient Cause

Convicted

NotConvicted

ChargedNot Charged

Figure 3.1 Necessary Cause

Chapter 3 General Issues in Research Design 55

case, a third variable—prior convictions—may explain some or all of the observed tendency of prison sentences to be associated with re-cidivism. Prior convictions are associated with both sentence—prison or probation—and sub-sequent convictions.

External ValidityAre fi ndings about the impact of mandatory ar-rest for family violence in Minneapolis similar to fi ndings in Milwaukee? Can community crime prevention organizations successfully combat drug use throughout a city, or do they work best in areas with only minor drug problems? Elec-tronic monitoring may be suitable as an alter-native sentence for convicted offenders, but can it work as an alternative to jail for defendants awaiting trial? Such questions are examples of issues in external validity: do research fi ndings about cause and effect apply equally to different cities, neighborhoods, and populations?

In a general sense, external validity is con-cerned with whether research fi ndings from one study can be reproduced in another study, often under different conditions. Because crime prob-lems and criminal justice responses can vary so much from city to city or from state to state, re-searchers and public offi cials are often especially interested in external validity. For example, a Kansas City evaluation found sharp reductions in gun-related crimes in hot spots that had been targeted for focused police patrols (Sherman and Rogan 1995). Because these results were promis-ing, similar projects were launched in two other cities, Indianapolis (McGarrell et al. 2001) and Pittsburgh (Cohen and Ludwig 2003). In both cases, researchers found that police actions tar-geting hot spots for gun violence reduced gun-related crime and increased seizures of illegal fi rearms. Having similar fi ndings in Indianapo-lis and Pittsburgh enhanced the external valid-ity of original results from Kansas City.

Construct ValidityThis type of validity is concerned with how well an observed relationship between variables

dence in statements about cause if their fi nd-ings are based on a small number of cases.

Threats to statistical conclusion validity might also have the opposite effect of suggest-ing that covariation is present when, in fact, there is no cause-and-effect relationship. The reasons for this are again somewhat technical and require a basic understanding of statisti-cal inference. But the basic principle is based on chance variation—sometimes what appears to be a relationship simply occurs by chance.

Internal ValidityInternal validity threats challenge causal statements that are based on an observed rela-tionship. An observed association between two variables has internal validity if the relation-ship is, in fact, causal and not due to the effects of one or more other variables. Whereas statis-tical conclusion threats are most often due to random error, internal validity problems result from nonrandom or systematic error. Threats to the internal validity of a proposed causal re-lationship between two indicators usually arise from the effects of one or more other variables. Notice how this validity threat relates to the third requirement for establishing a causal re-lationship: eliminating other possible explana-tions for the observed relationship.

If we observe that convicted drug users sen-tenced to probation are rearrested less often than drug users sentenced to prison, we might be tempted to infer that prison sentences cause recidivism. Although being in prison might have some impact on whether someone com-mits more crimes in the future, in this case it is important to look for other causes of recidi-vism. One likely candidate is prior criminal re-cord. Convicted drug users without prior crimi-nal records are more likely to be sentenced to probation, whereas persons with previous con-victions more often receive prison terms. Re-search on criminal careers has found that the probability of reoffending increases with the number of prior arrests and convictions (Far-rington, Jolliffe, Hawkins, et al. 2003). In this

56 Part Two Structuring Criminal Justice Inquiry

further that withdrawing preventive patrol, as was done in the reactive beats, reduces the vis-ibility of police. But by how much? Larson ex-plored this question in detail and suggested that two other features of police operations during the Kansas City experiment partially compensated for the absence of preventive pa-trol and produced a visible police presence.

First, the different types of experimental beats were adjacent to one another; one reac-tive beat shared borders with three control and three proactive beats. This enhanced the vis-ibility of police in reactive beats in two ways: (1) police in adjoining proactive and control beats sometimes drove around the perimeter of reactive beats, and (2) police often drove through reactive beats on their way to some other part of the city.

Second, many Kansas City police offi cers were skeptical about the experiment and feared that withdrawing preventive patrol in reactive beats would create problems. Partly as a result, police who responded to calls for service in the reactive areas more frequently used lights and sirens when driving to the location of com-plaints. A related action was that police units not assigned to the calls for service nevertheless drove into the reactive beats to provide backup service.

Each of these actions produced a visible po-lice presence in the reactive beats. People who lived in these areas were unaware of the experi-ment and, as you might expect, did not know whether a police car happened to be present be-cause it was on routine patrol, was on its way to some other part of the city, or was respond-ing to a call for assistance. And, of course, the use of lights and sirens makes police cars much more visible.

Larson’s point was that the construct of po-lice visibility is only partly represented by rou-tine preventive patrol. A visible police presence was produced in Kansas City through other means. Therefore the researchers’ conclusion that routine preventive patrol does not cause a reduction in crime or an increase in arrests suffers from threats to construct validity. Con-

represents the underlying causal process of in-terest. Construct validity refers to generaliz-ing from what we observe and measure to the real-world things in which we are interested. The concept of construct validity is thus closely related to issues in measurement, as we will see in Chapter 4.

To illustrate construct validity, let’s consider the supervision of police offi cers—specifi cally, whether close supervision causes police offi cers to write more traffi c tickets. We might defi ne close supervision in this way: a police sergeant drives his own marked police car in such a way as to always keep a patrol car in view.

This certainly qualifi es as close supervision, but you may recognize a couple of problems. First, two marked patrol cars present a highly visible presence to motorists, who might drive more prudently and thus reduce the opportu-nities for patrol offi cers to write traffi c tickets. Second, and more central to the issue of con-struct validity, this represents a narrow defi ni-tion of the construct close supervision. Patrol of-fi cers may be closely supervised in other ways that cause them to write more traffi c tickets. For example, sergeants might closely supervise their offi cers by reviewing their ticket produc-tion at the end of each shift. Supervising subor-dinates by keeping them in view is only one way of exercising control over their behavior. It may be appropriate for factory workers, but it is not practical for police, representing a very limited version of the construct supervision.

The well-known Kansas City Preventive Pa-trol Experiment, discussed in Chapter 1, pro-vides another example of construct validity problems. Recall that the experiment sought to determine whether routine preventive patrol caused reductions in crime and fear of crime and increases in arrests. Richard Larson (1975) dis-cussed several diffi culties with the experiment’s design. One important problem relates to the visibility of police presence, a central concept in preventive patrol. It is safe to assume that the ability of routine patrol to prevent crime and enhance feelings of safety depends crucially on the visibility of police. It makes sense to assume

Chapter 3 General Issues in Research Design 57

Does Drug Use Cause Crime?As a way of illustrating issues of validity and causal inference, we will consider the relation-ship between drug use and crime. Drug addic-tion is thought to cause people who are des-perate for a fi x and unable to secure legitimate income to commit crimes to support their habit.

Discussing the validity of causal statements about drug use and crime requires carefully specifying two key concepts— drug use and crime—and considering the different ways these concepts might be related. Jan Chaiken and Marcia Chaiken (1990) provide unusually care-ful and well-reasoned insights that will guide our consideration of links between drugs and crime.

First is the question of temporal order: which comes fi rst, drug use or crime? Research sum-marized by Chaiken and Chaiken provides no conclusive answer. In an earlier study of prison inmates, Chaiken and Chaiken (1982) found that 12 percent of their adult subjects com-mitted crimes after using drugs for at least two years, whereas 15 percent committed predatory crimes two or more years before using drugs. Studies of juveniles revealed similar fi ndings: “About 50 percent of delinquent youngsters are delinquent before they start using drugs; about 50 percent start concurrently or after” (Chai-ken and Chaiken 1990, 235).

Many studies have found that some drug users commit crimes and that some criminals use drugs, but Chaiken and Chaiken (1990, 234) conclude that “drug use and crime par-ticipation are weakly related as contemporane-ous products of factors generally antithetical to traditional United States lifestyles.” Stated somewhat differently, drug use and crime (as well as delinquency) are each deviant activities produced by other underlying causes. A statis-tical association between drug use and crime clearly exists. But the presence of other factors indicates that the relationship is not directly causal, thus bringing into question the inter-nal validity of causal statements about drug use and crime.

struct validity is a frequent problem in applied studies, in which researchers may oversimplify complex policies and policy goals.

Validity and Causal Inference SummarizedThe four types of validity threats can be grouped into two categories: bias and gener-alizability. Internal and statistical conclusion validity threats are related to systematic and nonsystematic bias, respectively. Problems with statistical procedures produce nonsystematic bias, whereas an alternative explanation for an observed relationship is an example of system-atic bias. In either case, bias calls into question the inference that some cause produced some effect. Failing to consider the more general cause-and-effect constructs that operate in an observed cause-and-effect relationship results in research fi ndings that cannot be generalized to real-world behaviors and conditions. And a cause-and-effect relationship observed in one setting or at one time may not operate in the same way in a different setting or at a different time.

Shadish, Cook, and Campbell (2002, 39) summarized their discussion of these four va-lidity threats by linking them to the types of questions that researchers ask in trying to es-tablish cause and effect. Test your understand-ing by writing the name of each validity threat after the appropriate question.

1. How large and reliable is the covariation be-tween the presumed cause and effect?

_______________2. Is the covariation causal, or would the same

covariation have been obtained without the treatment? _______________

3. What general constructs are involved in the persons, settings, treatments, and observa-tions used in the experiment?

_______________4. How generalizable is the locally embedded

causal relationship over varied persons, treatments, observations, and settings?

__________________

58 Part Two Structuring Criminal Justice Inquiry

and Bennett and Holloway (2005) report on varying patterns of use in England and Wales. Because there is no simple way to describe ei-ther construct, searching for a single cause-and-effect relationship misrepresents a complex causal process.

Problems with the external validity of re-search on drugs and crime are similar to those revolving around construct validity. The rela-tionship between occasional marijuana use and delinquency among teenagers is different from that between occasional cocaine use and adult

To assess the construct validity of research on drugs and crime, let’s think for a moment about different patterns of each behavior, rather than assume that drug use and crime are uniform behaviors. Many adolescents in the United States experiment with drugs; just as many— especially males— commit delinquent acts or petty crimes. A large number of adults may be occasional users of illegal drugs as well. Many different patterns of drug use, delin-quency, and adult criminality have been found in research in other countries. Pudney (2002)

CAUSATION ANDDECLINING CRIMEIN NEW YORK CITY

Did changes in police strategy and tactics in New York City cause a decline in crime? That question is central to what became quite a spirited debate, a debate that centers our attention on causa-tion. In fact, former Police Commissioner William Bratton (1999, 17) used language strikingly simi-lar to what you will fi nd in this chapter to argue that crime dropped in New York because of po-lice action:

As a basic tenet of epistemology . . . we cannot conclude that a causal relationship exists between two variables unless . . . three necessary conditions occur: one vari-able must precede the other in time, an empirically measured relationship must be demonstrated between the variables, and the relationship must not be better ex-plained by any third intervening variable. Although contemporary criminology’s ex-planations for the decline in New York City meet the fi rst two conditions, they don’t explain it better than a third intervening variable.

With these words, Bratton challenged researchers to propose some empirical measure of a variable that better accounted for the crime reduction. A number of researchers have advanced alterna-

tive explanations. Let’s now consider what some of those variables might be.

Changing Drug MarketsAfter falling from 1980 through 1984, homicide rates in larger U.S. cities rose sharply through the early 1990s. This corresponded with the emer-gence of crack cocaine, a low-cost drug sold by loosely organized gangs that settled business dis-putes with guns instead of lawyers. The decline in crack use in the mid-1990s corresponded with the beginning of decreasing homicide rates. Al-fred Blumstein and Richard Rosenfeld (1998) point to changes in crack markets as a plausible explanation—gun homicides and crack markets both increased and decreased together. Other researchers claim changes in crack markets had some effect on violence ( Johnson, Golub, and Dunlap 2000; Karmen 2000).

RegressionOne threat to internal validity is regression to the mean. This refers to a phenomenon whereby social indicators move up and down over time, and abnormally high or low values are eventually followed by a return (regression) to more normal levels. Jeffrey Fagan and associates (Fagan, Zim-ring, and Kim 1998) present evidence to suggest that rates of gun homicide in New York were rela-tively stable from 1969 through the mid-1980s, when they began to increase sharply. Around 1991, rates began to decline, returning to ap-proximate previous levels.

Chapter 3 General Issues in Research Design 59

trates threats to the validity of causal infer-ence. It is often diffi cult to fi nd a relationship because there is so much variation in drug use and crime participation (statistical conclusion validity threat). A large number of studies have demonstrated that, when statistical relation-ships are found, both drug use and crime can be attributed to other, often multiple, causes (internal validity threat). Different patterns among different population groups mean there are no readily identifi able cause-and-effect con-structs (construct validity). Because of these

crime; each relationship, in turn, varies from that between heroin addiction and persistent criminal behavior among adults.

The issue of external validity comes into fo-cus when we shift from basic research that seeks to uncover fundamental causal relationships to criminal justice policy. Chaiken and Chaiken argue that any uniform policy to reduce the use of all drugs among all population groups will have little effect on serious crime.

Basic and applied research on the relation-ships among drug use and crime readily illus-

Homicides Declined EverywhereFagan and associates (1998) point out that al-though New York’s decline was considerable it was not unprecedented. Many large cities saw sharp re-ductions in homicide during the same period, and a few cities had even greater declines. Blumstein and Rosenfeld (1998) cite similar data, pointing out that sharp reductions in homicide rates oc-curred in cities where no major changes in policing were evident. This suggests that declining crime in New York was simply part of a national trend.

IncapacitationThe decline of homicide rates in the 1990s fol-lowed more than a decade of growth in incar-ceration rates in state and federal correctional facilities. According to the incapacitation argu-ment, rates of homicide and other violent crimes declined because growing numbers of violent criminals were locked up. Some researchers have presented evidence to challenge that claim (Rosenfeld 2000; Spelman 2000).

Economic OpportunityThe early 1990s marked the beginning of a sus-tained period of economic growth in the United States. With greater job opportunities, crime nat-urally declined. A number of authors have cited this example of how change in one of the “root causes” of crime might have caused changes in rates of homicide and other serious crimes (for example, Karmen 2000; Silverman 1999), but none have offered any evidence to support it.

Demographic ChangeBecause violent and other offenses tend to be committed more by younger people, especially young males, a decline in the number of members in those demographic groups may be responsible for New York’s reduced crime rate. Fagan and as-sociates (1998) present data that show relatively stable numbers of 15- to 19-year-old males in New York, so it seems unlikely that demographic factors account for changes in crime rates. Addi-tional analysis of demographic change has been conducted by Rosenfeld (2000) and James Alan Fox (2000).

Continuation of a TrendGeorge Kelling and William Bratton (1998) claim that crime declined in New York immediately fol-lowing the implementation of major changes in policing. But other analysts argue that the be-ginning of the shift preceded Bratton’s changes. Notice that this brings into question another of the three criteria for inferring cause by claiming that the effect—declining crime—occurred before the cause—changes in policing. Even more pos-sible explanations have been offered, but those presented here are the most plausible and are most often cited by researchers. Andrew Karmen (2000, 263) offers a good summary of different explanations for New York’s crime drop.

60 Part Two Structuring Criminal Justice Inquiry

of considering nomothetic causation. A scien-tifi c realist approach would consider the causal mechanism underlying electronic monitoring to be effective in some contexts but not in oth-ers. As another example, we reviewed at some length the cause-and-effect conundrum sur-rounding drug use and crime. That review was framed by traditional nomothetic research to establish cause and effect. A scientifi c realism approach to the question would recognize that drug use and crime co-occur in some contexts but not in others.

We say that scientifi c realism bridges idio-graphic and nomothetic modes of explanation because it exhibits elements of both. Because it focuses our attention on very specifi c questions, scientifi c realism seems idiographic: “Will rede-signing the Interstate 78 exit in Newark, New Jersey, cause a reduction in the number of sub-urban residents seeking to buy heroin in this neighborhood?” But this approach is compat-ible with more general questions of causation: “Can the design of streets and intersections be modifi ed to make it more diffi cult for street drug markets to operate?” Changing an express-way exit ramp to reduce drug sales in Newark is a specifi c example of cause and effect that is rooted in the more general causal relationship between traffi c patterns and drug markets. Re-search by Nicholas Zanin and colleagues (2004) addresses both the idiographic explanation in Newark and the potential for broader applica-tions elsewhere.

These illustrations of the scientifi c realist approach to cause and effect are examples of research for the purpose of application, a topic treated at length by British researchers Ray Pawson and Nick Tilley (1997). Application is a type of explanatory research, as we indicated in Chapter 1. In later chapters, we call on scientifi c realism as a strategy for designing explanatory research (Chapter 5) and conducting evalua-tions (Chapter 10).

Sorting out causes and effects is one of the most diffi cult challenges of explanatory re-search. Our attention now turns to two other important considerations that emerge in re-

differences, policies developed to counter drug use among the population as a whole cannot be expected to have much of an impact on serious crime (external validity).

None of the above is to say that there is no cause-and-effect relationship between drug use and crime. However, Chaiken and Chai-ken have clearly shown that there is no simple causal connection. For more on how questions of cause and effect emerge, see the box “Causa-tion and Declining Crime in New York City.”

Introducing Scientifi c RealismIn our fi nal consideration of cause and effect in this chapter, we revisit the distinction be-tween idiographic and nomothetic ways of ex-planation. Doing research to fi nd what causes what refl ects nomothetic concerns more often than not. We wish to fi nd causal explanations that apply generally to situations beyond those we actually study in our research. At the same time, researchers and public offi cials are of-ten interested in understanding specifi c causal mechanisms in more narrowly defi ned situ-ations—what we have described as the idio-graphic mode of explanation.

Scientifi c realism bridges idiographic and nomothetic approaches to explanation by seek-ing to understand how causal mechanisms oper-ate in specifi c contexts. Traditional approaches to fi nding cause and effect usually try to isolate causal mechanisms from other possible infl u-ences, something you should now recognize as trying to control threats to internal valid-ity. The scientifi c realist approach views these other possible infl uences as contexts in which causal mechanisms operate. Rather than try to exclude or otherwise control possible out-side infl uences, scientifi c realism studies how such infl uences are involved in cause-and-effect relationships.

For example, earlier in this chapter, we noted that electronic monitoring as a condi-tion of probation might apply to some popula-tions but not others. We framed this as a ques-tion of external validity in the traditional way

Chapter 3 General Issues in Research Design 61

IndividualsAny variety of individuals may be the units of analysis in criminal justice research. This point is more important than it may initially seem. The norm of generalized understanding in social science should suggest that scientifi c fi ndings are most valuable when they apply to all kinds of people. In practice, however, re-searchers seldom study all kinds of people. At the very least, studies are typically limited to people who live in a single country, although some comparative studies stretch across na-tional boundaries.

As the units of analysis, individuals may be considered in the context of their member-ship in different groups. Examples of groups whose members may be units of analysis at the individual level are police, victims, defendants in criminal court, correctional inmates, gang members, and active burglars. Note that each of these terms implies some population of in-dividual persons. Descriptive studies having individuals as their units of analysis typically aim to describe the population that comprises those individuals.

GroupsSocial groups may also be the units of analy-sis for criminal justice research. This is not the same as studying the individuals within a group. If we study the members of a juvenile gang to learn about teenagers who join gangs, the individual (teen gang member) is the unit of analysis. But if we study all the juvenile gangs in a city to learn the differences between big gangs and small ones, between gangs selling drugs and gangs stealing cars, and so forth, the unit of analysis is the social group (gang).

Police beats or patrol districts might be the units of analysis in a study. A police beat can be described in terms of the total number of peo-ple who live within its boundaries, total street mileage, annual crime reports, and whether the beat includes a special facility such as a park or high school. We can then determine, for ex-ample, whether beats that include a park report

search for explanation and other purposes: units of analysis and the time dimension.

Units of AnalysisTo avoid mistaken inferences, researchers must carefully specify the people or phenomena that will be studied.

In criminal justice research, there is a great deal of variation in what or who is studied—what are technically called units of analysis. Individ-ual people are often units of analysis. Research-ers may make observations describing certain characteristics of offenders or crime victims, such as age, gender, or race. The descriptions of many individuals are then combined to pro-vide a picture of the population that comprises those individuals.

For example, we may note the age and gen-der of persons convicted of drunk driving in Fort Lauderdale over a certain period. Aggre-gating these observations, we might charac-terize drunk-driving offenders as 72 percent men and 28 percent women, with an average age of 26.4 years. This is a descriptive analysis of convicted drunk drivers in Fort Lauderdale. Although the description applies to the group of drunk drivers as a whole, it is based on the characteristics of individual people convicted of drunk driving.

Units of analysis in a study are typically also the units of observation. Thus, to study what steps people take to protect their homes from burglary, we might observe individual house-hold residents, perhaps through interviews. Sometimes, however, we observe units of analy-sis indirectly. We might ask individuals about crime prevention measures for the purpose of describing households. We might want to fi nd out whether homes with double-cylinder dead-bolt locks are burglarized less often than homes with less substantial protection. In this case, our units of analysis are households, but the units of observation are individual household members who are asked to describe burglaries and home protection to interviewers.

62 Part Two Structuring Criminal Justice Inquiry

grams in selected neighborhoods of a large city. In such an evaluation, we might be interested in how citizens feel about the program (indi-viduals), whether arrests increased in neighbor-hoods with the new program compared with those without it (groups), and whether the po-lice department’s budget increased more than the budget in a similar city (organizations). In such cases, it is imperative that researchers anticipate what conclusions they wish to draw with regard to what units of analysis.

Social ArtifactsYet another potential unit of analysis may be re-ferred to as social artifacts, or the products of so-cial behavior. One class of social artifacts is sto-ries about crime in newspapers and magazines or on television. A newspaper story might be characterized by its length, placement on front or interior pages, size of headlines, and pres-ence of photographs. A researcher could ana-lyze whether television news features or news-paper reports provide the most details about a new police program to increase drug arrests.

Social interactions are also examples of so-cial artifacts suitable for criminal justice re-search. Police crime reports are an example. We might analyze assault reports to fi nd how many involved three or more people, whether assaults involved strangers or people with some prior acquaintance, or whether they more often oc-curred in public or private locations.

At fi rst, crime reports may not seem to be so-cial artifacts, but consider for a moment what they represent. When a crime is reported to the police, offi cers usually record what happened from descriptions by victims or witnesses. For instance, an assault victim may describe how he suffered an unprovoked attack while inno-cently enjoying a cold beer after work. However, witnesses to the incident might claim that the “victim” started the fi ght by insulting the “of-fender.” The responding police offi cer must in-terpret who is telling the truth in trying to sort out the circumstances of a violent social inter-action. The offi cer’s report becomes a social

more assaults than beats without such facilities or whether auto thefts are more common in beats with more street mileage. Here the indi-vidual police beat is the unit of analysis.

OrganizationsFormal political or social organizations may also be the units of analysis in criminal jus-tice research. An example is correctional facili-ties, which implies, of course, a population of all correctional facilities. Individual facilities might be characterized in terms of their num-ber of employees, status as state or federal prisons, security classifi cation, percentage of inmates who are from racial or ethnic minor-ity groups, types of offenses for which inmates are sentenced to each facility, average length of sentence served, and so forth. We might de-termine whether federal prisons house a larger or smaller percentage of offenders sentenced for white-collar crimes than do state prisons. Other examples of formal organizations suit-able as units of analysis are police departments, courtrooms, probation offi ces, drug treatment facilities, and victim services agencies.

When social groups or formal organizations are the units of analysis, their characteristics are often derived from the characteristics of their individual members. Thus a correctional facil-ity might be described in terms of the inmates it houses—gender distribution, average sentence length, ethnicity, and so on. In a descriptive study, we might be interested in the percentage of institutions housing only females. Or, in an explanatory study, we might determine whether institutions housing both males and females report, on the average, fewer or more assaults by inmates on staff compared with male-only institutions. In each example, the correctional facility is the unit of analysis. In contrast, if we ask whether male or female inmates are more often involved in assaults on staff, then the in-dividual inmate is the unit of analysis.

Some studies involve descriptions or expla-nations of more than one unit of analysis. Con-sider an evaluation of community policing pro-

Chapter 3 General Issues in Research Design 63

of analysis, but we wish to draw conclusions about individual people.

The same problem will arise if we discover that incarceration rates are higher in states that have a large proportion of elderly residents. We will not know whether older people are actually imprisoned more often. Or, if we fi nd higher suicide rates in cities with large nonwhite pop-ulations, we cannot be sure whether more non-whites than whites committed suicide.

Don’t let these warnings against ecological fallacy lead you to commit what is called an individualistic fallacy. Some students approach-ing criminal justice research for the fi rst time have trouble reconciling general patterns of at-titudes and actions with individual exceptions they know of. If you read a newspaper story about a Utah resident visiting New York who is murdered on a subway platform, the fact remains that most visitors to New York and most subway riders are not at risk of murder. Similarly, mass media stories and popular fi lms about drug problems in U.S. cities frequently focus on drug use and dealing among African Americans. But that does not mean that most African Americans are drug users or that drugs are not a problem among whites.

The individualistic fallacy can be especially troublesome for beginning students of crimi-nal justice. Newspapers, local television news, and television police dramas often present un-usual or highly dramatized versions of crime problems and criminal justice policy. These messages may distort the way many people ini-tially approach research problems in criminal justice.

Units of Analysis in ReviewThe purpose of this section has been to specify what is sometimes a confusing topic, in part be-cause criminal justice researchers use a variety of different units of analysis. Although individual people are often the units of analysis, that is not always the case. Many research questions can more appropriately be answered through the ex-amination of other units of analysis.

artifact that represents one among the popula-tion of all assaults.

Records of different types of social interac-tions are common units of analysis in crimi-nal justice research. Criminal history records, meetings of community anticrime groups, pre-sentence investigations, and interactions be-tween police and citizens are examples. Notice that each example requires information about individuals but that social interactions between people are the units of analysis.

The Ecological FallacyWe now briefl y consider one category of prob-lems commonly encountered with respect to units of analysis. The ecological fallacy refers to the danger of making assertions about indi-viduals as the unit of analysis based on the ex-amination of groups or other aggregations.

As an example, suppose we are interested in learning about robbery in different police precincts of a large city. Let’s assume that we have information on how many robberies were committed in 2004 in each police precinct of Chicago. Assume also that we have census data describing some of the characteristics of those precincts. Our analysis of such data might show that a large number of robberies in 2004 occurred in the downtown precinct and that the average family income of persons who live in downtown Chicago (the Loop) was substan-tially higher than in other precincts in the city. We might be tempted to conclude that high-in-come downtown residents are more likely to be robbed than are people who live in other parts of the city—that robbers select richer victims. In reaching such a conclusion, we run the risk of committing an ecological fallacy, because lower-income people who did not live in the downtown area were also being robbed there in 2004. Victims might be commuters to jobs in the Loop, people visiting downtown theaters or restaurants, passengers on subway or elevated train platforms, or homeless persons who are not counted by the census. Our problem is that we examined police precincts as our unit

64 Part Two Structuring Criminal Justice Inquiry

results in part from diffi culties in directly mea-suring the concepts we want to study.

To test your grasp of the concept of units of analysis, here are some examples of real re-search topics. See if you can determine the unit of analysis in each. (The answers are given later in the chapter, on page 78.)

1. “Taking into account preexisting traffi c fa-tality trends and several other relevant fac-tors, the implementation of the emergency cellular telephone program resulted in a substantial and permanent reduction in the monthly percentage of alcohol-related fatal crashes.” (D’Alessio, Stolzenberg, and Terry 1999, 463– 464)

2. “The survey robbery rate was highest in Canada and the Netherlands, and lowest in Scotland. . . . In 1999 the survey robbery

The concept of units of analysis may seem more complicated than it needs to be. Under-standing the logic of units of analysis is more important than memorizing a list of the units. It is irrelevant what we call a given unit of analysis—a group, a formal organization, or a social artifact. It is essential, however, that we be able to identify our unit of analysis. We must decide whether we are studying assaults or as-sault victims, police departments or police of-fi cers, courtrooms or judges, and prisons or prison inmates. Without keeping this point in mind, we run the risk of making assertions about one unit of analysis based on the ex-amination of another. The box titled “Units of Analysis in the National Youth Gang Survey” offers examples of using inappropriate units of analysis. It also illustrates that lack of clar-ity about units of analysis in criminal justice

UNITS OF ANALYSIS INTHE NATIONAL YOUTHGANG SURVEY

In 1997, the third annual National Youth Gang Survey was completed for the federal Offi ce of Juvenile Justice and Delinquency Prevention (OJJDP). This survey refl ects keen interest in de-veloping better information about the scope of youth gangs and their activities in different types of communities around the country. As important and useful as this effort is, the National Youth Gang Survey—especially reports of its results—illustrates how some ambiguities can emerge with respect to units of analysis.

A variety of methods, often creative, are used to gather information from or about active offend-ers. Partly this is because it is diffi cult to system-atically identify offenders for research. Studying youth gangs presents more than the usual share of problems with units of analysis. Are we interested in gangs (groups), gang members (individuals), or offenses (social artifacts) committed by gangs?

Following methods developed in earlier years, the 1997 National Youth Gang Survey was based on a sample of law enforcement agencies. The sample was designed to represent different types of communities: rural areas, suburban counties,

small cities, and large cities. Questionnaires were mailed to the police chief for municipalities and to the sheriff for counties (National Youth Gang Center 1999, 3). Questions asked respondents to report on gangs and gang activity in their ju-risdiction—municipality for police departments, and unincorporated service area for sheriffs’ departments.

Here are examples of the types of questions included in the survey:

1. How many youth gangs were active in your jurisdiction?

2. How many active youth gang members were in your jurisdiction?

3. In your jurisdiction, what percentage of street sales of drugs were made by youth gang members? (followed by list: powder cocaine, crack cocaine, marijuana, heroin, metham-phetamine, other)

4. Does your agency have the following? (list of special youth gang units)

Notice the different units of analysis em-bedded in these questions. Seven are stated or implied:

1. Gangs: item 12. Gang members: items 2, 3

Chapter 3 General Issues in Research Design 65

prised the study’s cross-sectional units. From January 1986 through July 1989, a period of 43 months, data were collected monthly from each substation for a total of 344 cases.” (Kessler 1999, 346)

The Time DimensionBecause time order is a requirement for causal in-ferences, the time dimension of research requires careful planning.

We saw earlier in this chapter how the time se-quence of events and situations is a critical ele-ment in determining causation. In general, ob-servations may be made more or less at one time point or they may be deliberately stretched over a longer period. Observations made at more than one time point can look forward or backward.

rate was lowest in the United States.” (Far-rington et al. 2004, xii)

3. “On average, probationers were 31 years old, African American, male, and convicted of drug or property offenses. Most lived with family, and although they were not mar-ried, many were in exclusive relationships (44 percent) and had children (47 percent).” (MacKenzie, Browning, Skroban, and Smith 1999, 433)

4. “Seventy-fi ve percent (n = 158) of the cases were disposed at district courts, and 3 per-cent (n = 6) remained pending. One percent of the control and 4 percent of the experi-mental cases were referred to drug treat-ment court.” (Taxman and Elis 1999, 42)

5. “The department’s eight Field Operations Command substations (encompassing 20 police districts and 100 patrol beats) com-

3. Jurisdiction (city or part of county area): items 1, 2, 3

4. Street sales of drugs: item 35. Drug types: item 36. Agency: item 47. Special unit: item 4

Now consider some quotes from a summary re-port on the 1997 survey (National Youth Gang Center 1999). Which ones do or do not reason-ably refl ect the actual units of analysis from the survey?

■ “Fifty-one percent of survey respondents indi-cated that they had active youth gangs in their jurisdictions in 1997.” (page 7)

■ “Thirty-eight percent of jurisdictions in the Northeast, and 26 percent of jurisdictions in the Middle Atlantic regions reported active youth gangs in 1997.” (extracted from Table 3, page 10)

■ “Results of the 1997 survey revealed that there were an estimated 30,533 youth gangs and 815,986 gang members active in the United States in 1997.” (page 13)

■ “The percentage of street sales of crack co-caine, heroin, and methamphetamine con-ducted by youth gang members varied sub-stantially by region. . . . Crack cocaine sales

involving youth gang members were most prevalent in the Midwest (38 percent), heroin sales were most prevalent in the Northeast (15 percent), and methamphetamine sales were most prevalent in the West (21 percent).” (page 27)

■ “The majority (66 percent) of respondents in-dicated that they had some type of specialized unit to address the gang problem.” (page 33)

The youth gang survey report includes a number of statements and tables that inaccurately de-scribe units of analysis. You probably detected examples of this in some of the statements shown here. Other statements accurately refl ect units of analysis measured in the survey.

If you read the 1997 survey report and keep in mind our discussion of units of analysis, you will fi nd more misleading statements and tables. This will enhance your understanding of units of analysis.

Source: Information drawn from National Youth Gang Center (1999).

66 Part Two Structuring Criminal Justice Inquiry

extended period. An example is a researcher who observes the activities of a neighborhood anti-crime organization from the time of its incep-tion until its demise. Analysis of newspaper sto-ries about crime or numbers of prison inmates over time are other examples. In the latter in-stances, it is irrelevant whether the researcher’s observations are made over the course of the actual events under study or at one time—for example, examining a year’s worth of newspa-pers in the library or 10 years of annual reports on correctional populations.

Three types of longitudinal studies are com-mon in criminal justice research: trend, co-hort, and panel studies. Trend studies look at changes within some general population over time. An example is a comparison of Uniform Crime Report (UCR; described in Chapter 1) fi g-ures over time, showing an increase in reported crime from 1960 through 1993 and then a de-cline through 2001. Or a researcher might want to know whether changes in sentences for cer-tain offenses were followed by increases in the number of people imprisoned in state institu-tions. In this case, a trend study might examine annual fi gures for prison population over time, comparing totals for the years before and after new sentencing laws took effect.

Cohort studies examine more specifi c pop-ulations (cohorts) as they change over time. Typically a cohort is an age group, such as those people born during the 1980s, but it can also be based on some other time grouping. Cohorts are often defi ned as a group of people who enter or leave an institution at the same time, such as persons entering a drug treatment cen-ter during July, offenders released from custody in 2002, or high school seniors in March 2003.

In what is probably the best-known cohort study, Marvin Wolfgang and associates (Wolf-gang, Figlio, and Sellin 1972) studied all males born in 1945 who lived in the city of Philadel-phia from their 10th birthday through age 18 or older. The researchers examined records from police agencies and public schools to de-termine how many boys in the cohort had been

Cross-Sectional StudiesMany criminal justice research projects are designed to study a phenomenon by taking a cross section of it at one time and analyzing that cross section carefully. Exploratory and descriptive studies are often cross-sectional.A single U.S. census, for instance, is a study aimed at describing the U.S. population at a given time. A single wave of the National Crime Victimization Survey (NCVS) is a descriptive cross-sectional study that estimates how many people have been victims of crime in a given time.

A cross-sectional exploratory study might be conducted by a police department in the form of a survey that examines what residents believe to be the sources of crime problems in their neighborhood. In all likelihood, the study will ask about crime problems in a single time frame, with the fi ndings used to help the de-partment explore various methods of introduc-ing community policing.

Cross-sectional studies for explanatory or evaluation purposes have an inherent prob-lem. Typically their aim is to understand causal processes that occur over time, but their con-clusions are based on observations made at only one time. For example, a survey might ask respondents whether their home has been burglarized and whether they have any special locks on their doors, hoping to explain whether special locks prevent burglary. Because the questions about burglary victimization and door locks are asked at only one time, it is not possible to determine whether burglary victims installed locks after a burglary or whether spe-cial locks were already in place but did not pre-vent the crime. Some of the ways we can deal with the diffi cult problem of determining time order will be discussed in the section on ap-proximating longitudinal studies.

Longitudinal StudiesResearch projects known as longitudinal stud-ies are designed to permit observations over an

Chapter 3 General Issues in Research Design 67

study of gun ownership and violence by Swiss researcher Martin Killias (1993). Killias com-pared rates of gun ownership as reported in an international crime survey to rates of homicide and suicide committed with guns. He was in-terested in the possible effects of gun availabil-ity on violence: do nations with higher rates of gun ownership also have higher rates of gun violence?

Killias reasoned that inferring causation from a cross-sectional comparison of gun own-ership and homicides committed with guns would be ambiguous. Gun homicide rates could be high in countries with high gun-ownership rates because the availability of guns was higher. Or people in countries with high gun-ownership rates could have bought guns to protect them-selves, in response to rates of homicide. Cross-sectional analysis would not make it possible to sort out the time order of gun ownership and gun homicides.

But does that reasoning hold for gun sui-cides? Killias argued that the time order in a re-lationship between gun ownership and gun sui-cides is less ambiguous. It makes much more sense that suicides involving guns are at least partly a result of gun availability. But it is not reasonable to assume that people might buy guns in response to high gun-suicide rates.

Logical inferences may also be made when-ever the time order of variables is clear. If we dis-cover in a cross-sectional study of high school students that males are more likely than females to smoke marijuana, we can conclude that gen-der affects the propensity to use marijuana, not the other way around. Thus, even though our observations are made at only one time, we are justifi ed in drawing conclusions about pro-cesses that take place across time.

Retrospective Studies Research that asks people to recall their pasts, called retrospectiveresearch, is a common way of approximating observations over time. In a study of recidi-vism, for example, we might select a group of prison inmates and analyze their history of

charged with delinquency or arrested, how old they were when fi rst arrested, and what differ-ences there were in school performance between delinquents and nondelinquents.

Panel studies are similar to trend and co-hort studies except that observations are made on the same set of people on two or more oc-casions. The NCVS is a good example of a de-scriptive panel study. A member of each house-hold selected for inclusion in the survey is interviewed seven times at six-month intervals. The NCVS serves many purposes, but it was de-veloped initially to estimate how many people were victims of various types of crimes each year. It is designed as a panel study so that per-sons can be asked about crimes that occurred in the previous six months, and two waves of panel data are combined to estimate the na-tionwide frequency of victimization over a one-year period.

Among longitudinal studies, panel studies face a special problem: panel attrition. Some of the respondents studied in the fi rst wave of a study may not participate in later waves. The danger is that those who drop out of the study may not be typical and may thereby distort the results of the study. Suppose we are interested in evaluating the success of a new drug treat-ment program by conducting weekly drug tests on a panel of participants for a period of 10 months. Regardless of how successful the program appears to be after 10 months, if a substantial number of people drop out of our study, we can expect that treatment was less ef-fective in keeping them off drugs.

Approximating Longitudinal StudiesIt may be possible to draw conclusions about processes that take place over time even when only cross-sectional data are available. It is worth noting some of the ways to do that.

Logical Inferences Cross-sectional data sometimes imply processes that occur over time on the basis of simple logic. Consider a

68 Part Two Structuring Criminal Justice Inquiry

child victims have a mother or father who was abused as a child. It seems safe to conclude that your hypothesis about the intergenerational transmission of violence is strongly supported, because 90 percent (18 out of 20) of abuse or neglect victims brought before your court come from families with a history of child abuse.

Think for a moment about how you ap-proached the question of whether child abuse breeds child abuse. You began with abuse vic-tims and retrospectively established that many of their parents had been abused. However, this is different from the question of how many vic-tims of childhood abuse later abuse their own children. That question requires a prospectiveapproach, in which you begin with childhood victims and then determine how many of them later abuse their own children.

To clarify this point, let’s shift from the hypothetical study to actual research that il-lustrates the difference between prospective and retrospective approaches to the same ques-tion. Rosemary Hunter and Nancy Kilstrom (1979) conducted a study of 255 infants and their parents. The researchers began by select-ing families of premature infants in a newborn intensive care unit. Interviews with the parents of 255 infants revealed that either the mother or the father in 49 of the families had been the victim of abuse or neglect; 206 families revealed no history of abuse. In a prospective follow-up study, Hunter and Kilstrom found that within one year 10 of the 255 infants had been abused. Nine of those 10 infant victims were from the 49 families with a history of abuse, and 1 abused infant was from the 206 families with no background of abuse.

Figure 3.3 illustrates these prospective re-sults graphically. Infants in 18 percent (9 out of 49) of families with a history of abuse showed signs of abuse within one year of birth, whereas less than 1 percent of infants born to parents with no history of abuse were abused within one year. Although that is a sizable difference, notice that the 18 percent fi gure for continuity

delinquency or crime. Or suppose we are inter-ested in whether college students convicted of drunk driving are more likely to have parents with drinking problems than college students with no drunk-driving record. Such a study is retrospective because it focuses on the histories of college students who have or have not been convicted of drunk driving.

The danger in this technique is evident. Sometimes people have faulty memories; some-times they lie. Retrospective recall is one way of approximating observations across time, but it must be used with caution. Retrospective stud-ies that analyze records of past arrests or con-victions suffer from different problems: records may be unavailable, incomplete, or inaccurate.

A more fundamental issue in retrospective research hinges on how subjects are selected and how subject selection affects the kinds of questions such studies can address.

Imagine that you are a juvenile court judge and you’re troubled by what appears to be a large number of child abuse cases in your court. Talking with a juvenile caseworker, you won-der whether the parents of these children were abused or neglected during their own child-hood. Together, you formulate a hypothesis about the intergenerational transmission of violence: victims of childhood abuse later abuse their own children. How might you go about investigating that hypothesis?

Given your position as a judge who regu-larly sees abuse victims, you will probably con-sider a retrospective approach that examines the backgrounds of families appearing in your court. Let’s say you and the caseworker plan to investigate the family backgrounds of 20 abuse victims who appear in your court during the next three months. The caseworker consults with a clinical psychologist from the local uni-versity and obtains copies of a questionnaire, or protocol, that has been used by researchers to study the families of child abuse victims. After interviewing the families of 20 victims, the caseworker reports to you that 18 of the 20

Chapter 3 General Issues in Research Design 69

10 abused infants at time 2 and then checked their family backgrounds. Figure 3.4 illustrates this retrospective approach. A large majority of the 10 infant victims (90 percent) had parents with a history of abuse.

of abuse is very similar to the 19 percent rate of abuse discovered in the histories of all 255 families.

Now consider what Hunter and Kilstrom would have found if they had begun with the

10 Victims

Infants

1 Not Victim

9 Victims

Time 1 Time 2

10%

90%

Parents

Figure 3.4 Retrospective Approach to a SubjectSource: Adapted from Hunter and Kilstrom (1979), as suggested by Widom (1989b).

206Not Victims

[81%]

49Victims[19%]

Parents

1 Victim

9 Victims

Infants

Time 1 Time 2

0.5%

18%

Figure 3.3 Prospective Approach to a SubjectSource: Adapted from Hunter and Kilstrom (1979), as suggested by Widom (1989b).

70 Part Two Structuring Criminal Justice Inquiry

hood victims of abuse or neglect later abuse their own children. A retrospective study can be used, however, to compare whether childhood victims are more likely than nonvictims to have a history of abuse in their family background.

The Time Dimension SummarizedJoel Devine and James Wright (1993, 19) offer a clever metaphor that distinguishes longitu-dinal studies from cross-sectional ones. Think of a cross-sectional study as a snapshot, a trend study as a slide show, and a panel study as a motion picture. A cross-sectional study, like a snapshot, produces an image at one point in time. This can provide useful information about crime—burglary, for example—at a single time, perhaps in a single place. A trend study is akin to a slide show—a series of snapshots in sequence over time. By viewing a slide show, we can tell how some indicator— change in bur-glary rates—varies over time. But a trend study is usually based on aggregate information. It can tell us something about aggregations of burglary over time, but not, for instance, whether the same people are committing bur-glaries at an increasing or decreasing rate or whether there are more or fewer burglars with a relatively constant rate of crime commission. A panel study, like a motion picture, can capture moving images of the same individuals and give us information about individual rates of offending over time.

How to Design a Research ProjectDesigning research requires planning several stages, but the stages do not always occur in the same sequence.

We’ve now seen some of the options available to criminal justice researchers in designing projects, but what if you were to undertake research? Where would you start? Then where would you go? How would you begin planning your research?

You probably realize by now that the pro-spective and retrospective approaches ad-dress fundamentally different questions, even though the questions may appear similar on the surface:

Prospective: What percentage of abuse victims later abuse their children? (18 percent; Figure 3.3)

Retrospective: What percentage of abuse victims have parents who were abused? (90 percent; Figure 3.4)

In a study of how child abuse and neglect af-fect drug use, Cathy Spatz Widom and associ-ates (Widom, Weiler, and Cotler 1999) present a similar contrast of prospective and retrospec-tive analysis. Looking backward, 75 percent of subjects with a drug abuse diagnosis in semi-clinical interviews were victims of childhood abuse or neglect. Looking forward, 35 percent of childhood victims and 34 percent of nonvic-tims had a drug abuse diagnosis.

More generally, Robert Sampson and John Laub (1993, 14) comment on how retrospective and prospective views yield different interpreta-tions about patterns of criminal offending over time:

Looking back over the careers of adult criminals exaggerates the prevalence of stability. Looking forward from youth re-veals the success and failures, including adolescent delinquents who go on to be normal functioning adults. (emphasis in original). This is the paradox noted [by Lee Robins] earlier: adult criminality seems to be always preceded by childhood miscon-duct, but most conduct-disordered chil-dren do not become antisocial or criminal adults.

Notice how the time dimension is linked to how research questions are framed. A retrospec-tive approach is limited in its ability to reveal how causal processes unfold over time. A retro-spective approach is therefore not well suited to answer questions such as how many child-

Chapter 3 General Issues in Research Design 71

the theory may produce new ideas and create new interests. Or your understanding of some theory may encourage you to consider new policies.

To make this discussion more concrete, let’s take a specifi c research example. Suppose you are concerned about the problem of crime on your campus and you have a special interest in learning more about how other students view the issue and what they think should be done about it. Going a step further, let’s say you have the impression that students are especially con-cerned about violent crimes such as assault and robbery and that many students feel the uni-versity should be doing more to prevent violent crime. The source of this idea might be your own interest after being a student for a couple of years. You might develop the idea while read-ing about theories of crime in a course you are taking. Perhaps you recently read stories about a crime wave on campus. Or maybe some com-bination of things makes you want to learn more about campus crime.

Considering the research purposes dis-cussed earlier in this chapter, your research will be mainly exploratory. You probably have de-scriptive and explanatory interests as well: How much of a problem is violent crime on campus? Are students especially concerned about crime in certain areas? Why are some students more worried about crime than others? What do stu-dents think would be effective changes to re-duce campus crime problems?

At this point, you should begin to think about units of analysis and the time dimension. Your interest in violent crime might suggest a study of crimes reported to campus police in recent years. In this case, the units of analysis will be social artifacts (crime reports) in a lon-gitudinal study (crime reports in recent years). Or, after thinking a bit more, you may be inter-ested in current student attitudes and opinions about violent crime. Here the units of analy-sis will be individuals (college students), and a cross-sectional study will suit your purposes nicely.

Every project has a starting point, but it is important to think through later stages even at the beginning. Figure 3.5 presents a schematic view of the social scientifi c research process. We present this view reluctantly because it may suggest more of a cookbook approach to re-search than is the case in practice. Nonetheless, it’s important to have an overview of the whole process before we launch into the details of particular components of research. This fi gure presents another and more detailed picture of the scientifi c process discussed in Chapter 1.

The Research ProcessAt the top of the diagram in Figure 3.5 are in-terests, ideas, theories, and new programs—the possible beginning points for a line of research. The letters (A, B, X, Y, and so forth) represent variables or concepts such as deterrence or child abuse. Thus you might have a general interest in fi nding out why the threat of punishment de-ters some but not all people from committing crimes, or you might want to investigate how burglars select their targets. Alternatively, your inquiry might begin with a specifi c idea about the way things are. You might have the idea that aggressive arrest policies deter drug use, for ex-ample. Question marks in the diagram indicate that you aren’t sure things are the way you sus-pect they are. We have represented a theory as a complex set of relationships among several variables (A, B, E, and F ).

The research process might also begin with an idea for a new program. Imagine that you are the director of a probation services depart-ment and you want to introduce weekly drug tests for people on probation. Because you have taken a course on criminal justice research methods, you decide to design an evaluation of the new program before trying it out. The research process begins with your idea for the new drug-testing program.

Notice the movement back and forth among these several possible beginnings. An initial interest may lead to the formulation of an idea, which may be fi t into a larger theory, and

72 Part Two Structuring Criminal Justice Inquiry

THEORYINTEREST IDEA

CONCEPTUALIZATION

OPERATIONALIZATION

CHOICE OF RESEARCH METHOD

POPULATION AND SAMPLING

OBSERVATIONS

DATA PROCESSING

ANALYSIS

APPLICATION

Specify the meaningof the concepts and

variables to bestudied

How will we actuallymeasure the variables

under study?

ExperimentsSurvey researchField researchContent analysisExisting data researchComparative researchEvaluation research

Collecting data foranalysis and interpretation

Transforming the datacollected into a form

appropriate to manipulationand analysis

Analyzing data anddrawing conclusions

Reporting results andassessing their implications

Whom do we want to beable to draw conclusions

about? Who will be observedfor that purpose?

? Y A BX Y?

NEW PROGRAM

Drug testsProbation violations

FE

Figure 3.5 The Research Process

Chapter 3 General Issues in Research Design 73

ing the planned report will help you make bet-ter decisions about research design.

ConceptualizationWe often talk casually about criminal justice concepts such as deterrence, recidivism, crime prevention, community policing, and child abuse, but it’s necessary to specify what we mean by these concepts to do research on them. Chapter 4 will examine this process of concep-tualization in depth. For now, let’s see what it might involve in our hypothetical example.

If you are going to study student concerns about violent crime, you must fi rst specify what you mean by concern about violent crime. This ambiguous phrase can mean different things to different people. Campus police offi cers are concerned about violent crime because that is part of their job. On the one hand, students might be concerned about crime in much the same way they are concerned about other so-cial problems, such as homelessness, animal rights, and the global economy. They recognize these issues as problems society must deal with, but they don’t feel that the issues affect them directly; we could specify this concept as generalconcern about violent crime. On the other hand, students may feel that the threat of violent crime does affect them directly, and they ex-press some fear about the possibility of being a victim; let’s call this fear for personal safety.

Obviously, you need to specify what you mean by the term in your research, but this doesn’t necessarily mean you have to settle for a single defi nition. In fact, you might want to de-fi ne the concept of concern about violent crime in more than one way and see how students feel about each.

Of course, you need to specify all the con-cepts you wish to study. If you want to study the possible effect of concern about crime on student behavior, you’ll have to decide whether you want to limit your focus to specifi c precau-tionary behavior such as keeping doors locked or general behavior such as going to classes, parties, and football games.

Getting StartedTo begin pursuing your interest in student con-cerns about violent crime, you undoubtedly will want to read something about the issue. You might begin by fi nding out what research has been done on fear of crime and on the sorts of crime that concern people most. Newspaper stories should provide information on the vio-lent crimes that occurred recently on campus. Appendix A on the website for this book will give you some assistance in using your college li-brary. In addition, you will probably want to talk to people, such as other students or campus po-lice offi cers. These activities will prepare you to handle the various research design decisions we are about to examine. As you review the research literature, you should make note of the designs used by other researchers, asking whether the same designs will meet your research objective.

What is your objective, by the way? It’s im-portant that you are clear about that before you design your study. Do you plan to write a pa-per based on your research to satisfy a course requirement or as an honors thesis? Is your purpose to gain information that will support an argument for more police protection or bet-ter lighting on campus? Do you want to write an article for the campus newspaper or an aca-demic journal?

Usually, your objective for undertaking re-search can be expressed in a report. Appendix C on the website for this book will help you with the organization of research reports, and we recommend that you make an outline of such a report as the fi rst step in the design of any project. You should be clear about the kinds of statements you will want to make when the research is complete. Here are two examples of such statements: “x percentage of State U stu-dents believe that sexual assault is a big problem on campus,” and “Female students living off campus are more likely than females living in dorms to feel that emergency phones should be installed near buildings where evening classes are held.” Although your fi nal report may not look much like your initial image of it, outlin-

74 Part Two Structuring Criminal Justice Inquiry

You might operationalize fear for personal safety with the question “How safe do you feel alone on the campus after dark?” This could be followed by boxes indicating the possible an-swers “Safe” and “Unsafe.” Student attitudes about ways of improving campus safety could be operationalized with the item “Listed below are different actions that might be taken to re-duce violent crime on campus. Beside each de-scription, indicate whether you favor or oppose the actions described.” This could be followed by several different actions, with “Favor” and “Oppose” boxes beside each.

Population and SamplingIn addition to refi ning concepts and measure-ments, decisions must be made about whom or what to study. The population for a study is that group (usually of people) about whom we want to be able to draw conclusions. We are al-most never able to study all the members of the population that interests us, however. In vir-tually every case, we must sample subjects for study. Chapter 6 describes methods for selecting samples that adequately refl ect the whole pop-ulation that interests us. Notice in Figure 3.5 that decisions about population and sampling are related to decisions about the research method to be used.

In the study of concern about violent crime, the relevant population is the student popu-lation of your college. As you’ll discover in Chapter 6, however, selecting a sample requires you to get more specifi c than that. Will you in-clude part-time as well as full-time students? Only degree candidates or everyone? Students who live on campus, off campus, or both? There are many such questions, and each must be answered in terms of your research purpose. If your purpose is to study concern about sex-ual assault, you might consider limiting your population to female students. If hate crimes are of special interest, you will want to be sure that your study population includes minorities and others who are thought to be particularly targeted by hate crimes.

Choice of Research MethodA variety of methods are available to the criminal justice researcher. Each method has strengths and weaknesses, and certain concepts are more appropriately studied by some meth-ods than by others.

A survey is the most appropriate method for studying both general concern and fear for per-sonal safety. You might interview students di-rectly or ask them to fi ll out a questionnaire. As we’ll see in Chapter 7, surveys are especially well suited to the study of individuals’ attitudes and opinions. Thus if you wish to examine whether students who are afraid of crime are more likely to believe that campus lighting should be im-proved than students who are not afraid, a sur-vey is a good method.

Other methods described in Part Three may be appropriate. Through content analysis (dis-cussed in Chapter 9), you might examine letters to the editor in your campus newspaper and analyze what the writers believe should be done to improve campus safety. Field research (see Chapter 8), in which you observe whether stu-dents tend to avoid dark areas of the campus, will help you understand student behavior in avoiding certain areas of the campus at night. Or you might study offi cial complaints made to police and college administrators about crime problems on campus. As you read Part Three, you’ll see ways other research methods might be used to study this topic. Usually the best study design is one that uses more than one research method, taking advantage of their different strengths.

OperationalizationHaving specifi ed the concepts to be studied and chosen the research method, you now must de-velop specifi c measurement procedures. Opera-tionalization, discussed in Chapter 4, refers to the concrete steps, or operations, used to mea-sure specifi c concepts.

If you decide to use a survey to study con-cern about violent crime, your operationaliza-tion will take the form of questionnaire items.

Chapter 3 General Issues in Research Design 75

ApplicationThe fi nal stage of the research process involves using the research you’ve conducted and the conclusions you’ve reached. To start, you will probably want to communicate your fi ndings so that others will know what you’ve learned. It may be appropriate to prepare—and even publish—a written report. Perhaps you will make oral presentations in class or at a profes-sional meeting. Or you might create a web page that presents your results. Other students will be interested in hearing what you have learned about their concerns about violent crime on campus.

Your study might also be used to actually do something about campus safety. If you fi nd that a large proportion of students you interviewed believe that a parking lot near the library is poorly lighted, university administrators could add more lights or campus police might patrol the area more frequently. Crime prevention programs might be launched in dormitories if residents are more afraid of violent crime than students who live in other types of housing. Students in a Rutgers University class on crime prevention focused on car thefts and break-ins surrounding the campus in Newark, New Jer-sey. Their semester project presented specifi c recommendations on how university and city offi cials could reduce the problem.

Finally, you should consider what your re-search suggests with regard to further research on your subject. What mistakes should be cor-rected in future studies? What avenues, opened up slightly in your study, should be pursued in later investigations?

Research Design in ReviewIn designing a research project, you will fi nd it useful to begin by assessing three things: (1) your interests, (2) your abilities, and (3) the resources available to you. Each of these considerations will suggest a number of possible studies.

What are you interested in understanding? Surely you have several questions about crime and possible policy responses. Why do some

ObservationsHaving decided what to study, among whom, and by what method, you are ready to make observations—to collect empirical data. The chapters of Part Three, which describe various research methods, discuss the different obser-vation methods appropriate to each.

For a survey of concern about violent crime, you might prepare an electronic questionnaire and e-mail it to a sample selected from the stu-dent body or you could have a team of inter-viewers conduct the survey over the telephone. The relative advantages and disadvantages of these and other possibilities are discussed in Chapter 7.

AnalysisFinally, we manipulate the collected data for the purpose of drawing conclusions that refl ect on the interests, ideas, and theories that initi-ated the inquiry. Chapter 11 describes a few of the many options available to you in analyz-ing data. Notice in Figure 3.5 that the results of your analyses feed back into your initial interests, ideas, and theories. In practice, this feedback may initiate another cycle of inquiry. In the study of student concern about violent crime, the analysis phase will have both de-scriptive and explanatory purposes. You might begin by calculating the percentage of students who feel afraid to use specifi c parking facilities after dark and the percentage who favor or op-pose each of the different things that might be done to improve campus safety. Together, these percentages will provide a good picture of stu-dent opinion on the issue.

Moving beyond simple description, you might examine the opinions of different subsets of the student body: men versus women; fresh-men, sophomores, juniors, seniors, and gradu-ate students; and students who live in dorms versus off-campus apartments. You might then conduct some explanatory analysis to make the point that students who are enrolled in classes that meet in the evening hours are most in fa-vor of improved campus lighting.

76 Part Two Structuring Criminal Justice Inquiry

physical sciences, and it is just as important in criminal justice research.

The Research ProposalResearch proposals describe planned activities and include a budget and time line.

If you undertake a research project—an assign-ment for this course, perhaps, or even a major study funded by the government or a research foundation—you will probably have to provide a research proposal describing what you intend to accomplish and how. We’ll conclude this chapter with a discussion of how you might prepare such a proposal.

Elements of a Research ProposalSome funding agencies have specifi c require-ments for a proposal’s elements, structure, or both. For example, in its research solicitation announcements for the 2007 fi scal year, the Na-tional Institute of Justice (NIJ) describes what should be included in research proposals on such topics as terrorism and elder abuse (www.ojp.usdoj.gov/nij/funding/; accessed May 12, 2008). Your instructor may have certain re-quirements for a research proposal you are to prepare in this course. Here are some basic el-ements that should be included in almost any research proposal.

Problem or Objective What exactly do you want to study? Why is it worth studying? Does the proposed study contribute to our general understanding of crime or policy responses to crime? Does it have practical signifi cance? If your proposal describes an evaluation study, then the problem, objective, or research ques-tions may already be specifi ed for you. For ex-ample, in its request for research on elder abuse issued in 2006, the NIJ required that proposals address certain specifi c items in describing the impact of proposed research:

1. Potential for signifi cant advances in sci-entifi c or technical understanding of the problem

juvenile gangs sell drugs whereas others steal cars? Why do particular neighborhoods near campus seem to have higher rates of burglary? Do sentencing policies discriminate against mi-norities? Do cities with gun control laws have lower murder rates? Is burglary more common in areas near pawnshops? Are sentences for rape more severe in some states than in others? Are mandatory jail sentences more effective than license suspension in reducing repeat drunk-driving offenses? Think for a while about the kinds of questions that interest and concern you.

Once you have a few questions you are inter-ested in answering, think about the kind of in-formation you will need to answer them. What research units of analysis will provide the most relevant information: gangs, burglary victims, drunk drivers, households, community groups, police departments, cities, or states? This ques-tion should be inseparable from the question of research topics. Then ask which aspects of the units of analysis will provide the information you need to answer your research question.

Your next consideration is how to obtain that information. Are the relevant data likely to be already available somewhere (say, in a gov-ernment publication), or will you have to col-lect them yourself ? If you think you will have to collect them, how will you do that? Will it be necessary to observe juvenile gangs, inter-view a large number of burglary victims, or at-tend meetings of community crime prevention groups? Or will you have to design an experi-ment to study sentences for drunk driving?

As you answer these questions, you are well into the process of research design. Once you have a general idea of what you want to study and how, carefully review previous research in journals, books, and government reports to see how other researchers have addressed the topic and what they have learned about it. Your re-view of the literature may lead you to revise your research design; perhaps you will decide to use another researcher’s method or even replicate an earlier study. The independent replication of research projects is a standard procedure in the

Chapter 3 General Issues in Research Design 77

you should include a copy in an appendix to your proposal.

Data Collection Methods How will you ac-tually collect the data for your study? Will you observe behavior directly or conduct a survey? Will you undertake fi eld research, or will you focus on the reanalysis of data already collected by others? Criminal justice research often in-cludes more than one such method.

Analysis Briefl y describe the kind of analysis you plan to conduct. Spell out the purpose and logic of your analysis. Are you interested in pre-cise description? Do you intend to explain why things are the way they are? Will you analyze the impact of a new program? What possible explanatory variables will your analysis con-sider, and how will you know whether you’ve explained the program impact adequately?

References Be sure to include a list of all ma-terials you consulted and cited in your proposal. Formats for citations vary. Your instructor may specify certain formats, or refer you to specifi c style manuals for guidelines on how to cite books, articles, and web-based resources.

Schedule It is often appropriate to provide a schedule for the various stages of research. Even if you don’t do this for the proposal, do it for yourself. If you don’t have a time line for accomplishing the stages of research and keep-ing track of how you’re doing, you may end up in trouble.

Budget If you are asking someone to give you money to pay the costs of your research, you will need to provide a budget that speci-fi es where the money will go. Large, expensive projects include budgetary categories such as personnel, equipment, supplies, and expenses (such as travel, copying, and printing). Even for a more modest project you will pay for yourself, it’s a good idea to spend some time anticipat-ing any expenses involved: offi ce supplies, pho-tocopying, computer disks, telephone calls, transportation, and so on.

2. Potential for signifi cant advances in the fi eld

3. Relevance for improving the policy and practice of criminal justice and related agen-cies and improving public safety, security, and the quality of life (National Institute of Justice 2006, 8)

Literature Review What have others said about this topic? What theories address it, and what do they say? What research has been done? Are the fi ndings consistent, or do past studies disagree? Are there fl aws in the body of existing research that you feel you can remedy?

Research Questions What specifi c questions will your research try to answer? Given what others have found, as stated in your literature review, what new information do you expect to fi nd? It’s useful to view research questions as a more specifi c version of the problem or objec-tive described earlier. Then, of course, your spe-cifi c questions should be framed in the context of what other research has found.

Subjects for Study Whom or what will you study in order to collect data? Identify the subjects in general terms, and then specifi cally identify who (or what) is available for study and how you will reach them. Is it appropriate to select a sample? If so, how will you do that? If there is any possibility that your research will have an impact on those you study, how will you ensure that they are not harmed by the research? Finally, if you will be interacting di-rectly with human subjects, you will probably have to include a consent form (as described in Chapter 2) in an appendix to your proposal.

Measurement What are the key variables in your study? How will you defi ne and measure them? Do your defi nitions and measurement methods duplicate (that’s okay, incidentally) or differ from those of previous research on this topic? If you have already developed your mea-surement device (such as a questionnaire) or if you are using something developed by others,

78 Part Two Structuring Criminal Justice Inquiry

but it may also be a group, organization, or so-cial artifact.

• Researchers sometimes confuse units of analy-sis, resulting in the ecological fallacy or the in-dividualistic fallacy.

• Cross-sectional studies are those based on ob-servations made at one time. Although such studies are limited by this characteristic, infer-ences can often be made about processes that occur over time.

• Longitudinal studies are those in which obser-vations are made at many times. Such obser-vations may be made of samples drawn from general populations (trend studies), samples drawn from more specifi c subpopulations (co-hort studies), or the same sample of people each time (panel studies).

• Retrospective studies can sometimes approxi-mate longitudinal studies, but retrospective ap-proaches must be used with care.

• The research process is fl exible, involving differ-ent steps that are best considered together. The process usually begins with some general inter-est or idea.

• A research proposal provides an overview of why a study will be undertaken and how it will be conducted. It is a useful device for planning and is required in some circumstances.

✪ Key Terms

As you can see, if you are interested in con-ducting a criminal justice research project, it is a good idea to prepare a research proposal for your own purposes, even if you aren’t required to do so by your instructor or a funding agency. If you are going to invest your time and energy in such a project, you should do what you can to ensure a return on that investment.

✪ Answers to the Units-of-Analysis Exercise

1. Social artifacts (alcohol-related fatal crashes)2. Groups (countries)3. Individuals (probationers)4. Social artifacts (court cases)5. Organizations (police substations)

✪ Main Points• Explanatory scientifi c research centers on the

notion of cause and effect.

• Most explanatory social research uses a proba-bilistic model of causation. X may be said to cause Y if it is seen to have some infl uence on Y.

• X is a necessary cause of Y if Y cannot happen without X having happened. X is a suffi cient cause of Y if Y always happens when X happens.

• Three basic requirements determine a causal relationship in scientifi c research: (1) the inde-pendent variable must occur before the depen-dent variable, (2) the independent and depen-dent variables must be empirically related to each other, and (3) the observed relationship cannot be explained away as the effect of an-other variable.

• When scientists consider whether causal state-ments are true or false, they are concerned with the validity of causal inference.

• Four classes of threats to validity correspond to the types of questions researchers ask in trying to establish cause and effect. Threats to statis-tical conclusion validity and internal validity arise from bias. Construct and external validity threats may limit our ability to generalize from an observed relationship.

• A scientifi c realist approach to examining mechanisms in context bridges idiographic and nomothetic approaches to causation.

• Units of analysis are the people or things whose characteristics researchers observe, describe, and explain. The unit of analysis in criminal justice research is often the individual person,

cohort study, p. 66conceptualization,

p. 73construct

validity, p. 56cross-sectional

study, p. 66ecological

fallacy, p. 63external

validity, p. 55internal

validity, p. 55longitudinal

study, p. 66operationalization,

p. 74

panel study, p. 67probabilistic, p. 52prospective, p. 68retrospective

research, p. 67scientifi c

realism, p. 60statistical conclusion

validity, p. 53trend study, p. 66units of

analysis, p. 61validity, p. 53validity threats, p. 53

✪ Review Questions and Exercises1. Discuss one of the following statements in

terms of what you have learned about the cri-

Chapter 3 General Issues in Research Design 79

search on Crime,” Criminology 25 (1987), pp. 581–614. Two other highly respected crimi-nologists point to some of the shortcomings of longitudinal studies.

Maxwell, Joseph A., Qualitative Research Design: An Interactive Approach, 2nd ed. (Thousand Oaks, CA: Sage, 2005). Despite the word qualitative in the title, this book offers excellent advice in progressing from general interests or thoughts to more specifi c plans for actual research. Each chapter concludes with exercises that incremen-tally help readers develop research plans.

Pawson, Ray, and Nick Tilley, Realistic Evaluation (Thousand Oaks, CA: Sage, 1997). The authors propose an alternative way of thinking about cause, in the context of what they call “scientifi c realism.” Although they criticize traditional so-cial science approaches to inferring cause, Paw-son and Tilley supplement the classic insights of Cook and Campbell.

Sampson, Robert J., and John H. Laub, Crime in the Making: Pathways and Turning Points Through Life (Cambridge, MA: Harvard University Press, 1993). John H. Laub and Robert J. Sampson, Shared Beginnings, Divergent Lives: Delinquent Boys to Age 70 (Cambridge, MA: Harvard University Press, 2003). The highly acclaimed research de-scribed in these two volumes illustrates the lon-gitudinal approach to explanatory research, be-ginning with juveniles and following their lives through age 70. Sampson and Laub are also attentive to possible validity threats to their fi ndings.

Shadish, William R., Thomas D. Cook, and Donald T. Campbell, Experimental and Quasi-Experimental Designs for Generalized Causal Inference. (Boston: Houghton Miffl in, 2002). A recent update to a classic, this book is close to a defi nitive discus-sion of cause, validity threats, experiments, and generalizing from research. The authors move far beyond the earlier edition, but somehow the book is more accessible. See especially Chapters 1 through 3 and Chapter 11.

teria of causation and threats to the validity of causal inference. What cause-and-effect rela-tionships are implied? What are some alterna-tive explanations?

a. Guns don’t kill people; people kill people. b. Capital punishment prevents murder. c. Marijuana is a gateway drug that leads to

the use of other drugs.2. Several times, we have discussed the relation-

ship between drug use and crime. Describe the conditions for each of the following that would lead us to conclude that drug use is:

a. A necessary cause b. A suffi cient cause c. A necessary and suffi cient cause 3. In describing different approaches to the time

dimension, criminologist Lawrence Sherman (1995) claimed that cross-sectional studies can show differences and that longitudinal studies can show change. How does this statement re-late to the three criteria for inferring causation?

4. William Julius Wilson (1996, 167) cites the fol-lowing example of why it’s important to think carefully about units and time. Imagine a 13-bed hospital, in which 12 beds are occupied by the same 12 people for one year. The other hos-pital bed is occupied by 52 people, each staying one week. At any given time, 92 percent of beds are occupied by long-term patients (12 out of 13), but over the entire year, 81 percent of pa-tients are short-term patients (52 out of 64). Discuss the implications of a similar example, using jail cells instead of hospital beds.

✪ Additional ReadingsFarrington, David P., Lloyd E. Ohlin, and James Q.

Wilson, Understanding and Controlling Crime: To-ward a New Research Strategy (New York: Springer-Verlag, 1986). Three highly respected criminol-ogists describe the advantages of longitudinal studies and policy experiments for criminal jus-tice research. The book also presents a research agenda for studying the causes of crime and the effectiveness of policy responses.

Gottfredson, Michael R., and Travis Hirschi, “The Methodological Adequacy of Longitudinal Re-

80

Chapter 4

Concepts, Operationalization, and MeasurementIt’s essential to specify exactly what we mean (and don’t mean) by the terms we use. This is the fi rst step in the measurement process, and we’ll cover it in depth.

Introduction 81

Conceptions and Concepts 81

Conceptualization 83

Indicators and Dimensions 83

WHAT IS RECIDIVISM? 84

Creating Conceptual Order 84

Operationalization Choices 86

Measurement as Scoring 87

JAIL STAY 88

Exhaustive and Exclusive Measurement 88

Levels of Measurement 89

Implications of Levels of Measurement 91

Criteria for Measurement Quality 92

Reliability 93

Validity 94

Measuring Crime 97

General Issues in Measuring Crime 97

UNITS OF ANALYSIS AND

MEASURING CRIME 98

Measures Based on Crimes Known to Police 98

Chapter 4 Concepts, Operationalization, and Measurement 81

IntroductionBecause measurement is diffi cult and imprecise, re-searchers try to describe the measurement process explicitly.

This chapter describes the progression from having a vague idea about what we want to study to being able to recognize it and measure it in the real world. We begin with the general issue of conceptualization, which sets up a foundation for our examination of operation-alization and measurement. We then turn to different approaches to assessing measurement quality. The chapter concludes with an overview of strategies for combining individual measures into more complex indicators.

As you read this chapter, keep in mind a cen-tral theme: communication. Ultimately, crimi-nal justice and social scientifi c research seek to communicate fi ndings to an audience, such as professors, classmates, journal readers, or co-workers in a probation services agency. Moving from vague ideas and interests to a completed research report, as we described in Chapter 3, involves communication at every step—from general ideas to more precise defi nitions of critical terms. With more precise defi nitions, we can begin to develop measures to apply in the real world.

Conceptions and ConceptsClarifying abstract mental images is an essential fi rst step in measurement.

If you hear the word recidivism, what image comes to mind? You might think of someone who has served time for burglary and who breaks into a house soon after being released from prison. Or, in contrast to that rather specifi c image, you might have a more general image of a habitual criminal. Someone who works in a criminal justice agency might have a different mental image. Police offi cers might think of a specifi c individual they have arrested repeatedly for a variety of offenses, and a judge might think of a defendant who has three prior convictions for theft.

Ultimately, recidivism is simply a term we use in communication—a word representing a col-lection of related phenomena that we have ei-ther observed or heard about somewhere. It’s as though we have fi le drawers in our minds con-taining thousands of sheets of paper, and each sheet has a label in the upper right-hand cor-ner. One sheet of paper in your fi le drawer has the term recidivism on it, and the person who sits next to you in class has one, too.

The technical name for those mental images, those sheets of paper in our fi le drawers, is con-ception. Each sheet of paper is a conception—a subjective thought about things that we en-counter in daily life. But those mental images

Victim Surveys 102

Surveys of Offending 103

Measuring Crime Summary 104

Composite Measures 105

Typologies 106

An Index of Disorder 107

Measurement Summary 109

82 Part Two Structuring Criminal Justice Inquiry

in pursuit of self-interest—is abstract. Crime is the symbol, or label, they have assigned to this concept.

Let’s discuss a specifi c example. What is your conception of serious crime? What mental images come to mind? Most people agree that airplane hijacking, rape, bank robbery, and murder are serious crimes. What about a physi-cal assault that results in a concussion and fa-cial injuries? Many of us would classify it as a serious crime but not if the incident took place in a boxing ring. Is burglary a serious crime? It doesn’t rank up there with drive-by shooting, but we would probably agree that it is more se-rious than shoplifting. What about drug use or drug dealing?

Our mental images of serious crime may vary depending on our backgrounds and expe-riences. If your home has ever been burglarized, you might be more inclined than someone who has not suffered that experience to rate it as a serious crime. If you have been both burglar-ized and robbed at gunpoint, you would prob-ably think the burglary was less serious than the robbery. There is much disagreement over the seriousness of drug use. Younger people, whether or not they have used drugs, may be less inclined to view drug use as a serious crime, whereas police and other public offi cials might rank drug use as very serious. California and Oregon are among states that have legalized the use of marijuana for medical purposes. How-ever, as of 2006 the U.S. Department of Justice views all marijuana use as a crime, challenging state laws and raiding San Francisco medical marijuana dispensaries (Murphy 2005).

Serious crime is an abstraction, a label we use to represent a concept. However, we must be careful to distinguish the label we use for a concept from the reality that the concept repre-sents. There are real robberies, and robbery is a serious crime, but the concept of crime serious-ness is not real.

The concept of serious crime, then, is a con-struct created from your conception of it, our

cannot be communicated directly. There is no way we can directly reveal what’s written on our mental images. Therefore we use the terms writ-ten in the upper right-hand corners as a way of communicating about our conceptions and the things we observe that are related to those conceptions.

For example, the word crime represents our conception about certain kinds of behavior. But individuals have different conceptions; they may think of different kinds of behavior when they hear the word crime. Police offi cers in most states would include possession of marijuana among their conceptions of crime, whereas members of the advocacy group National Or-ganization for the Reform of Marijuana Laws (NORML) would not. Recent burglary victims might recall their own experiences in their conceptions of crime, whereas more fortunate neighbors might think about the murder story in yesterday’s newspaper.

Because conceptions are subjective and can-not be communicated directly, we use the words and symbols of language as a way of communi-cating about our conceptions and the things we observe that are related to those conceptions.

Concepts are the words or symbols in lan-guage that we use to represent these mental im-ages. We use concepts to communicate with one another, to share our mental images. Although a common language enables us to communi-cate, it is important to recognize that the words and phrases we use represent abstractions. Concepts are abstract because they are indepen-dent of the labels we assign to them. Crime as a concept is abstract, meaning that in the English language this label represents mental images of illegal acts. Of course, actual crimes are real events, and our mental images of crime may be based on real events (or the stuff of TV drama). However, when we talk about crime, without being more specifi c, we are talking about an abstraction. Thus, for example, the concept of crime proposed by Michael Gottfredson and Travis Hirschi (1990, 15)—using force or fraud

Chapter 4 Concepts, Operationalization, and Measurement 83

the theft of unattended personal property such as bicycles are examples of nonviolent crimes. Assault, rape, robbery, and murder are violent crimes.

Indicators and DimensionsThe end product of the conceptualization pro-cess is the specifi cation of a set of indicators of what we have in mind, indicating the presence or absence of the concept we are studying. To il-lustrate this process, let’s discuss the more gen-eral concept of crime seriousness. This concept is more general than serious crime because it implies that some crimes are more serious than others.

One good indicator of crime seriousness is harm to the crime victim. Physical injury is an example of harm, and physical injury is cer-tainly more likely to result from violent crime than from nonviolent crime. What about other kinds of harm? Burglary victims suffer eco-nomic harm from property loss and perhaps damage to their homes. Is the loss of $800 in a burglary an indicator of more serious crime than a $10 loss in a robbery in which the victim was not injured? Victims of both violent crime and nonviolent crime may suffer psychological harm. Or people might feel a sense of personal violation after discovering that their home has been burglarized. Other types of victim harm can be combined into groups and subgroups as well.

The technical term for such groupings is dimension—some specifi able aspect of a con-cept. Thus we might speak of the “victim harm dimension” of crime seriousness. This dimen-sion could include indicators of physical injury, economic loss, or psychological consequences. And we can easily think of other indicators and dimensions related to the general concept of crime seriousness. If we consider the theft of $20 from a poor person to be more serious than the theft of $2,000 from a wealthy oil company chief executive offi cer, victim wealth might be another dimension. Also consider a victim

conception of it, and the conceptions of all those who have ever used the term. The concept of serious crime cannot be observed directly or indirectly. We can, however, meaningfully dis-cuss the concept, observe examples of serious crime, and measure it indirectly.

ConceptualizationDay-to-day communication is made possible through general but often vague and unspo-ken agreements about the use of terms. Usually other people do not understand exactly what we wish to communicate, but they get the gen-eral drift of our meaning. Although we may not fully agree about the meaning of the term seri-ous crime, it’s safe to assume that the crime of bank robbery is more serious than the crime of bicycle theft. A wide range of misunderstand-ings is the price we pay for our imprecision, but somehow we muddle through. Science, how-ever, aims at more than muddling, and it can-not operate in a context of such imprecision.

Conceptualization is the process by which we specify precisely what we mean when we use particular terms. Suppose we want to fi nd out whether violent crime is more serious than nonviolent crime. Most of us would probably assume that is true, but it might be interest-ing to fi nd out whether it’s really so. Notice that we can’t meaningfully study the issue, let alone agree on the answer, without some pre-cise working agreements about the meanings of the terms we are using. They are working agree-ments in the sense that they allow us to work on the question.

We begin by clearly differentiating violent and nonviolent crime. In violent crimes, an offender uses force or threats of force against a victim. Nonviolent crimes either do not in-volve any direct contact between a victim and an offender or involve contact but no force. For example, pickpockets have direct contact with their victims but use no force. In contrast, rob-bery involves at least the threat to use force on victims. Burglary, auto theft, shoplifting, and

84 Part Two Structuring Criminal Justice Inquiry

crime is more serious than nonviolent crime in all cases.

Creating Conceptual OrderThe design and execution of criminal justice re-search requires that we clear away the confu-sion over concepts and reality. To this end, logicians and scientists have found it useful to distinguish three kinds of defi nitions: real, conceptual, and operational. With respect to the fi rst of these, Carl G. Hempel (1952, 6) has cautioned:

A “real” defi nition, according to traditional logic, is not a stipulation determining the meaning of some expression but a state-ment of the “essential nature” or the “essen-tial attributes” of some entity. The notion

identity dimension. Killing a burglar in self-defense would not be as serious as threatening to kill the president of the United States.

It is possible to subdivide the concept of crime seriousness into several dimensions. Spec-ifying dimensions and identifying the various indicators for each of those dimensions are both parts of conceptualization.

Specifying the different dimensions of a concept often paves the way for a better under-standing of what we are studying. We might observe that fi stfi ghts among high school stu-dents result in thousands of injuries per year but that the annual costs of auto theft cause di-rect economic harm to hundreds of insurance companies and millions of auto insurance poli-cyholders. Recognizing the many dimensions of crime seriousness, we cannot say that violent

WHAT IS RECIDIVISM?Tony Fabelo

The Senate Criminal Justice Committee will be studying the record of the corrections system and the use of recidivism rates as a measure of performance for the system. The fi rst task for the committee should be to clearly defi ne recidivism, understand how it is measured, and determine the implications of adopting recidivism rates as measures of performance.

Defi ning RecidivismRecidivism is the recurrence of criminal behavior. The rate of recidivism refers to the proportion of a specifi c group of offenders (for example, those released on parole) who engage in criminal be-havior within a given period of time. Indicators of criminal behavior are rearrests, reconvictions, or reincarcerations.

Each of these indicators depends on contact with criminal justice offi cials and will therefore underestimate the recurrence of criminal be-havior. However, criminal behavior that is unre-ported and not otherwise known to offi cials in

justice agencies is diffi cult to measure in a consis-tent and economically feasible fashion.

In 1991, the Criminal Justice Policy Council recommended to the Texas legislature and state criminal justice agencies that recidivism be mea-sured in the following way:

Recidivism rates should be calculated by counting the number of prison releases or number of offenders placed under commu-nity supervision who are reincarcerated for a technical violation or new offense within a uniform period of at-risk street time. The at-risk street time can be one, two, or three years, but it must be uniform for the group being tracked so that results are not distorted by uneven at-risk periods. Reincarceration should be measured using data from the “rap sheets” collected by the Texas Department of Public Safety in their Computerized Criminal History system. A centralized source of informa-tion reduces reporting errors.

Systemwide Recidivism RatesRecidivism rates can be reported for all offend-ers in the system—for all offenders released from

Chapter 4 Concepts, Operationalization, and Measurement 85

pational status, money in the bank, property, lifestyle, and so forth.

The specifi cation of conceptual defi nitions does two important things. First, it serves as a specifi c working defi nition we present so that readers will understand exactly what we mean by a concept. Second, it focuses our observa-tional strategy. Notice that a conceptual defi -nition does not directly produce observations; rather, it channels our efforts to develop actual measures.

As a next step, we must specify exactly what we will observe, how we will do it, and what interpretations we will place on various pos-sible observations. These further specifi cations make up the operational defi nition of the concept—a defi nition that spells out precisely how the concept will be measured. Strictly

of essential nature, however, is so vague as to render this characterization useless for the purposes of rigorous inquiry.

A real or essential nature defi nition is inher-ently subjective. The specifi cation of concepts in scientifi c inquiry depends instead on concep-tual and operational defi nitions. A conceptualdefi nition is a working defi nition specifi cally assigned to a term. In the midst of disagree-ment and confusion over what a term really means, the scientist specifi es a working defi ni-tion for the purposes of the inquiry. Wishing to examine socioeconomic status (SES), we may simply specify that we are going to treat it as a combination of income and educational attain-ment. With that defi nitional decision, we rule out many other possible aspects of SES: occu-

prison or for all offenders placed on probation. This I call systemwide recidivism rates. Approxi-mately 48 percent of offenders released from prison on parole or mandatory supervision, or re-leased from county jails on parole, in 1991 were reincarcerated by 1994 for a new offense or a pa-role violation.

For offenders released from prison in 1991 the reincarceration recidivism rates three years after release from prison by offense of conviction are listed below:

Burglary 56% Assault 44%

Robbery 54% Homicide 40%

Theft 52% Sexual assault 39%

Drugs 43% Sex offense 34%

For the same group, the reincarceration recidi-vism rate three years after release by age group is listed below:

17–25 56%

26–30 52%

31–35 48%

36–40 46%

41 or older 35%

The Meaning of Systemwide Recidivism RatesThe systemwide recidivism rate of prison releases should not be used to measure the performance of institutional programs. There are many socio-economic factors that can affect systemwide re-cidivism rates.

For example, the systemwide recidivism rate of offenders released from prison in 1995 declined because of changes in the characteristics of the population released from prison. Offenders are receiving and serving longer sentences, which will raise the average age at release. Therefore per-formance in terms of systemwide recidivism will improve but not necessarily because of improve-ments in the delivery of services within the prison system.

On the other hand, the systemwide recidivism rate of felons released from state jail facilities should be expected to be relatively high, because state jail felons are property and drug offenders who tend to have high recidivism rates.

86 Part Two Structuring Criminal Justice Inquiry

To test your understanding of these measure-ment steps, return to the beginning of the chapter, where we asked you what image comes to mind in connection with the word recidivism.Recall your own mental image, and compare it with Tony Fabelo’s discussion in the box titled “What Is Recidivism?”

Operationalization ChoicesDescribing how to obtain empirical measures begins with operationalization.

Recall from Chapter 3 that the research process is not usually a set of steps that proceed in or-der from fi rst to last. This is especially true of operationalization, the process of developing operational defi nitions. Although we begin by conceptualizing what we wish to study, once we start to consider operationalization, we may revise our conceptual defi nition. Developing an operational defi nition also moves us closer to measurement, which requires that we think about selecting a data collection method as well. In other words, operationalization does not proceed according to a systematic checklist.

To illustrate this fl uid process, let’s return to the issue of crime seriousness. Suppose we want to conduct a descriptive study that shows which crimes are more serious and which crimes are less serious.

One obvious dimension of crime serious-ness is the penalties that are assigned to differ-ent crimes by law. Let’s begin with this concep-tualization. Our conceptual defi nition of crime seriousness is therefore the level of punishment that a state criminal code authorizes for dif-ferent crimes. Notice that this defi nition has the distinct advantage of being unambiguous, which leads us to an operational defi nition something like this:

Consult the Texas Criminal Code. (1) Those crimes that may be punished by death will be judged most serious. (2) Next will be crimes that may be punished by a prison sentence of more than one year. (3) The

speaking, an operational defi nition is a descrip-tion of the operations undertaken in measuring a concept.

Pursuing the defi nition of SES, we might decide to ask the people we are studying three questions:

1. What was your total household income dur-ing the past 12 months?

2. How many persons are in your household?3. What is the highest level of school you have

completed?

Next, we need to specify a system for catego-rizing the answers people give us. For income, we might use the categories “under $25,000” and “$25,000–$35,000.” Educational attainment might be similarly grouped into categories, and we might simply count the number of peo-ple in each household. Finally, we need to spec-ify a way to combine each person’s responses to these three questions to create a measure of SES.

The end result is a working and workable defi nition of SES. Others might disagree with our conceptualization and operationalization, but the defi nition has one essential scientifi c virtue: it is absolutely specifi c and unambigu-ous. Even if someone disagrees with our defi ni-tion, that person will have a good idea of how to interpret our research results because what we mean by SES—refl ected in our analyses and conclusions—is clear.

Here is a diagram showing the progression of measurement steps from our vague sense of what a term means to specifi c measurements in a scientifi c study:

Conceptualization

Conceptual defi nition

Operational defi nition

Measurements in the real world

Chapter 4 Concepts, Operationalization, and Measurement 87

resent how much time you will have to spend studying. The American Bar Association rates nominees to the U.S. Supreme Court as quali-fi ed, highly qualifi ed, or not qualifi ed. You might rank last night’s date on the proverbial scale of 1 to 10, refl ecting whatever conceptual properties are important to you.

Measurement as ScoringAnother way to think of measurement is in terms of scoring. Your instructor scores ex-ams by counting the right answers and assign-ing some point value to each answer. Referees keep score at basketball games by counting the number of one-point free throws and two- and three-point fi eld goals for each team. Judges or juries score persons charged with crime by pronouncing “guilty” or “not guilty.” City mur-der rates are scored by counting the number of murder victims and dividing by the number of city residents.

Many people consider measurement to be the most important and diffi cult phase of criminal justice research. It is diffi cult, in part, because so many basic concepts in criminal jus-tice are not easy to defi ne as specifi cally as we would like. Without being able to settle on a conceptual defi nition, we fi nd operationalizing and measuring things challenging. This is illus-trated by the box titled “Jail Stay.”

In addition to being challenging, different operationalization choices can produce dif-ferent results. In the box titled “What Is Re-cidivism?” Tony Fabelo argues that the at-risk period for comparing recidivism for different groups of offenders should be uniform. It’s pos-sible to examine one-, two-, or three-year rates, but comparisons should use standard at-risk periods. Varying the at-risk period produces, as we might expect, differences in recidivism rates. Evaluating a Texas program that provided drug abuse treatment, Michael Eisenberg (1999, 8) reports rates for different at-risk periods:

1-Year 2-Year 3-Year

All participants 14% 37% 42%

least serious crimes are those with jail sen-tences of less than a year, fi nes, or both.

The operations undertaken to measure crime seriousness are specifi c. Our data collection strategy is also clear: go to the library, make a list of crimes described in the Texas Code, and clas-sify each crime into one of the three groups.

Note that we have produced rather narrow conceptual and operational defi nitions of crime seriousness. We might presume that penalties in the Texas Code take into account additional dimensions such as victim harm, offender mo-tivation, and other circumstances of individual crimes. However, the three groups of crimes in-clude very different types of incidents and so do not tell us much about crime seriousness.

An alternative conceptualization of crime se-riousness might center on what people think of as serious crime. In this view, crime seriousness is based on people’s beliefs, which may refl ect their perceptions of harm to victims, offender motivation, or other dimensions. Conceptual-izing crime seriousness in this way suggests a different approach to operationalization: you will present descriptions of various crimes to other students in your class and ask them to in-dicate how serious they believe the crimes are. If crime seriousness is operationalized in this way, a questionnaire is the most appropriate data collection method.

Operationalization involves describing how actual measurements will be made. The next step, of course, is making the measurements. Royce Singleton and associates (Singleton, Straits, and Straits 2005, 100) defi ne measure-ment as “the process of assigning numbers or labels to units of analysis in order to represent conceptual properties. This process should be quite familiar to the reader even if the defi nition is not.”

Think of some examples of the process. Your instructor assigns number or letter grades to exams and papers to represent your mastery of course material. You count the number of pages in this week’s history assignment to rep-

88 Part Two Structuring Criminal Justice Inquiry

ties such as employed part-time, employed full-time, and retired.

Every variable should have two important qualities. First, the attributes composing it should be exhaustive. If the variable is to have any utility in research, researchers must be able to classify every observation in terms of one of the attributes composing the variable. We will run into trouble if we conceptualize the vari-able sentence in terms of the attributes prisonand fi ne. After all, some convicted persons are assigned to probation, some have a portion of their prison sentence suspended, and others may receive a mix of prison term, probation, suspended sentence, or perhaps community service. Notice that we could make the list of attributes exhaustive by adding other and combi-nation. Whatever approach we take, we must be able to classify every observation.

At the same time, attributes composing a variable must be mutually exclusive. Research-ers must be able to classify every observation

Note that the difference between one- and two-year rates is much larger than that between two- and three-year rates. Operationalizing “recidi-vism” as a one-year failure rate would be much less accurate than operationalizing the concept as a two-year rate, because recidivism rates seem to stabilize at the two-year point.

Exhaustive and Exclusive MeasurementBriefl y revisiting terms introduced in Chapter 1, an attribute is a characteristic or quality of some-thing. Female is an example, as are old and stu-dent. Variables, in contrast, are logical sets of attributes. Thus, gender is a variable composed of the attributes female and male. The conceptu-alization and operationalization processes can be seen as the specifi cation of variables and the attributes composing them. Thus, employmentstatus is a variable that has the attributes em-ployed and unemployed, or the list of attributes could be expanded to include other possibili-

JAIL STAY

Recall from Chapter 1 that two of the general purposes of research are de-

scription and explanation. The distinction be-tween them has important implications for the process of defi nition and measurement. If you have formed the opinion that description is a simpler task than explanation, you may be sur-prised to learn that defi nitions can be more prob-lematic for descriptive research than for explana-tory research. To illustrate this, we present an example based on an attempt by one of the au-thors to describe what he thought was a simple concept.

In the course of an evaluation project, Max-fi eld wished to learn the average number of days people stayed in the Marion County (Indiana) jail. This concept was labeled jail stay. People can be in the county jail for three reasons: (1) They are serving a sentence of one year or less. (2) They are awaiting trial. (3) They are being held tempo-rarily while awaiting transfer to another county

or state or to prison. The third category includes people who have been sentenced to prison and are waiting for space to open up, or those who have been arrested and are wanted for some rea-son in another jurisdiction.

Maxfi eld vaguely knew these things but did not recognize how they complicated the task of defi ning and ultimately measuring jail stay. So the original question—“What is the average jail stay?”—was revised to “What is the average jail stay for persons serving sentences and for persons awaiting trial?”

Just as people can be in jail for different rea-sons, an individual can be in jail for more than one reason. Let’s consider a hypothetical jail resi-dent we’ll call Allan. He was convicted of burglary in July 2002 and sentenced to a year in jail. All but 30 days of his sentence were suspended, meaning that he was freed but could be required to serve the remaining 11 months if he got into trouble again. It did not take long. Two months after be-ing released, Allan was arrested for robbery and returned to jail.

Chapter 4 Concepts, Operationalization, and Measurement 89

tiveness and mutual exclusiveness are nomi-nal measures. Examples are gender, race, city of residence, college major, Social Security number, and marital status. The attributes composing each of these variables—male and female for the variable gender—are distinct from one another and pretty much cover the con-ventional possibilities among people. Nomi-nal measures merely offer names or labels for characteristics.

Imagine a group of people being character-ized in terms of a nominal variable and physi-cally grouped by the appropriate attributes. Suppose we are at a convention attended by hundreds of police chiefs. At a social func-tion, we ask them to stand together in groups according to the states in which they live: all those from Vermont in one group, those from California in another, and so forth. The vari-able is state of residence; the attributes are live inVermont, live in California, and so on. All the peo-ple standing in a given group have at least one

in terms of one and only one attribute. Thus, for example, we need to defi ne prison and fi ne in such a way that nobody can possess both attri-butes at the same time. That means we must be able to handle the variables for a person whose sentence includes both a prison term and a fi ne. In this case, attributes could be defi ned more precisely by specifying prison only, fi ne only, and both prison and fi ne.

Levels of MeasurementAttributes composing any variable must be mu-tually exclusive and exhaustive. Attributes may be related in other ways as well. Of particular interest is that variables may represent different levels of measurement: nominal, ordinal, in-terval, and ratio. Levels of measurement tell us what sorts of information we can gain from the scores assigned to the values of a variable.

Nominal Measures Variables whose attri-butes have only the characteristics of exhaus-

Now it gets complicated. A judge imposes the remaining 11 months of Allan’s suspended sen-tence. Allan is denied bail and must wait for his trial in jail. It is soon learned that Allan is wanted by police in Illinois for passing bad checks. Many people would be delighted to send Allan to Illi-nois; they tell offi cials in that state they can have him, pending resolution of the situation in Mar-ion County.

Allan is now in jail for three reasons: (1) serving his sentence for the original burglary, (2) await-ing trial on a robbery charge, and (3) waiting for transfer to Illinois.

Is this one jail stay or three? In a sense, it is one jail stay because one person, Allan, is occu-pying a jail cell. But let’s say Allan’s trial on the robbery charge is delayed until after he completes his sentence for the burglary. He stays in jail and begins a new jail stay. When he comes up for trial, the prosecutor asks to waive the robbery charges against Allan in hopes of exporting him to the neighboring state, and a new jail stay begins as Allan awaits his free trip to Illinois.

You may recognize this as a problem with units of analysis. Is the unit the person who stays in jail? Or are the separate reasons Allan is in jail—which are social artifacts—the units of analy-sis? After some thought, Maxfi eld decided that the social artifact was the more appropriate unit because he was interested in whether jail cells are more often occupied by people serving sentences or people awaiting trial. But that produced a new question of how to deal with people like Allan. Do we double-count the overlap in Allan’s jail stays, so that he accounts for two jail stays while serving his suspended sentence for burglary and waiting for the robbery trial? This seemed to make sense, but then Allan’s two jail stays would count the same as two other people with one jail stay each. In other words, Allan would appear to occupy two jail beds at the same time. This was neither true nor helpful in describing how long people stay in jail for different reasons.

90 Part Two Structuring Criminal Justice Inquiry

school group and the college group, or else the rank order is incorrect.

Interval Measures When the actual distance that separates the attributes composing some variables does have meaning, the variables are interval measures. The logical distance be-tween attributes can then be expressed in mean-ingful standard intervals.

Interval measures commonly used in social scientifi c research are constructed measures such as standardized intelligence tests. The in-terval that separates IQ scores of 100 and 110 is the same as the interval that separates scores of 110 and 120 by virtue of the distribution of the observed scores of the many thousands of people who have taken the test over the years. Criminal justice researchers often combine in-dividual nominal and ordinal measures to pro-duce a composite interval measure.

Ratio Measures Most of the social scientifi c variables that meet the minimum requirements for interval measures also meet the require-ments for ratio measures. In ratio measures, the attributes that compose a variable, besides having all the structural characteristics men-tioned previously, are based on a true zero point. Examples from criminal justice research are age, dollar value of property loss from bur-glary, number of prior arrests, blood alcohol content, and length of incarceration.

Returning to the example of various ways to classify police chiefs, we might ask the chiefs to group themselves according to years of experi-ence in their present position. All those new to their job would stand together, as would those with one year of experience, those with two years on the job, and so forth. The facts that members of each group share the same years of experience and that each group has a different shared length of time satisfy the minimum re-quirements for a nominal measure. Arranging the several groups in a line from those with the least to those with the most experience meets the additional requirements for an ordinal mea-sure and permits us to determine whether one

thing in common; the people in any one group differ from the people in all other groups in that same regard. Where the individual groups are formed, how close they are to one another, and how they are arranged in the room is irrele-vant. All that matters is that all the members of a given group share the same state of residence and that each group has a different shared state of residence.

Ordinal Measures Variables whose attri-butes may be logically rank ordered are ordinalmeasures. The different attributes represent relatively more or less of the variable. Examples of variables that can be ordered in some way are opinion of police, occupational status, crime seriousness, and fear of crime.

Let’s pursue the earlier example of grouping police chiefs at a social gathering and imagine that we ask all those who have graduated from college to stand in one group, all those with a high school diploma (but who were not also college graduates) to stand in another group, and all those who have not graduated from high school to stand in a third group. This manner of grouping people satisfi es the requirements for exhaustiveness and mutual exclusiveness. In addition, however, we might logically arrange the three groups in terms of their amount of formal education (the shared attribute). We might arrange the three groups in a row, rang-ing from most to least formal education. This arrangement provides a physical representation of an ordinal measure. If we know which groups two individuals are in, we can determine that one has more, less, or the same formal educa-tion as the other.

Note that in this example it is irrelevant how close or far apart the educational groups are from one another. They might stand 5 feet apart or 500 feet apart; the college and high school groups could be 5 feet apart, and the less-than-high-school group might be 500 feet farther down the line. These physical distances have no meaning. The high school group, how-ever, should be between the less-than-high-

Chapter 4 Concepts, Operationalization, and Measurement 91

The fourth column shows the ranking for each of the 17 crimes in the table; the most seri-ous crime, murder, is ranked 1, followed by rape with injury, and so on. The rankings express only the order of seriousness, however, because the difference between murder (ranked 1) and rape (ranked 2) is smaller than the distance be-tween rape and robbery with injury (ranked 3).

Finally, the crime descriptions presented to respondents indicated the value of property loss for each offense. This is a ratio measure with a true zero point, so that 10 burglaries with a loss of $1,000 each have the same property value as one arson offense with a loss of $10,000.

Specifi c analytic techniques require vari-ables that meet certain minimum levels of mea-surement. For example, we could compute the average property loss from the crimes listed in Table 4.1 by adding up the individual numbers in the fi fth column and dividing by the number of crimes listed (17). However, we would not be able to compute the average victim type be-cause that is a nominal variable. In that case, we could report the modal—the most common—victim type, which is society in Table 4.1.

Researchers may treat some variables as rep-resenting different levels of measurement. Ra-tio measures are the highest level, followed by interval, ordinal, and nominal. A variable that represents a given level of measurement—say, ratio—may also be treated as representing a lower level of measurement—say, ordinal. For example, age is a ratio measure. If we wish to examine only the relationship between age and some ordinal-level variable, such as delinquency involvement (high, medium, or low), we might choose to treat age as an ordinal-level variable as well. We might characterize the subjects of our study as being young, middle age, or old, specifying the age range for each of those group-ings. Finally, age might be used as a nominal-level variable for certain research purposes. Thus people might be grouped as baby boom-ers if they were born between 1945 and 1955.

The analytic uses planned for a given variable, then, should determine the level of

person is more experienced, is less experienced, or has the same level of experience as another. If we arrange the groups so that there is the same distance between each pair of adjacent groups, we satisfy the additional requirements of an interval measure and can say how much more experience one chief has than another. Fi-nally, because one of the attributes included—experience—has a true zero point (police chiefs just appointed to their job), the phalanx of hap-less convention goers also meets the require-ments for a ratio measure, permitting us to say that one person is twice as experienced as another.

Implications of Levels of MeasurementTo review this discussion and to illustrate why level of measurement may make a difference, consider Table 4.1. It presents information on crime seriousness adapted from a survey of crime severity conducted for the Bureau of Justice Statistics (Wolfgang, Figlio, Tracy, and Singer 1985). The survey presented brief de-scriptions of more than 200 different crimes to a sample of 60,000 people. Respondents were asked to assign a score to each crime based on how serious they thought the crime was com-pared with bicycle theft (scored at 10).

The fi rst column in Table 4.1 lists some of the crimes described. The second column shows a nominal measure that identifi es the victim in the crime: home, person, business, or society. Type of victim is an attribute of each crime. The third column lists seriousness scores computed from survey results, ranging from 0.6 for tres-passing to 35.7 for murder. These seriousness scores are interval measures because the dis-tance between, for example, auto theft (at 8.0) and accepting a bribe (at 9.0) is the same as that between accepting a bribe (at 9.0) and ob-structing justice (at 10.0). Seriousness scores are not ratio measures; there is no absolute zero point, and three instances of obstructing jus-tice (at 10.0) do not equal one rape with injury (at 30.0).

92 Part Two Structuring Criminal Justice Inquiry

that compose a variable. Saying that a woman is 43 years old is more precise than that she is in her forties. Describing a felony sentence as 18 months is more precise than more than one year.

As a general rule, precise measurements are superior to imprecise ones, as common sense would suggest. Precision is not always neces-sary or desirable, however. If knowing that a felony sentence is more than one year is suffi -cient for your research purpose, then any ad-ditional effort invested in learning the precise sentence would be wasted. The operationaliza-tion of concepts, then, must be guided partly by an understanding of the degree of precision required. If your needs are not clear, be more precise rather than less.

But don’t confuse precision with accuracy. Describing someone as “born in Stowe, Ver-mont” is more precise than “born in New Eng-land,” but suppose the person in question was actually born in Boston? The less precise de-

measurement to be sought, with the realization that some variables are inherently limited to a certain level. If a variable is to be used in a variety of ways that require different levels of measure-ment, the study should be designed to achieve the highest level possible. Although ratio mea-sures such as number of arrests can later be re-duced to ordinal or nominal ones, it is not pos-sible to convert a nominal or ordinal measure to a ratio one. More generally, you cannot convert a lower-level measure to a higher-level one. That is a one-way street worth remembering.

Criteria for Measurement QualityThe key standards for measurement quality are reliability and validity.

Measurements can be made with varying de-grees of precision, which refers to the fi neness of the distinctions made between the attributes

Table 4.1 Crime Seriousness and Levels of Measurement

Seriousness Value of Crime Victim Score Rank Property Loss

Accepting a bribe Society 9.0 9 0

Arson Business 12.7 6 $10,000

Auto theft Home 8.0 10 $12,000

Burglary Business 15.5 5 $100,000

Burglary Home 9.6 8 $1,000

Buying stolen property Society 5.0 12 0

Heroin sales Society 20.6 4 0

Heroin use Society 6.5 11 0

Murder Person 35.7 1 0

Obstructing justice Society 10.0 7 0

Public intoxication Society 0.8 15 0

Rape and injury Person 30.0 2 0

Robbery and injury Person 21.0 3 $1,000

Robbery attempt Person 3.3 13 0

Robbery, no injury Person 8.0 10 $1,000

Shoplifting Business 2.2 14 $10

Trespassing Home 0.6 16 0

Source: Adapted from Wolfgang, Figlio, Tracy, and Singer (1985).

Chapter 4 Concepts, Operationalization, and Measurement 93

search. For example, forensic DNA evidence is increasingly being used in violent crime cases. A National Research Council (1996) study found a variety of errors in laboratory procedures, including sample mishandling, evidence con-tamination, and analyst bias. These are mea-surement reliability problems that can lead to unwarranted exclusion of evidence or to the conviction of innocent people. Irregularities in DNA tests by a Texas crime lab led to the exon-eration of at least one previously convicted de-fendant and prompted reviews of hundreds of additional cases (McVicker and Khanna 2003).

Reliability problems crop up in many forms. Reliability is a concern every time a single ob-server is the source of data because we have no way to guard against that observer’s subjectiv-ity. We can’t tell for sure how much of what’s re-ported represents true variation and how much is due to the observer’s unique perceptions.

Reliability can also be an issue when more than one observer makes measurements. Sur-vey researchers have long known that different interviewers get different answers from respon-dents as a result of their own attitudes and de-meanor. Or we may want to classify a few hun-dred community anticrime groups into a set of categories created by the National Institute of Justice. A police offi cer and a neighborhood activist are unlikely to classify all those groups into the same categories; such inconsistency would be an example of reliability problems.

How do we create reliable measures? Be-cause the problem of reliability is a basic one in criminal justice measurement, researchers have developed a number of techniques for dealing with it.

The Test–Retest Method Sometimes it is appropriate to make the same measurement more than once. If there is no reason to expect the information to change, we should expect the same response every time. If answers vary, however, then the measurement method is, to the extent of that variation, unreliable. Here’s an illustration.

scription, in this instance, is more accurate; it’s a better refl ection of the real world. This is a point worth keeping in mind. Many criminal justice measures are imprecise, so reporting approxi-mate values is often preferable.

Precision and accuracy are obviously impor-tant qualities in research measurement, and they probably need no further explanation. When criminal justice researchers construct and eval-uate measurements, they pay special attention to two technical considerations: reliability and validity.

ReliabilityFundamentally, reliability is a matter of whether a particular measurement technique, applied repeatedly to the same thing, will yield the same result each time. In other words, mea-surement reliability is roughly the same as mea-surement consistency or stability. Imagine a po-lice offi cer standing on the street, guessing the speed of cars that pass by and issuing speeding tickets based on that judgment. If you received a ticket from this offi cer and went to court to contest it, you would almost certainly win your case. The judge would no doubt reject this way of measuring speed, regardless of the police of-fi cer’s experience. The reliability or consistency of this method of measuring vehicle speed is questionable at best. If the same police offi -cer used a radar speed detector, however, it is doubtful that you would be able to beat the ticket. The radar device is judged a much more reliable way of measuring speed.

Reliability, though, does not ensure accuracy any more than precision does. The speedometer in your car may be a reliable instrument for mea-suring speed, but it is common for speedom-eters to be off by a few miles per hour, especially at higher speeds. If your speedometer shows 55 miles per hour when you are actually travel-ing at 60, it gives you a consistent but inaccu-rate reading that might attract the attention of police offi cers with more accurate radar guns.

Measurement reliability is often a problem with indicators used in criminal justice re-

94 Part Two Structuring Criminal Justice Inquiry

supervisor call a subsample of the respondents on the telephone and verify selected informa-tion. West and Farrington (1977, 173) checked interrater reliability in their study of London youths and found few signifi cant differences in results obtained from different interviewers.

Comparing measurements from different raters works in other situations as well. Michael Geerken (1994) presents an important discus-sion of reliability problems that researchers are likely to encounter in measuring prior arrests through police rap sheets. Duplicate entries, the use of aliases, and the need to transform offi cial crime categories into a smaller number of catego-ries for analysis are among the problems Geerken cites. One way to increase consistency in trans-lating offi cial records into research measures—a process often referred to as coding—is to have more than one person code a sample of records and then compare the consistency of coding de-cisions made by each person. This approach was used by Michael Maxfi eld and Cathy Spatz Wi-dom (1996) in their analysis of adult arrests of child abuse victims.

In general, whenever researchers are con-cerned that measures obtained through coding may not be classifi ed reliably, they should have each independently coded by different people. A great deal of disagreement among coders would most likely be due to ambiguity in operational defi nitions.

The reliability of measurements is a funda-mental issue in criminal justice research, and we’ll return to it in the chapters to come. For now, however, we hasten to point out that even total reliability doesn’t ensure that our mea-sures actually measure what we think they mea-sure. That brings us to the issue of validity.

ValidityIn conventional usage, the term validity means that an empirical measure adequately refl ects the meaning of the concept under consider-ation. Put another way, measurement validity involves whether you are really measuring what you say you are measuring. Recall that an oper-

In their classic research on delinquency in England, Donald West and David Farrington (1977) interviewed a sample of 411 males from a working-class area of London at age 16 and again at age 18. The subjects were asked to de-scribe a variety of aspects of their lives, including educational and work history, leisure pursuits, drinking and smoking habits, delinquent activ-ities, and experience with police and courts.

West and Farrington assessed reliability in several ways. One was to compare responses from the interview at age 18 with those from the interview at age 16. For example, in each interview, the youths were asked at what age they left school. In most cases, there were few discrepancies in stated age from one interview to the next, which led the authors to conclude, “There was therefore no systematic tendency for youths either to increase or lessen their claimed period of school attendance as they grew older, as might have occurred if they had wanted either to exaggerate or to underplay their educational attainments” (1977, 76–77). If West and Farrington had found less consis-tency in answers to this and other items, they would have had good reason to doubt the truthfulness of responses to more sensitive questions. The test–retest method suggested to the authors that memory lapses were the most common source of minor differences.

Although this method can be a useful reliabil-ity check, it is limited in some respects. Faulty memory may produce inconsistent responses if there is a lengthy gap between the initial inter-view and the retest. A different problem can arise in trying to use the test–retest method to check the reliability of attitude or opinion measures. If the test–retest interval is short, then answers given in the second interview may be affected by earlier responses if subjects try to be consistent.

Interrater Reliability It is also possible for measurement unreliability to be generated by re-search workers—for example, interviewers and coders. To guard against interviewer unreliabil-ity, it is common practice in surveys to have a

Chapter 4 Concepts, Operationalization, and Measurement 95

as valid; this is sometimes referred to as con-vergent validity. The validity of College Board exams, for example, is shown in their ability to predict the success of students in college.

Timothy Heeren and associates (Heeren, Smith, Morelock, and Hingson 1985) offer a good example of criterion-related validity in their efforts to validate a measure of alcohol-related auto fatalities. Of course, conducting a blood alcohol laboratory test on everyone killed in auto accidents would be a valid measure. Not all states regularly do this, however, so Heeren and colleagues tested the validity of an alterna-tive measure: single-vehicle fatal accidents in-volving male drivers occurring between 8:00 p.m. and 3:00 a.m. The validity of this measure was shown by comparing it with the blood alcohol test results for all drivers killed in states that reliably conducted such tests in fatal accidents. Because the two measures agreed closely, Heeren and associates claimed that the proxy, or sur-rogate, measure would be valid in other states.

Another approach to criterion-related valid-ity is to show that our measure of a concept is different from measures of similar but distinct concepts. This is called discriminant validity,meaning that measures can discriminate be-tween different concepts.

Sometimes it is diffi cult to fi nd behavioral criteria that can be used to validate measures as directly as described here. In those instances, however, we can often approximate such crite-ria by considering how the variable in question ought, theoretically, to relate to other variables.

Construct Validity Construct validity is based on the logical relationships among vari-ables. Let’s suppose that we are interested in studying fear of crime—its sources and conse-quences. As part of our research, we develop a measure of fear of crime, and we want to assess its validity.

In addition to our measure, we will also de-velop certain theoretical expectations about the way the variable fear of crime relates to other vari-ables. For instance, it’s reasonable to conclude

ational defi nition specifi es the operations you will perform to measure a concept. Does your operational defi nition accurately refl ect the concept you are interested in? If the answer is yes, you have a valid measure. A radar gun is a valid measure of vehicle speed, but a wind ve-locity indicator is not because it measures total wind speed, not vehicle speed with respect to the ground.

Although methods for assessing reliability are relatively straightforward, it is more diffi -cult to demonstrate that individual measures are valid. Because concepts are not real, but ab-stract, we cannot directly demonstrate that mea-sures, which are real, are actually measuring an abstract concept. Nevertheless, researchers have some ways of dealing with the issue of validity.

Face Validity First, there’s something called face validity. Particular empirical measures may or may not jibe with our common agreements and our individual mental images about a par-ticular concept. We might debate the adequacy of measuring satisfaction with police services by counting the number of citizen complaints registered by the mayor’s offi ce, but we’d surely agree that the number of citizen complaints has something to do with levels of satisfaction. If someone suggested that we measure satisfac-tion with police by fi nding out whether people like to watch police dramas on TV, we would probably agree that the measure has no face va-lidity; it simply does not make sense.

Second, there are many concrete agreements among researchers about how to measure cer-tain basic concepts. The Census Bureau, for example, has created operational defi nitions of such concepts as family, household, and em-ployment status that seem to have a workable validity in most studies using those concepts.

Criterion-Related Validity A more formal way to assess validity is to compare a mea-sure with some external criterion, known as criterion-related validity. A measure can be validated by showing that it predicts scores on another measure that is generally accepted

96 Part Two Structuring Criminal Justice Inquiry

measure delinquency and criminality. But how valid are survey questions that ask people how many crimes they have committed?

The approach used by West and Farrington (and by others) is to ask people, for example, how many times they have committed robbery and how many times they have been arrested for that crime. Those who admit to having been arrested for robbery are asked when and where the arrest occurred. Self-reports can then be validated by checking police arrest records. This works two ways: (1) it is possible to validate in-dividual reports of being arrested for robbery, and (2) researchers can check police records for all persons interviewed to see if there are any records of robbery arrests that subjects do not disclose to interviewers.

Figure 4.1 illustrates the difference between validity and reliability. Think of measurement as analogous to hitting the bull’s-eye on a tar-get. A reliable measure produces a tight pattern, regardless of where it hits, because reliability is a function of consistency. Validity, in contrast, relates to the arrangement of shots around the bull’s-eye. The failure of reliability in the fi gure can be seen as a random error; the failure of va-lidity is a systematic error. Notice that neither an unreliable nor an invalid measure is likely to be very useful.

that people who are afraid of crime are less likely to leave their homes at night for entertainment than people who are not afraid of crime. If our measure of fear of crime relates to how often people go out at night in the expected fashion, that constitutes evidence of our measure’s con-struct validity. However, if people who are afraid of crime are just as likely to go out at night as people who are not afraid, that challenges the validity of our measure. This and related points about measures of fear are nicely illustrated by Jason Ditton and Stephen Farrall in their analy-sis of data from England (2007).

Tests of construct validity, then, can offer a weight of evidence that our measure either does or doesn’t tap the quality we want it to measure, without providing defi nitive proof.

Multiple Measures Another approach to val-idation of an individual measure is to compare it with alternative measures of the same con-cept. The use of multiple measures is similar to establishing criterion validity. However, the use of multiple measures does not necessarily as-sume that the criterion measure is always more accurate. For example, many crimes never result in an arrest, so arrests are not good measures of how many crimes are committed by individu-als. Self-report surveys have often been used to

Reliable but not valid Valid but not reliable Valid and reliable

Figure 4.1 Analogy to Validity and Reliability

Chapter 4 Concepts, Operationalization, and Measurement 97

self-interest, a term that has engaged philoso-phers and social scientists for centuries.

James Q. Wilson and Richard Herrnstein (1985, 22) propose a different defi nition that should get us started: “A crime is any act com-mitted in violation of a law that prohibits it and authorizes punishment for its commis-sion.” Although other criminologists (such as Gottfredson and Hirschi) might not agree with this conceptual defi nition, it has the advantage of being reasonably specifi c. We could be even more specifi c by consulting a state or federal code and listing the types of acts for which the law provides punishment.

Our list would be very long. In fact, one of the principal diffi culties we encounter when we try to measure crime is that many different types of behaviors and actions are included in our conceptualization of crime as an act com-mitted in violation of a law that prohibits it and au-thorizes punishment for its commission, but we may be interested in only a small subset of things included under such a broad defi nition. Differ-ent measures tend to focus on different types of crime, primarily because not all crimes can be measured the same way with any degree of reli-ability or validity. Therefore one important step in selecting a measure is deciding what crimes will be included.

What Units of Analysis? Recall that units of analysis are the specifi c entities researchers col-lect information about. Chapter 3 considered individuals, groups, social artifacts, and other units of analysis. Deciding how to measure crime requires that we once again think about these units.

Crimes involve four elements that are often easier to recognize in the abstract than they are to actually measure: offender, victim, offense, and incident. The most basic of these elements is the offender. Without an offender, there’s no crime, so a crime must, at a minimum, involve an offender. The offender is therefore one pos-sible unit of analysis. We might decide to study

Measuring CrimeDifferent approaches to measuring crime illustrate basic principles in conceptualization and measure-ment.

By way of illustrating basic principles in mea-surement, we now focus more narrowly on dif-ferent ways of measuring crime. Crime is a fun-damental dependent variable in criminal justice and criminology. Explanatory studies frequently seek to learn what causes crime, whereas applied studies often focus on what actions might be effective in reducing crime. Descriptive and ex-ploratory studies may simply wish to count how much crime there is in a specifi c area, a question of obvious concern to criminal justice offi cials as well as researchers.

Crime can also be an independent variable—for example, in a study of how crime affects fear or other attitudes or of whether people who live in high-crime areas are more likely than others to favor long prison sentences for drug dealers. Sometimes crime can be both an independent and a dependent variable, as in a study about the relationship between drug use and other offenses.

General Issues in Measuring CrimeAt the outset, we must consider two general questions that infl uence whatever approach we might take to measuring crime: (1) How will we conceptualize crime? (2) What units of analysis should be used?

Conceptualization Let’s begin by propos-ing a conceptual defi nition of crime— one that will enable us to decide what specifi c types of crime we’ll measure. Recall a defi nition from Michael Gottfredson and Travis Hirschi (1990, 15), mentioned earlier: “acts of force or fraud undertaken in pursuit of self-interest.” This is an interesting defi nition, but it is better suited to an extended discussion of theories of crime than to our purposes in this chapter. For exam-ple, we would have to clarify what was meant by

98 Part Two Structuring Criminal Justice Inquiry

“one or more offenses committed by the same offender, or group of offenders acting in concert, at the same time and place” (Federal Bureau of In-vestigation 2000, 17; emphasis in original).

Think about the difference between offense and incident for a moment. A single incident can include multiple offenses, but it’s not possi-ble to have one offense and multiple incidents.

To illustrate the different units of analysis—offenders, victims, offenses, and incidents—consider the examples in the box titled “Units of Analysis and Measuring Crime.” These ex-amples help distinguish units from each other and illustrate the links among different units. Notice that we have said nothing about aggre-gate units of analysis, a topic we examined in Chapter 3. We have considered only individual units, even though measures of crime are often based on aggregate units of analysis—neighbor-hoods, cities, counties, states, and so on.

We cover units at some length because they play a critical, and often overlooked, role in de-veloping operational defi nitions, to which our attention now turns.

Measures Based on Crimes Known to PoliceThe most widely used measures of crime are based on police records and are commonly

burglars, auto thieves, bank robbers, child mo-lesters, drug dealers, or people who have com-mitted many different types of offenses.

Crimes also require some sort of victim, the second possible unit of analysis. We could study victims of burglary, auto theft, bank robbery, or assault. Notice that this list of victims includes different types of units: households or busi-nesses for burglary, car owners for auto theft, banks for bank robbery, and individuals for assault. Some of these units are organizations (banks, businesses), some are individual people, some are abstractions (households), and some are ambiguous (individuals or organizations can own automobiles).

What about so-called victimless crimes like drug use, bookmaking, or prostitution? In a legal sense, victimless crimes do not exist be-cause crimes are acts that injure society, organi-zations, or individuals. But studying crimes in which only society is the victim—prostitution, for example—presents special challenges, and specialized techniques have been developed to measure certain types of victimless crimes.

The fi nal two elements of crimes— offense and incident—are closely intertwined and so will be discussed together. An offense is defi ned as an individual act of burglary, auto theft, bank robbery, and so on. The FBI defi nes incident as

UNITS OFANALYSIS ANDMEASURINGCRIME

Figuring out the different units of analysis in counting crimes can be diffi cult and confusing at fi rst. Much of the problem comes from the possibility of what database designers call one-to-many and many-to-many relationships. The same incident can have multiple offenses, offend-ers, and victims or just one of each. Fortunately, thinking through some examples usually clarifi es the matter. Our two examples are adapted from an FBI publication (2000, 18).

Example 1Two males entered a bar. The bartender was forced at gunpoint to hand over all money from the cash register. The offenders also took money and jewelry from three customers. One of the of-fenders used his handgun to beat one of the cus-tomers, thereby causing serious injury. Both of-fenders fl ed on foot.

One incident One robbery offense Two offenders Four victims (bar owner, three patrons)One aggravated assault offense Two offenders One victim

Chapter 4 Concepts, Operationalization, and Measurement 99

cle theft (Federal Bureau of Investigation 2007). Other offenses, referred to as Part II crimes, are counted only if a person has been arrested and charged with a crime. The UCR therefore does not include such offenses as shoplifting, drug sale or use, fraud, prostitution, simple as-sault, vandalism, receiving stolen property, and all other nontraffi c offenses unless someone is arrested. This means that a large number of crimes reported to police are not measured in the UCR.

Another source of measurement error in the UCR is produced by the hierarchy rule used by police agencies and the FBI to classify crimes. Under the hierarchy rule, if multiple crimes are committed in a single incident, only the most serious is counted in the UCR. For example, if a burglar breaks into a home, rapes one of the occupants, and fl ees in the homeowner’s car, at least three crimes are committed—burglary, rape, and vehicle theft. Under the FBI hierar-chy rule, however, only the most serious crime, rape, is counted in the UCR, even though the offender could be charged with all three of-fenses. In the examples described in the box “Units of Analysis and Measuring Crime,” the UCR would count one offense in each incident: a single robbery in the fi rst example and rape in the second.

referred to as crimes known to police. This phrase is at the core of police-based operational defi -nitions and has important implications for understanding what police records do and do not measure. The most obvious implication is that crimes not known to police cannot be measured by consulting police records. Other features of measures based on police records can best be understood by considering specifi c examples.

Uniform Crime Reports Police measures of crime form the basis for the FBI’s Uniform Crime Reports (UCR), a data series that has been collected since 1930 and has been widely used by criminal justice researchers. But certain characteristics and procedures related to the UCR affect its suitability as a measure of crime. Most of our comments highlight shortcomings in this regard, but keep in mind that the UCR is and will continue to be a very useful measure for researchers and public offi cials.

The UCR does not even try to count all crimes reported to police. What are referred to as Part I offenses are counted if these offenses are re-ported to police (and recorded by police). Part I offenses include murder and non-negligent manslaughter, forcible rape, robbery, aggravated assault, burglary, larceny-theft, and motor vehi-

Even though only one offender actually assaulted the bar patron, the other offender would be charged with assisting in the offense because he prevented others from coming to the aid of the assault victim.

Example 2Two males entered a bar. The bartender was forced at gunpoint to hand over all the money from the cash register. The offenders also took money and jewelry from two customers. One of the offenders, in searching for more people to rob, found a customer in a back room and raped her there, outside the view of the other offender. When the rapist returned, both offenders fl ed on foot.

This example includes two incidents because the rape occurred in a different place and the of-fenders were not acting in concert. And because they were not acting in concert in the same place, only one offender was associated with the rape incident.

Incident 1 One robbery offense Two offenders Three victims (bar owner, two patrons)Incident 2 One rape offense One offender One victim

100 Part Two Structuring Criminal Justice Inquiry

duct descriptive and explanatory studies of individual events. For example, it’s possible to compare the relationship between victim and offender for male victims and female victims or to compare the types of weapons used in kill-ings by strangers and killings by nonstrangers. Such analyses are not possible if we are study-ing homicide using UCR summary data.

Crime measures based on incidents as units of analysis therefore have several advantages over summary measures. It’s important to keep in mind, however, that SHR data still represent crimes known to police and recorded by police.

The National Incident-Based Reporting System The most recent development in police-based measures at the national level is the ongoing effort by the FBI and the Bureau of Justice Statistics (BJS) to convert the UCR to a National Incident-Based Reporting Sys-tem (NIBRS, pronounced “ny-bers”). Planning for replacement of the UCR began in the mid-1980s, but because NIBRS represents major changes, law enforcement agencies have shifted only gradually to the new system.

Briefl y, NIBRS is a Very Big Deal. For exam-ple, let’s consider NIBRS and the UCR crime measures for a single state, Idaho. Nationwide, about 17,000 law enforcement agencies report UCR summary data each year; that’s 17,000 annual observations, one for each reporting agency. In 2004, 106 agencies in Idaho reported UCR data, so Idaho submitted a maximum of 106 observations for 2004. Under NIBRS, Idaho reported over 95,000 incidents in 2004 (Idaho State Police 2005). In other words, rather than reporting 106 summary crime counts for eight UCR Part I offenses, Idaho reported detailed in-formation on 95,522 individual incidents. And this is Idaho, which ranked 39th among the states in year-2000 resident population!

In addition, NIBRS guidelines call for gath-ering more detailed information about a much broader array of offenses. Whereas the UCR re-ports information about seven Part I offenses (plus arson), NIBRS is designed to collect de-

Before we move on to other approaches to measuring crime, consider another important way units of analysis fi gure into UCR data. The UCR system produces what is referred to as a summary-based measure of crime. This means that UCR data include summary, or total, crime counts from reporting agencies— cities or counties. UCR summary data therefore rep-resent groups as units of analysis. Crime reports are available for cities or counties, and these may be aggregated upward to measure crime for states or regions of the United States. But UCR data available from the FBI cannot represent in-dividual crimes, offenders, or victims as units.

Recall that it is possible to aggregate units of analysis to higher levels, but it is not possible to disaggregate grouped data to the individual level. Because UCR data are aggregates, they cannot be used in descriptive or explanatory studies that focus on individual crimes, offend-ers, or victims. UCR data are therefore restricted to the analysis of such units as cities, counties, states, or regions.

Incident-Based Police Records The U.S. De-partment of Justice sponsors two series of crime measures that are based on incidents as units of analysis. The fi rst of these incident-basedmeasures, Supplementary Homicide Reports (SHR), was begun in 1961 and is part of the UCR program, as implied by supplementary.

Local law enforcement agencies submit detailed information about individual homi-cide incidents under the SHR program. This includes information about victims and, if known, offenders (age, gender, race); the re-lationship between victim and offender; the weapon used; the location of the incident; and the circumstances surrounding the killing. No-tice how the SHR relates to our discussion of units of analysis. Incidents are the basic unit and can include one or more victims and of-fenders; because the series is restricted to homi-cides, offense is held constant.

Because the SHR is an incident-based sys-tem, investigators can use SHR data to con-

Chapter 4 Concepts, Operationalization, and Measurement 101

a larger number of law enforcement agencies. In fact, many agencies have developed their own incident-based records systems independent of NIBRS, largely because of major advances in computing technology (Maxfi eld 1999). Fur-thermore, researchers are beginning to analyze

tailed information on 46 Group A offenses. Table 4.2 shows what kinds of information are collected for offenses, victims, and offenders under NIBRS. Table 4.3 shows NIBRS crime data for Idaho in 2004. Compare the top part of the table, reporting crime counts for UCR index offenses, to the bottom part. Additional NIBRS Group A offenses more than double the number of crimes “known to police” in Idaho (43,611 UCR Part I, plus 51,911 additional Group A). Simple assault and vandalism are by far the most common of these additional offenses, but drug violations accounted for al-most 13,000 offenses in 2004 (drug violations plus drug equipment violations).

Collecting detailed information on each in-cident for each offense, victim, and offender, and doing so for a large number of offense types, represents the most signifi cant changes in NIBRS compared with the UCR. Dropping the hierarchy rule is also a major change, but that is a consequence of incident-based reporting.

In the future, incident-based police records will become more readily available and will cover

Table 4.2 Selected Information in National Incident-Based Reporting System Records

Administrative Segment Offense Segment Incident date and time Offense type Reporting agency ID Attempted or Other ID numbers completed Offender drug/ alcohol use Location type Weapon use

Victim Segment Offender Segment Victim ID number Offender ID number Offense type Offender age, Victim age, gender, race gender, race Resident of jurisdiction? Type of injury Relationship to offender Victim type: Individual person Business Government Society/public

Source: Adapted from Federal Bureau of Investigation (2000, 6–8, 90).

Table 4.3 Crime in Idaho, 2004

UCR Part I Offenses

Murder, non-negligent manslaughter 35

Rape 577

Robbery 247

Aggravated assault 2,594

Burglary 7,700

Larceny 29,442

Motor vehicle theft 2,696

Arson 320

Subtotal 43,611

Additional NIBRS Group A Offenses

Simple assault 14,192

Intimidation 1,766

Bribery 7

Counterfeit/forgery 1,982

Destruction of property 14,516

Drug violations 6,667

Drug equipment violations 6,329

Embezzlement 295

Extortion/blackmail 12

Fraud 2,426

Gambling 9

Kidnapping/abduction 236

Pornography/obscene material 33

Prostitution 10

Forcible sex offenses 1,200

Nonforcible sex offenses 215

Stolen property 631

Weapons violations 1,385

Subtotal 51,911

Group A Total 95,522

Source: Adapted from Idaho Department of Law Enforcement, “Crime in Idaho, 2004,” www.isp.state.id.us/identifi cation/ucr/2004/crime_in_Idaho_2004.html; accessed May 13, 2008.

102 Part Two Structuring Criminal Justice Inquiry

ing household members. Samples of banks, gas stations, retail stores, business establishments, or stockbrokers would be needed to measure those crimes. In much the same fashion, crimes directed at homeless victims cannot be counted by surveys of households like the NCVS.

What about victimless crimes? For example, think about how you would respond to a Cen-sus Bureau interviewer who asked whether you had been the victim of a drug sale. If you have bought illegal drugs, you might think of your-self as a customer rather than as a victim. Or if you lived near a park where drug sales were common, you might think of yourself as a vic-tim even though you did not participate in a drug transaction. The point is that victim sur-veys are not good measures of victimless crimes because the respondents can’t easily be con-ceived as victims.

Measuring certain forms of delinquency through victim surveys presents similar prob-lems. Status offenses such as truancy and curfew violations do not have identifi able vic-tims who can be included in samples based on households. Homicide and manslaughter are other crimes that are not well measured by vic-tim surveys, for obvious reasons.

Since its inception, the NCVS has served as a measure to monitor the volume of crime, including crimes not reported to police. In a regular series of publications, the BJS reports annual victimization data together with analy-sis of victimization for special topics such as carjackings (Klaus 2004), intimate partner vio-lence (Rand and Rennison 2005), and contacts between individuals and the police (Durose, Schmitt, and Langan 2005). In addition, the NCVS is a valuable tool for researchers who take advantage of detailed information about individual victimizations to examine such top-ics as victimization in public schools (Dinkes, Cataldi, Lin-Kelly, et al. 2007), identity theft (Baum 2006), victimization at work (Duhart 2001), and why domestic violence victimiza-tions may or may not be reported to police (Fel-son, Messner, Hoskin, et al. 2002).

NIBRS data, something that is certain to prompt other researchers to do the same. For examples, see studies of child abuse (Finkelhor and Orm-rod 2004; Snyder 2000), hate crimes (Nolan, Akiyama, and Berhanu 2002), and domestic vio-lence (Vazquez, Stohr, and Perkiss 2005).

Victim SurveysConducting a victim survey that asks people whether they have been the victim of a crime is an alternative approach to operationalization. In principle, measuring crime through surveys has several advantages. Surveys can obtain in-formation on crimes that were not reported to police. Asking people about victimizations can also measure incidents that police may not have offi cially recorded as crimes. Finally, asking people about crimes that may have happened to them provides data on victims and offenders (individuals) and on the incidents themselves (social artifacts). Like an incident-based report-ing system, a survey can therefore provide more disaggregated units of analysis.

The National Crime Victimization SurveySince 1972, the U.S. Census Bureau has con-ducted the NCVS. The NCVS is based on a na-tionally representative sample of households and uses uniform procedures to select and in-terview respondents, which enhances the reli-ability of crime measures. Because individual people living in households are interviewed, the NCVS can be used in studies in which individu-als or households are the unit of analysis. And the NCVS uses a panel design, interviewing respondents from the same household seven times at six-month intervals.

The NCVS cannot measure all crimes, how-ever, in part because of the procedures used to select victims. Because the survey is based on a sample of households, it cannot count crimes in which businesses or commercial establish-ments are the victims. Bank robberies, gas sta-tion holdups, shoplifting, embezzlement, and securities fraud are examples of crimes that cannot be systematically counted by interview-

Chapter 4 Concepts, Operationalization, and Measurement 103

Surveys of OffendingJust as survey techniques can measure crime by asking people to describe their experiences as victims, self-report surveys ask people about crimes they may have committed. We might initially be skeptical of this technique: how truthful are people when asked about crimes they may have committed? Many people do not wish to disclose illegal behavior to interviewers even if they are assured of confi dentiality. Oth-ers might deliberately lie to interviewers and exaggerate the number of offenses they have committed. Our concern would be justifi ed, al-though researchers have devised various meth-ods to enhance the validity and reliability of self-report data; we will examine these in Chapter 7.

In any event, self-report surveys are the best method for operationalizing certain crimes that are poorly measured by other techniques. Thinking about the other methods we have discussed— crimes known to police and vic-timization surveys—suggests several examples. Crimes such as prostitution and drug abuse are excluded from victimization surveys and underestimated by police records of people ar-rested for these offenses. Public order crimes and delinquency are other examples. A third class of offenses that might be better counted by self-report surveys is crimes that are rarely reported to or observed by police—shoplifting and drunk driving are examples.

Think of it this way: As we saw earlier, all crimes require an offender. Not all crimes have clearly identifi able victims who can be inter-viewed, however, and not all crimes are readily observed by police, victims, or witnesses. If we can’t observe the offense and can’t interview a victim, what’s the next logical step?

There are no nationwide efforts to systemat-ically collect self-report measures on a variety of offenses, as is the case with the UCR and NCVS. Instead, periodic surveys yield information ei-ther on specifi c types of crime or on crimes committed by a specifi c target population. We will briefl y consider two ongoing self-report surveys here.

Community Victimization Surveys Follow-ing the initial development of victim survey methods in the late 1960s, the Census Bureau completed a series of city-level surveys. These were discontinued for a variety of reasons, but researchers and offi cials in the BJS occasionally conducted city-level victim surveys in specifi c communities. In 1998, the BJS and the Offi ce of Community Oriented Policing Services (COPS) launched pilot surveys in 12 large and medium-sized cities (Smith, Steadman, Minton, and Townsend 1999).

The city-level initiative underscores one of the chief advantages of measuring crime through victim surveys— obtaining counts of incidents not reported to police. In large part, city-level surveys were promoted by BJS and COPS to enable local law enforcement agencies to better understand the scope of crime—re-ported and unreported—in their communities. Notice also the title of the fi rst report: “Crimi-nal Victimization and Perceptions of Commu-nity Safety in 12 Cities, 1998.” We emphasize perceptions to illustrate that city-level surveys can be valuable tools for implementing com-munity policing, a key component of the 1994 Crime Bill that provided billions of dollars to hire new police offi cers nationwide. It is signifi -cant that the Department of Justice recognized the potential value of survey measures of crime and perceptions of community safety to de-velop and evaluate community policing.

The initial BJS/COPS effort was a pilot test of new methods for conducting city-level sur-veys. These bureaus jointly developed a guide-book and software so that local law enforce-ment agencies and other groups can conduct their own community surveys (Weisel 1999). These tools also promise to be useful for re-searchers who wish to study local patterns of crime and individual responses. Although the community survey initiative lapsed after George W. Bush became president, the BJS con-tinues to update software and make it available on its website (www.ojp.usdoj.gov/bjs/abstract/cvs.htm; accessed May 13, 2008).

104 Part Two Structuring Criminal Justice Inquiry

includes several samples of high school stu-dents and other groups, totaling about 49,500 respondents in 2004 ( Johnston et al. 2005).

Each spring between 120 and 140 high schools are sampled within particular geo-graphic areas. In larger high schools, samples of up to 350 seniors are selected; in smaller schools, all seniors may participate. Students fi ll out computer scan sheets containing batter-ies of questions that include self-reported use of alcohol, tobacco, and illegal drugs. In most cases, students record their answers in class-rooms during normal school hours.

The core sample of the MTF—surveys of high school seniors—thus provides a cross sec-tion for measuring annual drug use and other illegal acts. Now recall our discussion of the time dimension in Chapter 3. Each year, both the MTF and the NSDUH survey drug use for a cross section of high school seniors and adults in households, thus providing a snapshot of an-nual rates of self-reported drug use. Examining annual results from the MTF and the NSDUH over time provides a time series, or trend study, that enables researchers and policy makers to detect changes in drug use among high school seniors and adults. Finally, a series of follow-up samples of MTF respondents constitute a series of panel studies whereby changes in drug use among individual respondents can be studied over time. Thomas Mieczkowski (1996) pres-ents an excellent discussion of these two sur-veys and compares self-reported drug use from each series over time.

Measuring Crime SummaryTable 4.4 summarizes the strengths and weak-nesses of different measures of crime. The UCR and SHR provide the best counts for murder and crimes in which the victim is a business or a commercial establishment. Crimes against persons or households that are not reported to police are best counted by the NCVS. Usually these are less serious crimes, many of them UCR Part II incidents that are counted only if a sus-

National Survey on Drug Abuse and Health Like the NCVS, the National Survey on Drug Use and Health (NSDUH) is based on a national sample of households. Currently sponsored by the Substance Abuse and Mental Health Services Administration in the U.S. De-partment of Health and Human Services, the NSDUH samples households and household residents ages 12 and older. In the 2004 sample, about 68,000 individuals responded to ques-tions regarding their use of illegal drugs, alco-hol, and tobacco (Substance Abuse and Mental Health Services Administration 2005). Because it has been conducted for more than three de-cades, the NSDUH provides information on trends and changes in drug use among respon-dents. The 2004 survey was designed to obtain statistically reliable samples from the eight largest states in addition to the overall national sample.

Think for a moment about what sorts of questions we would ask to learn about people’s experience in using illegal drugs. Among other things, we would probably want to distinguish someone who tried marijuana once from daily users. The drug use survey does this by includ-ing questions to distinguish lifetime use (ever used) of different drugs from current use (used within the past month). You may or may not agree that use in the past month represents cur-rent use, but it is the standard used in regular reports on NSDUH results. That’s the opera-tional defi nition of current use.

Monitoring the Future Our second exam-ple is different in two respects: (1) it targets a specifi c population, and (2) it asks sampled re-spondents a broader variety of questions.

Since 1975, the National Institute on Drug Abuse has sponsored an annual survey of high school seniors, Monitoring the Future: A Continu-ing Study of the Lifestyles and Values of Youth, or the MTF for short. As its long title implies, the MTF survey is intended to monitor the behaviors, attitudes, and values of young people. The MTF

Chapter 4 Concepts, Operationalization, and Measurement 105

ever measure of crime best suits their research purpose.

Composite MeasuresCombining individual measures often produces more valid and reliable indicators.

Sometimes it is possible to construct a single measure that captures the variable of interest. For example, asking auto owners whether their car has been stolen in the previous six months is a straightforward way to measure auto-theft victimization. But other variables may be better measured by more than one indicator. To begin with a simple and well-known example, the FBI crime index was a composite measure of crime that combined police reports for seven differ-ent offenses into one indicator.

Composite measures are frequently used in criminal justice research for three reasons. First, despite carefully designing studies to

pect is arrested. Recent changes in NCVS pro-cedures have improved counts of sexual assault and other violent victimizations. Compared with the UCR, NIBRS potentially provides much greater detail for a broader range of offenses. NI-BRS complements the NCVS by including disag-gregated incident-based reports for state and lo-cal areas and by recording detailed information on crimes against children younger than age 12.

Self-report surveys are best at measuring crimes that do not have readily identifi able vic-tims and that are less often observed by or re-ported to police. The two self-report surveys listed in Table 4.4 sample different populations and use different interview procedures.

Don’t forget that all crime measures are se-lective, so it’s important to understand the se-lection process. Despite their various fl aws, the measures of crime available to you can serve many research purposes. Researchers are best advised to be critical and careful users of what-

Table 4.4 Measuring Crime Summary

Units Target Population Crime Coverage Best Count for

Known to police

UCR Aggregate: All law enforcement agencies; Limited number Commercial and reporting 98% reporting reported and business victimsagency recorded crimes

SHR Incident All law enforcement agencies; Homicides only Homicides98% reporting

NIBRS Incident All law enforcement agencies; Extensive Details on local limited reporting incidents; victims

under age 12

Surveys

NCVS Victimization, Individuals in households Household and Household and individuals and personal crimes personal crimes households not reported to

police

NSDUH Individual Individuals in households Drug use Drug use by respondent, adults in offender households

MTF Individual High school seniors; Substance use, Drug use by respondent, follow-up on sample delinquency, high schooloffender offending seniors

106 Part Two Structuring Criminal Justice Inquiry

very nearly maintaining the specifi c details of all the individual indicators.

TypologiesResearchers combine variables in different ways to produce different composite measures. The simplest of these is a typology, sometimes called a taxonomy. Typologies are produced by the intersection of two or more variables to cre-ate a set of categories or types. We may, for ex-ample, wish to classify people according to the range of their experience in criminal court. As-sume we have asked a sample of people whether they have ever served as a juror and whether they have ever testifi ed as a witness in crimi-nal court. Table 4.5 shows how the yes and no responses to these two questions can be com-bined into a typology of experience in court.

Typologies can be more complex— combin-ing scores on three or more measures or com-bining scores on two measures that take many different values. For an example of a complex typology, consider research by Rolf Loeber and associates (Loeber, Stouthamer-Loeber, von Kammen, and Farrington 1991) on patterns of delinquency over time. The researchers used a longitudinal design in which a sample of boys was selected from Pittsburgh public schools and interviewed many times. Some questions asked about their involvement in delinquency and criminal offending. This approach made it possible to distinguish boys who reported dif-ferent types of offending at different times.

Loeber and associates fi rst classifi ed delin-quent and criminal acts into the following or-dinal seriousness categories (1991, 44):

None: No self-reported delinquencyMinor: Theft of items worth less than $5,

vandalism, fare evasion Moderate: Theft more than $5, gang fi ght-

ing, carrying weapons Serious: Car theft, breaking and entering,

forced sex, selling drugs

provide valid and reliable measurements of vari-ables, the researcher is often unable to develop single indicators of complex concepts. That is especially true with regard to attitudes and opinions that are measured through surveys. For example, measuring fear of crime through a question that asks about feelings of safety on neighborhood streets measures some dimen-sions of fear but certainly not all of them. This leads us to question the validity of using that single question to measure fear of crime.

Second, we may wish to use a rather refi ned ordinal measure of a variable, arranging cases in several ordinal categories from very low to very high according to a variable such as degree of parental supervision. A single data item might not have enough categories to provide the de-sired range of variation, but an index or scale formed from several items would.

Finally, indexes and scales are effi cient devices for data analysis. If a single data item gives only a rough indication of a given variable, consid-ering several data items may give us a more comprehensive and more accurate indication. For example, the results of a single drug test would give us some indication of drug use by a probationer. Examining results from several drug tests would give us a better indication, but the manipulation of several data items simul-taneously can be very complicated. In contrast, composite measures are effi cient data reduction devices. Several indicators may be summarized in a single numerical score, even while perhaps

Table 4.5 Typology of Court Experience

Serve on Jury?

No Yes

Testify as Witness? No A B

Yes C D

Typology A: No experience with court B: Experience as juror only C: Experience as witness only D: Experience as juror and witness

Chapter 4 Concepts, Operationalization, and Measurement 107

low-up) with four categories each are reduced to a single variable with six categories. Fur-thermore, the two measures of delinquency are themselves composite measures, produced by summarizing self-reports of a large number of individual offenses. Finally, notice also how this effi ciency is refl ected in the clear meaning of the new composite measure. This dynamic typology summarizes information about time, offending, and offense seriousness in a single measure.

An Index of Disorder “What is disorder, and what isn’t?” asks Wes-ley Skogan (1990a) in his book on the links between crime, fear, and social problems such as public drinking, drug use, litter, prostitu-tion, panhandling, dilapidated buildings, and groups of boisterous youths. In an infl uential article titled “Broken Windows,” James Q. Wil-son and George Kelling (1982) describe dis-order as a sign of crime that may contribute independently to fear and crime itself. The ar-gument goes something like this: Disorder is a symbol of urban decay that people associate with crime. Signs of disorder can produce two related problems. First, disorder may contrib-ute to fear of crime, as urban residents believe that physical decay and undesirables are sym-bols of crime. Second, potential offenders may interpret evidence of disorder as a signal that informal social control mechanisms in a neigh-borhood have broken down and that the area is fair game for mayhem and predation.

We all have some sort of mental image (con-ception) of disorder, but, to paraphrase Sko-gan’s question: how do we measure it? Let’s begin by distinguishing two conceptions of dis-order. First, we can focus on the physical pres-ence of disorder—whether litter, public drink-ing, public drug use, and the like are evident in an urban neighborhood. We might measure the physical presence of disorder through a se-ries of systematic observations. This is the ap-proach used by Robert Sampson and Stephen Raudenbush (1999) in their study of links

Next, to measure changes in delinquency over time, the researchers compared reports of delinquency from the fi rst screening interview with reports from follow-up interviews. These two measures— delinquency at time 1 and delin-quency at time 2—formed the typology, which they referred to as a “dynamic classifi cation of offenders” (1991, 44). Table 4.6 summarizes this typology.

The fi rst category in the table, nondelin-quent, includes those boys who reported com-mitting no offenses at both the screening and follow-up interviews. Starters reported no of-fenses at screening and then minor, moderate, or serious delinquency at follow-up, whereas de-sistors were just the opposite. Those who com-mitted the same types of offenses at both times were labeled stable; deescalators reported com-mitting less serious offenses at follow-up; and escalators moved on to more serious offenses.

Notice the effi ciency of this typology. Two variables (delinquency at screening and fol-

Table 4.6 Typology of Change in Juvenile Offending

Juvenile Offending

Screening Follow-UpTypology (Time 1) (Time 2)

A. Nondelinquent 0 0

B. Starter 0 1, 2, or 3

C. Desistor 1, 2, or 3 0

D. Stable 1 1

D. Stable 2 2

D. Stable 3 3

E. Deescalator 3 2

E. Deescalator 2 or 3 1

F. Escalator 1 2 or 3

F. Escalator 2 3

Juvenile Offending Typology 0: None 1: Minor 2: Moderate 3: Serious

Source: Adapted from Loeber and associates (1991, 43–46).

108 Part Two Structuring Criminal Justice Inquiry

measure different types of disorder and ap-pear to have reasonable face validity. However, examining the relationship between each indi-vidual item and respondents’ fear of crime or experience as a crime victim would be unwieldy at best. So Skogan created two indexes, one for social disorder and one for physical dis-order, by adding up the scores for each item and dividing by the number of items in each group. Figure 4.2 shows a hypothetical sample questionnaire for these nine items, together with the scores that would be produced for each index.

This example illustrates how several related

between disorder and crime in Chicago. Unfor-tunately, these authors observed very few ex-amples of disorder and altogether ignored the question of whether such behaviors were per-ceived as problematic by residents of Chicago neighborhoods.

That brings us to the second conception, one focusing on the perception of disorder. Thus some people might view public drinking as disorderly, whereas others (New Orleans resi-dents, for example) consider public drinking to be perfectly acceptable. Questionnaires and survey methods are the best suited for measur-ing perceived disorder.

Skogan used questions about nine different examples of disorder and classifi ed them into two groups representing what he calls social and physical disorder (Skogan 1990a, 51, 191). Questions corresponding to each of these ex-amples of disorder asked respondents to rate them as big problems (scored 2), some problem (scored 1), or almost no problem (scored 0) in their neighborhood. Together, these nine items

Introduction:

Now I’m going to read you a list of crime-related problems that may be found in some parts of the city. For each one, please tell me how much of a problem it is in your neighborhood. Is it a big problem, some problem, or almost no problem?

Big Some No problem problem problem

(S) Groups of people loitering �2 1 0

(S) People using or selling drugs 2 �1 0

(P) Abandoned buildings 2 1 �0(S) Vandalism 2 �1 0

(P) Garbage and litter on street �2 1 0

(S) Gangs and gang activity 2 1 �0(S) People drinking in public �2 1 0

(P) Junk in vacant lots 2 1 �0(S) People making rude or insulting remarks 2 �1 0

(S) Social = 2 + 1 + 1 + 0 + 2 + 1 = 7 Index score = 7⁄6 = 1.16

(P) Physical = 0 + 2 + 0 = 2 Index score = 2⁄3 = 0.67

Figure 4.2 Index of Disorder

Social Disorder Physical DisorderGroups of loiterers Abandoned buildings Drug use and sales Garbage and litter Vandalism Junk in vacant lots Gang activityPublic drinkingStreet harassment

Chapter 4 Concepts, Operationalization, and Measurement 109

• Higher levels of measurements specify catego-ries that have ranked order or more complex numerical properties.

• A given variable can sometimes be measured at different levels of measurement. The most ap-propriate level of measurement used depends on the purpose of the measurement.

• Precision refers to the exactness of the measure used in an observation or description of an attribute.

• Reliability and validity are criteria for measure-ment quality. Valid measures are truly indica-tors of underlying concepts. A reliable measure is consistent.

• Crime is a fundamental concept in criminal justice research. Different approaches to mea-suring crime illustrate general principles of conceptualization, operationalization, and measurement. We have different measures of crime because each measure has its strengths and weaknesses.

• Different measures of crime are based on differ-ent units of analysis. Uniform Crime Reports are summary measures that report totals for in-dividual agencies. Other measures use offend-ers, victims, incidents, or offenses as the units of analysis.

• Crimes known to police have been the most widely used measures. UCR data are available for most of the 20th century; more detailed in-formation about homicides was added to the UCR in 1961. Most recently, the FBI has devel-oped an incident-based reporting system that is gradually being adopted.

• Surveys of victims reveal information about crimes that are not reported to police. The NCVS includes detailed information about per-sonal and household incidents, but does not count crimes against businesses or individual victims under age 12.

• Self-report surveys were developed to measure crimes with unclear victims that are less often detected by police. Two surveys estimate drug use among high school seniors and adults.

• The creation of specifi c, reliable measures often seems to diminish the richness of meaning our general concepts have. A good solution is to use multiple measures, each of which taps different aspects of the concept.

• Composite measures, formed by combining two or more variables, are often more valid measures of complex criminal justice concepts.

variables can be combined to produce an index that has three desirable properties. First, a com-posite index is a more valid measure of disorder than is a single question. Second, computing and averaging across all items in a category cre-ate more variation in the index than we could obtain in any single item. Finally, two indexes are more parsimonious than nine individual variables; data analysis and interpretation can be more effi cient.

Measurement SummaryWe have covered substantial ground in this chapter but still have introduced only the im-portant and often complex issue of measure-ment in criminal justice research. More than a step in the research process, measurement involves continuous thinking about what con-ceptual properties we wish to study, how we will operationalize those properties, and how we will develop measures that are reliable and valid. Often some type of composite measure better represents underlying concepts and thus enhances validity.

Subsequent chapters will pursue issues of measurement further. Part Three of this book will describe data collection—how we go about making actual measurements. And the next chapter will focus on different approaches to measuring crime.

✪ Main Points• Concepts are mental images we use as summary

devices for bringing together observations and experiences that seem to have something in common.

• Our concepts do not exist in the real world, so they can’t be measured directly.

• In operationalization, we specify concrete em-pirical procedures that will result in measure-ments of variables.

• Operationalization begins in study design and continues throughout the research project, in-cluding the analysis of data.

• Categories in a measure must be mutually ex-clusive and exhaustive.

110 Part Two Structuring Criminal Justice Inquiry

of trying to measure a particular dimension of crime: motive. Other examples are hate crimes, terrorism, and drug-related crimes. Specify conceptual and operational defi nitions for at least one of these types. Find one newspaper story and one research report that present an example.

✪ Additional ReadingsBest, Joel, Damned Lies and Statistics: Untangling

Numbers from the Media, Politicians, and Activists (Berkeley: University of California Press, 2001). Despite the title, much of this entertaining and informative book describes problems with measurement. For example, page 45 tells us: “Measuring involves deciding how to go about counting.” Best emphasizes how ambiguity in measures of social problems make it easy for advocates to exaggerate the frequency of such problems. Mass media often report and per-petuate errorful measures. What results, Best informs us, are mutant statistics.

Bureau of Justice Statistics, Performance Measures for the Criminal Justice System (Washington, DC: U.S. Department of Justice, Offi ce of Justice Pro-grams, Bureau of Justice Statistics, 1993). This collection of essays by prominent criminal jus-tice researchers focuses on developing measures for evaluation uses. The discussion of general measurement issues as encountered in differ-ent types of justice agencies is uncommonly thoughtful. You will fi nd this a provocative discussion of how to measure important con-structs in corrections, trial courts, and policing. See especially the general essays by John DiIulio and James Q. Wilson.

Gaes, Gerald G., Scott D. Camp, Julianne B. Nelson, and William G. Saylor. Measuring Prison Perfor-mance: Government Privatization and Accountabil-ity. (Walnut Creek, CA: AltaMira Press, 2004). This book stemmed partly from the BJS report, in an effort to expand how to measure the vari-ous dimensions of prisons. Another stated goal of the authors is to devise a system for compar-ing the performance of public and private cor-rectional facilities. This is an excellent resource for anyone interested in corrections.

Hough, Mike, and Mike Maxfi eld (eds.) Surveying Crime in the 21st Century: Crime Prevention Stud-ies, vol. 22. (Monsey, NY: Criminal Justice Press, 2007). This collection of essays was produced to

✪ Review Questions and Exercises1. Review the box titled “What Is Recidivism?” on

page 84. From that discussion, write conceptual and operational defi nitions for recidivism. Sum-marize how Fabelo proposes to measure the concept. Finally, discuss possible reliability and validity issues associated with Fabelo’s proposed measure.

2. We all have some sort of mental image of the pace of life. In a fascinating book titled A Geog-raphy of Time, Robert Levine (1997) operational-ized the pace of life in cities around the world with a composite measure of the following: a. How long it took a single pedestrian to

walk 60 feet on an uncrowded sidewalk b. What percentage of public clocks displayed

the correct time c. How long it took to purchase the equiva-

lent of a fi rst-class postage stamp with the equivalent of a $5 bill

Discuss possible reliability and validity issues with these indicators of the pace of life. Be sure to specify a conceptual defi nition for pace of life.

3. Los Angeles police consider a murder to be gang-related if either the victim or the offender is known to be a gang member, whereas Chi-cago police record a murder as gang-related only if the killing is directly related to gang ac-tivities (Spergel 1990). Describe how these dif-ferent operational defi nitions illustrate general points about measuring crime discussed in this chapter.

4. Measuring gang-related crime is an example

concept, p. 82conception, p. 81conceptual

defi nition, p. 85conceptualization,

p. 83construct

validity, p. 95criterion-related

validity, p. 95dimension, p. 83face validity, p. 95incident-based

measure, p. 100interval

measures, p. 90

nominalmeasures, p. 89

operational defi nition, p. 85

ordinalmeasures, p. 90

ratio measures, p. 90reliability, p. 93self-report

survey, p. 103summary-based

measure, p. 100typology, p. 106validity, p. 94victim survey, p. 102

✪ Key Terms

Chapter 4 Concepts, Operationalization, and Measurement 111

no one knows how to measure other dimen-sions of police performance. This document presents papers and discussions from a series of meetings in which police, researchers, reporters, and others discussed what matters in policing and how to measure it.

Moore, Mark H., and Anthony Braga. The “Bottom Line” of Policing: What Citizens Should Value (and Measure) in Police Performance. (Washington, DC: Police Executive Research Forum, 2003). This is a spin-off from Langworthy’s anthology. Though somewhat long-winded, the authors offer an exceptionally thoughtful discussion of measuring different dimensions of police performance.

commemorate the 25th anniversary of the Brit-ish Crime Survey. Contributors describe what they have learned from crime surveys in many research areas. In the concluding essay, the edi-tors (with Pat Mayhew) suggest how crime sur-veys should be revised.

Langworthy, Robert (ed.), Measuring What Matters: Proceedings from the Policing Research Institute Meetings (Washington, DC: U.S. Department of Justice, Offi ce of Justice Programs, National Institute of Justice, 1999). With the spread of community policing, researchers and offi cials alike have struggled with the question of how to measure police performance. Most people agree that simply counting crimes is not enough, but

112

Chapter 5

Experimental and Quasi-Experimental DesignsWe’ll learn about the experimental approach to social scientifi c research. We’ll consider a wide variety of experimental and other designs available to crimi-nal justice researchers.

Introduction 113

The Classical Experiment 113

Independent and Dependent Variables 114

Pretesting and Posttesting 114

Experimental and Control Groups 115

Double-Blind Experiments 116

Selecting Subjects 116

Randomization 117

Experiments and Causal Inference 117

Experiments and Threats to Validity 118

Threats to Internal Validity 118

Ruling Out Threats to Internal Validity 120

Generalizability and Threats to Validity 121

Variations in the Classical Experimental Design 123

Quasi-Experimental Designs 124

Nonequivalent-Groups Designs 125

Cohort Designs 128

Time-Series Designs 128

Chapter 5 Experimental and Quasi-Experimental Designs 113

frequently, in an average week, they consume al-cohol for the specifi c purpose of getting drunk. Next, we might show these subjects a video depicting the various physiological effects of chronic drinking and binge drinking. Finally—say, one month later—we might again ask the subjects about their use of alcohol in the pre-vious week to determine whether watching the video reduced alcohol use.

You might typically think of experiments as being conducted in laboratories under carefully controlled conditions. Although this may be true in the natural sciences, few social scientifi c experiments take place in laboratory settings. The most notable exception to this occurs in the discipline of psychology, in which labora-tory experiments are common. Criminal justice experiments are almost always conducted in fi eld settings outside the laboratory.

The Classical ExperimentVariables, time order, measures, and groups are the central features of the classical experiment.

Like much of the vocabulary of research, the word experiment has acquired both a general and a specialized meaning. So far, we have re-ferred to the general meaning, defi ned by David Farrington, Lloyd Ohlin, and James Q. Wilson (1986, 65) as “a systematic attempt to test a causal hypothesis about the effect of variations in one factor (the independent variable) on an-other (the dependent variable). . . . The defi ning feature of an experiment lies in the control of the independent variable by the experimenter.” In a narrower sense, the term experiment refers to a specifi c way of structuring research, usually called the classical experiment. In this section,

IntroductionExperimentation is an approach to research best suited for explanation and evaluation.

Research design in the most general sense in-volves devising a strategy for fi nding out some-thing. We’ll fi rst discuss the experiment as a mode of scientifi c observation in criminal justice research. At base, experiments involve (1) taking action and (2) observing the con-sequences of that action. Social scientifi c re-searchers typically select a group of subjects, do something to them, and observe the effect of what was done.

It is worth noting at the outset that experi-ments are often used in nonscientifi c human inquiry as well. We experiment copiously in our attempts to develop a more generalized under-standing about the world we live in. We learn many skills through experimentation: riding a bicycle, driving a car, swimming, and so forth. Students discover how much studying is re-quired for academic success through experi-mentation. Professors learn how much prepara-tion is required for successful lectures through experimentation.

Experimentation is especially appropriate for hypothesis testing and evaluation. Suppose we are interested in studying alcohol abuse among college students and in discovering ways to reduce it. We might hypothesize that acquiring an understanding about the health consequences of binge drinking and long-term alcohol use will have the effect of reducing alcohol abuse. We can test this hypothesis ex-perimentally. To begin, we might ask a group of experimental subjects how much beer, wine, or spirits they drank on the previous day and how

Variations in Time-Series Designs 132

Variable-Oriented Research and Scientifi c Realism 133

Experimental and Quasi-Experimental Designs Summarized 135

114 Part Two Structuring Criminal Justice Inquiry

It is essential that both independent and dependent variables be operationally defi ned for the purposes of experimentation. Such op-erational defi nitions might involve a variety of observation methods. Responses to a question-naire, for example, might be the basis for defi n-ing self-reported alcohol use on the previous day. Alternatively, alcohol use by subjects could be measured with a Breathalyzer® or other blood alcohol test.

Pretesting and PosttestingIn the simplest experimental design, subjects are measured on a dependent variable (pretested), exposed to a stimulus that represents an inde-pendent variable, and then remeasured on the dependent variable (posttested). Differences noted between the fi rst and second measure-ments on the dependent variable are then at-tributed to the infl uence of the independent variable.

In our example of alcohol use, we might begin by pretesting the extent of alcohol use among our experimental subjects. Using a ques-tionnaire, we measure the extent of alcohol use reported by each individual and the average level of alcohol use for the whole group. After showing subjects the video on the effects of al-cohol, we administer the same questionnaire again. Responses given in this posttest permit us to measure the subsequent extent of alcohol use by each subject and the average level of alco-hol use of the group as a whole. If we discover a lower level of alcohol use on the second admin-istration of the questionnaire, we might con-clude that the video indeed reduced the use of alcohol among the subjects. In the experimen-tal examination of behaviors such as alcohol use, we face a special practical problem relat-ing to validity. As you can imagine, the subjects might respond differently to the questionnaires the second time, even if their level of drinking remained unchanged. During the fi rst adminis-tration of the questionnaire, the subjects might have been unaware of its purpose. By the time of the second measurement, however, they might

we examine the requirements and components of the classical experiment. Later in the chapter we will consider designs that can be used when some of the requirements for classical experi-ments cannot be met.

The most conventional type of experiment in the natural and the social sciences involves three major pairs of components: (1) indepen-dent and dependent variables, (2) pretesting and posttesting, and (3) experimental and con-trol groups. We will now consider each of those components and the way they are put together in the execution of an experiment.

Independent and Dependent VariablesEssentially, an experiment examines the effect of an independent variable on a dependentvariable. Typically the independent variable takes the form of an experimental stimulus that is either present or absent—that is, having two attributes. In the example concerning alco-hol abuse, how often subjects used alcohol is the dependent variable and exposure to a video about alcohol’s effects is the independent vari-able. The researcher’s hypothesis suggests that levels of alcohol use depend, in part, on under-standing its physiological and health effects. The purpose of the experiment is to test the va-lidity of this hypothesis.

The independent and dependent variables appropriate to experimentation are nearly lim-itless. It should be noted, moreover, that a given variable might serve as an independent variable in one experiment and as a dependent variable in another. Alcohol use is the dependent vari-able in our example, but it might be the inde-pendent variable in an experiment that exam-ines the effects of alcohol abuse on academic performance.

In the terms of our discussion of cause and effect in Chapter 3, the independent variable is the cause and the dependent variable is the ef-fect. Thus we might say that watching the video causes a change in alcohol use or that reduced alcohol use is an effect of watching the video.

Chapter 5 Experimental and Quasi-Experimental Designs 115

test of alcohol use to both groups. Figure 5.1 i llustrates this basic experimental design.

Using a control group allows the researcher to control for the effects of the experiment it-self. If participation in the experiment leads the subjects to report less alcohol use, that should occur in both the experimental and the control groups. If, on the one hand, the overall level of drinking exhibited by the control group decreases between the pretest and posttest as much as for the experimental group, then the apparent reduction in alcohol use must be a function of some external factor, not a func-tion of watching the video specifi cally. In this situation, we can conclude that the video did not cause any change in alcohol use.

If, on the other hand, drinking decreases only in the experimental group, then we can be more confi dent in saying that the reduction is a consequence of exposure to the video (be-cause that’s the only difference between the two groups). Or, alternatively, if drinking decreases more in the experimental group than in the control group, then that too is grounds for as-suming that watching the video reduced alco-hol use.

The need for control groups in experimenta-tion has been most evident in medical research.

have fi gured out the purpose of the experi-ment, become sensitized to the questions about drinking, and changed their answers. Thus the video might seem to have reduced alcohol abuse although, in fact, it did not.

This is an example of a more general prob-lem that plagues many forms of criminal justice research: The very act of studying something may change it. Techniques for dealing with this problem in the context of experimentation are covered throughout the chapter.

Experimental and Control GroupsThe traditional way to offset the effects of the experiment itself is to use a control group. So-cial scientifi c experiments seldom involve only the observation of an experimental group,to which a stimulus has been administered. Researchers also observe a control group, to which the experimental stimulus has not been administered.

In our example of alcohol abuse, two groups of subjects are examined. To begin, each group is administered a questionnaire designed to measure their alcohol use in general and binge drinking in particular. Then only one of the groups—the experimental group—is shown the video. Later, the researcher administers a post-

Figure 5.1 Basic Experimental Design

EXPERIMENTALGROUP

CONTROLGROUP

Measure dependentvariable

Administer experimentalstimulus (video)

Remeasure dependentvariable

Measure dependentvariable

Remeasure dependentvariable

Compare: Different?

Compare: Same?

116 Part Two Structuring Criminal Justice Inquiry

only participate in group discussions; the con-trol group would do neither. With this kind of design, we could determine the impact of each stimulus separately, as well as their combined effect.

Double-Blind ExperimentsAs we saw with medical experimentation, pa-tients sometimes improve when they think they are receiving a new drug; thus it is often nec-essary to administer a placebo to a control group.

Sometimes experimenters have this same tendency to prejudge results. In medical re-search, the experimenters may be more likely to “observe” improvements among patients who receive the experimental drug than among those receiving the placebo. That would be most likely, perhaps, for the researcher who de-veloped the drug. A double-blind experiment eliminates this possibility because neither the subjects nor the experimenters know which is the experimental group and which is the con-trol. In medical experiments, those research-ers who are responsible for administering the drug and for noting improvements are not told which subjects receive the drug. Thus both re-searchers and subjects are blind with respect to who is receiving the experimental drug and who is getting the placebo. Another researcher knows which subjects are in which group, but that person is not responsible for administer-ing the experiment.

Selecting SubjectsBefore beginning an experiment, we must make two basic decisions about who will par-ticipate. First, we must decide on the target population—the group to which the results of our experiment will apply. If our experiment is designed to determine, for example, whether restitution is more effective than probation in reducing recidivism, our target population is some group of persons convicted of crimes. In our hypothetical experiment about the effects of watching a video on the health consequences

Time and again, patients who participated in medical experiments appeared to improve, but it was unclear how much of the improvement came from the experimental treatment and how much from the experiment. Now, in test-ing the effects of new drugs, medical researchers frequently administer a placebo (for example, sugar pills) to a control group. Thus the con-trol group patients believe that they, like mem-bers of the experimental group, are receiving an experimental drug—and they often improve. If the new drug is effective, however, those who receive that drug will improve more than those who receive the placebo.

In criminal justice experiments, control groups are important as a guard against the effects of not only the experiments themselves but also events that may occur outside the laboratory during the course of experiments. Suppose the alcohol use experiment was being conducted on your campus and at that time a popular athlete was hospitalized for acute al-cohol poisoning after he and a chum drank a bottle of rum. This event might shock the ex-perimental subjects and thereby decrease their reported drinking. Because such an effect should happen about equally for members of the control and experimental groups, lower lev-els of reported alcohol use in the experimental group than in the control group would again demonstrate the impact of the experimental stimulus: watching the video that describes the health effects of alcohol abuse.

Sometimes an experimental design requires more than one experimental or control group. In the case of the alcohol video, we might also want to examine the impact of participating in group discussions about why college students drink alcohol, with the intent of demonstrating that peer pressure may promote drinking by people who would otherwise abstain. We might design our experiment around three experimen-tal groups and one control group. One experi-mental group would see the video and partici-pate in the group discussions, another would only see the video, and still another would

Chapter 5 Experimental and Quasi-Experimental Designs 117

ization in criminal justice research to labora-tory controls in the natural sciences:

The control of extraneous variables by ran-domization is similar to the control of ex-traneous variables in the physical sciences by holding physical conditions (e.g., tem-perature, pressure) constant. Randomiza-tion insures that the average unit in [the] treatment group is approximately equiva-lent to the average unit in another [group] before the treatment is applied.

You’ve surely heard the expression, “All other things being equal.” Randomization makes it possible to assume that all other things are equal.

Experiments and Causal InferenceExperiments potentially control for many threats to the validity of causal inference, but researchers must remain aware of these threats.

The central features of the classical experiment are independent and dependent variables, pre-testing and posttesting, and experimental and control groups created through random assign-ment. Think of these features as building blocks of a research design to demonstrate a cause-and-effect relationship. This point will become clearer by comparing the criteria for causality, discussed in Chapter 3, to the features of the classical experiment, as shown in Figure 5.2.

The experimental design ensures that the cause precedes the effect in time by taking post-test measurements of the dependent variable after introducing the experimental stimulus. The second criterion for causation—an em-pirical correlation between the cause-and-effect variables—is determined by comparing the pretest (in which the experimental stimulus is not present) to the posttest for the experimen-tal group (after the experimental stimulus is administered). A change in pretest to posttest measures demonstrates correlation.

of alcohol abuse, the target population might be college students.

Second, we must decide how particular mem-bers of the target population will be selected for the experiment. In most cases, the methods used to select subjects must meet the scientifi c norm of generalizability; it should be possible to gen-eralize from the sample of subjects studied to the population those subjects represent.

Aside from the question of generalizability, the cardinal rule of subject selection and exper-imentation is the comparability of the experi-mental and control groups. Ideally, the control group represents what the experimental group would have been like if it had not been exposed to the experimental stimulus. It is essential, therefore, that the experimental and control groups be as similar as possible.

RandomizationHaving recruited, by whatever means, a group of subjects, we randomly assign those subjects to either the experimental or the control group. This might be accomplished by numbering all the subjects serially and selecting numbers by means of a random-number table. Or we might assign the odd-numbered subjects to the exper-imental group and the even-numbered subjects to the control group.

Randomization is a central feature of the classical experiment. The most important char-acteristic of randomization is that it produces experimental and control groups that are sta-tistically equivalent. Put another way, random-ization reduces possible sources of systematic bias in assigning subjects to groups. The basic principle is simple: if subjects are assigned to experimental and control groups through a random process such as fl ipping a coin, the as-signment process is said to be unbiased and the resultant groups are equivalent.

Although the rationale underlying this prin-ciple is a bit complex, understanding how ran-domization produces equivalent groups is a key point. Farrington and associates (Farrington, Ohlin, and Wilson 1986, 66) compare random-

118 Part Two Structuring Criminal Justice Inquiry

from experimental results may not accurately refl ect what went on in the experiment it-self. Put differently, conclusions about cause and effect may be biased in some systematic way. Shadish, Cook, and Campbell (2002, 54 –60) pointed to several sources of the problem. As you read about these different threats to in-ternal validity, keep in mind that each is an ex-ample of a simple point: possible ways research-ers might be wrong in inferring causation.

History Historical events may occur during the course of the experiment that confound the experimental results. The hospitalization of a popular athlete for acute alcohol poisoning dur-ing an experiment on reducing alcohol use is an example.

Maturation People are continually grow-ing and changing, whether in an experiment or not, and those changes affect the results of the experiment. In a long-term experiment, the fact that the subjects grow older may have an effect. In shorter experiments, they may be-come tired, sleepy, bored, or hungry, or change in other ways that affect their behavior in the experiment. A long-term study of alcohol abuse might reveal a decline in binge drinking as the subjects mature.

History and maturation are similar in that they represent a correlation between cause and effect that is due to something other than the independent variable. They’re differ-ent in that history represents something that’s outside the experiment altogether, whereas

The fi nal requirement is to show that the observed correlation between cause and effect is not due to the infl uence of a third variable. The classical experiment makes it possible to satisfy this criterion for cause in two ways. First, the posttest measures for the experimental group (stimulus present) are compared with those for the control group (stimulus not present). If the observed correlation between the stimulus and the dependent variable is due to some other fac-tor, then the two posttest scores will be similar. Second, random assignment produces experi-mental and control groups that are equivalent and will not differ on some other variable that could account for the empirical correlation be-tween cause and effect.

Experiments and Threats to ValidityThe classical experiment is designed to sat-isfy the three requirements for demonstrating cause-and-effect relationships. But what about threats to the validity of causal inference, dis-cussed in Chapter 3? In this section, we consider certain threats in more detail and describe how the classical experiment reduces many of them. Our discussion draws mostly on the book by William Shadish, Thomas Cook, and Donald Campbell (2002). We present these threats in a slightly different order, beginning with threats to internal validity.

Threats to Internal ValidityThe problem of threats to internal validity refers to the possibility that conclusions drawn

Experimentalgroup

Pretest Stimulus

TIME

Posttest

PosttestPretestControlgroup

COMPARISONS

Figure 5.2 Another Look at the Classical Experiment

Chapter 5 Experimental and Quasi-Experimental Designs 119

questionnaires about prejudice, for example, that is a testing problem. However, if different questionnaires about prejudice are used in pre-test and posttest measurements, instrumenta-tion is a potential threat.

Statistical Regression Sometimes it’s appro-priate to conduct experiments on subjects who start out with extreme scores on the dependent variable. Many sentencing policies, for example, target chronic offenders. Commonly referred to as regression to the mean, this threat to validity can emerge whenever researchers are interested in extreme cases. As a simple example, statisti-cians often point out that extremely tall people as a group are likely to have children shorter than themselves, and extremely short people as a group are likely to have children taller than themselves. The danger, then, is that changes occurring by virtue of subjects starting out in extreme positions will be attributed erroneously to the effects of the experimental stimulus.

Statistical regression can also be at work in aggregate analysis of changes in crime rates. For example, some researchers initially viewed declines in crime rates throughout U.S. cities in the 1990s as a return to more normal levels of crime after abnormally high rates in the 1980s.

Selection Biases Randomization eliminates the potential for systematic bias in selecting subjects, but subjects may be chosen in other ways that threaten validity. Volunteers are of-ten solicited for experiments conducted on col-lege campuses. Students who volunteer for an experiment may not be typical of students as a whole, however. Volunteers may be more inter-ested in the subject of the experiment and more likely to respond to a stimulus.

A common type of selection bias in applied criminal justice studies results from the natu-ral caution of public offi cials. Let’s say you are a bail commissioner in a large city, and the mayor wants to try a new program to increase the number of arrested persons who are released on bail. The mayor asks you to decide what kinds of defendants should be eligible for release and informs you that staff from the city’s criminal

maturation refers to change within the subjects themselves.

Testing Often the process of testing and re-testing infl uences people’s behavior and thereby confounds the experimental results. Suppose we administer a questionnaire to a group as a way of measuring their alcohol use. Then we administer an experimental stimulus and re-measure their alcohol use. By the time we con-duct the posttest, the subjects may have gotten more sensitive to the issue of alcohol use and so provide different answers. In fact, they may believe we are trying to determine whether they drink too much. Because excessive drinking is frowned on by university authorities, our sub-jects will be on their best behavior and give an-swers that they think we want or that will make them look good.

Instrumentation Thus far we haven’t said much about the process of measurement in pretesting and posttesting, and it’s appropri-ate to keep in mind the problems of concep-tualization and operationalization discussed in Chapter 4. If we use different measures of the dependent variable (say, different ques-tionnaires about alcohol use), how can we be sure that they are comparable? Perhaps alco-hol use seems to have decreased simply because the pretest measure was more sensitive than the posttest measure.

Or if the measurements are being made by the experimenters, their procedures may changeover the course of the experiment. You prob-ably recognize this as a problem with reliability. Instrumentation is always a potential problem in criminal justice research that uses secondary sources of information such as police records about crime or court records about probation violations. There may be changes in how pro-bation violations are defi ned or changes in the r ecord-keeping practices of police departments.

In general, testing refers to changes in how subjects respond to measurement, whereas in-strumentation is concerned with changes in the measurement process itself. If police offi -cers respond differently to pretest and posttest

120 Part Two Structuring Criminal Justice Inquiry

fending exhibited this threat to validity by rely-ing on single interviews with subjects who were asked how they viewed alternative punishments and whether they had committed any crimes.

Ruling Out Threats to Internal ValidityThe classical experiment, coupled with proper subject selection and assignment, can potentiallyhandle each of these threats to internal validity. Let’s look again at the classical experiment, pre-sented graphically in Figure 5.2.

Pursuing the example of the educational video as an attempt to reduce alcohol abuse, if we use the experimental design shown in F igure 5.2, we should expect two fi ndings. For the experimental group, the frequency of drink-ing measured in their posttest should be less than in their pretest. In addition, when the two posttests are compared, the experimental group should have less drinking than the control group.

This design guards against the problem of history because anything occurring outside the experiment that might affect the experimen-tal group should also affect the control group. There should still be a difference in the two posttest results. The same comparison guards against problems of maturation as long as the subjects have been randomly assigned to the two groups. Testing and instrumentation should not be problems because both the e xperimental and the control groups are subject to the same tests and experimenter effects. If the subjects have been assigned to the two groups randomly, statistical regression should affect both equally, even if people with extreme scores on drinking (or whatever the dependent variable is) are be-ing studied. Selection bias is ruled out by the random assignment of subjects.

Experimental mortality can be more compli-cated to handle because dropout rates may be different between the experimental and control groups. The experimental treatment itself may increase mortality in the group exposed to the video. As a result, the group of experimental

justice services agency will be evaluating the program. In establishing eligibility criteria, youwill probably try to select defendants who will not be arrested again while on bail and defen-dants who will most likely show up for sched-uled court appearances. In other words, you will try to select participants who are least likely to fail. This common and understandable caution is sometimes referred to as creaming—skimming the best risks off the top. Creaming is a threat to validity because the low-risk per-sons selected for release, although most likely to succeed, do not represent the jail population as a whole.

Experimental Mortality Experimental sub-jects often drop out of an experiment before it is completed, and that can affect statistical comparisons and conclusions. This is termed experimental mortality, also known as attrition.In the classical experiment involving an experi-mental and a control group, each with a pretest and a posttest, suppose that the heavy drinkers in the experimental group are so turned off by the video on the health effects of binge drink-ing that they tell the experimenter to forget it and leave. Those subjects who stick around for the posttest were less heavy drinkers to start with, and the group results will thus refl ect a substantial “decrease” in alcohol use.

Mortality may also be a problem in experi-ments that take place over a long period (people may move away) or in experiments that require a substantial commitment of effort or time by subjects; they may become bored with the study or simply decide it’s not worth the effort.

Ambiguous Causal Time Order In criminal justice research, there may be ambiguity about the time order of the experimental stimulus and the dependent variable. Whenever this oc-curs, the research conclusion that the stimulus caused the dependent variable can be challenged with the explanation that the “dependent” vari-able actually caused changes in the stimulus. Many early studies of the relationship between different types of punishments and rates of of-

Chapter 5 Experimental and Quasi-Experimental Designs 121

Generalizability and Threats to ValidityPotential threats to internal validity are only some of the complications faced by experiment-ers. They also have the problem of generalizing from experimental fi ndings to the real world. Even if the results of an experiment are an accurate gauge of what happened during that experiment, do they really tell us anything about life in the wilds of society? With our ex-amination of cause and effect in Chapter 3 in mind, we consider two dimensions of gen-eralizability: construct validity and external validity.

Threats to Construct Validity In the lan-guage of experimentation, construct validity is the correspondence between the empirical test of a hypothesis and the underlying causal pro-cess that the experiment is intended to repre-sent. Construct validity is thus concerned with generalizing from our observations in an ex-periment to causal processes in the real world. In our hypothetical example, the educational video is how we operationalize the construct of understanding the health effects of alcohol abuse. Our questionnaire represents the depen-dent construct of actual alcohol use.

Are these reasonable ways to represent the underlying causal process in which under-standing the effects of alcohol use causes peo-ple to reduce excessive or abusive drinking? It’s a reasonable representation but also one that is certainly incomplete. People develop an under-standing of the health effects of alcohol use in many ways. Watching an educational video is one way; having personal experience, talking to friends and parents, taking other courses, and reading books and articles are others. Our video may do a good job of representing the health effects of alcohol use, but it is an incomplete representation of that construct. Alternatively, the video may be poorly produced, too tech-nical, or incomplete. Then the experimental stimulus may not adequately represent the con-struct we are interested in— educating students

subjects that received the posttest will differ from the group that received the pretest. In our example of the alcohol video, it would probably not be possible to handle this problem by ad-ministering a placebo, for instance. In general, however, the potential for mortality can be re-duced by shortening the time between pretest and posttest, by emphasizing to subjects the importance of completing the posttest, or per-haps by offering cash payments for participat-ing in all phases of the experiment.

The remaining problems of internal invalid-ity can be avoided through the careful admin-istration of a controlled experimental design. We emphasize careful administration. Random assignment, pretest and posttest measures, and use of control and experimental groups do not automatically rule out threats to validity. This caution is especially true in fi eld studies and evaluation research, in which subjects partici-pate in natural settings and uncontrolled varia-tion in the experimental stimulus may be pres-ent. Control over experimental conditions is the hallmark of this approach, but conditions in fi eld settings are usually more diffi cult to control.

For example, Richard Berk and associates (2003) randomly assigned several thousand in-mates entering California prisons to an experi-mental or traditional (control) procedure for classifying inmate risk. That was a straight-forward intervention that was easily adminis-tered and unlikely to vary much; classifi cation also took place over a short period of time. In contrast, Denise Gottfredson and colleagues (2006) conducted a classical experiment to assess the effects of drug courts in reducing recidivism. Drug courts involve a range of in-terventions that are more diffi cult to standard-ize. The treatment— drug-court participation—can take place over a long period of time. Researchers examined how long individuals stayed in drug-court treatment, but acknowl-edged that the quality of treatment was more diffi cult to control and could have varied quite a lot.

122 Part Two Structuring Criminal Justice Inquiry

carefully controlled conditions of the experi-ment might have had something to do with the video’s effectiveness.

In contrast, criminal justice fi eld experi-ments are conducted in more natural settings. Real probation offi cers in different local juris-dictions deliver intensive supervision to real probationers. This is not to say that external validity is never a problem in fi eld experiments. But one of the advantages of fi eld experiments in criminal justice is that, because they take place under real-world conditions, results are more likely to be valid in other real-world set-tings as well.

You may have detected a fundamental con-fl ict between internal and external validity. Threats to internal validity are reduced by con-ducting experiments under carefully controlled conditions. But such conditions do not refl ect real-world settings, and this restricts our ability to generalize results. Field experiments gener-ally have greater external validity, but their in-ternal validity may suffer because such studies are more diffi cult to monitor than those tak-ing place in more controlled settings. John Eck describes this trade-off as a diabolical dilemma (Eck 2002, 104).

Shadish, Cook, and Campbell (2002, 98–101) offered some useful advice for resolving the potential for confl ict between internal and ex-ternal validity. Explanatory studies that test cause-and-effect theories should place greater emphasis on internal validity, whereas applied studies should be more concerned with exter-nal validity. This is not a hard and fast rule be-cause internal validity must be established be-fore external validity becomes an issue. That is, applied researchers must have confi dence in the internal validity of their cause-and-effect rela-tionships before they ask whether similar rela-tionships would be found in other settings.

Threats to Statistical Conclusion ValidityThe basic principle of statistical conclusion va-lidity is simple. Virtually all experimental re-search in criminal justice is based on samples of

about the health effects of alcohol use. There may also be problems with our measure of the dependent variable: questionnaire items on self-reported alcohol use.

By this time, you should recognize a similar-ity between construct validity and some of the measurement issues discussed in Chapter 4. Almost any empirical example or measure of a construct is incomplete. Part of construct valid-ity involves how completely an empirical mea-sure can represent a construct or how well we can generalize from a measure to a construct.

A related issue in construct validity is whether a given level of treatment is suffi cient. Perhaps showing a single video to a group of subjects would have little effect on alcohol use, but administering a series of videos over several weeks would have a greater impact. We could test this experimentally by having more than one experimental group and varying the num-ber of videos seen by different groups.

Threats to External Validity Will an experi-mental study, conducted with the kind of con-trol we have emphasized here, produce results that would also be found in more natural set-tings? Can an intensive probation program shown to be successful in Minneapolis achieve similar results in Miami? External validity rep-resents a slightly different form of generaliz-ability, one in which the question is whether results from experiments in one setting (time and place) will be obtained in other settings or whether a treatment found to be effective for one population will have similar effects on a different group.

Threats to external validity are greater for ex-periments conducted under carefully controlled conditions. If the alcohol education experiment reveals that drinking decreased among students in the experimental group, then we can be con-fi dent that viewing the video led to reduced alcohol use among our experimental subjects. But will the video have the same effect on high school students or adults if it is broadcast on television? We cannot be certain because the

Chapter 5 Experimental and Quasi-Experimental Designs 123

control groups, (2) the number and variation of experimental stimuli, (3) the number of pretest and posttest measurements, and (4) the proce-dures used to select subjects and assign them to groups. By way of illustrating these building blocks and the ways they are used to produce different designs, we adopt the widely used sys-tem of notation introduced by Campbell and Stanley (1966). Figure 5.3 presents this nota-tion and shows how it is used to represent the classical experiment and examples of variations on this design.

In Figure 5.3, the letter O represents obser-vations or measurements, and X represents an experimental stimulus or treatment. Different time points are displayed as t, with a subscript number to represent time order. Thus for the classical experiment shown in Figure 5.3, Oat t1 is the pretest, O at t3 is the posttest, and the experimental stimulus, X, is administered to the experimental group at t2, between the pretest and posttest. Measures are taken for the control group at times t1 and t3, but the

subjects that represent a target population. Larger samples of subjects, up to a point, are more representative of the target population than are smaller samples. Statistical conclu-sion validity most often becomes an issue when fi ndings are based on small samples of cases. Because experiments can be costly and time consuming, they are frequently conducted with relatively small numbers of subjects. In such cases, only large differences between experimen-tal and control groups on posttest measures can be detected with any degree of confi dence.

In practice, this means that fi nding cause-and-effect relationships through experiments depends on two related factors: (1) the number of subjects and (2) the magnitude of posttest differences between the experimental and con-trol groups. Experiments with large numbers of cases may be able to reliably detect small differ-ences, but experiments with smaller numbers can detect only large differences.

Threats to statistical conclusion validity can be magnifi ed by other diffi culties in fi eld ex-periments. Unreliable measurement is one such problem that is often encountered in criminal justice research. More generally, Weisburd and associates (1993) concluded, after reviewing a large number of criminal justice experiments, that failure to maintain control over experi-mental conditions reduces statistical conclu-sion validity even for studies with large num-bers of subjects.

Variations in the Classical Experimental DesignThe basic experimental design is adapted to meet different research applications.

We now turn to a more systematic considera-tion of variations on the classical experiment that can be produced by manipulating the building blocks of experiments.

Slightly restating our earlier remarks, four basic building blocks are present in experimen-tal designs: (1) the number of experimental and

Classical ExperimentExperimental group O X OControl group O O

t1 t2 t3

Time ⎯⎯⎯⎯⎯⎯⎯⎯→

O � observation or measurement

X � experimental stimulust � time point

Posttest OnlyExperimental group X OControl group O t1 t2

FactorialExperimental treatment 1 O X1 OExperimental treatment 2 O X2 OControl O O

t1 t2 t3

Figure 5.3 Variations in the Experimental Design

124 Part Two Structuring Criminal Justice Inquiry

gle treatment, and one control group. This de-sign is useful for comparing the effects of dif-ferent interventions or different amounts of a single treatment. In evaluating a probation pro-gram, we might wish to compare how different levels of contact between probation offi cers and probation clients affect recidivism. In this case, subjects in one experimental group might re-ceive weekly contact (X1), the other experimen-tal group be seen by probation offi cers twice each week (X2), and control-group subjects have normal contact (say, monthly) with probation offi cers. Because more contact is more expen-sive than less contact, we would be interested in seeing how much difference in recidivism was produced by monthly, weekly, and twice-weekly contacts.

Thus an experimental design may have more than one group receiving different versions or levels of experimental treatment. We can also vary the number of measurements made on dependent variables. No hard and fast rules ex-ist for using these building blocks to design a given experiment. A useful rule of thumb, how-ever, is to keep the design as simple as possible to control for potential threats to validity. The specifi c design for any particular study depends on the research purpose, available resources, and unavoidable constraints in designing and actually carrying out the experiment.

One very common constraint is how sub-jects or units of analysis are selected and as-signed to experimental or control groups. This building block brings us to the subject of quasi-experimental designs.

Quasi-Experimental DesignsWhen randomization is not possible, researchers can use different types of quasi-experimental designs.

By now, the value of random assignment in con-trolling threats to validity should be apparent. However, it is often impossible to randomly select subjects for experimental and control groups and satisfy other requirements. Most

experimental stimulus is not administered to the control group.

Now consider the design labeled “Posttest Only.” As implied by its name, no pretest mea-sures are made on either the experimental or the control group. Thinking for a moment about the threats to internal validity, we can imagine situations in which a posttest-only design is ap-propriate. Testing and retesting might especially infl uence subjects’ behavior if measurements are made by administering a questionnaire, with subjects’ responses to the posttest poten-tially affected by their experience in the pretest. A posttest-only design can reduce the possibility of testing being a threat to validity by eliminat-ing the pretest.

Without a pretest, it is obviously not pos-sible to detect change in measures of the depen-dent variable, but we can still test the effects of the experimental stimulus by comparing post-test measures for the experimental group with posttest measures for the control group. For ex-ample, if we are concerned about the possibility of sensitizing subjects in a study of an alcohol education video, we might eliminate the pretest and examine the posttest differences between the experimental and control groups. Random-ization is the key to the posttest-only design. If subjects are randomly assigned to experimental and control groups, we expect them to be equiv-alent. Any posttest differences between the two groups on the dependent variable can then be attributed to the infl uence of the video.

In general, posttest-only designs are appro-priate when researchers suspect that the process of measurement may bias subjects’ responses to a questionnaire or other instrument. This is more likely when only a short time elapses between pretest and posttest measurements. The number of observations made on subjects is a design building block that can be varied as needed. We emphasize here that random assign-ment is essential in a posttest-only design.

Figure 5.3 also shows a factorial design, which has two experimental groups that receivedifferent treatments, or different levels of a sin-

Chapter 5 Experimental and Quasi-Experimental Designs 125

approaches to matching and the creative use of experimental design building blocks. Examples include studies of child abuse (Widom 1989a), obscene phone calls (Clarke 1997a), and video cameras for crime prevention (Gill and Spriggs 2005). Figure 5.4 shows a diagram of each de-sign using the X, O, and t notation. The solid line that separates treatment and comparison groups in the fi gure signifi es that subjects have been placed in groups through some nonran-dom procedure.

often, there may be practical or administrative obstacles. There may also be legal or ethical rea-sons randomization cannot be used in criminal justice experiments.

When randomization is not possible, the next-best choice is often a quasi-experiment.The prefi x quasi-, meaning “to a certain degree,” is signifi cant—a quasi-experiment is, to a cer-tain degree, an experiment. In most cases, quasi-experiments do not randomly assign subjects and therefore may suffer from the internal va-lidity threats that are so well controlled in true experiments. Without random assignment, the other building blocks of experimental de-sign must be used creatively to reduce validity threats. We group quasi-experimental designs into two categories: (1) nonequivalent-groups designs and (2) time-series designs. Each can be represented with the same O, X, and t notation used to depict experimental designs.

Nonequivalent-Groups DesignsThe name for this family of designs is also meaningful. The main strength of random as-signment is that it allows us to assume equiva-lence in experimental and control groups. When it is not possible to create groups through ran-domization, we must use some other procedure, one that is not random. If we construct groups through a nonrandom procedure, however, we cannot assume that the groups are equivalent—hence the label nonequivalent-groups design.

Whenever experimental and control groups are not equivalent, we should select subjects in a way that makes the two groups as compa-rable as possible. Often the best way to achieve comparability is through a matching process in which subjects in the experimental group are matched with subjects in a comparison group. The term comparison group is commonly used, rather than control group, to highlight the non-equivalence of groups in quasi-experimental de-signs. A comparison group does, however, serve the same function as a control group.

Some examples of research that use non-equivalent-groups designs illustrate various

Widom (1989a) Treatment group X O

Comparison group Ot1 t2

X � official record of child abuseO � counts of juvenile or adult arrest

Clarke (1997a)Treatment group O X O

Comparison group O O

t1 t2 t3

X � caller identification and call tracingO � customer complaints of obscene calls

Gill and Spriggs (2005)Target area 1 O X1 O

Comparison area 1 O OTarget area 2 O X2 O

Comparison area 2 O O

Target area 13 O X13 O

Comparison area 13 O Ot1 t2 t3

Xi � CCTV installation in area iO � Police crime data, survey data on fear of

crime

Figure 5.4 Quasi-Experimental Design Examples

126 Part Two Structuring Criminal Justice Inquiry

Deterring Obscene Phone Calls In 1988, the New Jersey Bell telephone company introduced caller identifi cation (ID) and instant call trac-ing in a small number of telephone exchange areas. Now ubiquitous in mobile phones, caller ID was a new technology in 1988. Instant call tracing allows the recipient of an obscene or threatening call to automatically initiate a pro-cedure to trace the source of the call.

Ronald Clarke (1997a) studied the effects of these new technologies in deterring obscene phone calls. Clarke expected that obscene calls would decrease in areas where the new services were available. To test this, he compared records of formal customer complaints about annoying calls in the New Jersey areas that had the new services to formal complaints in other New Jer-sey areas where caller ID and call tracing were not available. One year later, the number of for-mal complaints had dropped sharply in areas serviced by the new technology; no decline was found in other New Jersey Bell areas.

In this study, telephone service areas with new services were the treatment group, and ar-eas without the services were the comparison group. Clarke’s matching criterion was a simple one: telephone service by New Jersey Bell, as-suming the volume of obscene phone calls was relatively constant within a single phone service area. Of course, matching on telephone ser-vice area cannot eliminate the possibility that the volume of obscene phone calls varies from one part of New Jersey to another, but Clarke’s choice of a comparison group was straightfor-ward and certainly more plausible than com-paring New Jersey to, say, New Mexico.

Clarke’s study is a good example of a natural fi eld experiment. The experimental stimulus—caller ID and call tracing—was not specifi cally introduced by Clarke, but he was able to obtain measures for the dependent variable before and after the experimental stimulus was intro-duced. This design made it possible for Clarke to infer with reasonable confi dence that caller ID and call tracing reduced the number of for-mal complaints about obscene phone calls.

Child Abuse and Later Arrest Cathy Spatz Widom studied the long-term effects of child abuse—whether abused children are more likely to be charged with delinquent or adult criminal offenses than children who were not abused. Child abuse was the experimental stimulus, and the number of subsequent arrests was the dependent variable.

Of course, it is not possible to assign chil-dren randomly to groups in which some are abused and others are not. Widom’s de-sign called for selecting a sample of children who, according to court records, had been abused. She then matched each abused subject with a comparison subject— of the same gen-der, race, age, and approximate socioeconomic status (SES)—who had not been abused. The assumption with these matching criteria was that age at the time of abuse, gender, race, and SES differences might confound any observed relationship between abuse and subsequent arrests.

You may be wondering how a researcher se-lects important variables to use in matching ex-perimental and comparison subjects. We cannot provide a defi nitive answer to that question, any more than we can specify what particular vari-ables should be used in a given experiment. The answer ultimately depends on the nature and purpose of the experiment. As a general rule, however, the two groups should be comparable in terms of variables that are likely to be related to the dependent variable under study. Widom matched on gender, race, and SES because these variables are correlated with juvenile and adult arrest rates. Age at the time of reported abuse was also an important variable because children abused at a younger age had a longer “at-risk” period for delinquent arrests.

Widom produced experimental and com-parison groups matching individual subjects. It is also possible to construct experimental and comparison groups through aggregate match-ing, in which the average characteristics of each group are comparable. This is illustrated in our next example.

Chapter 5 Experimental and Quasi-Experimental Designs 127

ing), and because CCTV was carefully tailored to each site. Instead, the researchers created two types of comparison areas. First, compari-son areas “were selected by similarity on socio-demographic and geographical characteristics and crime problems.” The second type of com-parison was “buffer zones,” defi ned as an area in a one-mile radius from the edge of the target area where CCTV cameras were installed; buffer zones were defi ned only for CCTV areas.

The rationale for comparison areas is clear. If CCTV is effective in reducing crime, we should expect declines in target areas, but not in comparison areas. Alternatively, if post-treatment measures of crime went down in both treatment and comparison areas, we might expect greater declines in the CCTV sites. But what about buffer areas? After defi ning buffer areas, researchers then subdivided them into concentric rings around a target area, shown as T in Figure 5.5. The stated purpose was to as-sess any movement of crime around the target area. If CCTV was effective in reducing crime, any reduction should be greatest in the target area; the size of the reduction should decline moving outward from the target area.

Short-term results found some reduction of some types of crime in some CCTV areas. In other treatment areas, some crimes increased more than in comparison areas. In particular, Gill and Spriggs found that public order of-fenses such as drunkenness tended to increasemore in CCTV target areas. Overall, signifi cant drops in crime were found in just 2 of 13 target areas. Fear and related attitudes declined in all target and comparison areas, but the authors believed this was largely due to declining crime in all areas.

This example illustrates why nonequivalent comparison groups are important. Because crime declined in most areas and fear declined in all, a simple comparison of pre- and post-intervention measures would have been mis-leading. That strategy would have suggested that CCTV was responsible for reduced crime and fear. Only by adding the comparison and

Cameras and Crime Prevention U.S. resi-dents have probably become accustomed to seeing closed-circuit television (CCTV) cameras in stores and at ATMs, but this technology is less used in public spaces such as streets and parking lots. With an estimated 4 million cam-eras deployed, CCTV is widely used as a crime prevention and surveillance tool in the United Kingdom (McCahill and Norris 2003). CCTV enabled the London Metropolitan Police to quickly identify suspects in the Underground bombing attacks that took place in 2005. Cam-eras are increasingly used to monitor traffi c, and even record license plates of cars running traffi c lights. But does CCTV have any effect in reducing crime?

Martin Gill and associates (Gill and Spriggs 2005; Gill, Spriggs, Argomaniz, et al. 2005) conducted an evaluation of 13 CCTV projects installed in a variety of residential and commer-cial settings in England. These were a mix of smaller and large-scale CCTV projects involv-ing multiple cameras. One area on the outskirts of London included more than 500 cameras installed to reduce thefts of and from vehicles in parking facilities. Five projects in London and other urban areas placed 10 to 15 cameras in low-income housing areas, seeking to reduce burglary and robbery. Researchers examined two types of dependent variables before and aftercameras were installed: crimes reported to police and fear of crime. Fear was measured through surveys of people living in residential areas, and samples of people on local streets for commercial areas and parking facilities.

Measuring police data and fear of crime be-fore and after cameras were installed made it possible for Gill and associates to satisfy two criteria for cause—time order and covariation between the independent variable (CCTV) and dependent variables. However, they were not able to randomly assign some areas to receive the CCTV intervention, while other areas did not. This was because the intervention was planned for only a small number of locations of each type (residential, commercial, park-

128 Part Two Structuring Criminal Justice Inquiry

Now think of a cohort that is exposed to some experimental stimulus. The May proba-tion cohort might be required to complete 100 hours of community service in addition to meeting other conditions of probation. If we are interested in whether probationers who receive community service sentences are charged with fewer probation violations, we can compare the performance of the May cohort with that of the April cohort, or the June cohort, or some other cohort not sentenced to community service.

Cohorts that do not receive community service sentences serve as comparison groups. The groups are not equivalent because they were not created by random assignment. But if we assume that a comparison cohort does not systematically differ from a treatment cohort on important variables, we can use this design to determine whether community service sen-tences reduce probation violations.

That last assumption is very important, but it may not be viable. Perhaps a criminal court docket is organized to schedule certain types of cases at the same time, so a May cohort would be systematically different from a June cohort. But if the assumption of comparability can be met, cohorts may be used to construct nonequivalent comparison and experimental groups by taking advantage of the natural fl ow of cases through an institutional process.

Time-Series DesignsTime-series designs are common examples of longitudinal studies in criminal justice re-search. As the name implies, a time-series de-sign involves examining a series of observations on some variable over time. A simple example is examining trends in arrests for drunk driving over time to see whether the number of arrests is increasing, decreasing, or staying constant. A police executive might be interested in keeping track of arrests for drunk driving, or for other offenses, as a way of monitoring the perfor-mance of patrol offi cers. Or state corrections offi cials might want to study trends in prison admissions as a way of predicting the future need for correctional facilities.

buffer areas to their research were Gill and Spriggs able to learn that CCTV was probably not the cause of declines, since similar patterns were found in many areas where CCTV systems were not installed.

Together, these three studies illustrate dif-ferent approaches to research design when it is not possible to randomly assign subjects to treatment and control groups. Lacking random assignment, researchers must use cre-ative procedures for selecting subjects, con-structing treatment and comparison groups, measuring dependent variables, and exercising other controls to reduce possible threats to validity.

Cohort DesignsChapter 3 mentioned cohort studies as ex-amples of longitudinal designs. We can also view cohort studies as a type of nonequivalent-groups design. Recall from Chapter 3 that a co-hort may be defi ned as a group of subjects who enter or leave an institution at the same time. For example, a class of police offi cers who grad-uate from a training academy at the same time could be considered a cohort. Or we might view all persons who were sentenced to probation in May as a cohort.

Figure 5.5 Buffer Zones in CCTV Quasi-ExperimentSource: Adapted from Gill and Spriggs (2005, 40).

T123

Chapter 5 Experimental and Quasi-Experimental Designs 129

nize this as an example of history as a validity threat to the inference that the new checkpoint program caused a change in auto accidents. The general decline in pattern 1 may be due to reduced drunk driving that has nothing to do with sobriety checkpoints. Pattern 2 illus-trates what is referred to as seasonality in a time series—a regular pattern of change over time. In our example, the data might refl ect seasonal variation in alcohol-related accidents that oc-curs around holidays or maybe on football weekends near a college campus.

Patterns 3 and 4 lend more support to the inference that sobriety checkpoints caused a decline in alcohol-related accidents, but the two patterns are different in a subtle way. In pattern 3, accidents decline more sharply from a general downward trend immediately after the check-point program was introduced, whereas pattern 4 displays a sharper decline sometime after the new program was established. Which pattern provides stronger support for the inference?

In framing your answer, recall what we have said about construct validity. Think about the underlying causal process these two patterns represent, or consider possible mechanisms that might be at work. Pattern 3 suggests that the program was immediately effective and supports what we might call an incapacitation mechanism: roadside checkpoints enabled po-lice to identify and arrest drunk drivers, thereby getting them off the road and reducing acci-dents. Pattern 4 suggests a deterrent mecha-nism: as drivers learned about the checkpoints, they less often drove after drinking, and acci-dents eventually declined. Either explanation is possible given the evidence presented. This il-lustrates an important limitation of i nterrupted time-series designs: they operationalize com-plex causal constructs in simple ways. Our in-terpretation depends in large part on how we understand this causal process.

The classic study by Richard McCleary and as-sociates (McCleary, Nienstedt, and Erven 1982) illustrates the need to think carefully about how well time-series results refl ect underly-ing causal patterns. McCleary and colleagues

An interrupted time series is a special type of time-series design that can be used in cause-and-effect studies. A series of observations is compared before and after an intervention is introduced. For example, a researcher might want to know whether roadside sobriety check-points cause a decrease in fatal automobile ac-cidents. Trends in accidents could be compared before and after the roadside checkpoints are established.

Interrupted time-series designs can be very useful in criminal justice research, especially in applied studies. They do have some limitations, however, just like other ways of structuring re-search. Shadish, Cook, and Campbell (2002) de-scribed the strengths and limitations of differ-ent approaches to time-series designs. We will introduce these approaches with a hypothetical example and then describe some specifi c crimi-nal justice applications.

Continuing with the example of sobriety checkpoints, Figure 5.6 presents four possible patterns of alcohol-related automobile acci-dents. The vertical line in each pattern shows the time when the roadside checkpoint pro-gram is introduced. Which of these patterns indicates that the new program caused a reduc-tion in car accidents?

If the time-series results looked like pattern 1 in Figure 5.6, we might think initially that the checkpoints caused a reduction in alcohol-related accidents, but there seems to be a gen-eral downward trend in accidents that contin-ues after the intervention. It’s safer to conclude that the decline would have continued even without the roadside checkpoints.

Pattern 2 shows that an increasing trend in auto accidents has been reversed after the inter-vention, but this appears to be due to a regular pattern in which accidents have been bouncing up and down. The intervention was introduced at the peak of an upward trend, and the later decline may be an artifact of the underlying pattern rather than of the new program.

Patterns 1 and 2 exhibit some outside trend, rather than an intervention, that may account for a pattern observed over time. We may recog-

130 Part Two Structuring Criminal Justice Inquiry

glaries, after investigating incidents over a pe-riod of time and making arrests. But it is highly u nlikely that changing investigative procedures would have an immediate impact. This dis-crepancy prompted McCleary and associates to look more closely at the policy change and led to their conclusion that the apparent decline

reported a sharp decline in burglaries imme-diately after a special burglary investigation unit was established in a large city. This fi nding was at odds with their understanding of how police investigations could reasonably be ex-pected to reduce burglary. A special unit might eventually be able to reduce the number of bur-

Pattern 1

Sobrietycheckpoints

Fatalaccidents

60

50

40

30

20

10

01 2 3 4 5

Week

6 7 8

Pattern 2

Sobrietycheckpoints

Fatalaccidents

60

50

40

30

20

10

01 2 3 4 5

Week

6 7 8

Figure 5.6 Four Patterns of Change in Fatal Automobile Accidents (Hypothetical Data)

Chapter 5 Experimental and Quasi-Experimental Designs 131

what appeared to be a reduction in burglary. Instrumentation can be a particular problem in time-series designs for two reasons. First, observations are usually made over a relatively long time period, which increases the likeli-hood of changes in measurement instruments.

in burglaries was produced by changes in record-keeping practices. No evidence existed of any decline in the actual number of burglaries.

This example illustrates our discussion of in-strumentation earlier in this chapter. Changes in the way police counted burglaries produced

Pattern 3

Sobrietycheckpoints

Fatalaccidents

60

50

40

30

20

10

01 2 3 4 5

Week

6 7 8

Pattern 4

Sobrietycheckpoints

Fatalaccidents

60

50

40

30

20

10

01 2 3 4 5

Week

6 7 8

Figure 5.6 (continued)

132 Part Two Structuring Criminal Justice Inquiry

of police problem solving in Chicago. They ex-amined changes in crime for police beats where specifi c problems were identifi ed and addressed to comparison beats where no crime-specifi c interventions were developed.

A single-series design may be modifi ed by introducing and then removing the interven-tion, as shown in the third part of Figure 5.7. We might test sobriety checkpoints by setting them up every weekend for a month and then not setting them up for the next few months. If the checkpoints caused a reduction in alcohol related accidents, we might expect an increase after they were removed. Or the effects of week-end checkpoints might persist even after we re-moved them.

Because different states or cities sometimes introduce new drunk-driving programs at dif-ferent times, we might be able to use what Shadish, Cook, and Campbell (2002, 192) called a “time-series design with switching replica-tions.” The bottom of Figure 5.7 illustrates this design. For example, assume that Ohio begins using checkpoints in May 1998 and Michigan

Second, time-series designs often use measures that are produced by an organization such as a police department, criminal court, probation offi ce, or corrections department. There may be changes or irregularities in the way data are col-lected by these agencies that are not readily ap-parent to researchers and that are, in any case, not subject to their control.

Variations in Time-Series DesignsIf we view the basic interrupted time-series de-sign as an adaptation of basic design building blocks, we can consider how modifi cations can help control for many validity problems. The simplest time-series design studies one group—the treatment group— over time. Rather than making one pretest and one posttest observa-tion, the interrupted time-series design makes a longer series of observations before and after introducing an experimental treatment.

What if we considered the other building blocks of experimental design? Figure 5.7 pres-ents the basic design and some variations using the familiar O, X, and t notation. In the basic de-sign, shown at the top of Figure 5.7, many pre-test and posttest observations are made on a sin-gle group that receives some type of treatment.

We could strengthen this design by adding a comparison series of observations on a group that does not receive the treatment. If, for ex-ample, roadside sobriety checkpoints were in-troduced all over the state of Ohio but were not used at all in Michigan, then we could compare auto accidents in Ohio (the treatment series) with auto accidents in Michigan (the compari-son series). If checkpoints caused a reduction in alcohol-related accidents, we would expect to see a decline in Ohio following the inter-vention, but there should be no change or a lesser decline in Michigan over the same time period. The second part of Figure 5.7 shows this design—an interrupted time series with a nonequivalent comparison group. The two se-ries are not equivalent because we did not ran-domly assign drivers to Ohio or Michigan. So Young Kim and Wesley Skogan (2003) present a good example of this design in their analysis

Simple Interrupted Time SeriesO O O O X O O O Ot1 t2 t3 t4 t5 t6 t7 t8

Interrupted Time Series with Nonequivalent Comparison Group

O O O O X O O O O

O O O O O O O Ot1 t2 t3 t4 t5 t6 t7 t8

Interrupted Time Serieswith Removed Treatment

O O X O O O –X O O Ot1 t2 t3 t4 t5 t6 t7 t8

Interrupted Time Serieswith Switching Replications

O O O X O O O O O

O O O O O X O O Ot1 t2 t3 t4 t5 t6 t7 t8

Figure 5.7 Interrupted Time-Series Designs

Chapter 5 Experimental and Quasi-Experimental Designs 133

rectional facilities. Using a variable-oriented approach, we might visit one or a few facilities to conduct in-depth interviews with staff, ob-serve the condition of facilities, and gather in-formation from institutional records. Here, we are collecting information on a wide range of variables from a small number of institutions.

The case-study design is an example of vari-able-oriented research. Here, the researcher’s at-tention centers on an in-depth examination of one or a few cases on many dimensions. Robert Yin (2003) points out that the terms case andcase study are used broadly. Cases can be individ-ual people, neighborhoods, correctional facili-ties, courtrooms, or other aggregations.

Robert Yin cautions that the case study de-sign is often misunderstood as representing “qualitative” research or participant observa-tion study. Instead, Yin advises that the case study is a design strategy and that the labels qualitative and quantitative are not useful ways to distinguish design strategies. Case studies might appear qualitative because they focus on one or a small number of units. But many case studies employ sophisticated statistical techniques to examine many variables for those units. An example illustrates how misleading it can be to associate case studies with qualitative research.

In what has come to be known as the “Bos-ton Gun Project,” Anthony Braga and associ-ates (Braga, Kennedy, Waring, and Piehl 2001) studied violence by youth gangs in Boston neighborhoods. Theirs was an applied explana-tory study. They worked with local offi cials to better understand gang violence, develop ways to reduce it, and eventually assess the effects of their interventions. Neither a classical experi-ment nor a nonequivalent-groups design was possible. Researchers sought to understand and reduce violence by all gangs in the city. Their research centered on gangs, not individuals, though some interventions targeted particular gang members.

Researchers collected a large amount of information about gangs and gang violence from several sources. Researchers used network

introduces them in July of the same year. A switching-replications design could strengthen our conclusion that checkpoints reduce acci-dents if we saw that a decline in Ohio began in June and a similar pattern was found in Michi-gan beginning in August. The fact that similar changes occurred in the dependent variable in different states at different times, correspond-ing to when the program was introduced, would add to our confi dence in stating that sobriety checkpoints reduced auto accidents.

Variable-Oriented Research and Scientifi c RealismAnother way to think about a time-series design is as a study of one or a few cases with many observations. If we design a time-series study of roadside checkpoints in Ohio, we will be exam-ining one case (Ohio) with many observations of auto accidents. Or a design that compares Ohio and Michigan will examine many observa-tions for two cases. Thinking once again about design building blocks, notice how we have slightly restated one of those building blocks. Instead of considering the number of experi-mental and control groups, our attention centers on the number of subjects or cases in our study. In Figure 5.7, the fi rst and third time-series de-signs have one case each, while the second and fourth designs examine two cases each.

Classical experiments and quasi-experiments with large numbers of subjects are examples of what Charles Ragin (2000) terms case-orientedresearch, in which many cases are examined to understand a small number of variables. Time-series designs and case studies are examples of variable-oriented research, in which a large number of variables are studied for a small number of cases or subjects. Suppose we wish to study inmate-on-inmate assaults in correc-tional facilities. With a case-oriented approach, we might send a questionnaire to a sample of 500 correctional facilities, asking facility staff to provide information about assaults, facil-ity design, inmate characteristics, and housing conditions. Here, we are gathering information on a few variables from a large number of cor-

134 Part Two Structuring Criminal Justice Inquiry

violence operated as a different mechanism; the “levers” pulled in Boston did not work else-where. Braga and associates emphasize that the problem-solving process is exportable to other settings but that the interventions used in Bos-ton are not (2001, 220).

How do case studies address threats to valid-ity? In the most general sense, case studies at-tempt to isolate causal mechanisms from pos-sible confounding infl uences by studying very precisely defi ned subjects. Donald Campbell (2003, ix–x) likened this to laboratory experi-ments in the natural sciences, in which research-ers try to isolate causal variables from outside infl uences. Case-study research takes place in natural fi eld settings, not in laboratories. But the logic of trying to isolate causal mechanisms by focusing on one or a few cases is a direct de-scendant of the rationale for experimental iso-lation in laboratories.

Figure 5.8 summarizes advice from Yin (2003, 33–39) on how to judge the quality of case-study designs in language that should now be familiar. Construct validity is estab-lished through multiple sources of evidence, the establishment of chains of causation that connect independent and dependent variables, and what are termed member checks—asking key informants to review tentative conclu-sions about causation. Examples of techniques for strengthening internal validity are theory-based pattern matching and time-series analy-sis. The fi rst criterion follows Shadish, Cook, and Campbell, calling on researchers to make specifi c theory-based predictions about what pattern of results will support hypothesized causal relationships. Alternative explanations, also termed rival hypotheses, are less persuasive when specifi c predictions of results are actu-ally obtained. For example, Braga and associ-ates (2001) predicted that gun killings among male Boston residents under age 25 would de-cline following implementation of the package of interventions in the Boston gun strategy. Al-though other explanations are possible for the sharp observed declines, the specifi c focus of

analysis to examine relationships between gangs in different neighborhoods and confl icts over turf within neighborhoods. Police records of homicides, assaults, and shootings were studied. Based on extensive data on a small number of gangs, researchers collaborated with public offi cials, neighborhood organizations, and a coalition of religious leaders—the “faith community.” A variety of interventions were devised, but most were crafted from a detailed understanding of the specifi c nature of gangs and gang violence as they existed in Boston neighborhoods. David Kennedy (1998) sum-marizes these using the label “pulling levers,” signifying that key gang members were vulner-able to intensive monitoring via probation or parole. The package of strategies was markedly successful: Youth homicides were reduced from about 35 to 40 each year in the 20 years preced-ing the program to about 15 per year in the fi rst 5 post-intervention years (Braga 2002, 70).

The Boston research is also a good example of the scientifi c realist approach of Ray Pawson and Nick Tilley (1997). Researchers examined a small number of subjects—gangs and gang members—in a single city and in the context of specifi c neighborhoods where gangs were ac-tive. Extensive data were gathered on the mech-anisms of gang violence. Interventions were tai-lored to those mechanisms in their context.

Braga and associates (2001) emphasize that the success of the Boston efforts was due to the process by which researchers, public offi cials, and community members collaboratively studied gang violence and then developed appropriate policy actions based on their analyses. Other ju-risdictions mistakenly tried to reproduce Bos-ton’s interventions, with limited or no success, failing to recognize that the interventions were developed specifi cally for Boston. In case-study language, researchers examined many variables for one site and based policy decisions on that analysis. In the words of scientifi c realism, re-searchers studied the gang violence mecha-nism in the Boston context. In other contexts (Baltimore or Minneapolis, for example), gang

Chapter 5 Experimental and Quasi-Experimental Designs 135

are conducted in the manner of experiments, using design building blocks in different ways.

Experimental and Quasi-Experimental Designs SummarizedUnderstanding the building blocks of research de-sign and adapting them accordingly works better than trying to apply the same design to all research questions.

By now it should be clear that there are no sim-ple formulas or recipes for designing an experi-mental or quasi-experimental study. Research-ers have an almost infi nite variety of ways of varying the number and composition of groups of subjects, selecting subjects, determining how many observations to make, and deciding what types of experimental stimuli to introduce or study.

Variations on experimental and quasi-experimental designs are constructed for basic and applied explanatory studies. As we stated early in this chapter, experiments are best suited to topics that involve well-defi ned con-cepts and propositions. Experiments and quasi-experiments also require that researchers be able to exercise, or at least approximate, some degree of control over an experimental stimu-lus. Finally, these designs depend on the ability to unambiguously establish the time order of experimental treatments and observations on the dependent variable. Often it is not possible to achieve the necessary degree of control.

In designing research projects, researchers should be alert to opportunities for using ex-perimental designs. Researchers should also be aware of how quasi-experimental designs can be developed when randomization is not pos-sible. Experiments and quasi-experiments lend themselves to a logical rigor that is often much more diffi cult to achieve in other modes of ob-servation. The building blocks of research de-sign can be used in creative ways to address a variety of criminal justice research questions.

the researchers’ interventions and the concomi-tant results undermine the credibility of rival hypotheses. Having many measures of variables over time strengthens internal validity if ob-servations support our predicted expectations about cause. We saw earlier how nonequivalent time-series comparisons and switching replica-tions can enhance fi ndings. This is also consis-tent with pattern matching—we make specifi c statements about what patterns of results we expect in our observations over time.

Finally, a single case study is vulnerable to ex-ternal validity threats because it is rooted in the context of a specifi c site. Conducting multiple case studies in different sites illustrates the prin-ciple of replication. By replicating research fi nd-ings, we accumulate evidence. We may also fi nd that causal relationships are different in dif-ferent settings, as did researchers who tried to transplant specifi c interventions from the Bos-ton Gun Project. Although such fi ndings can undermine the generalizability of causality, they also help us understand how causal mechanisms can operate differently in different settings.

Time-series designs and case studies are ex-amples of variable-oriented research. A case study with many observations over time can be an example of a time-series design. Adding one or more other cases offers opportunities to create nonequivalent comparisons. Time-series designs, case studies, and nonequivalent com-parisons are quasi-experimental designs—they

Case Study Approach

Construct Validity Multiple sources of evidence Establish chain of causation Member checksInternal Validity Pattern-matching Time-series analysisExternal Validity Replicate through multiple case studies

Figure 5.8 Case Studies and ValiditySource: Adapted from Yin (2003, 34).

136 Part Two Structuring Criminal Justice Inquiry

✪ Key TermsCareful attention to design issues, and to how design elements can reduce validity threats, is essential to the research process.

✪ Main Points• Experiments are an excellent vehicle for the con-

trolled testing of causal processes. Experiments may also be appropriate for evaluation studies.

• The classical experiment tests the effect of an experimental stimulus on some dependent vari-able through the pretesting and posttesting of experimental and control groups.

• It is less important that a group of experimen-tal subjects be representative of some larger population than that experimental and control groups be similar to each other.

• Randomization is the best way to achieve com-parability in the experimental and control groups.

• The classical experiment with random assign-ment of subjects guards against most of the threats to internal invalidity.

• Because experiments often take place under controlled conditions, results may not be gen-eralizable to real-world constructs. Or fi ndings from an experiment in one setting may not ap-ply to other settings.

• The classical experiment may be modifi ed to suit specifi c research purposes by chang-ing the number of experimental and control groups, the number and types of experimental stimuli, and the number of pretest or posttest measurements.

• Quasi-experiments may be conducted when it is not possible or desirable to use an experimental design.

• Nonequivalent-groups and time-series designs are two general types of quasi-experiments.

• Time-series designs and case studies are exam-ples of variable-oriented research, in which a large number of variables are examined for one or a few cases.

• Both experiments and quasi-experiments may be customized by using design building blocks to suit particular research purposes.

• Not all research purposes and questions are amenable to experimental or quasi-experimen-tal designs because researchers may not be able to exercise the required degree of control.

case study, p. 133case-oriented

research, p. 133classical experiment,

p. 113control group, p.115dependent variable,

p. 114experimental group,

p. 115

generalizability, p. 121

independent vari-able, p. 114

quasi-experiment, p. 125

randomization, p. 117

variable-oriented research, p. 133

✪ Review Questions and Exercises1. If you do not remember participating in

D.A.R.E.—Drug Abuse Resistance Education—you have probably heard or read something about it. Describe an experimental design to test the causal hypothesis that D.A.R.E. reduces drug use. Is your experimental design feasible? Why or why not?

2. Experiments are often conducted in public health research where a distinction is made be-tween an effi cacy experiment and an effective-ness experiment. Effi cacy experiments focus on whether a new health program works un-der ideal conditions; effectiveness experiments test the program under typical conditions that health professionals encounter in their day-to-day work. Discuss how effi cacy experiments and effectiveness experiments refl ect concerns about internal validity threats on the one hand and generalizability on the other.

3. Crime hot spots are areas where crime reports, calls for police service, or other measures of crime are especially common. Police in depart-ments with a good analytic capability routinely identify hot spots and launch special tactics to reduce crime in these areas. What kinds of v alidity threats should researchers be especially attentive to in studying the effects of police in-terventions on hot spots?

✪ Additional ReadingsCampbell, Donald T., and Julian Stanley, Experi-

mental and Quasi-Experimental Designs for Research(Chicago: Rand McNally, 1966). This short book provides an excellent analysis of the logic and methods of experimentation in social re-

Chapter 5 Experimental and Quasi-Experimental Designs 137

ton: Houghton Miffl in, 2002). An update of the defi nitive guide to quasi-experimentation, this book focuses on basic principles of research design. In addition to numerous pointers on designing research, the authors stress that de-signing out validity threats is much preferred to trying to control them through later statistical analysis.

Weisburd, David, Cynthia M. Lum, and Anthony Petrosino, “Does Research Design Affect Study Outcomes in Criminal Justice?” The Annals578(2001): 50–70. The authors make the in-triguing claim that stronger experimental de-signs are more likely to fi nd no causal relation-ships, whereas quasi-experimental designs more often fi nd relationships. Read this article care-fully (whether or not you complete the exercise described above), and decide whether you agree with the authors’ conclusions.

Yin, Robert K., Case Study Research: Design and Methods,3rd ed. (Thousand Oaks, CA: Sage, 2003). Many people incorrectly associate case studies with qualitative research. Yin describes a variety of case-study designs as quasi-experiments. In do-ing so, he is consistent with how Shadish, Cook, and Campbell (2002) describe case studies.

search and is widely cited as the classic discus-sion of validity threats.

Kim, So Young, and Wesley G. Skogan, “Statisti-cal Analysis of Time Series Data on Problem Solving,” Community Policing Working Paper #27 (Center for Policy Research, Northwestern University, 2003; www.northwestern.edu/ipr/publications/policing.html; accessed May 21, 2008). Kim and Skogan present a number of time-series studies to examine the effects of problem solving by Chicago police. This is a good example of switching replications time-series designs by researchers at the university where Campbell and Cook did their pioneering work on quasi-experimental designs.

Pawson, Ray, and Nick Tilley, Realistic Evaluation(Thousand Oaks, CA: Sage, 1997). We men-tioned this book in Chapter 3. Pawson and Tilley argue that experiments and quasi-experi-ments focus too narrowly on threats to internal validity. Instead, they propose a different view of causation and different approaches to assess-ing cause.

Shadish, William R., Thomas D. Cook, and Donald T. Campbell, Experimental and Quasi-Experimen-tal Designs for Generalized Causal Inference (Bos-

This page intentionally left blank

139

Having covered the basics of structuring research, from general issues to research de-sign, let’s dive into the various observational techniques available for criminal justice re-search.

Chapter 6 examines how social scientists go about selecting people or things for ob-servation. Our discussion of sampling ad-dresses the fundamental scientifi c issue of generalizability. As we’ll see, it is possible for us to select a few people or things for obser-vation and then apply what we observe to a much larger group of people or things than we actually observed. It is possible, for ex-ample, to ask a thousand people how they feel about “three strikes and you’re out” laws and then accurately predict how tens of millions of people feel about it.

Chapter 7 describes survey research and other techniques for collecting data by ask-ing people questions. We’ll cover different ways of asking questions and discuss the

various uses of surveys and related tech-niques in criminal justice research.

Chapter 8, on fi eld research, examines what is perhaps the most natural form of data collection: the direct observation of phenomena in natural settings. As we will see, observations can be highly structured and systematic (such as counting pedestri-ans who walk by a specifi ed point) or less structured and more fl exible.

Chapter 9 discusses ways to take advan-tage of some of the data available all around us. Researchers often examine data collected by criminal justice and other public agen-cies. Content analysis is a method of collect-ing data through carefully specifying and counting communications such as news stories, court opinions, or even recorded visual images. Criminal justice researchers may also conduct secondary analysis of data collected by others.

Part Three

Modes of Observation

140

Chapter 6

SamplingSampling makes it possible to select a few hundred or thousand people for study and discover things that apply to many more people who are not studied.

Introduction 141

The Logic of Probability Sampling 141

Conscious and Unconscious Sampling Bias 143

Representativeness and Probability of Selection 144

Probability Theory and Sampling Distribution 145

The Sampling Distribution of 10 Cases 145

From Sampling Distribution to Parameter Estimate 149

Estimating Sampling Error 150

Confi dence Levels and Confi dence Intervals 151

Probability Theory and Sampling Distribution Summed Up 152

Populations and Sampling Frames 153

Types of Sampling Designs 154

Simple Random Sampling 154

Systematic Sampling 154

Stratifi ed Sampling 155

Disproportionate Stratifi ed Sampling 156

Multistage Cluster Sampling 157

Multistage Cluster Sampling with Stratifi cation 158

Illustration: Two National Crime Surveys 160

The National Crime Victimization Survey 160

The British Crime Survey 161

Chapter 6 Sampling 141

Probability Sampling in Review 162

Nonprobability Sampling 162

Purposive Sampling 162

Quota Sampling 163

Reliance on Available Subjects 164

Snowball Sampling 165

Nonprobability Sampling in Review 166

IntroductionHow we collect representative data is fundamental to criminal justice research.

Much of the value of research depends on how data are collected. A critical part of criminal jus-tice research is deciding what will be observed and what won’t. If you want to study drug us-ers, for example, which drug users should you study? This chapter discusses the logic and fundamental principles of sampling, then de-scribes different general approaches for select-ing subjects or other units.

Sampling is the process of selecting obser-vations. Sampling is ordinarily used to select observations for one of two related reasons. First, it is often not possible to collect informa-tion from all persons or other units we wish to study. We may wish to know what propor-tion of all persons arrested in U.S. cities have recently used drugs, but collecting all that data would be virtually impossible. Thus, we have to look at a sample of observations.

The second reason for sampling is that it is often not necessary to collect data from all persons or other units. Probability sampling techniques enable us to make relatively few observations and then generalize from those observations to a much wider population. If we are interested in what proportion of high school students have used marijuana, collecting data from a probability sample of a few thou-sand students will serve just as well as trying to study every high school student in the country.

Although probability sampling is central to criminal justice research, it cannot be used in many situations of interest. A variety of non-probability sampling techniques are available in such cases. Nonprobability sampling has its own logic and can provide useful samples for criminal justice inquiry. In this chapter, we ex-amine both the advantages and the shortcom-ings of such methods, and we discuss where they fi t in the larger picture of sampling and collecting data. Keep in mind one important goal of all sampling: to reduce, or at least un-derstand potential biases that may be at work in selecting subjects.

The Logic of Probability SamplingProbability sampling helps researchers generalize from observed cases to unobserved ones.

In selecting a group of subjects for study, social science researchers often use some type of sam-pling. Sampling in general refers to selecting part of a population. In selecting samples, we want to do two related things. First, we select samples to represent some larger population of people or other things. If we are interested in attitudes about a community correctional facility, we might draw a sample of neighbor-hood residents, ask them some questions, and use their responses to represent the attitudes of all neighborhood residents. Or, in studying cases in a criminal court, we may not be able to examine all cases, so we select a sample to

142 Part Three Modes of Observation

represent that population of all cases processed through some court.

Second, we may want to generalize from a sample to an unobserved population the sample is intended to represent. If we interview a sample of community residents, we may want to gener-alize our fi ndings to all community residents—those we interviewed and those we did not. We might similarly expect that our sample of crimi-nal court cases can be generalized to the popula-tion of all criminal court cases.

A special type of sampling that enables us to generalize to a larger population is known as probability sampling, a method of selection in which each member of a population has a known chance or probability of being selected. Know-ing the probability that any individual member of a population could be selected makes it pos-sible for us to make predictions that our sample accurately represents the larger population.

If all members of a population are identical in all respects— demographic characteristics,

attitudes, experiences, behaviors, and so on—there is no need for careful sampling proce-dures. Any sample will be suffi cient. In this ex-treme case of homogeneity, in fact, a single case will be suffi cient as a sample to study character-istics of the whole population.

In reality, of course, the human beings who make up any real population are heterogeneous, varying in many ways. Figure 6.1 offers a sim-plifi ed illustration of a heterogeneous popula-tion: the 100 members of this small population differ by gender and race. We’ll use this hypo-thetical micropopulation to illustrate various aspects of sampling.

A sample of individuals from a population, if it is to provide useful descriptions of the total population, must contain essentially the same variations that exist in the population. This is not as simple as it might seem. Let’s look at some of the possible biases in selec-tion or ways researchers might go astray. Then we will see how probability sampling provides

44 white women44 white men 6 African American women 6 African American men

Figure 6.1 A Population of 100 People

Chapter 6 Sampling 143

an effi cient method for selecting a sample that should adequately refl ect variations that exist in the population.

Conscious and Unconscious Sampling BiasAt fi rst glance, it may seem as if sampling is a rather straightforward matter. To select a sam-ple of 100 lawyers, a researcher might simply go to a courthouse and interview the fi rst 100 lawyers who walk through the door. This kind of sampling method is often used by untrained researchers, but it is subject to serious biases. In connection with sampling, bias simply means that those selected are not “typical” or “represen-tative” of the larger populations they have been chosen from. This kind of bias is virtually inevi-table when a researcher picks subjects casually.

Figure 6.2 illustrates what can happen when we simply select people who are convenient for study. Although women make up only 50 per-cent of our micropopulation, those closest to

the researcher (people in the upper right-hand corner of Figure 6.2) happen to be 70 percent women. Although the population is 12 percent African American, none were selected into this sample of people who happened to be conve-niently situated near the researcher.

Moving beyond the risks inherent in simply studying people who are convenient, we need to consider other potential problems as well. To begin, our own personal leanings or biases may affect the sample selected in this manner; hence, the sample will not truly represent the popula-tion of lawyers. Suppose a researcher is a little intimidated by lawyers who look particularly prosperous, believing that they might ridicule his research effort. He might consciously or unconsciously avoid interviewing them. Or he might believe that the attitudes of “establish-ment” lawyers are irrelevant to his research pur-poses and avoid interviewing them.

Even if the researcher seeks to interview a “balanced” group of lawyers, he won’t know the

Figure 6.2 A Sample of Convenience: Easy, but Not Representative

Thesample

144 Part Three Modes of Observation

exact proportions of different types of lawyers who make up such a balance and won’t always be able to identify the different types merely by watching them walk by.

The researcher might make a conscious ef-fort to interview, say, every 10th lawyer who enters the courthouse, but he still cannot be sure of a representative sample because differ-ent types of lawyers visit the courthouse with different frequencies, and some never go to the courthouse at all. Thus, the resulting sample will overrepresent lawyers who visit the court-house more often.

Similarly, “call-in polls”—in which radio stations ask people to call specifi ed telephone numbers to register their opinions— cannot be trusted to represent the general population. At the very least, not everyone in the population is even aware of the poll. Those who are aware of it have some things in common simply because they listen to the same radio station. As mar-ket researchers understand very well, a classical music station has a different audience than a hard rock station. Adding even more bias to the sample, those who are motivated to take part in the poll are probably different from others who are not so motivated.

A similar problem affects polls linked to we-blogs or mass e-mail. Blogs tend to be selective; people regularly visit blogs that present views on personal and political issues they endorse (He-witt 2005). As a result, the population of people who respond to weblog polls can only represent the population of people who regularly visit in-dividual blogs. As a general principle, the more self-selection is involved, the more bias will be introduced into the sample.

The possibilities for inadvertent sampling bias are endless and not always obvious. Fortu-nately, some techniques can help us avoid bias.

Representativeness and Probability of SelectionAlthough the term representativeness has no precise, scientifi c meaning, it carries a com-monsense meaning that makes it useful in the

discussion of sampling. As we’ll use the term here, a sample is representative of the popula-tion from which it is selected if the aggregate characteristics of the sample closely approxi-mate those same aggregate characteristics in the population. If the population, for example, contains 50 percent women, a representative sample will also contain “close to” 50 percent women. Later in this chapter, we’ll discuss “how close” in detail. Notice that samples need not be representative in all respects; representa-tiveness is limited to those characteristics that are relevant to the substantive interests of the study.

A basic principle of probability sampling is that a sample will be representative of the pop-ulation from which it is selected if all members of the population have an equal chance of be-ing selected in the sample. Samples that have this quality are often labeled equal probabil-ity of selection method (EPSEM) samples.This principle forms the basis of probability sampling.

Even carefully selected EPSEM samples are seldom, if ever, perfectly representative of the populations from which they are drawn. Never-theless, probability sampling offers two special advantages. First, probability samples, though never perfectly representative, are typically more representative than other types of samples because they avoid the biases discussed in the preceding section. In practice, there is a greater likelihood that a probability sample will be representative of the population from which it is drawn than that a nonprobability sample will be.

Second, and more importantly, probability sampling permits us to estimate the accuracy or representativeness of the sample. Conceivably,a researcher might wholly by chance select a sample that closely represents the larger popu-lation. The odds are against doing so, however, and we cannot estimate the likelihood that a haphazard sample will achieve representative-ness. The probability sample can provide an accurate estimate of success or failure, because

Chapter 6 Sampling 145

probability samples enable us to draw on prob-ability theory.

Probability Theory and Sampling DistributionProbability theory permits inferences about how sampled data are distributed around the value found in a larger population.

With a basic understanding of the logic of probability sampling in hand, we can examine how probability sampling works in practice. We will then be able to devise specifi c sampling techniques and assess the results of those tech-niques. To do so, we fi rst need to understand four important concepts.

A sample element is that unit about which information is collected and that provides the basis of analysis. Typically, in survey re-search, elements are people or certain types of people. However, other kinds of units can be the elements for criminal justice research—correctional facilities, police beats, or court cases, for example. Elements and units of analy-sis are often the same in a given study, although the former refers to sample selection and the latter to data analysis.

A population is the theoretically specifi ed grouping of study elements. Whereas the vague term delinquents might describe the target for a study, a more precise description of the popu-lation includes the defi nition of the element delinquents (for example, a person charged with a delinquent offense) and the time referent for the study (charged with a delinquent offense in the previous six months). Translating the abstract adult drug addicts into a workable popu-lation requires specifying the age that defi nes adult and the level of drug use that constitutes an addict. Specifying college student includes a consideration of full- and part-time students, degree and nondegree candidates, undergradu-ate and graduate students, and so on.

A population parameter is the value for a given variable in a population. The average in-

come of all families in a city and the age distri-bution of the city’s population are parameters. An important portion of criminal justice re-search involves estimating population param-eters on the basis of sample observations.

The summary description of a given vari-able in the sample is called a sample statistic.Sample statistics are used to make estimates of population parameters. Thus, the average income computed from a sample and the age distribution of that sample are statistics, and those statistics are used to estimate income and age parameters in a population.

The ultimate purpose of sampling is to se-lect a set of elements from a population in such a way that descriptions of those elements (sam-ple statistics) accurately portray the param-eters of the total population from which the elements are selected. Probability sampling enhances the likelihood of accomplishing this aim and also provides methods for estimating the degree of probable success.

The key to this process is random selection. In random selection, each element has an equal chance of being selected independent of any other event in the selection process. Flipping a coin is the most frequently cited example: the “selection” of a head or a tail is independent of previous selections of heads or tails.

There are two reasons for using random se-lection methods. First, this procedure serves as a check on conscious or unconscious bias on the part of the researcher. The researcher who selects cases on an intuitive basis might choose cases that will support his or her research ex-pectations or hypotheses. Random selection erases this danger. Second, and more impor-tantly, with random selection we can draw on probability theory, which allows us to estimate population parameters and to estimate how ac-curate our statistics are likely to be.

The Sampling Distribution of 10 CasesSuppose there are 10 people in a group, and each has a certain amount of money in his or her

146 Part Three Modes of Observation

pocket. To simplify, let’s assume that one per-son has no money, another has $1, another has $2, and so forth up to the person who has $9. Figure 6.3 illustrates this population of 10 people.

Our task is to determine the average amount of money one person has—specifi cally, the mean number of dollars. If you simply add up the money shown in Figure 6.3, the total is $45, so the mean is $4.50 (45 ÷ 10). Our purpose in the rest of this example is to estimate that mean without actually observing all 10 individuals. We’ll do that by selecting random samples from the population and using the means of those samples to estimate the mean for the whole population.

To start, suppose we select—at random—a sample of only 1 person from the 10. Depend-ing on which person we select, we will estimate the group’s mean as anywhere from $0 to $9. Figure 6.4 shows a display of those 10 possible samples. The 10 dots shown on the graph rep-resent the 10 “sample” means we will get as esti-mates of the population. The range of the dots on the graph is the sampling distribution,defi ned as the range of sample statistics we will obtain if we select many samples. Figure 6.4

shows how all of our possible samples of 1 are distributed. Obviously, it is not a good idea to select a sample of only 1 because we stand a good chance of missing the true mean of $4.50 by quite a bit.

What if we take samples of 2 each? As you can see from Figure 6.5, increasing the sample size improves our estimations. Once again, each dot represents a possible sample. There are 45 pos-sible samples of two elements: $0/$1, $0/$2, . . . , $7/$8, $8/$9. Moreover, some of these samples produce the same means. For example, $0/$6, $1/$5, and $2/$4 all produce means of $3. In Figure 6.5, the three dots shown above the $3 mean represent those 3 samples.

Notice that the means we get from the 45 samples are not evenly distributed. Rather, they are somewhat clustered around the true value of $4.50. Only 2 samples deviate by as much as $4 from the true value ($0/$1 and $8/$9), whereas 5 of the samples give the true estimate of $4.50, and another 8 samples miss the mark by only $.50 (plus or minus).

Now suppose we select even larger samples. What will that do to our estimates of the mean? Figure 6.6 presents the sampling distributions of samples of 3, 4, 5, and 6. The progression of

Figure 6.3 A Population of 10 People with $0 to $9

$ 8 $ 1 $ 7

$ 2$ 0 $ 6

$ 5$ 3

$ 9$ 4

Chapter 6 Sampling 147

$0 $1 $2 $3 $4 $5 $6 $7 $8 $9

• • • • • • • • • •

True mean = $4.5010

9

8

7

6

5

4

3

2

1

Estimate of mean(Sample size = 1)

Num

ber

of s

ampl

es(T

otal

= 1

0)

Figure 6.4 The Sampling Distribution of Samples of 1

$0 $1 $2 $3 $4 $5 $6 $7 $8 $9

• • • • • • • • •• • • • • • •• • • • • • •• • • • • •

• • • • •• • • •• • •• •

True mean = $4.5010

9

8

7

6

5

4

3

2

1

Estimate of mean(Sample size = 2)

Num

ber

of s

ampl

es(T

otal

= 4

5)

Figure 6.5 The Sampling Distribution of Samples of 2

148 Part Three Modes of Observation

Figure 6.6 The Sampling Distribution of Samples of 3, 4, 5, and 6

True mean = $4.50

True mean = $4.50

True mean = $4.50True mean = $4.50

2019181716151413121110987654321

Num

ber

of s

ampl

es (

Tota

l = 2

10)

$0 $1 $2 $3 $4 $5 $6 $7 $8 $9

Estimate of mean(Sample size = 4)

2019181716151413121110987654321

Num

ber

of s

ampl

es (

Tot

al =

210

)

$0 $1 $2 $3 $4 $5 $6 $7 $8 $9

Estimate of mean(Sample size = 6)

10987654321

Num

ber

of s

ampl

es (

Tot

al =

120

)

$0 $1 $2 $3 $4 $5 $6 $7 $8 $9

Estimate of mean(Sample size = 3)

2019181716151413121110

987654321

Num

ber

of s

ampl

es (

Tot

al =

252

)

$0 $1 $2 $3 $4 $5 $6 $7 $8 $9

Estimate of mean(Sample size = 5)

• • • • • • • • • • • • • • •• • • • • • •• • • • • • • • • • • •• • • • • •

• • • • • • • • • •• • • • • •• • • • • • • •• •• • • •• • • • • • • •• • • •• • • • • • •• • •• • • • • • •• • •

• • • • •• • •• • • •• •

• • ••

••• •••• •••• •••• ••••• •• •• ••• •••• •••• •••• ••• •• •• ••

•••• •••• •••••• •• •• •••••• •••• ••••• •• •• ••••• •••• ••••• •• •• •

•••• •••• •••••• •••••• •••• •••••• ••••• •••• ••••• ••••• •••• ••••• •••• •••• •••• •••• •••• •••• ••• •••• ••• ••• •••• ••• ••• •••• ••• ••

•••••• •••••••• ••••••• •

••••• •

••••••

••• •••• •••• •••• •••• •• •• ••• •••• •••• •••••• •• •• ••

•••• •••• •••••• •• •• •

•••• •••• ••••• •• •••••• •••• ••••• •• ••

•••• •••• ••••• ••••• •••• •••• ••••• •••• •••• ••••• •••• •••• •••• •••• ••• ••

• •••••• ••• •••••• ••

• •••••• •••••••• •

••••••••••

••

• • • ••• • • • • • • • • • • • •• • • • • • ••• ••• • • • • • • • • • •• • • • • • ••

••• • • • • • • • • •• • • • • • •

••• • • • • • • • • •• • • • •••• • • • • • • • • •• • • • •

••• • • • • • • • •• • • ••• • • • • • • •• • • ••• • • • • • • •• • • ••• • • • • • • •• • • •• • • • • • •• • • •

• • • • •• • • •• • • • •• • • •

• • • • •• • • •• • • •• • •

• • • • •• • • • •

••

A. Samples of 3

B. Samples of 4

D. Samples of 6C. Samples of 5

Chapter 6 Sampling 149

the sampling distributions is clear. Every in-crease in sample size improves the distribution of estimates of the mean in two related ways. First, in the distribution for samples of 5, for example, no sample means are at the extreme ends of the distribution. Why? Because it is not possible to select fi ve elements from our popu-lation and obtain an average of less than $2 or greater than $7. The second way sampling dis-tributions improve with larger samples is that sample means cluster more and more around the true population mean of $4.50. Figure 6.6 clearly shows this tendency.

From Sampling Distribution to Parameter EstimateLet’s turn now to a more realistic sampling sit-uation and see how the notion of sampling dis-tribution applies. Assume that we wish to study the population of Placid Coast, California, to assess the levels of approval or disapproval of a proposed law to ban possession of handguns within the city limits.

Our target population is all adult residents. In order to draw an actual sample, we need some sort of list of elements in our population; such a list is called a sampling frame. Assume our sampling frame is a voter registration list of, say,

20,000 registered voters in Placid Coast. The el-ements are the individual registered voters.

The variable under consideration is attitudes toward the proposed law: approve or disap-prove. Measured in this way, attitude toward the law is a binomial variable; it can have only two values. We’ll select a random sample of, say, 100 persons for the purpose of estimating the population parameter for approval of the pro-posed law.

Figure 6.7 presents all the possible values of this parameter in the population—from 0 per-cent approval to 100 percent approval. The mid-point of the line—50 percent—represents half the voters approving of the handgun ban and the other half disapproving.

To choose our sample, we assign each person on the voter registration list a num-ber and use a computer program to generate 100 random numbers. Then we interview the 100 people whose numbers have been selected and ask for their attitudes toward the hand-gun ban: whether they approve or disapprove. Suppose this operation gives us 48 people who approve of the law and 52 who disapprove. We present this statistic by placing a dot at the point representing 48 percent, as shown in Figure 6.8.

0 50 100

Percentage of voters approving of the proposed law

Figure 6.7 The Range of Possible Sample Study Results

0 50 100

Percentage of voters approving of the proposed law

Sample 1 (48%) Sample 3 (52%)

Sample 2 (51%)

• ••

Figure 6.8 Results Produced by Three Hypothetical Samples

150 Part Three Modes of Observation

Now suppose we select another sample of 100 people in exactly the same fashion and measure their approval or disapproval of the proposed law. Perhaps 51 people in the second sample approve of the law. We place another dot in the appropriate place on the line in Figure 6.8. Repeating this process once more, we may discover that 52 people in the third sample approve of the handgun ban; we add a third dot to Figure 6.8.

Figure 6.8 now presents the three different sample statistics that represent the percentages of people in each of the three random samples who approved of the proposed law. Each of the random samples, then, gives us an estimate of the percentage of people in the total popula-tion of registered voters who approve of the handgun law. Unfortunately, we now have three separate estimates.

To rescue ourselves from this dilemma, let’s draw more and more samples of 100 registered voters each, question each of the samples con-cerning their approval or disapproval, and plot the new sample statistics on our summary graph. In drawing many such samples, we dis-cover that some of the new samples provide duplicate estimates, as in our earlier illustra-tion with 10 cases. Figure 6.9 shows the sam-pling distribution of hundreds of samples. This is often referred to as a normal or bell-shaped curve.

Notice that by increasing the number of samples selected and interviewed we have also increased the range of estimates provided by the sampling operation. In one sense, we have increased our dilemma in attempting to fi nd the parameter in the population. Fortunately, probability theory provides certain important rules about the sampling distribution shown in Figure 6.9.

Estimating Sampling ErrorProbability theory can help resolve our di-lemma with some basic statistical concepts. First, if many independent random samples are selected from a population, then the sample statistics provided by those samples will be dis-tributed around the population parameter in a known way. Thus, although Figure 6.9 shows a wide range of estimates, more of them are in the vicinity of 50 percent than elsewhere in the graph. Probability theory tells us, then, that the true value is in the vicinity of 50 percent.

Second, probability theory gives us a for-mula for estimating how closely the sample sta-tistics are clustered around the true value:

sp q

n�

where s is the standard error— defi ned as a mea-sure of sampling error—n is the number of cases

0 50 100

Percentage of voters approving of the proposed law

•••••••

806040200

Num

ber

of s

ampl

es

• • • • • • • • • • • • • • • • • •

• • • • • • • • • • • • • • • • •• • • • • • • • • •

• • • • • • • • • • •• • • • • • • • • • • •

• • • • • • • • • • • • •• • • • • • • • • • • • • •

• • • • • • • • • • • • • • •• • • • • • • • • • • • • • • •

• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •

• • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • •

• • • • • • • • • • • • • • • • • • • • • • • • • • • • •• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •

• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •

Figure 6.9 The Sampling Distribution

Chapter 6 Sampling 151

in each sample, and p and q are the population parameters for the binomial. If 60 percent of reg-istered voters approve of the ban on handguns and 40 percent disapprove, then p and q are 60 percent and 40 percent, or .6 and .4, respectively.

To see how probability theory makes it pos-sible for us to estimate sampling error, suppose that in reality 50 percent of the people approve of the proposed law and 50 percent disapprove. These are the population parameters we are try-ing to estimate with our samples. Recall that we have been selecting samples of 100 cases each. When these numbers are plugged into the for-mula, we get:

s ��

�.5 .5

100.05

The standard error equals .05, or 5 percent.In probability theory, the standard error is

a valuable piece of information because it in-dicates how closely the sample estimates will be distributed around the population param-eter. A larger standard error indicates sample estimates are widely dispersed, while a smaller standard error means that estimates are more clustered around a population parameter. Probability theory tells us that approximately 34 percent (.3413) of the sample estimates will fall within one standard error increment above the population parameter, and another 34 per-cent will fall within one standard error incre-ment below the parameter. In our example, the standard error increment is 5 percent, so we know that 34 percent of our samples will give estimates of approval between 50 percent (the parameter) and 55 percent (one standard error above); another 34 percent of the samples will give estimates between 50 and 45 percent (one standard error below the parameter). Taken to-gether, then, we know that roughly two-thirds (68 percent) of the samples will give estimates between 45 and 55 percent, which is within 5 percent of the parameter.

The standard error is also a function of the sample size—an inverse function. This means

that as the sample size increases, the standard error decreases. And as the sample size increases, the several samples will be clustered nearer to the true value. Figure 6.6 illustrates this clus-tering. Another rule of thumb is evident in the formula for the standard error: Because of the square root operation, the standard error is re-duced by half if the sample size is quadrupled. In our example, samples of 100 produce a stan-dard error of 5 percent; to reduce the standard error to 2.5 percent, we would have to increase the sample size to 400.

All of this information is provided by estab-lished probability theory in reference to the se-lection of large numbers of random samples. If the population parameter is known and many random samples are selected, probability the-ory allows us to predict how many of the sam-ples will fall within specifi ed intervals from the parameter.

Of course, this discussion illustrates only the logic of probability sampling. It does not de-scribe the way research is actually conducted. Usually, we do not know the parameter; we con-duct a sample survey precisely because we want to estimate that value. Moreover, we don’t actu-ally select large numbers of samples; we select only one sample. What probability theory does is provide the basis for making inferences about the typical research situation. Knowing what it would be like to select thousands of samples allows us to make assumptions about the one sample we do select and study.

Confi dence Levels and Confi dence IntervalsProbability theory specifi es that 68 percent of that fi ctitious large number of samples will produce estimates that fall within one standard error of the parameter. As researchers, we can turn the logic around and infer that any single random sample has a 68 percent chance of fall-ing within that range. In this regard, we speak of confi dence levels: we are 68 percent confi -dent that our sample estimate is within one standard error of the parameter. Or we may

152 Part Three Modes of Observation

say that we are 95 percent confi dent that the sample statistic is within two standard errors of the parameter, and so forth. Quite reasonably, our confi dence level increases as the margin for error is extended. We are virtually positive (99.9 percent) that our statistic is within three standard errors of the true value.

Although we may be confi dent (at some level) of being within a certain range of the pa-rameter, we seldom know what the parameter is. To resolve this dilemma, we substitute our sam-ple estimate for the parameter in the formula; lacking the true value, we substitute the best available guess.

The result of these inferences and estima-tions is that we are able to estimate a popula-tion parameter and also the expected degree of error on the basis of one sample drawn from a population. We begin with this question: what percentage of the registered voters in Placid Coast approve of the proposed handgun ban? We select a random sample of 100 registered voters and interview them. We might then re-port that our best estimate is that 50 percent of registered voters approve of the gun ban and that we are 95 percent confi dent that between 40 and 60 percent (plus or minus two standard errors) approve. The range from 40 to 60 per-cent is called the confi dence interval. At the 68 percent confi dence level, the confi dence in-terval is 45 to 55 percent.

The logic of confi dence levels and confi -dence intervals also provides the basis for deter-mining the appropriate sample size for a study. Once we decide on the sampling error we can tolerate, we can calculate the number of cases needed in our sample.

Probability Theory and Sampling Distribution Summed UpThis, then, is the basic logic of probability sam-pling. Random selection permits the researcher to link fi ndings from a sample to the body of probability theory so as to estimate the accuracy of those fi ndings. All statements of accuracy in sampling must specify both a confi dence level

and a confi dence interval. The researcher must report that he or she is x percent confi dent that the population parameter is between two spe-cifi c values.

In this example, we have demonstrated the logic of sampling error using a binomial vari-able—a variable analyzed in percentages. A dif-ferent statistical procedure would be required to calculate the standard error for a mean, but the overall logic is the same.

Notice that nowhere in this discussion did we consider the size of the population being studied. This is because the population size is almost always irrelevant. A sample of 2,000 respondents drawn properly to represent resi-dents of Vermont will be no more accurate than a sample of 2,000 drawn properly to represent residents in the United States, even though the Vermont sample would be a substantially larger proportion of that small state’s residents than would the same number chosen to repre-sent the nation’s residents. The reason for this counterintuitive fact is that the equations for calculating sampling error assume that the populations being sampled are infi nitely large, so all samples would equal zero percent of the whole.

Two cautions are in order before we con-clude this discussion of the basic logic of prob-ability sampling. First, the survey uses of prob-ability theory as discussed here are technically not wholly justifi ed. The theory of sampling distribution makes assumptions that almost never apply in survey conditions. The exact pro-portion of samples contained within specifi ed increments of standard errors mathematically assumes an infi nitely large population, an in-fi nite number of samples, and sampling with replacement—that is, every sampling unit se-lected is “thrown back into the pot” and could be selected again. Second, our discussion has greatly oversimplifi ed the inferential jump from the distribution of several samples to the probable characteristics of one sample.

We offer these cautions to provide per-spective on the uses of probability theory in

Chapter 6 Sampling 153

sampling. Researchers in criminal justice and other social sciences often appear to overesti-mate the precision of estimates produced by the use of probability theory. Variations in sam-pling techniques and nonsampling factors may further reduce the legitimacy of such estimates. For example, those selected in a sample who fail or refuse to participate further detract from the representativeness of the sample.

Nevertheless, the calculations discussed in this section can be extremely valuable to you in understanding and evaluating your data. Although the calculations do not provide as precise estimates as some researchers might as-sume, they can be quite valid for practical pur-poses. They are unquestionably more valid than less rigorously derived estimates based on less-rigorous sampling methods. Most important, being familiar with the basic logic underlying the calculations can help you react sensibly both to your own data and to those reported by others.

Populations and Sampling FramesThe correspondence between a target population and sampling frames affects the generalizability of samples.

Although as researchers and as consumers of research we need to understand the theoretical foundations of sampling, it is no less important to appreciate the less-than-perfect conditions that exist in the fi eld. One aspect of fi eld con-ditions that requires a compromise in terms of theoretical conditions and assumptions is the relationship between populations and sam-pling frames.

A sampling frame is the list or quasi-list of elements from which a probability sample is se-lected. We say quasi-list because, even though an actual list might not exist, we can draw samples as if there were a list. Properly drawn samples provide information appropriate for describ-ing the population of elements that compose

the sampling frame—nothing more. This point is important in view of the common tendency for researchers to select samples from a particu-lar sampling frame and then make assertions about a population that is similar, but not identical, to the study population defi ned by the sampling frame.

For example, if we want to study the atti-tudes of corrections administrators toward de-terminant sentencing policies, we might select a sample by consulting the membership roster of the American Correctional Association. In this case, the membership roster is our sampling frame, and corrections administrators are the population we wish to describe. However, un-less all corrections administrators are members of the American Correctional Association and all members are listed in the roster, it would be incorrect to generalize results to all corrections administrators.

Studies of organizations are often the sim-plest from a sampling standpoint because or-ganizations typically have membership lists. In such cases, the list of members may be an ac-ceptable sampling frame. If a random sample is selected from a membership list, then the data collected from that sample may be taken as rep-resentative of all members—if all members are included in the list. It is, however, imperative that researchers learn how complete or incom-plete such lists might be and limit their gener-alizations to listed sample elements rather than to an entire population.

Other lists of individuals may be especially relevant to the research needs of a particu-lar study. Lists of licensed drivers, automobile owners, welfare recipients, taxpayers, holders of weapons permits, and licensed professionals are just a few examples. Although it may be dif-fi cult to gain access to some of these lists, they provide excellent sampling frames for special-ized research purposes.

Telephone directories are frequently used for “quick and dirty” public opinion polls. Undeniably, they are easy and inexpensive to use, and that is no doubt the reason for their

154 Part Three Modes of Observation

popularity. Still, they have several limitations. A given directory will not include new sub-scribers or those who have requested unlisted numbers. Sampling is further complicated by the inclusion of nonresidential listings in di-rectories. Moreover, telephone directories are sometimes taken to be a listing of a city’s popu-lation, which is simply not the case. Lower-in-come people are less likely to have telephones, and higher-income people may have more than one line. A growing number of households are served only by wireless phone service and so are not listed in directories. A recent national study reported that 7 percent of households had only wireless phones and 2 percent had no telephone service (Blumberg, Luke, and Cynamon 2006). Telephone companies may not publish list-ings for temporary residents such as students. And persons who live in institutions or group quarters— dormitories, nursing homes, room-ing houses, and the like—are not listed in phone directories.

Street directories and tax maps are often used for easy samples of households, but they also may suffer from incompleteness and pos-sible bias. For example, in strictly zoned urban regions, illegal housing units are unlikely to ap-pear on offi cial records. As a result, such units have no chance for selection and sample fi nd-ings will not be representative of those units, which are often substandard and overcrowded.

In a more general sense, it’s worth viewing sampling frames as operational defi nitions of a study population. Just as operational defi ni-tions of variables describe how abstract con-cepts will be measured, sampling frames serve as a real-world version of an abstract study population. For example, we may want to study how criminologists deal with ethical issues in their research. We don’t know how many crimi-nologists exist out there, but we can develop a general idea about the population of crimi-nologists. We could also operationalize the con-cept by using the membership directory for the American Society of Criminology—that list is our operational defi nition of criminologist.

Types of Sampling DesignsDifferent types of sampling designs can be used alone or in combination for different research purposes.

The illustrations we have considered so far have been based on simple random sampling. How-ever, researchers have a number of options in choosing their sampling method, each with its own advantages and disadvantages.

Simple Random SamplingSimple random sampling forms the basis of probability theory and the statistical tools we use to estimate population parameters, standard error, and confi dence intervals. More accurately, such statistics assume unbiased sampling, and simple random sampling is the foundation of unbiased sampling.

Once a sampling frame has been established in keeping with the guidelines we presented, to use simple random sampling, the researcher as-signs a single number to each element in the list, not skipping any number in the process. A table of random numbers, or a computer pro-gram for generating them, is then used to select elements for the sample.

If the sampling frame is a computerized da-tabase or some other form of electronic data, a simple random sample can be selected by computer. In effect, the computer program numbers the elements in the sampling frame, generates its own series of random numbers, and prints out the list of elements selected.

Systematic SamplingSimple random sampling is seldom used in prac-tice, primarily because it is not usually the most effi cient method, and it can be tedious if done manually. It typically requires a list of elements. And when such a list is available, researchers usually use systematic sampling rather than simple random sampling.

In systematic sampling, the researcher chooses all elements in the list for inclusion in the sample. If a list contains 10,000 elements and we want a sample of 1,000, we select every

Chapter 6 Sampling 155

10th element for our sample. To ensure against any possible human bias, we should select the fi rst element at random. Thus, to systematically select 1,000 from a list of 10,000 elements, we begin by selecting a random number between 1 and 10. The element having that number, plus every 10th element following it, is included in the sample. This method technically is referred to as a systematic sample with a random start.

In practice, systematic sampling is virtually identical to simple random sampling. If the list of elements is indeed randomized before sam-pling, one might argue that a systematic sample drawn from that list is, in fact, a simple random sample.

Systematic sampling has one danger. A pe-riodic arrangement of elements in the list can make systematic sampling unwise; this arrange-ment is usually called periodicity. If the list of elements is arranged in a cyclical pattern that coincides with the sampling interval, a biased sample may be drawn. Suppose we select a sam-ple of apartments in an apartment building. If the sample is drawn from a list of apartments arranged in numerical order (for example, 101, 102, 103, 104, 201, 202, and so on), there is a danger of the sampling interval coinciding with the number of apartments on a fl oor or some multiple of it. Then the samples might include only northwest-corner apartments or only apartments near the elevator. If these types of apartments have some other particular charac-teristic in common (for example, higher rent), the sample will be biased. The same potential danger would apply in a systematic sample of houses in a subdivision arranged with the same number of houses on a block.

In considering a systematic sample from a list, then, we need to carefully examine the na-ture of that list. If the elements are arranged in any particular order, we have to fi gure out whether that order will bias the sample to be selected and take steps to counteract any pos-sible bias.

In summary, systematic sampling is usually superior to simple random sampling, in terms

of convenience if nothing else. Problems in the ordering of elements in the sampling frame can usually be remedied quite easily.

Stratifi ed SamplingWe have discussed two methods of selecting a sample from a list: random and systematic. Stratifi cation is not an alternative to these methods, but it represents a possible modifi -cation in their use. Simple random sampling and systematic sampling both ensure a degree of representativeness and permit an estimate of the sampling error present. Stratifi ed sam-pling is a method for obtaining a greater degree of representativeness— decreasing the probable sampling error. To understand why that is the case, we must return briefl y to the basic theory of sampling distribution.

Recall that sampling error is reduced by two factors in the sample design: (1) a large sample produces a smaller sampling error than a small sample does, and (2) a homogeneous popula-tion produces samples with smaller sampling errors than a heterogeneous population does. If 99 percent of the population agrees with a cer-tain statement, it is extremely unlikely that any probability sample will greatly misrepresent the extent of agreement. If the population is split 50–50 on the statement, then the sampling er-ror will be much greater.

Stratifi ed sampling is based on this second factor in sampling theory. Rather than select-ing our sample from the total population at large, we select appropriate numbers of ele-ments from homogeneous subsets of that pop-ulation. To get a stratifi ed sample of university students, for example, we fi rst organize our population by college class and then draw ap-propriate numbers of freshmen, sophomores, juniors, and seniors. In a nonstratifi ed sample, representation by class is subject to the same sampling error as other variables. In a sample stratifi ed by college class, the sampling error on that variable is reduced to zero.

Even more complex stratifi cation methods are possible. In addition to stratifying by class,

156 Part Three Modes of Observation

we might also stratify by gender, grade point average, and so forth. In this fashion, we could ensure that our sample contains the proper numbers of freshman men with a 4.0 average, freshman women with a 4.0 average, and so forth.

The ultimate function of stratifi cation, then, is to organize the population into homoge-neous subsets (with heterogeneity between subsets) and to select the appropriate number of elements from each. To the extent that the subsets are homogeneous on the stratifi cation variables, they may also be homogeneous on other variables. Because age is usually related to college class, a sample stratifi ed by class will be more representative in terms of age as well.

The choice of stratifi cation variables typi-cally depends on what variables are available and what variables might help reduce sampling error for a particular study. Gender can often be determined in a list of names. Many local government sources of information on hous-ing units are arranged geographically. Age, race, education, occupation, and other variables are often included on lists of persons who have had contact with criminal justice offi cials.

In selecting stratifi cation variables, however, we should be concerned primarily with those that are presumably related to the variables we want to represent accurately. Because gender is related to many variables and is often available for stratifi cation, it is frequently used. Age and race are related to many variables of interest in criminal justice research. Income is also related to many variables, but it is often not available for stratifi cation. Geographic location within a city, state, or nation is related to many things. Within a city, stratifi cation by geographic loca-tion usually increases representativeness in so-cial class and ethnicity.

Stratifi ed sampling ensures the proper rep-resentation of the stratifi cation variables to enhance representation of other variables re-lated to them. Taken as a whole, then, a strati-fi ed sample is likely to be more representative

on a number of variables than a simple random sample is.

Disproportionate Stratifi ed SamplingAnother use of stratifi cation is to purposively produce samples that are not representative of a population on some variable, referred to as disproportionate stratifi ed sampling. Be-cause the purpose of sampling, as we have been discussing, is to represent a larger population, you may wonder why anyone would want to intentionally produce a sample that was not representative.

To understand the logic of disproportionate stratifi cation, consider again the role of popula-tion homogeneity in determining sample size. If members of a population vary widely on some variable of interest, then larger samples must be drawn to adequately represent the larger sampling error in that population. Similarly, if only a small number of people in a popula-tion exhibit some attribute or characteristic of interest, then a large sample must be drawn to produce adequate numbers of elements that exhibit the uncommon condition. Dispropor-tionate stratifi cation is a way of obtaining suf-fi cient numbers of these rare cases by selecting a number disproportionate to their representa-tion in the population.

The best example of disproportionate sam-pling in criminal justice is a national crime survey in which one goal is to obtain some min-imum number of crime victims in a sample. Be-cause crime victimization for certain offenses—such as robbery or aggravated assault—is rela-tively rare on a national scale, persons who live in large urban areas, where serious crime is more common, are disproportionately sampled.

The British Crime Survey (BCS) is a na-tionwide survey of people aged 16 and over in England and Wales. Over its fi rst 20 years (since 1982) the BCS selectively oversampled people or areas to yield larger numbers of designated subjects than would result from proportionate

Chapter 6 Sampling 157

random samples of the population of England and Wales. The BCS conducted in 2000 in-cluded special questions to better understand contacts between ethnic minorities and police, and ethnic minorities were disproportionately oversampled to produce a large enough num-ber of ethnic minority respondents for later analysis (Kershaw, Budd, Kinshott, et al. 2000).

Multistage Cluster SamplingThe preceding sections have described reason-ably simple procedures for sampling from lists of elements. Unfortunately, however, many in-teresting research problems require the selec-tion of samples from populations that cannot easily be listed for sampling purposes—that is, sampling frames are not readily available. Ex-amples are the population of a city, state, or na-tion and all police offi cers in the United States. In such cases, the sample design must be much more complex. Such a design typically involves the initial sampling of groups of elements—clusters—followed by the selection of elements within each of the selected clusters. This proce-dure yields multistage cluster samples.

Cluster sampling may be used when it is ei-ther impossible or impractical to compile an ex-haustive list of the elements that compose the target population, such as all law enforcement offi cers in the United States. Often, however, population elements are already grouped into subpopulations, and a list of those subpopula-tions either exists or can be created.

Population elements, or aggregations of those elements, are referred to as samplingunits. In the simplest forms of sampling, ele-ments and units are the same thing—usually people. But in cases in which a listing of ele-ments is not available, we can often use some other unit that includes a grouping of elements.

Because U.S. law enforcement offi cers are employed by individual cities, counties, or states, it is possible to create lists of those po-litical units. For cluster sampling, then, we could sample the list of cities, counties, and

states in some manner as discussed previously (for example, a systematic sample stratifi ed by population). Next, we could obtain lists of law enforcement offi cers from agencies in each of the selected jurisdictions. We could then sam-ple each of the lists to provide samples of police offi cers for study.

Another typical situation concerns sam-pling among population areas such as a city. Although there is no single list of a city’s popu-lation, citizens reside on discrete city blocks or census blocks. It is possible, therefore, to select a sample of blocks initially, create a list of per-sons who live on each of the selected blocks, and then sample persons from that list. In this case, blocks are treated as the primary sampling unit.

In a more complex design, we might sample blocks, list the households on each selected block, sample the households, list the persons who reside in each household, and, fi nally, sam-ple persons within each selected household. This multistage sample design will lead to the ultimate selection of a sample of individuals without requiring the initial listing of all indi-viduals in the city’s population.

Multistage cluster sampling, then, involves the repetition of two basic steps: listing and sampling. The list of primary sampling units (city blocks) is compiled and perhaps stratifi ed for sampling. Next, a sample of those units is selected. The list of secondary sampling units is then sampled, and the process continues.

Cluster sampling is highly recommended for its effi ciency, but the price of that effi ciency is a less accurate sample. A simple random sample drawn from a population list is subject to a sin-gle sampling error, but a two-stage cluster sam-ple is subject to two sampling errors. First, the initial sample of clusters represents the popula-tion of clusters only within a range of sampling error. Second, the sample of elements selected within a given cluster represents all the elements in that cluster only within a range of sampling error. Thus, for example, we run a certain risk of selecting a sample of disproportionately wealthy

158 Part Three Modes of Observation

city blocks, plus a sample of disproportionately wealthy households within those blocks. The best solution to this problem involves the num-ber of clusters selected initially and the number of elements selected within each cluster.

Recall that sampling error is reduced by two factors: (1) an increase in the sample size and (2) increased homogeneity of the elements be-ing sampled. These factors operate at each level of a multistage sample design. A sample of clusters will best represent all clusters if a large number are selected and if all clusters are very much alike. A sample of elements will best rep-resent all elements in a given cluster if a large number are selected from the cluster and if all the elements in the cluster are very much alike.

A good general guideline for cluster de-sign is to maximize the number of clusters se-lected while decreasing the number of elements within each cluster. But this scientifi c guideline must be balanced against an administrative constraint. The effi ciency of cluster sampling is based on the ability to minimize the list of pop-ulation elements. By initially selecting clusters, we need only list the elements that make up the selected clusters, not all elements in the entire population. Increasing the number of clusters, however, reduces this effi ciency in cluster sam-pling. A small number of clusters may be listed more quickly and more cheaply than a large number. Remember that all the elements in a selected cluster must be listed, even if only a few are to be chosen in the sample.

The fi nal sample design will refl ect these two constraints. In effect, we will probably se-lect as many clusters as we can afford. So as not to leave this issue too open-ended, here is a rule of thumb: population researchers convention-ally aim for the selection of 5 households per census block. If a total of 2,000 households are to be interviewed, researchers select 400 blocks and interview 5 households on each. Figure 6.10 presents a graphic overview of this process.

As we turn to more detailed procedures in cluster sampling, keep in mind that this method almost inevitably involves a loss of accuracy.

First, as noted earlier, a multistage sample de-sign is subject to a sampling error at each stage. Because the sample size is necessarily smaller at each stage than the total sample size, the sam-pling error at each stage will be greater than would be the case for a single-stage random sample of elements. Second, sampling error is estimated on the basis of observed variance among the sample elements. When those ele-ments are drawn from relatively homogeneous clusters, the estimated sampling error will be too optimistic and so must be corrected in light of the cluster sample design.

Multistage Cluster Sampling with Stratifi cationThus far we have looked at cluster sampling as though a simple random sample were selected at each stage of the design. In fact, we can use strat-ifi cation techniques to refi ne and improve the sample being selected. The basic options avail-able are essentially the same as those possible in single-stage sampling from a list. In selecting a national sample of law enforcement offi cers, we might initially stratify our list of agencies by type (state, county, municipal), geographic re-gion, size, and rural or urban location.

Once the primary sampling units (law en-forcement agencies) have been grouped ac-cording to the relevant, available stratifi cation variables, either simple random or systematic sampling techniques can be used to select the sample. We might select a specifi ed number of units from each group or stratum, or we might arrange the stratifi ed clusters in a continuous list and systematically sample that list.

To the extent that clusters are combined into homogeneous strata, the sampling error at this stage will be reduced. The primary goal of stratifi cation, as before, is homogeneity.

In principle, stratifi cation can take place at each level of sampling. The elements listed within a selected cluster might be stratifi ed before the next stage of sampling. Typically, however, that is not done because we strive for relative homogeneity within clusters. If clusters

Chapter 6 Sampling 159

Figure 6.10 Multistage Cluster Sampling

Stage One: Identifyblocks and selecta sample. (Selectedblocks are shaded.)

Stage Two: Go to eachselected block and listall households in order.(Example of one listed block.)

Stage Three: Foreach list, select asample of households.(In this example, everysixth household hasbeen selected startingwith #5, which wasselected at random.)

1st St.

2nd St.

3rd St.

4th St.

5th St.

Par

sley

Ave

.

Sag

e A

ve.

Ros

emar

y A

ve.

Thy

me

Ave

.

Rob

inso

n A

ve.

Box

er A

ve.

Brid

ge A

ve.

1.2.3.4.5.6.7.8.9.

10.11.12.13.14.15.16.17.18.19.20.21.22.23.24.25.26.27.28.29.30.

491 Rosemary Ave.487 Rosemary Ave.473 Rosemary Ave.455 Rosemary Ave.437 Rosemary Ave. • •

423 Rosemary Ave.411 Rosemary Ave.403 Rosemary Ave.1101 4th St.1123 4th St.1137 4th St. • •

1157 4th St.1169 4th St.1187 4th St.402 Thyme Ave.408 Thyme Ave.424 Thyme Ave. • •

446 Thyme Ave.458 Thyme Ave.480 Thyme Ave.498 Thyme Ave.1186 5th St.1174 5th St. • •

1160 5th St.1140 5th St.1122 5th St.1118 5th St.1116 5th St.1104 5th St. • •

1102 5th St.

160 Part Three Modes of Observation

are suffi ciently similar, it is not necessary to stratify again.

Illustration: Two National Crime SurveysTwo national crime surveys show different ways of designing samples to achieve desired results.

Our discussion of sampling designs suggests that researchers can combine many different techniques of sampling and their various com-ponents in different ways to suit various needs. In fact, the different components of sampling can be tailored to specifi c purposes in much the same way that research design principles can be modifi ed to suit various needs. Because sample frames suitable for simple random sampling are often unavailable, researchers use multi-stage cluster sampling to move from aggregate sample units to actual sample elements. We can add stratifi cation to ensure that samples are representative of important variables. And we can design samples to produce elements that are proportionate or disproportionate to the population.

Two national crime surveys illustrate how these various building blocks may be com-bined in complex ways: (1) the National Crime Victimization Survey (NCVS), conducted by the Census Bureau, and (2) the British Crime Survey (BCS). Each is a multistage cluster sam-ple, but the two surveys use different strategies for sampling to produce suffi cient numbers of respondents in different categories. Our sum-mary description is adapted from BJS (2006) for the NCVS and from Sian Nicholas and asso-ciates (2007) for the BCS. Essays in the volume edited by Hough and Maxfi eld (2007) trace the history and development of the BCS.

The National Crime Victimization SurveyAlthough various parts of the NCVS have been modifi ed since the surveys were begun in 1972, the basic sampling strategies have remained rel-

atively constant. The most signifi cant changes have been fl uctuations in sample size and a shift to telephone interviewing, with samples of telephone number listings eventually leading to households. In 2006, the NCVS changed inter-viewing procedures and revised samples some-what to account for population movement from central cities (Rand and Catalano 2007).

The NCVS seeks to represent the nationwide population of persons aged 12 and over who are living in households. We noted in Chapter 6 that the phrase “living in households” is sig-nifi cant; this is especially true in our current discussion of sampling. NCVS procedures are not designed to sample homeless persons or people who live in institutional settings such as military group housing, temporary hous-ing, or correctional facilities. Also, because the sample targets persons who live in households, it cannot provide estimates of crimes in which a commercial establishment or business is the victim.

Because there is no national list of house-holds in the United States, multistage cluster sampling must be used to proceed from larger units to households and their residents. The national sampling frame used in the fi rst stage defi nes primary sampling units (PSUs) as large metropolitan areas, nonmetropolitan counties, or groups of contiguous counties (to represent rural areas).

The largest 93 PSUs are specifi ed as self-representing and are automatically included in the fi rst stage of sampling. The remaining PSUs are stratifi ed by size, population density, reported crimes, and other variables. An addi-tional 110 non-self-representing PSUs are then selected with a probability proportionate to the population of the PSU. Thus, if one stratum includes Bugtussle, Texas (population 7,000), Punkinseed, Indiana (5,000), and Rancid, Mis-souri (3,000), the probability that each PSU will be selected is 7 in 15 for Bugtussle, 5 in 15 for Punkinseed, and 3 in 15 for Rancid.

The second stage of sampling involves des-ignating four different sampling frames within

Chapter 6 Sampling 161

each PSU. Each of these frames is used to se-lect different types of subsequent units. First, the housing unit frame lists addresses of hous-ing units from census records. Second, a group quarters frame lists group quarters such as dor-mitories and rooming houses from census re-cords. Third, a building permit frame lists newly constructed housing units from local govern-ment sources. Finally, an area frame lists census blocks (physical geographic units), from which independent address lists are generated and sampled. Notice that these four frames are nec-essary because comprehensive, up-to-date lists of residential addresses are not available in this country.

For the 2005 NCVS, these procedures yielded a sample of approximately 39,000 housing units. Completed interviews were obtained from about 67,000 individuals living in households. The sample design for the NCVS is an excellent illus-tration of the relationship between sample size and variation in the target population. Because serious crime is a relatively rare event when aver-aged across the entire U.S. population, very large samples must be drawn. And because no single list of the target population exists, samples are drawn in several stages.

For further information, consult the NCVS documentation maintained by the Bureau of Justice Statistics (www.ojp.usdoj.gov/bjs/cvictgen.htm; accessed May 8, 2008). Also see the “National Crime Victimization Survey Re-source Guide,” maintained at the National Ar-chive of Criminal Justice Data (www.icpsr.um-ich.edu/NACJD/NCVS; accessed May 8, 2008).

The British Crime SurveyWe have seen that NCVS sampling procedures begin with demographic units and work down to selection of housing units. BCS sampling is simplifi ed by the existence of a national list of something close to addresses. The Postcode Address File (PAF) lists postal delivery points nationwide and is further subdivided to distin-guish “small users,” those addresses receiving less than 50 items per day. Even though 50 pieces

of mail might still seem like quite a bit, this classifi cation makes it easier to distinguish household addresses from commercial ones.

Postcode sectors, roughly corresponding to U.S. fi ve-digit zip codes, are easily defi ned clus-ters of addresses from the PAF. Samples of ad-dresses are then selected from within these sec-tors. In most cases, 32 addresses are selected from within the postcode.

In addition, BCS researchers devised “booster samples” to increase the number of respondents who were ethnic minorities or aged 16 to 24. Victimization experiences of eth-nic minorities were of special interest to police and other public offi cials. Young people were oversampled to complete a special question-naire of self-report behavior items.

The ethnic minority booster was accom-plished by fi rst selecting respondents using formal sampling procedures. Interviewers then sought information about four housing units adjacent to the selected unit in an effort to de-termine if any residents were nonwhite. If ad-jacent units housed minority families, one was selected to be interviewed for the ethnic minor-ity booster sample. This is an example of what Steven Thompson (1997) calls “adaptive sam-pling.” Probability samples are selected, and then those respondents are used to identify other individuals who meet some criterion. In-creasing the number of respondents aged 16 to 24 was simpler—interviewers sought additional respondents in that age group within sampled households.

One fi nal sampling dimension refl ects the regional organization of police in England and Wales into 43 police areas. The BCS was fur-ther stratifi ed to produce 600 to 700 interviews in each police area to support analysis within those areas.

Apart from the young-person booster, once individual households were selected one person age 16 or over was randomly chosen to pro-vide information for all household members. Sampling procedures initially produced about 54,700 addresses for the year 2004 BCS. About

162 Part Three Modes of Observation

8 percent of these were eliminated because they were vacant, had been demolished, or contained a business, not a private household. Of the remaining 50,000 addresses, interviews were completed with 37,213 individuals for a response rate of about 74 percent.

Although sampling designs for both the BCS and the NCVS are more complex than we have represented in this discussion, the impor-tant point is how multistage cluster sampling is used in each. Notice two principal differences between the samples. First, the NCVS uses proportionate sampling to select a large number of respondents who may then represent the rel-atively rare attribute of victimization. The BCS samples a disproportionate number of minority and young residents, who are more likely to be victims of crime. Second, sampling procedures for the BCS are somewhat simpler than those for the NCVS, largely due to the existence of a suitable sampling frame at the national level. Stratifi cation and later-stage sampling are con-ducted to more effi ciently represent each police area and to oversample minority respondents.

Probability Sampling in ReviewDepending on the fi eld situation, probability sampling can be very simple or extremely com-plex, time consuming, and expensive. Whatever the situation, however, it is usually the pre-ferred method for selecting study elements. It’s worth restating the two main reasons for this.

First, probability sampling avoids conscious or unconscious biases in element selection on the part of the researcher. If all elements in the population have an equal (or unequal and sub-sequently weighted) chance of selection, there is an excellent chance that the sample so se-lected will closely represent the population of all elements.

Second, probability sampling permits esti-mates of sampling error. Although no probabil-ity sample will be perfectly representative in all respects, controlled selection methods permit the researcher to estimate the degree of expected error.

Despite these advantages, it is sometimes im-possible to use standard probability sampling methods. Sometimes, it isn’t even appropriate to do so. In those cases, researchers turn to non-probability sampling.

Nonprobability SamplingIn many research applications, nonprobability sam-ples are necessary or advantageous.

You can no doubt envision situations in which it would be either impossible or unfeasible to select the kinds of probability samples we have described. Suppose we want to study auto thieves. There is no list of all auto thieves, nor are we likely to be able to create anything other than a partial and highly selective list. More-over, probability sampling is sometimes inap-propriate even if it is possible. In many such sit-uations, nonprobability sampling proceduresare called for. Recall that probability samples are defi ned as those in which the probability that any given sampling element will be se-lected is known. Conversely, in nonprobability sampling, the likelihood that any given element will be selected is not known.

We’ll examine four types of nonprobability samples in this section: (1) purposive or judg-mental sampling, (2) quota sampling, (3) the reliance on available subjects, and (4) snowball sampling.

Purposive SamplingOccasionally, it may be appropriate to select a sample on the basis of our own knowledge of the population, its elements, and the nature of our research aims—in short, based on our judg-ment and the purpose of the study. Such a sam-ple is called a purposive sample.

We may wish to study a small subset of a larger population in which many members of the subset are easily identifi ed, but the enumer-ation of all of them would be nearly impossible. For example, we might want to study members of community crime prevention groups; many members are easily visible, but it is not feasible

Chapter 6 Sampling 163

to defi ne and sample all members of commu-nity crime prevention organizations. In study-ing a sample of the most visible members, however, we may collect data suffi cient for our purposes.

Criminal justice research often compares practices in different jurisdictions, such as cit-ies or states. In such cases, study elements may be selected because they exhibit some particu-lar attribute. For instance, Cassia Spohn and Julie Horney (1991) were interested in how differences among states in rape shield laws affected the use of evidence in sexual assault cases. Strong rape shield laws restricted the use of evidence or testimony about a rape victim’s sexual behavior, whereas weak laws routinely permitted such testimony. Spohn and Horney selected a purposive sample of six states for analysis based on the strength of their rape shield laws. Similarly, Michael Leiber and Jayne Stairs (1999) were interested in how economic inequality combined with race to affect sentenc-ing practices in Iowa juvenile courts. After con-trolling for economic status, they found that African American defendants received more restrictive sentences than white defendants. Leiber and Stairs selected three jurisdictions purposively to obtain sample elements with ad-equate racial diversity in the state of Iowa. The researchers then selected more than 5,000 juve-nile cases processed in those three courts.

Researchers may also use purposive or judg-mental sampling to represent patterns of com-plex variation. In their study of closed-circuit television (CCTV) systems, Martin Gill and Angela Spriggs (2005) describe how sites were sampled to refl ect variation in type of area (resi-dential, commercial, city center, large parking facilities). Some individual CCTV projects were selected because of certain specifi c features—they were installed in a high-crime area, or the CCTV setup was notably expensive. One ele-ment of this study involved interviews to assess changes in fear of crime following CCTV instal-lation. Spriggs and associates (2005) sampled passers-by on city center streets. They fi rst se-

lected purposive samples of areas and spread their interviews across four day/time periods. This was done to refl ect variation in the types of people encountered on different streets at different times. Sampling strategies were thus adapted because of expected heterogeneity that would have been diffi cult to capture with ran-dom selection.

Pretesting a questionnaire is another situa-tion in which purposive sampling is common. If we plan to study people’s attitudes about court-ordered restitution for crime victims, we might want to test the questionnaire on a sample of crime victims. Instead of selecting a probability sample of the general population, we might select some number of known crime victims, perhaps from court records.

Quota SamplingLike probability sampling, quota sampling ad-dresses the issue of representativeness, although the two methods approach the issue quite dif-ferently. Obtaining a quota sample begins with a matrix or table describing the characteristics of the target population we wish to represent. To do this, we need to know, for example, what proportion of the population is male or female and what proportions fall into various age cat-egories, education levels, ethnic groups, and so forth. In establishing a national quota sample, we need to know what proportion of the na-tional population is, say, urban, eastern, male, under 25, white, working-class, and all the com-binations of these attributes.

Once we have created such a matrix and assigned a relative proportion to each cell in the matrix, we can collect data from people who have all the characteristics of a given cell. We then assign all the persons in a given cell a weight appropriate to their portion of the total population. When all the sample elements are weighted in this way, the overall data should provide a reasonable representation of the total population.

Although quota sampling may resemble prob ability sampling, it has two inherent prob-

164 Part Three Modes of Observation

It is generally best justifi ed if the researcher wants to study the characteristics of people who are passing the sampling point at some specifi ed time. For example, in her study of street lighting as a crime prevention strategy, Painter (1996) interviewed samples of pedes-trians as they walked through specifi ed areas of London just before and six weeks after im-provements were made in lighting conditions. Painter clearly understood the scope and limits of this sampling technique. Her fi ndings are de-scribed as applying to people who actually use area streets after dark, while recognizing that this population may be quite different from the population of area residents. Interviewing a sample of available evening pedestrians is an appropriate sampling technique for generaliz-ing to the population of evening pedestrians, and the population of pedestrians will not be the same as the population of residents.

In a more general sense, samples like Paint-er’s select elements of a process—the process that generates evening pedestrians—rather than el-ements of a population. If we can safely assume that no systematic pattern generates elements of a process, then a sample of available elements as they happen to pass by can be considered to be representative. If you are interested in study-ing crimes reported to police, a sample of, say, every seventh crime report over a two-month period will be representative of the general pop-ulation of crime reports over that two-month period.

Sometimes nonprobability and probability sampling techniques can be combined. For ex-ample, most attempts to sample homeless or street people rely on available subjects found in shelters, parks, or other locations. Semaan and associates (2002) suggest that once areas are found where homeless people congregate, individuals there can be enumerated and then sampled. Here’s a semi-hypothetical example.

In recent years, Maxfi eld has observed that many people who appear to be homeless con-gregate at the corner of 9th Avenue and 41st Street in Manhattan. An effi cient strategy for

lems. First, the quota frame (the proportions that different cells represent) must be accu-rate, and it is often diffi cult to get up-to-date information for this purpose. A quota sample of auto thieves or teenage vandals would obvi-ously suffer from this diffi culty. Second, biases may exist in the selection of sample elements within a given cell— even though its proportion of the population is accurately estimated. An interviewer instructed to interview fi ve persons who meet a given complex set of characteristics may still avoid people who live at the top of seven-story walk-ups, have particularly run-down homes, or own vicious dogs.

Quota and purposive sampling may be com-bined to produce samples that are intuitively, if not statistically, representative. For example, Kate Painter and David Farrington (1998) de-signed a survey to study marital and partner violence. They wanted to represent several vari-ables: marital status, age, an occupational mea-sure of social status, and each of 10 standard regions in the United Kingdom. A probability sample was rejected because the authors wished to get adequate numbers of respondents in each of several categories, and some of the catego-ries were thought to be relatively uncommon. Instead, the authors selected quota samples of 100 women in each of 10 regions and sought equal numbers of respondents in each of fi ve occupational status categories.

Reliance on Available SubjectsRelying on available subjects—that is, stop-ping people at a street corner or some other location—is sometimes misleadingly called “convenience sampling.” University researchers frequently conduct surveys among the students enrolled in large lecture classes. The ease and economy of such a method explain its popu-larity; however, it seldom produces data of any general value. It may be useful to pretest a ques-tionnaire, but it should not be used for a study purportedly describing students as a whole.

Reliance on available subjects can be an ap-propriate sampling method in some situations.

Chapter 6 Sampling 165

was ingenious. The Port Authority drew a sample of outgoing buses . . . and placed representatives aboard. After the bus had departed, he or she would hand out a questionnaire to be completed during the trip . . . [and] collect these questionnaires as each customer arrived at the destina-tion. This procedure produced a very high response rate and high completion rate for each item.

Snowball SamplingAnother type of nonprobability sampling that closely resembles the available-subjects ap-proach is called snowball sampling. Commonly used in fi eld observation studies or specialized interviewing, snowball sampling begins by identifying a single subject or small number of subjects and then asking the subject(s) to iden-tify others like him or her who might be willing to participate in a study.

Criminal justice research on active crimi-nals or deviants frequently uses snowball sam-pling techniques. The researcher often makes an initial contact by consulting criminal justice agency records to identify, say, someone con-victed of auto theft and placed on probation. That person is interviewed and asked to sug-gest other auto thieves whom researchers could contact. Stephen Baron and Timothy Hartna-gel (1998) studied violence among homeless youths in Edmonton, Canada, identifying their sample through snowball techniques. Similarly, snowball sampling is often used to study drug users and dealers. Leon Pettiway (1995) de-scribes crack cocaine markets in Philadelphia through the eyes of his snowball sample. Bruce Jacobs and Jody Miller (1998) accumulated a sample of 25 female crack dealers in St. Louis to study specifi c techniques to avoid arrest.

Contacting an initial subject or informant who will then refer the researcher to other sub-jects can be especially diffi cult in studies of ac-tive offenders. As in most aspects of criminal justice research, the various approaches to ini-tiating contacts for snowball sampling have

interviewing samples of homeless people would be a time-space sample where, for example, each hour individuals would be counted and some fraction sampled. Let’s say we wished to inter-view 30 people and spread those interviews over a six-hour period; we would try to interview fi ve people per hour. So each hour we would count the number of people within some specifi c area (say, 20 at 1:00 p.m.), then divide that number by fi ve to obtain a sampling fraction (4 in this case). Recalling our earlier discussion of system-atic probability sampling, we would then select a random starting point to identify the fi rst person to interview, then select the fourth per-son after that, and so on. This approach would yield an unbiased sample that represented the population of street people on one Manhattan corner over a six-hour period.

As it happens, 41st Street and 9th Avenue in Manhattan is the rear entrance to the Port Au-thority bus terminal. Marcus Felson and asso-ciates (1996) described efforts to reduce crime and disorder in the Port Authority terminal, a place they claim is the world’s busiest bus sta-tion. Among the most important objectives were to reduce perceptions of crime problems and to improve how travelers felt about the Port Au-thority terminal. These are research questions appropriate to some sort of survey. Because more than 170,000 passengers pass through the bus station on an average spring day, obtaining a suffi ciently large sample of users presents no diffi culty. The problem was how to select a sam-ple. Felson and associates point out that stop-ping passengers on their way to or from a bus was out of the question. Most passengers are commuters whose journey to and from work is timed to the minute, with none to spare for in-terviewers’ questions. Here’s how Felson and as-sociates describe the solution and the sampling strategy it embodied (1996, 90–91):

Response rates would have been low if the Port Authority had tried to interview rush-ing customers or to hand out question-naires to be returned later. Their solution

166 Part Three Modes of Observation

Like other elements of criminal justice re-search, sampling plans must be adapted to specifi c research applications. When it’s impor-tant to make estimates of the accuracy of our samples, and when suitable sampling frames are possible, we use probability sampling tech-niques. When no reasonable sampling frame is available, and we cannot draw a probability sample, we cannot make estimates about sam-ple accuracy. Fortunately, in such situations, we can make use of a variety of approaches for drawing nonprobability samples.

✪ Main Points• The logic of probability sampling forms the

foundation for representing large populations with small subsets of those populations.

• The chief criterion of a sample’s quality is the degree to which it is representative—the extent to which the characteristics of the sample are the same as those of the population from which it was selected.

• The most carefully selected sample is almost never a perfect representation of the popula-tion from which it was selected. Some degree of sampling error always exists.

• Probability sampling methods provide one excellent way of selecting samples that will be quite representative. They make it possible to estimate the amount of sampling error that should be expected in a given sample.

• The chief principle of probability sampling is that every member of the total population must have some known nonzero probability of being selected in the sample.

• Our ability to estimate population parameters with sample statistics is rooted in the sampling distribution and probability theory. If we draw a large number of samples of a given size, sam-ple statistics will cluster around the true popu-lation parameter. As sample size increases, the cluster becomes tighter.

• A variety of sampling designs can be used and combined to suit different populations and re-search purposes. Each type of sampling has its own advantages and disadvantages.

• Simple random sampling is logically the most fundamental technique in probability sampling although it is seldom used in practice.

• Systematic sampling involves using a sampling frame to select units that appear at some speci-

advantages and disadvantages. Beginning with subjects who have a previous arrest or convic-tion is usually the easiest method for research-ers, but it suffers from potential bias by de-pending on offenders who are known to police or other offi cials (McCall 1978).

Because snowball samples are used most commonly in fi eld research, we’ll return to this method of selecting subjects in Chapter 10 on fi eld methods and observation. In the mean-time, recent studies by researchers at the Uni-versity of Missouri–St. Louis offer good exam-ples of snowball samples of offenders that are not dependent on contacts with criminal jus-tice offi cials. Beginning with a street-savvy ex-offender, these researchers identifi ed samples of burglars (Wright and Decker 1994), mem-bers of youth gangs (Decker and Van Winkle 1996), and armed robbers (Wright and Decker 1997). It’s especially diffi cult to identify active offenders as research subjects, but these exam-ples illustrate notably clever uses of snowball sampling techniques.

Nonprobability Sampling in ReviewSnowball samples are essentially variations on purposive samples (we want to sample juve-nile gang members) and on samples of avail-able subjects (sample elements identify other sample elements that are available to us). Each of these is a nonprobability sampling tech-nique. And, like other types of nonprobability samples, snowball samples are most appropri-ate when it is impossible to determine the prob-ability that any given element will be selected in a sample. Furthermore, snowball sampling and related techniques may be necessary when the target population is diffi cult to locate or even identify. Selecting pedestrians who happen to pass by, for example, is not an effi cient way to select a sample of prostitutes or juvenile gang members. In contrast, approaching a pedestrian is an appropriate sampling method for study-ing pedestrians, whereas drawing a probability sample of urban residents to identify people who walk in specifi c areas of the city would be costly and ineffi cient.

Chapter 6 Sampling 167

✪ Review Questions and Exercises1. Discuss possible study populations, elements,

sampling units, and sampling frames for draw-ing a sample to represent the populations listed here. You may wish to limit your discus-sion to populations in a specifi c state or other jurisdiction.

a. Municipal police offi cers b. Felony court judges c. Auto thieves d. Licensed automobile drivers e. State police superintendents f. Persons incarcerated in county jails 2. What steps would be involved in selecting a

multistage cluster sample of undergraduate students taking criminal justice research meth-ods courses in U.S. colleges and universities?

3. Briefl y discuss some potential problems in draw-ing a sample of visitors to a popular website.

✪ Additional ReadingsKish, Leslie, Survey Sampling (New York: Wiley,

1965). Unquestionably the defi nitive work on sampling in social research. Kish’s coverage ranges from the simplest matters to the most complex and mathematical. He is both highly theoretical and downright practical. Easily read-able and diffi cult passages intermingle as Kish dissects everything you could want or need to know about each aspect of sampling.

Patton, Michael Quinn, Qualitative Research and Evaluation Methods, 3rd ed. (Thousand Oaks, CA: Sage, 2001). Though its focus is evaluation, this book presents one of the best discussions of nonprobability sampling available. Patton covers a wide range of variations on purposive sampling.

Semaan, Salaam, Jennifer Lauby, and Jon Liebman, “Street and Network Sampling in Evaluation Studies of HIV Risk-Reduction In-

fi ed interval—for example, every 8th, or 15th, or 1,023rd unit. This method is functionally equivalent to simple random sampling.

• Stratifi cation improves the representativeness of a sample by reducing the sampling error.

• Disproportionate stratifi ed sampling is espe-cially useful when we want to select adequate numbers of certain types of subjects who are rel-atively rare in the population we are studying.

• Multistage cluster sampling is frequently used when there is no list of all the members of a population.

• The NCVS and the BCS are national crime sur-veys based on multistage cluster samples. Sam-pling methods for each survey illustrate differ-ent approaches to representing relatively rare events.

• Nonprobability sampling methods are less sta-tistically representative and less reliable than probability sampling methods. However, they are often easier and cheaper to use.

• Purposive sampling is used when researchers wish to select specifi c elements of a population. This may be because the elements are believed to be representative of extreme cases or because they represent the range of variation expected in a population.

• In quota sampling, researchers begin with a de-tailed description of the characteristics of the total population and then select sample mem-bers in a way that includes the different com-posite profi les that exist in the population.

• In cases in which it’s not possible to draw non-probability samples through other means, re-searchers often rely on available subjects. Pro-fessors sometimes do this—students in their classes are available subjects.

• Snowball samples accumulate subjects through chains of referrals and are most commonly used in fi eld research.

✪ Key Termsbinomial variable,

p. 149cluster sample, p. 157confi dence interval,

p. 152confi dence level,

p. 151disproportionate

stratifi ed sam-pling, p. 156

equal probability of selection method (EPSEM), p. 144

nonprobabilitysample, p. 162

population, p. 145population param-

eter, p. 145probability sample,

p. 142

purposive sample, p. 162

quota sample, p. 163sample element,

p. 145sample statistic,

p. 145sampling distribu-

tion, p. 146sampling frame,

p. 149

sampling units, p. 157

simple random sample, p. 154

snowball sampling, p. 165

standard error, p. 150

stratifi cation, p. 155systematic sampling,

p. 154

168 Part Three Modes of Observation

(Washington, DC: U.S. Department of Justice, Offi ce of Justice Programs, Bureau of Justice Statistics, 1999). This short handbook offers good, basic advice on drawing samples for com-munity-level victimization surveys. The infor-mation on estimating sample size is especially good.

terventions,” AIDS Reviews 4(2002): 213–223. Many techniques used in public health research can cross over nicely for criminal justice stud-ies. This is a good example of creative sampling techniques for fi nding hard-to-fi nd people.

Weisel, Deborah, Conducting Community Surveys: A Practical Guide for Law Enforcement Agencies

169

Chapter 7

Survey Research and Other Ways of Asking QuestionsWe’ll examine how mail, interview, and telephone surveys can be used in criminal justice research. We’ll also consider other ways of collecting data by asking people questions.

Introduction 170

Topics Appropriate to Survey Research 171

Counting Crime 171

Self-Reports 171

Perceptions and Attitudes 172

Targeted Victim Surveys 172

Other Evaluation Uses 172

Guidelines for Asking Questions 173

Open-Ended and Closed-Ended Questions 173

Questions and Statements 174

Make Items Clear 174

Short Items Are Best 174

Avoid Negative Items 174

Biased Items and Terms 175

Designing Self-Report Items 175

Questionnaire Construction 177

General Questionnaire Format 177

Contingency Questions 177

Matrix Questions 178

Ordering Items in a Questionnaire 180

(continued)

170 Part Three Modes of Observation

IntroductionAsking people questions is the most common data collection method in social science.

A little-known survey was attempted among French workers in 1880. A German political so-ciologist mailed some 25,000 questionnaires to workers to determine the extent of their exploi-tation by employers. The rather lengthy ques-tionnaire included items such as these:

Does your employer or his representa-tive resort to trickery in order to defraud you of a part of your earnings? If you are paid piece rates, is the quality of the article made a pretext for fraudulent deductions from your wages?

The survey researcher in this case was not George Gallup but Karl Marx (1880, 208). Although 25,000 questionnaires were mailed out, there is

no record of any being returned. And you need not know much about survey methods to recog-nize the loaded questions posed by Marx.

Survey research is perhaps the most fre-quently used mode of observation in sociology and political science, and surveys are often used in criminal justice research as well. You have no doubt been a respondent in a survey, and you may have conducted surveys yourself.

We begin this chapter by discussing the crim-inal justice topics that are most appropriate for survey methods. Next, we cover the basic princi-ples of how to ask people questions for research purposes, including some of the details of ques-tionnaire construction. We describe the three basic ways of administering questionnaires—self-administration, face-to-face interviews, and telephone interviews—and summarize the strengths and weaknesses of each method. Af-ter discussing more specialized interviewing

DON’T START FROM

SCRATCH! 181

Self-AdministeredQuestionnaires 181

Mail Distribution and Return 182

Warning Mailings and Cover Letters 182

Follow-Up Mailings 183

Acceptable Response Rates 183

Computer-Based Self-Administration 184

In-Person Interview Surveys 185

The Role of the Interviewer 185

Coordination and Control 186

Computer-Assisted In-Person Interviews 187

Telephone Surveys 189

Computer-Assisted Telephone Interviewing 190

Comparison of the Three Methods 191

Strengths and Weaknesses of Survey Research 192

Other Ways of Asking Questions 194

Specialized Interviewing 194

Focus Groups 195

Should You Do It Yourself ? 196

Chapter 7 Survey Research and Other Ways of Asking Questions 171

of data collected by police. Of course, survey measures have their own shortcomings. Most of these diffi culties, such as recall error and reluctance to discuss victimization with in-terviewers, are inherent in survey methods. Nevertheless, victim surveys have become im-portant sources of data about the volume of crime in the United States and in other countries.

Self-ReportsSurveys that ask people about crimes they may have committed were also discussed in Chapter 4. For research that seeks to explore or explain why people commit criminal, delinquent, or de-viant acts, asking questions is the best method available.

Within the general category of self-report surveys, two different applications are dis-tinguished by their target population and sampling methods. Studies of offenders select samples of respondents known to have com-mitted crimes, often prisoners. Typically the fo-cus is on the frequency of offending—how many crimes of various types are committed by active offenders over a period of time.

The other type of self-report survey focuses on the prevalence of offending—how many peo-ple commit crimes, in contrast to how many crimes are committed by a target population of offenders. Such surveys typically use samples that represent a broader population, such as U.S. households, adult males, or high school se-niors. The Monitoring the Future survey, briefl y described in Chapter 4, is a self-report survey that centers on measuring the prevalence of of-fending among high school seniors.

General-population surveys and surveys of offenders tend to present different types of diffi culties in connection with the validity and reliability of self-reports. Recall error and the r eporting of fabricated offenses may be prob-lems in a survey of high-rate offenders (Roberts, Mulvey, Horney, et al. 2005), whereas respon-dents in general-population self-report surveys may be reluctant to disclose illegal behavior. When we discuss questionnaire construction

t echniques, such as focus groups, we conclude the chapter with some advice on the benefi ts and pitfalls of conducting your own surveys.

Topics Appropriate to Survey ResearchSurveys have a wide variety of uses in basic and ap-plied criminal justice research.

Surveys may be used for descriptive, explana-tory, exploratory, and applied research. They are best suited for studies that have individual people as the units of analysis. They are often used for other units of analysis as well, such as households or organizations. Even in these cases, however, one or more individual people act as respondents or informants.

For example, researchers sometimes use vic-timization incidents as units of analysis in ex-amining data from crime surveys. The fact that some people may be victimized more than once and others not at all means that victimization incidents are not the same units as individuals. However, a survey questionnaire must still be administered to people who provide informa-tion about victimization incidents. In a similar fashion, the National Jail Census, conducted every fi ve or so years by the Census Bureau, col-lects information about local detention facili-ties. Jails are the units of analysis, but informa-tion about each jail is provided by individuals. Quite a lot of research on police practices was conducted following passage of the 1994 Crime Bill, and in most cases, law enforcement agen-cies were the units of analysis; individual peo-ple, however, provided information for the sur-veys of police departments.

We now consider some broad categories of research applications in which survey methods are especially appropriate.

Counting CrimeWe touched on this use of surveys in Chapter 4. Asking people about victimizations is a measure of crime that adjusts for some of the l imitations

172 Part Three Modes of Observation

propriate for evaluating any policy that may in-crease crime reporting as a side effect.

Consider also that large-scale surveys such as the NCVS cannot be used to evaluate local crime prevention programs. This is because the NCVS is designed to represent the national population of persons who live in households. Although NCVS data for the 11 largest states can be analyzed separately (Lauritsen and Schaum 2005), the NCVS is not representa-tive of any particular local jurisdiction. It is not possible to identify the specifi c location of vic-timizations from NCVS data.

The community victim surveys designed by the BJS and the COPS offi ce help with each of these needs. Local surveys can be launched specifi cally to evaluate local crime prevention efforts. Or innovative programs can be timed to correspond to regular cycles of local surveys. In each case, the BJS-COPS guide (Weisel 1999) presents advice on drawing samples to repre-sent local jurisdictions.

Another type of targeted victim survey is one that focuses on particular types of incidents that might target more narrowly defi ned pop-ulation segments. A good example is the Na-tional Violence Against Women Survey, a joint effort of the National Institute of Justice and a violence prevention bureau in the National Institutes of Health (Tjaden and Thoennes 2000). Screening questions presented explicit descriptions of sexual and other violence with the specifi c purpose of providing better infor-mation about these incidents that have proved diffi cult to measure through general-purpose crime surveys.

Other Evaluation UsesOther types of surveys may be appropriate for applied studies. A good illustration of this is a continuing series of neighborhood surveys to evaluate community policing in Chicago. Here’s an example of how the researchers link their information needs to surveys (Chicago Community Policing Evaluation Consortium 2004, 2):

later in this chapter, we will present examples and suggestions for creating self-report items.

Perceptions and AttitudesAnother application of surveys in criminal justice is to learn how people feel about crime and criminal justice policy. Public views about sentencing policies, gun control, police per-formance, and drug abuse are often solicited in opinion polls. Begun in 1972, the General Social Survey is an ongoing survey of social in-dicators in the United States. Questions about fear of crime and other perceptions of crime problems are regularly included. Since the 1970s, a growing number of explanatory stud-ies have been conducted on public perceptions of crime and crime problems. A large body of research on fear of crime has grown, in part, from the realization that fear and its behav-ioral consequences are much more widespread among the population than is actual criminal victimization (Ditton and Farrall 2007).

Targeted Victim SurveysVictim surveys that target individual cities or neighborhoods are important tools for evalu-ating policy innovations. Many criminal jus-tice programs seek to prevent or reduce crime in a specifi c area, but crimes reported to po-lice cannot be used to evaluate many types of programs.

To see why this is so, consider a hypothetical community policing program that encourages neighborhood residents to report all suspected crimes to the police. Results from the National Crime Victimization Survey (NCVS) have con-sistently shown that many minor incidents are not reported because victims believe that po-lice will not want to be bothered. But if a new program stresses that police actually want to be bothered, the proportion of crimes reported may increase, resulting in what appears to be an increase in crime.

The solution is to conduct targeted victim surveys before and after introducing a policy change. Such victim surveys are especially ap-

Chapter 7 Survey Research and Other Ways of Asking Questions 173

p olice in your city today?” and be provided with a space to write in the answer (or be asked to report it orally to an interviewer). The other op-tion is closed-ended questions, in which the respondent is asked to select an answer from among a list provided by the researcher.

Closed-ended questions are especially useful because they provide more uniform responses and are more easily processed. They often can be transferred directly into a data fi le. Open-ended responses, in contrast, must be coded before they can be processed for analysis. This coding process often requires that the researcher in-terpret the meaning of responses, which opens up the possibility of misunderstanding and re-searcher bias. Also, some respondents may give answers that are essentially irrelevant to the re-searcher’s intent.

The chief shortcoming of closed-ended questions lies in the researcher’s structuring of responses. When the relevant answers to a given question are relatively clear, there should be no problem. In some cases, however, the research-er’s list of responses may fail to include some important answers. When we ask about “the most important crime problem facing the po-lice in your city today,” for example, our check-list might omit certain crime problems that re-spondents consider important.

In constructing closed-ended questions, we are best guided by two of the requirements for operationalizing variables stated in Chapter 4. First, the response categories provided should be exhaustive: they should include all the pos-sible responses that might be expected. Often, researchers ensure this by adding a category labeled something like “Other (Please specify: ______________).” Second, the answer catego-ries must be mutually exclusive: the respondent should not feel compelled to select more than one. In some cases, researchers solicit multiple answers, but doing so can create diffi culties in subsequent data processing and analysis. To ensure that categories are mutually exclusive, we should carefully consider each combination of categories, asking whether a person could

Because it is a participatory program, CAPS [Chicago’s Alternative Policing Strategy] depends on the effectiveness of campaigns to bring it to the public’s attention and on the success of efforts to get the public in-volved in beat meetings and other district projects. The surveys enable us to track the public’s awareness and involvement in community policing in Chicago.

In general, surveys can be used to evaluate policy that seeks to change attitudes, beliefs, or perceptions. Consider a program designed to promote victim and witness cooperation in criminal court by reducing case-processing time. At fi rst, we might consider direct mea-sures of case-processing time as indicators of program success. If the program goal is to in-crease cooperation, however, a survey that asks how victims and witnesses perceive case-pro-cessing time will be more appropriate.

Guidelines for Asking QuestionsHow questions are asked is the single most impor-tant feature of survey research.

A defi ning feature of survey methods is that research concepts are operationalized by ask-ing people questions. Several general guidelines can assist in framing and asking questions that serve as excellent operationalizations of vari-ables. It is important also to be aware of pitfalls that can result in useless and even misleading information. We’ll begin with some of the op-tions available for creating questionnaires.

Open-Ended and Closed-Ended QuestionsIn asking questions, researchers have two basic options, and each can accommodate certain variations. The fi rst is open-ended questions,in which the respondent is asked to provide his or her own answers. For example, the re-spondent may be asked, “What do you feel is the most important crime problem facing the

174 Part Three Modes of Observation

cide?” Questionnaire items should be precise so that the respondent knows exactly what the researcher wants an answer to.

Frequently, researchers ask respondents for a single answer to a combination question. Such double-barreled questions seem to occur most often when the researcher has personally iden-tifi ed with a complex question. For example, the researcher might ask respondents to agree or disagree with the statement “The Depart-ment of Corrections should stop releasing in-mates for weekend furloughs and concentrate on rehabilitating criminals.” Although many people will unequivocally agree with the state-ment and others will unequivocally disagree, still others will be unable to answer. Some might want to terminate the furlough program and punish—not rehabilitate—prisoners. Oth-ers might want to expand rehabilitation efforts while maintaining weekend furloughs; they can neither agree nor disagree without misleading the researcher.

Short Items Are BestIn the interest of being unambiguous and pre-cise and pointing to the relevance of an issue, re-searchers often create long, complicated items. That should be avoided. In the case of ques-tionnaires respondents complete themselves, they are often unwilling to study an item to un-derstand it. The respondent should be able to read an item quickly, understand its intent, and select or provide an answer without diffi culty. In general, it’s safe to assume that respondents will read items quickly and give quick answers; therefore short, clear items that will not be mis-interpreted under those conditions are best. Questions read to respondents in person or over the phone should be similarly brief.

Avoid Negative ItemsA negation in a questionnaire item paves the way for easy misinterpretation. Asked to agree or disagree with the statement “Drugs such as marijuana should not be legalized,” many

reasonably choose more than one answer. In addition, it is useful to add an instruction that respondents should select the one best answer. However, this is still not a satisfactory substi-tute for a carefully constructed set of responses.

Questions and StatementsThe term questionnaire suggests a collection of questions, but a typical questionnaire prob-ably has as many statements as questions. This is because researchers often are interested in determining the extent to which respondents hold a particular attitude or perspective. Re-searchers try to summarize the attitude in a fairly brief statement; then they present that statement and ask respondents whether they agree or disagree with it. Rensis Likert formal-ized this procedure through the creation of the Likert scale, a format in which respondents are asked whether they strongly agree, agree, dis-agree, or strongly disagree, or perhaps strongly approve, approve, and so forth.

Both questions and statements may be used profi tably. Using both in a questionnaire adds fl exibility in the design of items and can make the questionnaire more interesting as well.

Make Items ClearIt should go without saying that questionnaire items must be clear and unambiguous, but the broad proliferation of unclear and ambiguous questions in surveys makes the point worth stressing here. Researchers commonly become so deeply involved in the topic that opinions and perspectives that are clear to them will not be at all clear to respondents, many of whom have given little or no thought to the topic. Or researchers may have only a superfi cial under-standing of the topic and so may fail to specify the intent of a question suffi ciently. The ques-tion “What do you think about the governor’s decision concerning prison furloughs?” may evoke in the respondent some counter-ques-tions: “Which governor’s decision?” “What are prison furloughs?” “What did the governor de-

Chapter 7 Survey Research and Other Ways of Asking Questions 175

found that the way programs were identifi ed had an impact on the amount of public sup-port they received. Here are some comparisons:

More Support Less Support“Assistance to the “Welfare” poor”“Halting the rising “Law crime rate” enforcement” “Dealing with drug “Drug addiction” rehabilitation”

In 1986, for example, 63 percent of respondents said too little money was being spent on “assis-tance to the poor,” while in a matched survey that year, only 23 percent said we were spend-ing too little on “welfare.”

The main guidance we offer for avoiding bias is that researchers imagine how they would feel giving each of the answers they offer to respon-dents. If they would feel embarrassed, perverted, inhumane, stupid, irresponsible, or anything like that, then they should give some serious thought to whether others will be willing to give those answers. Researchers must carefully examine the purpose of their inquiry and con-struct items that will be most useful to it.

We also need to be generally wary of what re-searchers call the social desirability of questions and answers. Whenever we ask people for in-formation, they answer through a fi lter of what will make them look good. That is especially true if they are being interviewed in a face-to-face situation.

Designing Self-Report ItemsSocial desirability is one of the problems that plagues self-report crime questions in general population surveys. Adhering to the ethical principles of confi dentiality and anonymity, as well as convincing respondents that we are do-ing so, is one way of getting more truthful re-sponses to self-report items. Other techniques can help us avoid or reduce problems with self-report items.

respondents will overlook the word not and an-swer on that basis. Thus some will agree with the statement when they are in favor of legal-izing marijuana and others will agree when they oppose it. And we may never know which is which.

Biased Items and TermsRecall from the earlier discussion of conceptu-alization and operationalization that there are no ultimately true meanings for any of the con-cepts we typically study in social science. This same general principle applies to the responses we get from persons in a survey.

The meaning of a given response to a ques-tion depends in large part on the wording of the question. That is true of every question and answer. Some questions seem to encour-age particular responses more than other ques-tions. Questions that encourage respondents to answer in a particular way are biased. Most researchers recognize the likely effect of a ques-tion such as “Do you support the president’s initiatives to promote the safety and security of all Americans?” and no reputable researcher would use such an item. The biasing effect of items and terms is far subtler than this example suggests, however.

The mere identifi cation of an attitude or po-sition with a prestigious (or unpopular) person or agency can bias responses. For example, an item that starts with “Do you agree or disagree with the recent Supreme Court decision that …”might have this effect. We are not suggesting that such wording will necessarily produce consensus or even a majority in support of the position identifi ed with the prestigious person or agency. Rather, support will likely be greater than what would have been obtained without such identifi cation.

Sometimes, the impact of different forms of question wording is relatively subtle. For ex-ample, Kenneth Rasinski (1989) analyzed the results of several General Social Survey studies of attitudes toward government spending. He

176 Part Three Modes of Observation

lustrate how thoughtful wording and intro-ductions can be incorporated into sensitive questions.

Self-report surveys of known offenders encounter different problems. Incarcerated p ersons may be reluctant to admit commit-ting crimes because of the legal consequences. High-rate offenders may have diffi culty distin-guishing among a number of different crimes or remembering even approximate dates. Sort-ing out dates and details of individual crimes among high-rate offenders requires different strategies.

One technique that is useful in surveys of active offenders is to interview subjects several times at regular intervals. For example, Lisa Ma-her (1997) interviewed her sample of heroin- or cocaine-addicted women repeatedly (sometimes daily) over the course of three years. Each sub-ject was asked about her background, intimate relationships with men, income-generating ac-tivities, and drug-use habits. Having regular in-terviews helped respondents recall offending.

One method, used in earlier versions of the BCS, is to introduce a group of self-report items with a disclaimer and to sanitize the p resentation of offenses. The self-report section of the 1984 BCS began with this introduction:

There are lots of things which are actually crimes, but which are done by lots of peo-ple, and which many people do not think of as crimes. On this card [printed card handed to respondents] are a list of eight of them. For each one can you tell me how many people you think do it—most people, a lot of people, or no one.

Respondents then read a card, shown in Fig-ure 7.1, that presented descriptions of various offenses. Interviewers fi rst asked respondents how many people they thought ever did X,where X corresponded to the letter for an of-fense shown in Figure 7.1. Next, respondents were asked whether they had ever done X. Inter-viewers then moved on down the list of letters for each offense on the card.

This procedure incorporates three tech-niques to guard against the socially desirable response of not admitting to having commit-ted a crime. First, the disclaimer seeks to reas-sure respondents that “many people” do not really think of various acts as crimes. Second, respondents are asked how many people they think commit each offense before being asked whether they have done so themselves. This takes advantage of a common human justifi -cation for engaging in certain kinds of behav-ior— other people do it. Third, asking whether they “have ever done X” is less confrontational than asking whether they “have ever cheated on an expense account.” Again, the foibles of human behavior are at work here, in much the same way that people use euphemisms such as restroom for “toilet” and sleep together for “have sexual intercourse.” It is, of course, not realis-tic to expect that such ploys will reassure all respondents. Furthermore, disclaimers about serious offenses such as rape or bank robbery would be ludicrous. But such techniques il-

Figure 7.1 Showcard for Self-Report Items, 1984 British Crime SurveySource: Adapted from the 1984 British Crime Survey (NOP Market Research Limited 1985).

A. Taken office supplies from work (such as statio-nery, envelopes, and pens) when not sup-posed to.

B. Taken things other than office supplies from work (such as tools, money, or other goods) when not supposed to.

C. Fiddled expenses [fiddled is the Queen’s English equivalent of fudged ].

D. Deliberately traveled [on a train] without a ticket or paid too low a fare.

E. Failed to declare something at customs on which duty was payable.

F. Cheated on tax.

G. Used cannabis (hashish, marijuana, ganga, grass).

H. Regularly driven a car when they know they have drunk enough to be well above the legal limit.

Chapter 7 Survey Research and Other Ways of Asking Questions 177

tions. An improperly laid-out questionnaire can cause respondents to miss questions, con-fuse them about the nature of the data desired, and, in the extreme, lead them to throw the questionnaire away.

As a general rule, the questionnaire should be uncluttered. Inexperienced researchers tend to fear that their questionnaire will look too long, so they squeeze several questions onto a single line, abbreviate questions, and try to use as few pages as possible. Such efforts are ill-advised and even counterproductive. Putting more than one question on a line will cause some respon-dents to miss the second question altogether. Some respondents will misinterpret abbreviated questions. And, more generally, respondents who have spent considerable time on the fi rst page of what seemed a short questionnaire will be more demoralized than respondents who quickly completed the fi rst several pages of what initially seemed a long form. Moreover, the lat-ter will have made fewer errors and will not have been forced to reread confusing, abbreviated questions. Nor will they have been forced to write a long answer in a tiny space.

Contingency QuestionsQuite often in questionnaires, certain ques-tions are clearly relevant to only some of the respondents and irrelevant to others. A victim survey, for example, presents batteries of ques-tions about victimization incidents that are meaningful only to crime victims.

Frequently, this situation—realizing that the topic is relevant only to some respondents—arises when we wish to ask a series of ques-tions about a certain topic. We may want to ask whether respondents belong to a particular organization and, if so, how often they attend meetings, whether they have held offi ce in the organization, and so forth. Or we might want to ask whether respondents have heard any-thing about a certain policy proposal, such as opening a youth shelter in the neighborhood, and then investigate the attitudes of those who have heard of it.

Other research asks offenders to complete “crime calendars” on which they make records of weekly or monthly offenses committed. Jen-nifer Roberts and associates (Roberts, Mulvey, Horney, et al. 2005) found that more frequent interviews were necessary for use by high-rate offenders, and that crime calendars were best suited for tracking more serious offenses

Obtaining valid and reliable results from self-report items is challenging, but self-report survey techniques are important tools for ad-dressing certain types of criminal justice re-search questions. Because of this, researchers are constantly striving to improve self-report items. See the collection of essays by Kennet and Gfroerer (2005) for a detailed discussion of issues involved in measuring self-reported drug use through the National Household Survey on Drug Abuse. A National Research Council report (2001) discusses self-report survey mea-sures more generally.

Computer technology has made it possible to signifi cantly improve self-reported items. David Matz (2007) describes advances in self-report items from recent surveys that supple-ment the British Crime Survey. We present ex-amples later in the chapter when we focus on different modes of survey administration.

Questionnaire ConstructionAfter settling on question content, researchers must consider the format and organization of all items in a questionnaire.

Because questionnaires are the fundamental in-struments of survey research, we now turn our attention to some of the established techniques for constructing them. The following sections are best considered as a continuation of our theoretical discussions in Chapter 4 of concep-tualization and measurement.

General Questionnaire FormatThe format of a questionnaire is just as im-portant as the nature and wording of the ques-

178 Part Three Modes of Observation

tingency questions is long enough to extend over several pages. Victim surveys typically in-clude many contingency questions. Figure 7.3 presents a few questions from the NCVS ques-tionnaire used in 2004. All respondents are asked a series of screening questions to reveal possible victimizations. Persons who answer yes to any of the screening questions then com-plete a crime incident report that presents a large number of items designed to measure de-tails of the victimization incident.

As Figure 7.3 shows, the crime incident report itself also contains contingency ques-tions. You might notice that even this brief ad-aptation from the NCVS screening and crime incident report questionnaires is rather com-plex. NCVS questionnaires are administered p rimarily through computer-assisted telephone interviews in which the fl ow of contingency questions is more or less automated. It would be diffi cult to construct a self-administered victimization questionnaire with such compli-cated contingency questions.

Matrix QuestionsOften researchers want to ask several questions that have the same set of answer categories. This happens whenever the Likert response categories are used. Then it is often possible to construct a matrix of items and answers, as i llustrated in Figure 7.4.

This format has three advantages. First, it uses space effi ciently. Second, respondents

The subsequent questions in series such as these are called contingency questions; whether they are to be asked and answered is contingenton the response to the fi rst question in the s eries. The proper use of contingency questions can make it easier for respondents to complete the questionnaire because they do not have to answer questions that are irrelevant to them.

Contingency questions can be presented in several formats on printed questionnaires. The one shown in Figure 7.2 is probably the clear-est and most effective. Note that the questions shown in the fi gure could have been dealt with in a single question: “How many times, if any, have you smoked marijuana?” The response categories then would be: “Never,” “Once,” “2 to 5 times,” and so forth. This single ques-tion would apply to all respondents, and each would fi nd an appropriate answer category. Such a question, however, might put pressure on some respondents to report having smoked marijuana, because the main question asks how many times they have smoked it. The contin-gency question format illustrated in Figure 7.2 reduces the subtle pressure on respondents to report having smoked marijuana. This discus-sion shows how seemingly theoretical issues of validity and reliability are involved in so mun-dane a matter as how to format questions on a piece of paper.

Used properly, even complex sets of contin-gency questions can be constructed without confusing respondents. Sometimes a set of con-

If yes: About how many times haveyou smoked marijuana?[ ] Once[ ] 2 to 5 times[ ] 6 to 10 times[ ] 11 to 20 times[ ] More than 20 times

23. Have you ever smoked marijuana?[ ] Yes[ ] No

Figure 7.2 Contingency Question Format

Chapter 7 Survey Research and Other Ways of Asking Questions 179

Screening Question:

36a. I’m going to read you some examples that will give you an idea of the kinds of crimes this study covers. As I go through them, tell me if any of these happened to you in the last 6 months, that is, since 2024.

Was something belonging to YOU stolen, such as-

(a) Things that you carry, like luggage, a wallet, purse, briefcase, book- (b) Clothing, jewelry or calculator- (c) Bicycle or sports equipment- (d) Things in your home—like a TV, stereo, or tools- (e) Things outside your home, such as a garden hose or lawn furniture- (f) Things belonging to children in the household- (g) Things from a vehicle, such as a package, groceries, camera, or tapes- OR (h) Did anyone ATTEMPT to steal anything belonging to you?

Crime Incident Report:

20a. Were you or any other member of this household present when this incident occurred?

___ Yes [ask item 20b]

___ No [skip to 56, page 8]

20b. Which household members were present

___ Respondent only [ask item 21]

___ Respondent and other household member(s) [ask item 21]

___ Only other household member(s) [skip to 59, page 8]

21. Did you personally see an offender?

___ Yes

___ No

. . . . . . . . .

56. Do you know or have you learned anything about the offender(s)—for instance, whether there was one or more than one offender involved, whether it was someone young or old, or male or female?

___ Yes [ask 57]

___ No [skip to 88, page 11]

Figure 7.3 NCVS Screening Questions and Crime Incident ReportSource: Adapted from National Crime Victimization Survey, NCVS-1 Basic Screen Questionnaire, 9/16/2004 version, www.ojp.usdoj.gov/bjs/pub/pdf/ncvs104.pdf, accessed May 9, 2008; National Crime Victimization Survey, NCVS-2 Crime Incident Report, 9/16/2004 version, www.ojp.usdoj.gov/bjs/pub/pdf/ncvs204.pdf, accessed May 9, 2008.

Figure 7.4 Matrix Question Format

17. Beside each of the statements presented below, please indicate whether you Strongly Agree (SA), Agree (A), Disagree (D), Strongly Disagree (SD), or are Undecided (U).

SA A D SD U

a. What this country needs is more law and order [ ] [ ] [ ] [ ] [ ]

b. Police in America should not carry guns [ ] [ ] [ ] [ ] [ ]

c. Repeat drug dealers should receive life sentences [ ] [ ] [ ] [ ] [ ]

180 Part Three Modes of Observation

ones. If several questions ask about the dangers of illegal drug use and then a question (open-ended) asks respondents to volunteer what they believe to be the most serious crime problems in U.S. cities, drug use will receive more men-tions than would otherwise be the case. In this situation, it is preferable to ask the open-ended question fi rst.

If respondents are asked to rate the over-all effectiveness of corrections policy, they will answer subsequent questions about specifi c aspects of correctional institutions in a way that is consistent with their initial assessment. The converse is true as well: if respondents are fi rst asked specifi c questions about prisons and other correctional facilities, their subsequent overall assessment will be infl uenced by the ear-lier question.

The best solution is sensitivity to the prob-lem. Although we cannot avoid the effect of question order, we should attempt to estimate what that effect will be. Then we will be able to interpret results in a meaningful fashion. If the order of questions seems an especially impor-tant issue in a given study, we could construct several versions of the questionnaire that con-tain the different possible orderings of ques-tions. We could then determine the effects of ordering. At the very least, different versions of the questionnaire should be pretested.

The desired ordering of questions differs somewhat between self-administered question-naires and interviews. In the former, it is usually best to begin the questionnaire with the most interesting questions. Potential respondents who glance casually at the fi rst few questions should want to answer them. Perhaps the ques-tions involve opinions that they are aching to express. At the same time, however, the initial questions should be neither threatening nor sensitive. It might be a bad idea to begin with questions about sexual behavior or drug use. Requests for demographic data (age, gender, and the like) should generally be placed at the end of a self-administered questionnaire. Plac-ing these questions at the beginning, as many

probably fi nd it easier to complete a set of ques-tions presented in this fashion. Third, this for-mat may increase the comparability of responses given to different questions for the respondent, as well as for the researcher. Because respon-dents can quickly review their answers to earlier items in the set, they might choose between, say, “strongly agree” and “agree” on a given state-ment by comparing their strength of agreement with their earlier responses in the set.

Some dangers are inherent in using this for-mat, as well. Its advantages may promote struc-turing an item so that the responses fi t into the matrix format when a different, more idiosyn-cratic, set of responses might be more appropri-ate. Also, the matrix question format can gen-erate a response set among some respondents. This means that respondents may develop a pattern of, say, agreeing with all the statements, without really thinking about what the state-ments mean. That is especially likely if the set of statements begins with several that indicate a particular orientation (for example, a conserva-tive political perspective) and then offers only a few subsequent ones that represent the opposite orientation. Respondents might assume that all the statements represent the same orientation and, reading quickly, misread some of them, thereby giving the wrong answers. This problem can be reduced somewhat by alternating state-ments that represent different orientations and by making all statements short and clear.

A more diffi cult problem is when responses are generated through respondent boredom or fatigue. This can be avoided by keeping matrix questions and the entire questionnaire as short as possible. Later in this chapter, in the section on comparing different methods of question-naire administration, we will describe a useful technique for avoiding response sets generated by respondent fatigue.

Ordering Items in a QuestionnaireThe order in which questions are asked can also affect the answers given. The content of one question can affect the answers given to later

Chapter 7 Survey Research and Other Ways of Asking Questions 181

Finally, it’s common for less experienced re-searchers to assume that questionnaires must be newly constructed for each application. In contrast, it’s almost always possible—and usu-ally preferable—to use an existing question-naire as a point of departure. See the box “Don’t Start from Scratch!” for more on this.

Self-AdministeredQuestionnairesSelf-administered questionnaires are generally the least expensive and easiest to complete.

Although the mail survey is the typical method used in self-administered studies, several other methods are also possible. In some cases, it may be appropriate to administer the q uestionnaire to a group of respondents gathered at the same

inexperienced researchers are tempted to do, might make the questionnaire appear overly in-trusive, so the person who receives it may not want to complete it.

Just the opposite is generally true for in-person interview and telephone surveys. When the potential respondent’s door fi rst opens, the interviewer must begin to establish rapport quickly. After a short introduction to the study, the interviewer can best begin by enumerating the members of the household, obtaining de-mographic data about each. Such questions are easily answered and generally nonthreatening. Once the initial rapport has been established, the interviewer can move into more sensitive areas. An interview that begins with the ques-tion “Do you ever worry about strangers ap-pearing at your doorstep?” will probably end rather quickly.

DON’TSTART FROMSCRATCH!

It’s always easier to modify an existing question-naire for a particular research application than it is to start from scratch. It’s also diffi cult to imag-ine asking questions that nobody has asked be-fore. Here are examples of websites that present complete questionnaires or batteries of question-naire items.■ Bureau of Justice Statistics (BJS). In addition

to administering the NCVS, the BJS collects information from a variety of justice organiza-tions. Copies of recent questionnaires for all BJS-sponsored surveys are available:

www.ojp.usdoj.gov/bjs/quest.htm■ California Healthy Kids Survey. This set of

questionnaires is useful for assessing behavior routines. Most include items on alcohol, to-bacco, and other drug use; fi ghting; and other behaviors of potential interest for school-based interventions. English and Spanish ver-sions are available for elementary, middle, and high school:

www.wested.org/hks/

■ Centers for Disease Control and Prevention (CDC). Various centers within the CDC reg-ularly collect a variety of health-related data through questionnaires and other data col-lection systems. Copies of instruments are available:

www.cdc.gov/nchs/express.htm■ The Measurement Group. This website pro-

vides links to questionnaires designed for use in public health studies, but many of these in-clude items of potential interest to treatment-related initiatives:

www.themeasurementgroup.com/evalbttn.htm

■ University of Surrey Question Bank. Main-tained by a university in England, the Ques-tion Bank includes links to complete question-naires for a wide variety of surveys conducted in the United Kingdom and other countries. You can fi nd a master list of surveys or browse questionnaires by topic. An excellent resource: http://qb.soc.surrey.ac.uk/

Source: Adapted from Maxfi eld (2001). All websites accessed May 9, 2008.

182 Part Three Modes of Observation

recall your reasons for not returning it—and keep those in mind any time you plan to send questionnaires to others.

One big reason people do not return ques-tionnaires is that it seems like too much trou-ble. To overcome this problem, researchers have developed ways to make the return of ques-tionnaires easier. One method involves a self-mailing questionnaire that requires no return envelope. The questionnaire is designed so that when it is folded in a particular fashion, the re-turn address appears on the outside. That way, the respondent doesn’t have to worry about losing the envelope.

Warning Mailings and Cover LettersWarning mailings are used to verify who lives at sampled addresses, and to increase response rates. Warning mailings work like this: After researchers generate a sample, they send a post-card to each selected respondent, with the nota-tion “Address correction requested” printed on the postcard. If the addressee has moved and left a forwarding address, the questionnaire is sent to the new address. In cases in which someone has moved and not left a forwarding address, or more than a year has elapsed and the post offi ce no longer has information about a new address, the postcard is returned marked something like “Addressee unknown.” Selected persons who still reside at the original listed address are warned in suitable language to expect a ques-tionnaire in the mail. In such cases, postcards should briefl y describe the purpose of the survey for which the respondent has been selected.

Warning letters can be more effective than postcards in increasing response rates, and they can also serve the purpose of cleaning ad-dresses. Letters printed on letterhead stationery can present a longer description of the survey’s purpose and a more reasoned explanation of why it is important for everyone to respond.

Cover letters accompanying the question-naire offer a similar opportunity to increase response rates. Two features of cover letters warrant some attention. First, the content of

place at the same time, such as police offi cers at roll call or prison inmates at some specially arranged assembly. Or probationers might complete a questionnaire when they report for a meeting with their probation supervi-sor. The Monitoring the Future survey (see Chapter 4) has high school seniors complete self-administered questionnaires in class.

Some experimentation has been conducted on the home delivery of questionnaires. A re-search worker delivers the questionnaire to the home of sample respondents and explains the study. Then the questionnaire is left for the re-spondent to complete, and the researcher picks it up later.

Home delivery and the mail can be used in combination as well. Questionnaires can be mailed to families, and then research workers may visit the homes to pick up the question-naires and check them for completeness. In the opposite approach, survey packets are hand-delivered by research workers with a request that the respondents mail the completed ques-tionnaires to the research offi ce. In general, when a research worker delivers the question-naire, picks it up, or both, the completion rate is higher than for straightforward mail surveys.

More recently, the Internet has made it possible to have respondents complete self-administered questionnaires online. Before dis-cussing web-based questionnaires, let us turn our attention to the fundamentals of mail sur-veys, which might still be used for people with-out Internet access.

Mail Distribution and ReturnThe basic method for collecting data through the mail is transmittal of a questionnaire ac-companied by a letter of explanation and a self-addressed, stamped envelope for returning the questionnaire. You have probably received a few. As a respondent, you are expected to com-plete the questionnaire, put it in the envelope, and mail it back. If, by any chance, you have re-ceived such a questionnaire and failed to r eturn it, it would be a valuable exercise for you to

Chapter 7 Survey Research and Other Ways of Asking Questions 183

do so at all. Properly timed follow-up mailings provide additional stimuli to respond.

The effects of follow-up mailings may be seen by monitoring the number of question-naires received over time. Initial mailings will be followed by a rise in and subsequent subsid-ing of returns, and follow-up mailings will spur a resurgence of returns. In practice, three mail-ings (an original and two follow-ups) are most effective.

Acceptable Response RatesA question frequently asked about mail sur-veys concerns the percentage return rate that should be achieved. Note that the body of in-ferential statistics used in connection with sur-vey analysis assumes that all members of the initial sample complete and return their ques-tionnaires. Because this almost never happens, response bias becomes a concern. Researchers must test (and hope for) the possibility that re-spondents look essentially like a random sam-ple of the initial sample and thus a somewhat smaller random sample of the total population. For example, if the gender of all people in the sample is known, a researcher can compare the percentages of males and females indicated on returned questionnaires with the percentages for the entire sample.

Nevertheless, overall response rate is one guide to the representativeness of the sample respondents. If the response rate is high, there is less chance of signifi cant response bias than if the rate is low. As a rule of thumb, a response rate of at least 50 percent is adequate for analysis and reporting. A response rate of at least 60 per-cent is good, and a response rate of 70 percent is very good. Bear in mind that these are only rough guides; they have no statistical basis, and a demonstrated lack of response bias is far more important than a high response rate. Response rates tend to be higher for surveys that target a narrowly defi ned population, whereas general population surveys yield lower response rates.

Don Dillman (2006) has undertaken an ex-tensive review of the various techniques survey

the letter is obviously important. The message should communicate why a survey is being conducted, how and why the respondent was selected, and why it is important for the re-spondent to complete the questionnaire. In line with our discussion of the protection of human subjects in Chapter 2, the cover letter should also assure respondents that their answers will be confi dential.

Second, the cover letter should identify the institutional affi liation or sponsorship of the survey. The two alternatives are (1) an institu-tion that the respondent respects or can identify with, or (2) a neutral but impressive-sounding affi liation. For example, if we are conducting a mail survey of police chiefs, printing our cover letter on International Association of Chiefs of Police (IACP) stationery and having the letter signed by an offi cial in the IACP might increase the response rate. Of course, we cannot adopt such a procedure unless the survey is endorsed by the IACP.

By the same token, it is important to avoid controversial affi liations or those inappropriate for the target population. The National Orga-nization for the Reform of Marijuana Laws, for instance, is not suitable for most target popu-lations. A university affi liation is appropriate in many cases, unless the university is on bad terms with the target population.

Follow-Up MailingsFollow-up mailings may be administered in a number of ways. In the simplest, nonrespon-dents are sent a letter of additional encourage-ment to participate. A better method, however, is to send a new copy of the survey questionnaire with the follow-up letter. If potential respon-dents have not returned their questionnaires after two or three weeks, the questionnaires probably have been lost or misplaced.

The methodological literature on follow-up mailings strongly suggests that they are effec-tive in increasing return rates in mail surveys. In general, the longer a potential respondent delays replying, the less likely he or she is to

184 Part Three Modes of Observation

The advantages of this method are obvious. Responses are automatically recorded in com-puter fi les, saving time and money. Web-page design tools make it possible to create attractive questionnaires that include contingency ques-tions, matrixes, and other complex tools for presenting items to respondents. Dillman, long recognized for his total design approach to con-ducting mail surveys, has written a comprehen-sive guide to conducting mail, web-based, and other self-administered surveys (Dillman 2006).

All electronic versions of self-administered questionnaires face a couple of problems. The fi rst concerns representativeness: will the peo-ple who can be surveyed online be representa-tive of meaningful populations, such as all U.S. adults, all registered voters, or all residents of particular urban neighborhoods? This criti-cism has also been raised with regard to sur-veys via fax and, in the mid-20th century, with regard to telephone surveys. Put in terms that should be familiar from the previous chapter, how closely do available sampling frames for electronic surveys match possible target popu-lations? If, for example, our target population is university students, how can we obtain a list of e-mail addresses or other identifi ers that will enable us to survey a representative sample? It’s easy to think of other target populations of in-terest to criminal justice researchers that might be diffi cult to reach via e-mail or web-based questionnaires.

The second problem is an unfortunate con-sequence of the rapid growth of e-mail and re-lated technologies. Just as junk mail clutters our physical mailboxes with all sorts of advertising, “spam” and other kinds of unwanted messages pop up all too often in our virtual mailboxes. The proliferation of junk e-mail has led to the development of anti-spam fi lters that screen out unwanted correspondence. Unfortunately, such programs can also screen out unfamiliar but well-meaning mail such as e-mail question-naires. Similar problems with telemarketing have made it increasingly diffi cult to conduct surveys by telephone.

researchers use to increase return rates on mail surveys, and he evaluates the impact of each. More importantly, Dillman stresses the necessity of paying attention to all aspects of the study—what he calls the “total design method”—rather than one or two special gimmicks.

Computer-Based Self-AdministrationAdvances in computer and telecommunica-tions technology over the past several decades have produced additional options for distrib-uting and collecting self-administered ques-tionnaires. Jeffrey Walker (1994) describes vari-ations on conducting surveys by fax machine. Questionnaires are faxed to respondents, who are asked to fax their answers back. As the In-ternet and Web have permeated work and lei-sure activities, different types of computer-as-sisted self-administered surveys have become more common.

David Shannon and associates (Shannon, Johnson, Scarcy, and Lott 2002) describe three general types of electronic surveys. The fi rst is a disk-based survey. Respondents load a ques-tionnaire from a disk or CD into their own computer, key in responses to survey items, and then either mail the disk back to researchers or transmit the information electronically. As the earliest form of electronic survey, the disk-based survey is a relic of stand-alone personal com-puters. Disk-based surveys are now virtually ob-solete as personal computers are routinely con-nected to the Web in one way or another.

The second type, e-mail surveys, has a few variations. Researchers can include a few simple questions in an e-mail message and ask respon-dents to reply by e-mail. More elaborate ver-sions can embed complex formatted question-naires in e-mail messages. Respondents might be asked to open an attached fi le that contains a questionnaire, or they might be directed to another web page that contains a formatted questionnaire. That brings us to the third type of electronic survey described by Shannon and associates—a questionnaire posted on a web page.

Chapter 7 Survey Research and Other Ways of Asking Questions 185

tions have replaced letters, check-writing, and other correspondence, self-administered sur-veys will increasingly be conducted on web-based computers. At the end of this chapter, we list a small sample of resources for conducting web-based surveys.

In-Person Interview SurveysFace-to-face interviews are best for complex ques-tionnaires and other specialized needs.

The in-person interview is an alternative method of collecting survey data. Rather than asking respondents to read questionnaires and enter their own answers, researchers send inter-viewers to ask the questions orally and record respondents’ answers. Most interview surveys require more than one interviewer, although a researcher might undertake a small-scale inter-view survey alone.

The Role of the InterviewerNot surprisingly, in-person interview surveys typically attain higher response rates than mail surveys. Respondents seem more reluctant to turn down an interviewer who is standing on their doorstep than to throw away a mail ques-tionnaire. A properly designed and executed interview survey ought to achieve a completion rate of at least 80 to 85 percent.

The presence of an interviewer generally de-creases the number of “don’t know” and “no an-swer” responses. If minimizing such responses is important to the study, the interviewer can be instructed to probe for answers (“If you had to pick one of the answers, which do you think would come closest to your feelings?”).

The interviewer can also help respondents with confusing questionnaire items. If the re-spondent clearly misunderstands the intent of a question, the interviewer can clarify matters and thereby obtain a relevant response. Such clarifi cations must be strictly controlled, how-ever, through formal specifi cations. Finally, the interviewer can observe as well as ask ques-tions. For example, the interviewer can make

In yet another example of technology ad-vances being accompanied by new threats, the spread of computer viruses has made people cautious about opening e-mail or attachments from unfamiliar sources. This problem, and the electronic version of junk mail, can be ad-dressed in a manner similar to warning mail-ings for printed questionnaires. Before sending an e-mail with an embedded or linked question-naire, researchers can distribute e-mail mes-sages from trusted sources that warn recipients to expect to receive a questionnaire, and urge them to complete it.

We should keep one basic principle in mind when considering whether a self-administered questionnaire can be distributed electronically: web-based surveys depend on access to the Web, which, of course, implies having a computer. The use of computers and the Web continues to increase rapidly. Although access to this technology still is unequally distributed across socioeconomic classes, web-based surveys can be readily conducted for many target popula-tions of interest to criminal justice researchers.

In their recommendations for rethinking crime surveys, Maxfi eld and associates argue that web-based surveys are well-suited for learn-ing more about victims of computer-facilitated fraud. Since only people with Internet access are possible victims, an Internet-based sample is ideal (Maxfi eld, Hough, and Mayhew 2007). Mike Sutton (2007) describes other examples of nontraditional crimes where Internet sam-ples of computer users are appropriate.

Recalling our discussion in Chapter 6, the correspondence between a sampling frame and target population is a crucial feature of sam-pling. Most justice professionals and criminal justice organizations routinely use the Web and e-mail. Lower-cost but generalizable victim surveys can use web-based samples of univer-sity students to distribute questionnaires. Or printed warning letters can be mailed, inviting respondents to complete either traditional or e-mail self-administered questionnaires. Just as e-mail, electronic bill-paying, and other transac-

186 Part Three Modes of Observation

neighborhood crime problems, the respondent might simply reply, “Pretty bad.” The inter-viewer could obtain an elaboration on this re-sponse through a variety of probes. Sometimes, the best probe is silence; if the interviewer sits quietly with pencil poised, the respondent will probably fi ll the pause with additional com-ments. Appropriate verbal probes are “How is that?” and “In what ways?” Perhaps the most generally useful probe is “Anything else?”

In every case, however, it is imperative that the probe be completely neutral. The probe must not in any way affect the nature of the subsequent response. If we anticipate that a given question may require probing for appro-priate responses, we should write one or more useful probes next to the item in the question-naire. This practice has two important advan-tages. First, it allows for more time to devise the best, most neutral probes. Second, it ensures that all interviewers will use the same probes as needed. Thus even if the probe is not perfectly neutral, the same stimulus is presented to all respondents. This is the same logical guideline as for question wording. Although a question should not be loaded or biased, it is essential that every respondent be presented with the same question, even if a biased one.

Coordination and ControlWhenever more than one interviewer will ad-minister a survey, it is essential that the efforts be carefully coordinated and controlled. Two ways to ensure this control are by (1) training interviewers and (2) supervising them after they begin work.

Whether the researchers will be administer-ing a survey themselves or paying a professional fi rm to do it for them, they should be attentive to the importance of training interviewers. The interviewers usually should know what the study is all about. Even though the interview-ers may be involved only in the data collection phase of the project, they should understand what will be done with the information they gather and what purpose will be served.

observations about the quality of the dwelling, the presence of various possessions, the respon-dent’s ability to speak English, the respondent’s general reactions to the study, and so forth.

Survey research is, of necessity, based on an unrealistic stimulus–response theory of cogni-tion and behavior. That is, it is based on the as-sumption that a questionnaire item will mean the same thing to every respondent, and every given response must mean the same thing when given by different respondents. Although this is an impossible goal, survey questions are drafted to approximate the ideal as closely as possible. The interviewer also plays a role in this ideal sit-uation. The interviewer’s presence should not affect a respondent’s perception of a question or the answer given. The interviewer, then, should be a neutral medium through which questions and answers are transmitted. If this goal is met, different interviewers will obtain the same re-sponses from a given respondent, an example of reliability in measurement (see Chapter 4).

Familiarity with the Questionnaire The in-terviewer must be able to read the question-naire items to respondents without stumbling over words and phrases. A good model for in-terviewers is the actor reading lines in a play or fi lm. The interviewer must read the questions as though they are part of a natural conversa-tion, but that “conversation” must precisely follow the language set down in the question.

By the same token, the interviewer must be familiar with the specifi cations for administer-ing the questionnaire. Inevitably, some ques-tions will not exactly fi t a given respondent’s sit-uation, and the interviewer must determine how those questions should be interpreted in that situation. The specifi cations provided to the in-terviewer should include adequate guidelines in such cases, but the interviewer must know the organization and content of the specifi cations well enough to refer to them effi ciently.

Probing for Responses Probes are frequently required to elicit responses to open-ended questions. For example, to a question about

Chapter 7 Survey Research and Other Ways of Asking Questions 187

Computer-Assisted Interviewing in the BCSEarly waves of the BCS, a face-to-face inter-view survey, asked respondents to complete a self-administered questionnaire about drug use, printed as a small booklet that was promi-nently marked “Confi dential.” Beginning with the 1994 survey, respondents answered self-report questions on laptop computers. The BCS includes two related versions of CAI. In comput-er-assisted personal interviewing (CAPI), inter-viewers read questions from computer screens, instead of printed questionnaires, and then key in respondents’ answers. For self-report items, interviewers hand the computers to subjects, who then key in the responses themselves. This approach is known as computer-assisted self-interviewing (CASI). In addition, CASI as used in the BCS is supplemented with audio instruc-tions—respondents listen to interview prompts on headphones connected to the computer. After subjects key in their responses to self-report items, the answers are scrambled so the interviewer cannot access them. Notice how this feature of CASI enhances the re-searcher’s ethical obligation to keep responses confi dential.

Malcolm Ramsay and Andrew Percy (1996) report that CASI had at least two benefi ts. First, respondents seemed to sense a greater degree of confi dentiality when they responded to ques-tions on a computer screen as opposed to ques-tions on a written form. Second, the laptop computers were something of a novelty that stimulated respondents’ interest; this was espe-cially true for younger respondents.

Examining results from the BCS reveals that CASI techniques produced higher estimates of illegal drug use than those revealed in previous surveys. Table 7.1 compares self-reported drug use from the 1998 BCS (Ramsey and Partridge 1999) with results from the 1992 BCS (Mott and Mirrlees-Black 1995), in which respon-dents answered questions in printed booklets. We present results for only three drugs here, together with tabulations about the use of any drug. For each drug, the survey measured

There may be some exceptions to this, how-ever. In their follow-up study of child abuse victims and controls, Cathy Spatz Widom and associates (Widom, Weiler, and Cotler 1999) did not inform the professional interviewers who gathered data that their interest was in the long-term effects of child abuse. This safeguard was used to avoid even the slightest chance that interviewers’ knowledge of the study focus would affect how they conducted interviews.

Obviously, training should ensure that in-terviewers understand the questionnaire. In-terviewers should also understand procedures to select respondents from among household members. And interviewers should recognize circumstances in which substitute sample ele-ments may be used in place of addresses that no longer exist, families who have moved, or persons who simply refuse to be interviewed.

Training should include practice sessions in which interviewers administer the questionnaire to one another. The fi nal stage of the training should involve some real interviews conducted under conditions like those in the survey.

While interviews are being conducted, it is a good idea to review questionnaires as they are completed. This may reveal questions or groups of questions that respondents do not under-stand. By reviewing completed questionnaires it is also possible to determine if interviewers are completing items accurately.

Computer-AssistedIn-Person InterviewsJust as e-mail and web-based surveys apply new technology to the gathering of survey data through self-administration, laptop and hand-held computers are increasingly being used to conduct in-person interviews. Different forms of computer-assisted interviewing (CAI) of-fer major advantages in the collection of survey data. At the same time, CAI has certain disad-vantages that must be considered. We’ll begin by describing an example of how this technol-ogy was adopted in the BCS, one of the earliest uses of CAI in a general-purpose crime survey.

188 Part Three Modes of Observation

• Questionnaires for self-interviewing can be programmed in different languages, readily switching to the language appropriate for a particular respondent.

• Audio-supplemented CASI produces a stan-dardized interview, avoiding any bias that might emerge from interviewer effects.

• Audio supplements, in different languages, facilitate self-interviews of respondents who cannot read.

At the same time, CAI has certain disad-vantages that preclude its use in many survey applications:

• Although computers become more of a bar-gain each day, doing a large-scale in-person interview survey requires providing comput-ers for each interviewer. Costs can quickly add up. CAI also requires specialized soft-ware to format and present on-screen questionnaires.

• Although CAI reduces costs in data process-ing, it requires more up-front investment in programming questionnaires, skip se-quences, and the like.

lifetime use (“Ever used?”) and use in the past 12 months.

Notice that rates of self-reported use were substantially higher in 1998 than in 1992, with the exception of “semeron” use, reported by very few respondents in 1992 and none in 1998. If you’ve never heard of semeron, you’re not alone. It’s a fi ctitious drug, included in the list of real drugs to detect untruthful or exag-gerated responses. If someone confessed to using semeron, his or her responses to other self-reported items would be suspect. Notice in Table 7.1 that CASI use in 1998 reduced the number of respondents who admitted using a drug that doesn’t exist.

CASI has also been used in the BCS since 1996 to measure domestic violence commit-ted by partners and ex-partners of males and females aged 16 to 59. Catriona Mirrlees-Black (1999) reports that CASI techniques reveal higher estimates of domestic violence victim-ization among both females and males.

Advantages and Disadvantages Different types of CAI offer a number of advantages for in-person interviews. The BCS and other sur-veys that include self-report items indicate that CAI is more productive in that self-reports of drug use and other offending tend to be higher. In 1999, the National Household Survey on Drug Abuse shifted completely to CAI (Wright, Barker, Gfroerer, and Piper 2002). Other advan-tages include the following:

• Responses can be quickly keyed in and au-tomatically reformatted into data fi les for analysis.

• Complex sequences of contingency ques-tions can be automated. Instead of p rinting many examples of “If answer is yes, go to question 43a; if no … ,” computer-based questionnaires automatically jump to the next appropriate question contingent on re-sponses to earlier ones.

• CAI offers a way to break up the monotony of a long interview by shifting from verbal interviewer prompts to self-interviewing, with or without an audio supplement.

Table 7.1 Self-Reported Drug Use, 1992 and 1998 British Crime Survey

Percentage of Respondents Ages 16–29 Who Report Use

1992 1998

Marijuana or cannabis Ever used? 24 42 Used in previous 12 months? 12 23

Amphetamines Ever used? 9 20 Used in previous 12 months? 4 8

Semeron Ever used? 0.3 0.0 Used in previous 12 months? 0.1 0.0

Any drug Ever used? 28 49 Used in previous 12 months? 14 25

Source: 1992 data adapted from Mott and Mirrlees-Black (1995, 41–42); 1998 data adapted from Ramsay and Partridge (1999, 68–71).

Chapter 7 Survey Research and Other Ways of Asking Questions 189

respondents may be more honest in giving so-cially disapproved answers if they don’t have to look the questioner in the eye. Similarly, it may be possible to probe into more sensitive areas, although that is not necessarily the case. People are, to some extent, more suspicious when they can’t see the person asking them questions—perhaps a consequence of telemarketing and salespeople conducting bogus surveys before making sales pitches.

Telephone surveys can give a researcher greater control over data collection if several in-terviewers are engaged in the project. If all the interviewers are calling from the research offi ce, they can get clarifi cation from the supervisor whenever problems occur, as they inevitably do. Alone in the fi eld, an interviewer may have to wing it between weekly visits with the inter-viewing supervisor.

A related advantage is rooted in the grow-ing diversity of U.S. cities. Because many major cities have growing immigrant populations, interviews may need to be conducted in differ-ent languages. Telephone interviews are usually conducted from a central site, so that one or more multilingual interviewers can be quickly summoned if an English-speaking interviewer makes contact with, say, a Spanish-speaking re-spondent. In-person interview surveys present much more diffi cult logistical problems in han-dling multiple languages. And mail surveys re-quire printing and distributing questionnaires in different languages.

Telephone interviewing has its problems, however. Telephone surveys are limited by defi -nition to people who own telephones. Years ago, this method produced a substantial social class bias by excluding poor people. Over time, however, the telephone has become a standard fi xture in almost all American homes. The U.S. Census Bureau estimates that 95.5 percent of all households now have telephones, so the ear-lier class bias has been substantially reduced (U.S. Bureau of the Census 2006, Table 1117). The NCVS, traditionally an in-person interview, has increased its use of telephone interviews as part of the crime survey’s redesign.

• Automated skip sequences for contingency questions are great, but if something goes wrong with the programmed questionnaire, all sorts of subsequent problems are pos-sible. As Emma Forster and Alison McCleery (1999) point out, such question-routing mistakes might mean whole portions of a questionnaire are skipped. Whereas occa-sional random errors are possible with pen-and-paper interviews, large-scale systematic error can happen with CAI technology.

• Proofreading printed questionnaires is straightforward, but it can be diffi cult to au-dit a computerized questionnaire. Doing so might require special technical skills.

• It can be diffi cult to print and archive a com-plex questionnaire used in CAI. This was a problem with early applications of CAI tech-nology, but improvements in software are helping to solve it.

• Batteries of laptops run down, and comput-ers and software are more vulnerable to mal-functions and random weirdness than are stacks of printed questionnaires.

In sum, CAI can be costly and requires some specialized skills. As a result, these and related technologies are best suited for use by profes-sional survey researchers or research centers that regularly conduct large-scale in-person in-terviews. We will return to this issue in the con-cluding section of this chapter.

Telephone SurveysTelephone surveys are fast and relatively low cost.

Telephone surveys have many advantages that make them a popular method. Probably the greatest advantages involve money and time. In a face-to-face household interview, a researcher may drive several miles to a respondent’s home, fi nd no one there, return to the research offi ce, and drive back the next day—possibly fi nding no one there again.

Interviewing by telephone, researchers can dress any way they please, and it will have no effect on the answers respondents give. And

190 Part Three Modes of Observation

should plan on making fi ve calls to reach two households.

Another developing problem is the increas-ing number of households that have mobile phone service only. Stephen Blumberg (Blum-berg, Luke, and Cynamon 2006) and associates report that about 7 percent of households in a United States national sample have only mobile phones. Jan van Dijk (2007) describes how this is especially troublesome in some European countries where mobile-only households are more common.

The most diffi cult challenge with telephone surveys involves the explosion of telemarket-ing. The volume of junk phone calls rivals that of junk mail, and salespeople often begin their pitch by describing a “survey.” The viability of legitimate surveys is now hampered by the pro-liferation of bogus surveys, which are actually sales campaigns disguised as research. As if that weren’t bad enough, telemarketing has become so annoying that many people simply hang up whenever they hear a strange voice announcing some institutional affi liation.

Following action in several states, the Federal Trade Commission approved new regulations restricting telemarketing by enabling individu-als to place their phone numbers on a national do-not-call registry (Telemarketing Sales Rule 2003). The rule went into effect in October 2003 after withstanding vigorous legal challenges by the telemarketing industry. The impact of this regulation remains to be seen. On the one hand, it has the potential to make life easier for legiti-mate telephone survey researchers. On the other hand, since conducting research is identifi ed as a legitimate activity under the Telemarketing Sales Rule, it may induce more unscrupulous fi rms to disguise their sales calls as surveys.

Computer-AssistedTelephone InterviewingMuch of the growth in telemarketing has been fueled by advances in computer and telecom-munications technology. Beginning in the 1980s, much of the same technology came to

At the same time, phone surveys are much less suitable for individuals not living in house-holds. Homeless people are obvious examples; those who live in institutions are also diffi cult to reach out and touch via telephone. Patricia Tjaden and Nancy Thoennes (2000) cite single-adult households and people living in rural or inner-city areas as targets least likely to have telephone coverage.

A related sampling problem involves un-listed numbers. If the survey sample is selected from the pages of a local telephone directory, it totally omits all those people who have re-quested that their numbers not be published. Similarly, those who recently moved and tran-sient residents are not well represented in pub-lished telephone directories. This potential bias has been eliminated through random-digit dialing (RDD), a technique that has advanced telephone sampling substantially.

RDD samples use computer algorithms to generate lists of random telephone numbers—usually the last four digits. This procedure gets around the sampling problem of unlisted tele-phone numbers but may substitute an admin-istrative problem. Randomly generating phone numbers produces numbers that are not in op-eration or that serve a business establishment or pay phone. In most cases, businesses and pay phones are not included in the target popula-tion; dialing these numbers and learning that they’re out of scope will take time away from producing completed interviews with the tar-get population.

Weisel (1999) offers an excellent description of RDD samples for use in community crime surveys. Among the increasingly important is-sues in RDD samples is the growing number of telephone numbers in use for cell phones, pagers, and home access to the Internet. This means that the rate of ineligible telephone numbers generated through RDD is increas-ing. Weisel’s rule of thumb is that ineligible numbers account for an estimated 60 percent of phone numbers produced by typical RDD procedures. Thus, if RDD is used, researchers

Chapter 7 Survey Research and Other Ways of Asking Questions 191

rapidly gets to the next appropriate question. This can be especially handy in a victim survey, in which affi rmative answers to screening ques-tions automatically bring up detailed questions about each crime incident.

Comparison of the Three MethodsCost, speed, and question content are issues to con-sider in selecting a survey method.

We’ve now examined three ways of collecting survey data: self-administered questionnaires, in-person interviews, and telephone surveys. We have also considered some recent advances in each mode of administration. Although we’ve touched on some of the relative advantages and disadvantages of each, let’s take a minute to compare them more directly.

Self-administered questionnaires are gen-erally cheaper to use than interview surveys. Moreover, for self-administered e-mail or web-based surveys, it costs no more to conduct a national survey than a local one. Obviously, the cost difference between a local and a national in-person interview survey is considerable. Tele-phone surveys are somewhere in between. Al-though national surveys can infl ate costs some-what through long-distance telephone charges, fl at-rate long-distance service can be negotiated or Internet-based phone service might be used. Mail surveys typically require a small staff. One person can conduct a reasonable mail survey, although it is important not to underestimate the work involved.

Up to a point, cost and speed are related. In-person interview surveys can be completed very quickly if a large pool of interviewers is a vailable and funding is adequate to pay them. In contrast, if a small number of people are conducting a larger number of face-to-face interviews, costs are generally lower, but the survey takes much longer to complete. Tele-phone surveys that use CATI technology are the fastest.

be widely used in telephone surveys, referred to as computer-assisted telephone interviewing (CATI). Perhaps you’ve occasionally marveled at news stories that report the results of a na-tionwide opinion poll the day after some major speech or event. The speed of CATI technology, coupled with RDD, makes these instant poll re-ports possible.

Interviewers wearing telephone headsets sit at computer workstations. Computer programs dial phone numbers, which can be either gener-ated through RDD or extracted from a database of phone numbers compiled from a source. As soon as phone contact is made, the computer screen displays an introduction (“Hello, my name is . . . calling from the Survey Research Center at Ivory Tower University”) and the fi rst question to be asked, often a query about the number of residents who live in the household. As interviewers key in answers to each question, the computer program displays a new screen that presents the next question, until the end of the interview is reached.

CATI systems offer several advantages over procedures in which an interviewer works through a printed interview schedule. Speed is one obvious plus. Forms on computer screens can be fi lled in more quickly than can paper forms. Typing answers to open-ended questions is much faster than writing them by hand. And CATI software immediately formats responses into a data fi le as they are keyed in, which elimi-nates the step of manually transferring answers from paper to computer.

Accuracy is also enhanced by CATI systems in several ways. First, CATI programs can be de-signed to accept only valid responses to a given questionnaire item. For example, if valid re-sponses to respondent “gender” are f for f emale and m for male, the computer will accept only those two letters, emitting a disagreeable noise and refusing to proceed if something else is keyed in. Second, the software can be pro-grammed to automate contingency questions and skip sequences, thus ensuring that the in-terviewer skips over inappropriate items and

192 Part Three Modes of Observation

hood, the dwelling unit, and so forth. They may also note characteristics of the respondents or the quality of their interaction with the respondents—whether the respondent had dif-fi culty communicating, was hostile, seemed to be lying, and so on. Finally, when the safety of interviewers is an issue, a mail or phone survey may be the best option.

Ultimately, researchers must weigh all these advantages and disadvantages of the three methods against research needs and available resources.

Strengths and Weaknesses of Survey ResearchSurveys tend to be high on reliability and generaliz-ability, but validity can often be a weak point.

Like other modes of collecting data in crimi-nal justice research, surveys have strengths and weaknesses. It is important to consider these in deciding whether the survey format is appro-priate for a specifi c research purpose.

Surveys are particularly useful in describ-ing the characteristics of a large population. The NCVS has become an important tool for researchers and public offi cials because of its ability to describe levels of crime. A carefully se-lected probability sample, in combination with a standardized questionnaire, allows research-ers to make refi ned descriptive statements about a neighborhood, a city, a nation, or some other large population.

Standardized questionnaires have an im-portant advantage in regard to measurement. Earlier chapters discussed the ambiguous na-ture of concepts: they ultimately have no real meanings. One person’s view about, say, crime seriousness or punishment severity is quite dif-ferent from another’s. Although we must be able to defi ne concepts in ways that are most relevant to research goals, it’s not always easy to apply the same defi nitions uniformly to all subjects. Nevertheless, the survey researcher is bound to the requirement of having to ask exactly the same questions of all subjects and

Self-administered surveys may be more ap-propriate to use with especially sensitive issues if the surveys offer complete anonymity. Re-spondents are sometimes reluctant to report controversial or deviant attitudes or behav-iors in interviews, but they may be willing to respond to an anonymous self-administered questionnaire. However, the successful use of computers for self-reported items in the BCS and the National Household Survey on Drug Use and Health indicates that interacting with a machine can promote more candid responses. This is supported by experimental research comparing different modes of questionnaire administration (Tourangeau and Smith 1996).

Interview surveys have many advantages, too. For example, in-person or telephone surveys are more appropriate when respondent literacy may be a problem. Interview surveys also result in fewer incomplete questionnaires. Respon- dents may skip questions in a self-administered questionnaire, but interviewers are trained not to do so. CAI offers a further check on this in telephone and in-person surveys.

Although self-administered questionnaires may be more effective in dealing with sensitive issues, interview surveys are defi nitely more effective in dealing with complicated ones. In-terviewers can explain complex questions to re-spondents and use visual aids that are not pos-sible in mail or phone surveys.

In-person interviews, especially with com-puter technology, can also help reduce response sets. Respondents (like students?) eventually become bored listening to a lengthy series of similar types of questions. It’s easier to main-tain individuals’ interest by changing the kind of stimulation they are exposed to. A mix of questions verbalized by a person, presented on a computer screen, and heard privately through earphones is more interesting for respondents and reduces fatigue.

Interviewers who question respondents face to face are also able to make important observa-tions aside from responses to questions asked in the interview. In a household interview, they may summarize characteristics of the neighbor-

Chapter 7 Survey Research and Other Ways of Asking Questions 193

tic violence is measured in the context of a crime survey, and some women may not see what happened to them as “crime,” or be reluctant to do so. Also, there is little time to approach the topic “gently.” A spe-cially designed questionnaire with care-fully selected interviewers may well have the edge here.

In recent years, both the NCVS and the BCS have been revised to produce better measures of domestic and intimate violence. Estimates of domestic violence increased in the 1996 wave of the BCS, and researchers think that the increase refl ects a greater willingness by respondents to discuss domestic violence with interviewers (Mirrlees-Black, Mayhew, and Percy 1996). The use of CASI in the BCS has produced higher estimates of victimization prevalence among women, as well as the fi rst measurable rates of domestic violence victimization for males (Mir-rlees-Black 1999).

Survey research is generally weaker on valid-ity and stronger on reliability. In comparison with fi eld research, for instance, the artifi ciality of the survey format puts a strain on validity. As an illustration, most researchers agree that fear of crime is not well measured by the stan-dard question “How safe do you feel, or would you feel, out alone in your neighborhood at night?” Survey responses to that question are, at best, approximate indicators of what we have in mind when we conceptualize fear of crime.

Reliability is a different matter. By present-ing all subjects with a standardized stimu-lus, survey research goes a long way toward e liminating unreliability in observations made by the researcher.

However, even this statement is subject to qualifi cation. Critics of survey methods argue that questionnaires for standard crime surveys and many specialized studies embody a narrow, legalistic conception of crime that cannot re-fl ect the perceptions and experiences of minori-ties and women. Survey questions typically are based on male views and do not adequately tap victimization or fear of crime among women

having to impute the same intent to all respon-dents giving a particular response.

At the same time, survey research has its weak-nesses. First, the requirement for standardiza-tion might mean that we are trying to fi t round pegs into square holes. Standardized question-naire items often represent the least common denominator in assessing people’s attitudes, orientations, circumstances, and experiences. By designing questions that are at least minimally appropriate to all respondents, we may miss what is most appropriate to many respondents. In this sense, surveys often appear superfi cial in their coverage of complex topics.

Using surveys to study crime and criminal justice policy presents special challenges. The target population frequently includes lower-income, transient persons who are diffi cult to contact through customary sampling methods. For example, homeless persons are excluded from any survey that samples households, but people who live on the street no doubt fi gure prominently as victims and offenders. Max-fi eld (1999) describes how new data from the National Incident-Based Reporting System suggest that a number of “non-household-associated” persons are systematically under-counted by sampling procedures used in the NCVS. Crime surveys such as the NCVS and the BCS have been defi cient in getting informa-tion about crimes of violence when the victim and offender have a prior relationship. This is particularly true for domestic violence.

Underreporting of domestic violence ap-pears to be due, in part, to the very general nature of large-scale crime surveys. Catriona M irrlees-Black (1995, 8) of the British Home Of-fi ce summarizes the trade-offs of using survey techniques to learn about domestic violence:

Measuring domestic violence is diffi cult territory. The advantage of the BCS is that it is based on a large nationally representa-tive sample, has a relatively high response rate, and collects information on enough incidents to provide reliable details of their nature. One disadvantage is that domes-

194 Part Three Modes of Observation

Another approach is to study one or two (or some other small number) correctional institu-tions intensively. We might interview a psychol-ogist in each institution and present questions about various approaches to drug treatment therapy. In all likelihood, we will use, not a highly structured questionnaire, but rather a list of questions or topics we wish to discuss with each subject. And we will treat the inter-view as more of a directed conversation than a formal interview. Of course, we cannot general-ize from interviews with one or two prison psy-chologists to any larger population. However, we will gain an understanding (and probably a more detailed one) of how staff psychologists in specifi c institutions feel about different drug treatment programs.

Specialized interviewing asks questions of a small number of subjects, typically using an interview schedule that is much less structured than that in sample surveys. Michael Quinn Patton (2001) distinguishes two variations of specialized interviews. The less structured al-ternative is to prepare a general interview guide that includes the issues, topics, or questions the researcher wishes to cover. Issues and items are not presented to respondents in any standard-ized order. The interview guide is more like a checklist than an interview schedule, ensuring that planned topics are addressed at some point in the interview. The standardized open-ended interview, in contrast, is more structured, us-ing specifi c questions arranged in a particular order. The researcher presents each respondent with the same questions in the same sequence (subject to any contingency questions). The questions are open-ended, but their format and presentation are standardized.

To underscore the fl exibility of specialized interviewing, Patton describes how the two ap-proaches can be used in combination (2001, 347):

A conversational strategy can be used within an interview guide approach, or you can combine a guide approach. . . . This

(Straus 1999; Tjaden and Thoennes 2000). Concern that survey questions might mean dif-ferent things to different respondents raises im-portant questions about reliability and about the generalizability of survey results across sub-groups of a population.

As with all methods of observation, a full awareness of the inherent or probable weak-nesses of survey research may partially resolve them. Ultimately, though, we are on the safest ground when we can use several different re-search methods to study a given topic.

Other Ways of Asking QuestionsSpecialized interviews and focus groups are alterna-tive ways of gathering question-based data.

Sample surveys are perhaps the best-known application of asking questions as a data-gathering strategy for criminal justice research. Often, however, more specialized interviewing techniques are appropriate.

Specialized InterviewingNo precise defi nition of the term survey enables us to distinguish a survey from other types of interview situations. As a rule of thumb, a sam-ple survey (even one that uses nonprobability sampling methods) is an interview-based tech-nique for generalizing to a larger population using a standardized questionnaire. In contrast, specialized interviewing focuses on the views and opinions of only those individuals who are interviewed.

Let’s say we are interested in how mental health professionals view different drug treat-ment programs for prison inmates. One ap-proach is to conduct a sample survey of psy-chologists who work in state correctional facilities in which each sampled psychologist completes a structured questionnaire concern-ing drug treatment programs. This approach will enable us to generalize to the population of state prison psychologists.

Chapter 7 Survey Research and Other Ways of Asking Questions 195

cus groups have proved to be more suitable for many market research applications. In recent years, focus groups have commonly been used as substitutes for surveys in criminal justice and other social scientifi c research.

In a focus group, 8 to 15 people are brought together in a room to engage in a guided group discussion of some topic. Although focus groups cannot be used to make statistical esti-mates about a population, members are never-theless selected to represent a target population. Richard Krueger and Mary Anne Casey (2000) describe focus groups, their applications, and their advantages and disadvantages in detail.

For example, the location of community correctional facilities such as work-release cen-ters and halfway houses often prompts a classic “Not in my backyard!” (NIMBY) response from people who live in neighborhoods where pro-posed facilities will be built. Recognizing this, a mayor who wants to fi nd a suitable site with-out annoying neighborhood residents (voters) is well advised to convene a focus group that includes people who live in areas near possible facility locations. A focus group can test the “market acceptability” of a work-release center, which might include the best way to package and sell the product. Such an exercise might reveal that an appeal to altruism (“We all have to make sacrifi ces in the fi ght against crime”) is much less effective in gaining support than an alternative sales pitch that stresses potential economic benefi ts (“This new facility will pro-vide jobs for neighborhood residents”).

Generalizations from focus groups to target populations cannot be precise; however, a study by V. M. Ward and associates (Ward, Bertrand, and Brown 1991) found that focus group and survey results can be quite consistent under certain conditions. They conclude that focus groups are most useful in two cases: (1) when precise generalization to a larger population is not necessary, and (2) when focus group participants and the larger population they are intended to represent are relatively homo-geneous. So, for example, a focus group is not

combined strategy offers the interviewer fl exibility in probing and in determining when it is appropriate to explore certain subjects in greater depth, or even to pose questions about new areas of inquiry that were not originally anticipated in the in-terview instrument’s development.

Open-ended questions are ordinarily used because they capture rich detail better. The pri-mary disadvantage of open-ended questions—having to categorize responses—is not a prob-lem in specialized interviewing because of the small number of subjects and because research-ers are more interested in describing than in generalizing.

Specialized interviewing can be incorpo-rated into any research project as a supplemen-tary source of information. If, for example, we are interested in the effects of determinant sentencing on prison populations, we can ana-lyze data from the Census of State Adult Cor-rectional Facilities, conducted by the BJS. We might also interview a small number of correc-tions administrators, perhaps asking them to react to our data analysis. Evaluation studies and other applied research projects frequently use specialized interviewing techniques, alone or in combination with other sources of data.

Focus GroupsLike sample surveys, focus group techniques were refi ned by market research fi rms in the years following World War II. As the name im-plies, market research explores questions about the potential for sales of consumer products. Because a fi rm may spend millions of dollars developing, advertising, and distributing some new item, market research is an important tool to test consumer reactions before large sums of money are invested in a product.

Surveys have two disadvantages in market research. First, a nationwide or large-scale prob-ability survey can be expensive. Second, it may be diffi cult to present advertising messages or other product images in a survey format. Fo-

196 Part Three Modes of Observation

select participants from a specifi c target popu-lation that relates to our research questions. If we’re interested in how residents of a specifi c neighborhood will feel about opening a work-release center, we should select group partici-pants who live in the target neighborhood or one very much like it.

Should You Do It Yourself ?Anyone can do a mail or simple telephone survey, but many times it’s better to use professional survey researchers.

The fi nal issue we address in this chapter is who should conduct surveys. Drawing a sam-ple, constructing a questionnaire, and either conducting interviews or distributing self-administered instruments are not especially diffi cult. Equipped with the basic principles we have discussed so far in this book, you could complete a modest in-person or telephone survey yourself. Mail and web-based surveys of large numbers of subjects are entirely pos-sible, especially with present-day computer capabilities.

At the same time, the various tasks involved in completing a survey require a lot of work and attention to detail. We have presented many tips for constructing questionnaires, but our guidelines barely scratch the surface. Many books describe survey techniques in more de-tail, and a growing number focus specifi cally on telephone or mail techniques (see “Additional Readings” at the end of this chapter). In many respects, however, designing and executing a survey of even modest size can be challenging.

Consider the start-up costs involved in in-person or telephone interview surveys of any size. Finding, training, and paying interview-ers are time consuming, potentially costly, and require some degree of expertise. The price of computer equipment continues downward, but a CATI setup or supply of laptops and associ-ated software for interviewers still represents a substantial investment that cannot easily be justifi ed for a single survey.

appropriate to predict how all city residents will react to a ban on handgun ownership. But a fo-cus group of registered handgun owners could help evaluate a proposed city campaign to buy back handguns.

Focus groups may also be used in combina-tion with survey research in one of two ways. First, a focus group can be valuable in ques-tionnaire development. When researchers are uncertain how to present items to respondents, a focus group discussion about the topic can generate possible item formats. For instance, James Nolan and Yoshio Akiyama (1999) stud-ied police routines for making records of hate crimes. In a general sense, they knew what con-cepts they wanted to measure but were unsure how to operationalize them. They convened fi ve focus groups in different cities, including po-lice administrators, mid-level managers, patrol offi cers, and civilian employees, to learn about different perspectives on hate-crime recording. Analyzing focus group results, Nolan and Aki-yama prepared a self-administered question-naire that was sent to a large number of indi-viduals in four police departments.

Second, after a survey has been completed and preliminary results tabulated, focus groups may be used to guide the interpretation of some results. After a citywide survey in which we fi nd, for example, that recent immigrants from Southeast Asian countries are least sup-portive of community policing, we might con-duct a focus group of Asian residents to delve more deeply into their concerns.

Focus groups are fl exible and can be adapted to many uses in basic and applied research. Keep in mind, however, two key elements e xpressed in the name of this data collection technique. Focus means that researchers pres-ent specifi c questions or issues for directed dis-cussion. Having a free-for-all discussion about hate crime, for example, would not have yielded much useful insight for Nolan and Akiyama to develop a survey instrument. Group calls our at-tention to potential participants in the focused discussions. Like market researchers, we should

Chapter 7 Survey Research and Other Ways of Asking Questions 197

The alternative to doing it yourself is to con-tract with a professional survey research fi rm or a company that routinely conducts surveys. Most universities have a survey research center or institute, often affi liated with a sociology or political science department. Such institutes are usually available to conduct surveys for government organizations as well as university researchers, and they can often do so very eco-nomically. Private research fi rms are another possibility. Most have the capability to conduct all types of surveys as well as focus groups.

Using a professional survey fi rm or institute has several advantages. Chapter 6 described the basic principles of sampling, but actually draw-ing a probability sample can be complex. Even the BJS-COPS do-it-yourself guide book for community crime surveys (Weisel 1999) coun-sels police departments and others to consult with experts in drawing RDD samples. Profes-sional fi rms regularly use sampling frames that can represent city, state, and national samples or whatever combination is appropriate.

We have emphasized the importance of measurement throughout this book. Research-ers should develop conceptual and operational defi nitions and be attentive to all phases of the measurement process. However, constructing a questionnaire requires attention to details that may not always be obvious to researchers. Sur-vey fi rms are experienced in preparing standard demographic items, batteries of matrix ques-tions, and complex contingency questions with appropriate skip sequences.

Although it is often best for researchers to discuss specifi c concepts and even to draft questions, professional fi rms offer the con-siderable benefi t of experience in pulling it all together. This is not to say that a researcher should simply propose some ideas for ques-tions and then leave the details to the pros. Working together with a survey institute or market research fi rm to propose questionnaire items, review draft instruments, evaluate pre-tests, and make fi nal modifi cations is usually the best approach.

If interview surveys are beyond a researcher’s means, he or she might fall back on a mail or web-based survey. Few capital costs are involved; most expenses are in consumables such as enve-lopes, stamps, and stationery. One or two per-sons can orchestrate a mail survey reasonably well at minimal expense. Perhaps a consultant could be hired to design a web-based survey at modest cost. But consider two issues.

First, the business of completing a survey involves a great deal of tedious work. In mail surveys, questionnaires and cover letters must be printed, folded or stuffed into envelopes, stamped, and delivered (fi nally!) to the post of-fi ce. None of this is much fun. The enjoyment starts when completed questionnaires start to trickle in. It’s rewarding to begin the actual em-pirical research, and the excitement may get the researcher past the next stretch of tedium: go-ing from paper questionnaires to actual data. So it’s possible for one individual to do a mail sur-vey, but the researcher must be prepared for lots of work; even then, it will be more work than expected. Web-based surveys have their own trade-offs. Economizing on up-front program-ming costs entails the risk of being swamped by unusable electronic questionnaires.

The second issue is more diffi cult to deal with and is often overlooked by researchers. We have examined at some length the advan-tages and disadvantages of the three methods of questionnaire administration. Some meth-ods are more or less appropriate than others for different kinds of research questions. If a telephone or an in-person interview survey is best for the particular research needs, conduct-ing a mail or web-based survey would be a com-promise, perhaps an unacceptable one. But the r esearcher’s excitement at actually beginning the research may lead him or her to overlook or minimize problems with doing a mail survey on the cheap, in much the same way that research-ers often are not in a position to recognize ethi-cal problems with their own work. Doing a mail survey because it’s all you can afford does not necessarily make the mail survey worth doing.

198 Part Three Modes of Observation

lations, but surveys have many other uses in criminal justice research.

• Surveys are the method of choice for obtain-ing self-reported offending data. Continuing efforts to improve self-report surveys include using confi dential computer-assisted personal interviews.

• Questions may be open-ended or closed-ended. Each technique for formulating questions has advantages and disadvantages.

• Short items in a questionnaire are usually bet-ter than long ones.

• Bias in questionnaire items encourages respon-dents to answer in a particular way or to sup-port a particular point of view. It should be avoided.

• Questionnaires are administered in three basic ways: self-administered questionnaires, face-to-face interviews, and telephone interviews. Each mode of administration can be varied in a num-ber of ways.

• Computers can be used to enhance each type of survey. Computer-assisted surveys have many advantages, but they often require special skills and equipment.

• It is generally advisable to plan follow-up mail-ings for self-administered questionnaires, send-ing new questionnaires to respondents who fail to respond to the initial appeal.

• The essential characteristic of interviewers is that they be neutral; their presence in the data collection process must not have any effect on the responses given to questionnaire items.

• Surveys conducted over the telephone are fast and fl exible.

• Each method of survey administration has a va-riety of advantages and disadvantages.

• Survey research has the weaknesses of being somewhat artifi cial and potentially superfi cial. It is diffi cult to gain a full sense of social pro-cesses in their natural settings through the use of surveys.

• Specialized interviews with a small number of people and focus groups are additional ways of collecting data by asking questions.

• Although the particular tasks required to com-plete a survey are not especially diffi cult, re-searchers must carefully consider whether to conduct surveys themselves or contract with a professional organization.

Perhaps the chief benefi t of contracting for a survey is that survey research centers and other professional organizations have the latest spe-cialized equipment, software, and know-how to take advantage of advances in all forms of CAI. Furthermore, such companies can more readily handle such administrative details as training interviewers, arranging travel for in-person sur-veys, coordinating mail surveys, and providing general supervision. This frees researchers from much of the tedium of survey research, enabling them to focus on more substantive issues.

Researchers must ultimately decide whether to conduct a survey themselves or contract with a professional fi rm. And the decision is best made after carefully considering the pros and cons of each approach. Too often, university fac-ulty assume that students can get the job done while overlooking the important issues of how to maintain quality control and whether a sur-vey is a worthwhile investment of students’ time. Similarly, criminal justice practitioners may be-lieve that agency staff can handle a mail survey or conduct phone interviews from the offi ce. Again, compromises in the quality of results, together with the opportunity costs of divert-ing staff from other tasks, must be considered. The do-it-yourself strategy may seem cheaper in the short run, but it often becomes a false econ-omy when attention turns to data analysis and interpretation.

We’ll close this section with an apocryphal story about a consultant’s business card; the card reads, “Fast! Low cost! High quality! Pick any two.” It’s best to make an informed choice that best suits your needs.

✪ Main Points• Survey research, a popular social research

method, involves the administration of ques-tionnaires to a sample of respondents selected from some population.

• Survey research is especially appropriate for de-scriptive or exploratory studies of large popu-

Chapter 7 Survey Research and Other Ways of Asking Questions 199

overview of survey methods, this textbook cov-ers many aspects of survey techniques that are omitted here.

Dillman, Don A., Mail and Internet Surveys: The Tai-lored Design Method 2007 Update, 2nd ed. (New York: Wiley, 2006). This update of a classic refer-ence on self-administered surveys includes a va-riety of web-based techniques. Dillman makes many good suggestions for improving response rates.

General Accounting Offi ce, Using Structured Inter-viewing Techniques (Washington, DC: General Accounting Offi ce, 1991). This is another use-ful handbook in the GAO series on evaluation methods. In contrast to Patton (below), the GAO emphasizes getting comparable informa-tion from respondents through structured in-terviews. This is very useful step-by-step guide.

Krueger, Richard A., and Mary Anne Casey, FocusGroups: A Practical Guide for Applied Research, 3rd ed. (Thousand Oaks, CA: Sage, 2000). A clear and comprehensive introduction to focus groups, this book really lives up to its title, de-scribing basic principles of focus groups and giving numerous practical tips.

Patton, Michael Quinn, Qualitative Research and Evaluation Methods, 3rd ed. (Thousand Oaks, CA: Sage, 2001). This is a thorough discussion of specialized interviewing. Patton’s advice will also be useful in constructing questionnaires for surveys in general.

Weisel, Deborah, Conducting Community Surveys: A Practical Guide for Law Enforcement Agencies (Washington, DC: U.S. Department of Justice, Offi ce of Justice Programs, Bureau of Justice Statistics, and Offi ce of Community Oriented Police Services, 1999). Another practical guide, this brief publication is prepared for use by public offi cials, not researchers. As such, it’s a very good description of the nuts and bolts of doing telephone surveys.

✪ Key Terms

✪ Review Questions and Exercises1. Find a questionnaire on the Internet. Bring the

questionnaire to class and critique it. Critique other aspects of the survey design as well.

2. For each of the open-ended questions listed, construct a closed-ended question that could be used in a questionnaire.

a. What was your family’s total income last year?

b. How do you feel about shock incarceration or “boot camp” programs?

c. How do people in your neighborhood feel about the police?

d. What do you feel is the biggest problem fac-ing this community?

e. How do you protect your home from burglary?

3. Prepare a brief questionnaire to study percep-tions of crime near your college or university. Include questions asking respondents to de-scribe a nearby area where they either are afraid to go after dark or think crime is a problem. Then use your questionnaire to interview at least 10 students.

4. A recent evaluation of a federal program to support community policing included sending questionnaires to a sample of about 1,200 police chiefs. Each questionnaire included a number of items asking about specifi c features of com-munity policing and whether they were being used in the department. Almost all the police chiefs had someone else complete the question-naire. What’s the unit of analysis in this survey? What problems might result from having an in-dividual complete such a questionnaire?

✪ Additional ReadingsBabbie, Earl, Survey Research Methods, 2nd ed. (Bel-

mont, CA: Wadsworth, 1990). A comprehensive

closed-ended ques-tions, p. 173

computer-assistedinterviewing, p. 187

focus group, p. 195open-ended ques-

tions, p. 173questionnaire, p. 174

200

Chapter 8

Field ResearchThe techniques described in this chapter focus on observing life in its natural habitat—going where the action is and watching. We’ll consider how to pre-pare for the fi eld, how to observe, how to make records of what is observed, and how to recognize the relative strengths and weaknesses of fi eld research.

Introduction 201

Topics Appropriate to Field Research 202

The Various Roles of the Observer 203

Asking Questions 205

Gaining Access to Subjects 207

Gaining Access to Formal Organizations 207

Gaining Access to Subcultures 210

Selecting Cases for Observation 210

Purposive Sampling in Field Research 212

Recording Observations 214

Cameras and Voice Recorders 214

Field Notes 215

Structured Observations 216

Linking Field Observations and Other Data 217

Illustrations of Field Research 219

Field Research on Speeding and Traffi c Enforcement 219

CONDUCTING A

SAFETY AUDIT 220

Bars and Violence 222

Strengths and Weaknesses of Field Research 224

Validity 224

Reliability 225

Generalizability 226

Chapter 8 Field Research 201

IntroductionField research is often associated with qualitative techniques, though many other applications are possible.

We turn now to what may seem like the most obvious method of making observations: fi eld research. If researchers want to know about something, why not simply go where it’s hap-pening and watch it happen?

Field research encompasses two different methods of obtaining data: (1) making direct observation and (2) asking questions. This chapter concentrates primarily on observation, although we briefl y describe techniques for spe-cialized interviewing in fi eld studies.

Most of the observation methods discussed in this book are designed to produce data appro-priate for quantitative analysis. Surveys provide data to calculate things like the percentage of crime victims in a population or the mean value of property lost in burglaries. Field research may yield qualitative data— observations not eas-ily reduced to numbers—in addition to quanti-tative data. For example, a fi eld researcher who is studying burglars may note how many times subjects have been arrested (quantitative), as well as whether individual burglars tend to se-lect certain types of targets (qualitative).

Qualitative fi eld research is often a theory or hypothesis-generating activity, as well. In many types of fi eld studies, researchers do not have precisely defi ned hypotheses to be tested. Field observation may be used to make sense out of an ongoing process that cannot be predicted in advance. This process involves making ini-tial observations, developing tentative general conclusions that suggest further observations, making those observations, revising the prior conclusions, and so forth.

For example, Ross Homel and associates (Homel, Tomsen, and Thommeny 1992) con-ducted a fi eld study of violence in bars in Syd-ney, Australia, and found that certain situations tended to trigger violent incidents. Subsequent studies tested a series of hypotheses about the

links between certain situations and violence (Homel and Clark 1994), and how interior de-sign was related to aggression in dance clubs (Macintyre and Homel 1996). Later research by James Roberts (2002) expanded these fi nd-ings by examining management and serving practices in New Jersey bars and clubs. Barney Glaser and Anselm Strauss (1967) refer to this process as “grounded theory.” Rather than fol-lowing the deductive approach to theory build-ing described in Chapter 2, grounded theory is based on (or grounded in) experience, usually through observations made in the fi eld.

Field studies in criminal justice may also produce quantitative data that can be used to test hypotheses or evaluate policy innovations. Typically, qualitative exploratory observations help defi ne the nature of some crime problem and suggest possible policy responses. Follow-ing the policy response, further observations are made to assess the policy’s impact. For ex-ample, the situational crime prevention ap-proach proposes fi ve steps to analyze specifi c crime problems. The fi rst and last of those steps illustrate the dual uses of observation for prob-lem defi nition and hypothesis testing (Clarke 1997b, 5). The fi rst step is to collect data about the nature and dimensions of the specifi c crime problem. The last step is to monitor results and disseminate experience.

By now, especially if you have experience as a criminal justice professional, you may be think-ing that fi eld research is not much different from what police offi cers and many other people do every day—make observations in the fi eld and ask people questions. Police may also col-lect data about particular crime problems, take action, and monitor results. So what’s new here?

Compared with criminal justice profession-als, researchers tend to be more concerned with making generalizations and then using system-atic fi eld research techniques to support those generalizations. Consider the different goals and approaches used by two people who might observe shoplifters: a retail store security guard and a criminal justice researcher. The security

202 Part Three Modes of Observation

For example, Clifford Shearing and Phillip Stenning (1992, 251) describe how Disney World employs subtle but pervasive mechanisms of in-formal social control that are largely invisible to millions of theme park visitors. It is diffi cult to imagine any technique other than direct obser-vation that could produce these insights:

Control strategies are embedded in both environmental features and structural rela-tions. In both cases control structures and activities have other functions which are highlighted so that the control function is overshadowed. For example, virtually every pool, fountain, and fl ower garden serves both as an aesthetic object and to direct visitors away from, or towards, particu-lar locations. Similarly, every Disney em-ployee, while visibly and primarily engaged in other functions, is also engaged in the maintenance of order.

Many of the different uses of fi eld obser-vation in criminal justice research are nicely summarized by George McCall. Comparing the three principal ways of collecting data—observing, asking questions, and consulting written records—McCall (1978, 8–9) states that observation is most appropriate for obtaining information about physical or social settings, behaviors, and events.

Field research is especially appropriate for topics that can best be understood within their natural settings. Surveys may be able to mea-sure behaviors and attitudes in somewhat arti-fi cial settings, but not all behavior is best mea-sured this way. For example, fi eld research is a superior method for studying how street-level drug dealers interpret behavioral and situa-tional cues to distinguish potential customers, normal street traffi c, and undercover police of-fi cers. It would be diffi cult to study these skills through a survey.

Field research on actual crimes involves obtaining information about events. McCall

guard wishes to capture a thief and prevent the loss of shop merchandise. Toward those ends, he or she adapts surveillance techniques to the behavior of a particular suspected shoplifter. The researcher’s interests are different; perhaps she or he estimates the frequency of shoplifting, describes characteristics of shoplifters, or evalu-ates some specifi c measure to prevent shoplift-ing. In all likelihood, researchers use more stan-dardized methods of observation aimed toward a generalized understanding.

This chapter examines fi eld research meth-ods in some detail, providing a logical over-view and suggesting some specifi c skills and techniques that make scientifi c fi eld research more useful than the casual observation we all engage in. As we cover the various applica-tions and techniques of fi eld research, it’s use-ful to recall the distinction we made, way back in Chapter 1, between ordinary human inquiry and social scientifi c research. Field methods il-lustrate how the common techniques of obser-vation that we all use in ordinary inquiry can be deployed in systematic ways.

Topics Appropriate to Field ResearchWhen conditions or behavior must be studied in natural settings, fi eld research is usually the best approach.

One of the key strengths of fi eld research is the comprehensive perspective it gives the re-searcher. This aspect of fi eld research enhances its validity. By going directly to the phenome-non under study and observing it as completely as possible, we can develop a deeper and fuller understanding of it. This mode of observa-tion, then, is especially (though not exclusively) appropriate to research topics that appear to defy simple quantifi cation. The fi eld researcher may recognize nuances of attitude, behavior, and setting that escape researchers using other methods.

Chapter 8 Field Research 203

items about victimization, perceptions of crime problems and lighting quality, and reports about routine nighttime behavior in areas af-fected by the lighting.

Although the pretest and posttest survey items could have been used to assess changes in attitudes and behavior associated with im-proved lighting, fi eld observations provided better measures of behavior. Painter conducted systematic counts of pedestrians in areas both before and after street lighting was enhanced. Observations like these are better measures of such behavior than are survey items because people often have diffi culty recalling actions such as how often they walk through some area after dark.

The Various Roles of the ObserverField observer roles range from full participation to fully detached observation.

The term fi eld research is broader and more in-clusive than the common term participant obser-vation. Field researchers need not always partic-ipate in what they are studying, although they usually will study it directly at the scene of the action. As Catherine Marshall and Gretchen Rossman (1995, 60) point out:

The researcher may plan a role that entails varying degrees of “participantness”—that is, the degree of actual participation in daily life. At one extreme is the full partici-pant, who goes about ordinary life in a role or set of roles constructed in the setting. At the other extreme is the complete observer, who engages not at all in social interac-tion and may even shun involvement in the world being studied. And, of course, all possible complementary mixes along the continuum are available to the researcher.

The full participant, in this sense, may be a gen-uine participant in what he or she is studying

(1978) points out that observational studies of vice—such as prostitution and drug use—are much more common than observational stud-ies of other crimes, largely because these behav-iors depend at least in part on being visible and attracting customers. One notable exception is research on shoplifting. A classic study by Terry Baumer and Dennis Rosenbaum (1982) had two goals: (1) to estimate the incidence of shop-lifting in a large department store and (2) to assess the effectiveness of different store secu-rity measures. Each objective required devising some measure of shoplifting, which Baumer and Rosenbaum obtained through direct ob-servation. Samples of persons were followed by research staff from the time they entered the store until they left. Observers, posing as fellow shoppers, watched for any theft by the person they had been assigned to follow.

Many aspects of physical settings are prob-ably best studied through direct observation. The prevalence and patterns of gang graffi ti in public places could not be reliably measured through surveys, unless the goal was to mea-sure perceptions of graffi ti. The work of Oscar Newman (1972, 1996), Ray Jeffery (1977), and Patricia and Paul Brantingham (Brantingham and Brantingham 1991) on the relationship be-tween crime and environmental design depends crucially on fi eld observation of settings. If op-portunities for crime vary by physical setting, then observation of the physical characteristics of a setting is required.

An evaluation of street lighting as a crime prevention tool in two areas of London illus-trates how observation can be used to mea-sure both physical settings and behavior. Kate Painter (1996) was interested in the relation-ships between street lighting, certain crime rates (measured by victim surveys), fear of crime, and nighttime mobility. Improvements in street lighting were made in selected streets; surveys of pedestrians and households in the affected areas were conducted before and after the light-ing improvements. Survey questions included

204 Part Three Modes of Observation

workers are not normally compatible with col-lecting data for research.

Because of these considerations— ethical, scientifi c, practical, and safety—fi eld research-ers most often choose a different role. The re-searcher taking the role participant-as-observerparticipates with the group under study but makes it clear that he or she is also undertak-ing research. If someone has been convicted of some offense and been placed on probation, for example, that might present an opportunity to launch a study of probation offi cers.

McCall (1978) suggests that fi eld research-ers who study active offenders may comfort-ably occupy positions around the periphery of criminal activity. Acting as a participant in certain types of leisure activities, such as fre-quenting selected bars or dance clubs, may be appropriate roles. This approach was used by Dina Perrone in her research on drug use in New York dance clubs (2006; also mentioned in Chapter 2). Furthermore, McCall describes how making one’s role as a researcher known to criminals and becoming known as a “right square” is more acceptable to subjects than an unsuccessful attempt to masquerade as a col-league. There are dangers in this role also, how-ever. The people being studied may shift their attention to the research project, and the pro-cess being observed may no longer be typical. Conversely, a researcher may come to identify too much with the interests and viewpoints of the participants. This is referred to as going na-tive and results in loss of the detachment neces-sary for social science.

The observer-as-participant identifi es himself or herself as a researcher and interacts with the participants in the course of their routine activities but makes no pretense of actually be-ing a participant. Many observational studies of police patrol are examples of this approach. Researchers typically accompany police offi cers on patrol, observing routine activities and in-teractions between police and citizens. Spend-ing several hours in the company of a police

(for example, a participant in a demonstration against capital punishment)— or at least pre-tend to be a genuine participant. In any event, if you are acting as a full participant, you let people see you only as a participant, not as a researcher.

That raises an ethical question: is it ethical to deceive the people we are studying in the hope that they will confi de in us as they would not confi de in an identifi ed researcher? Do the interests of science—the scientifi c values of the research— offset any ethical concerns?

Related to this ethical consideration is a sci-entifi c one. No researcher deceives his or her subjects solely for the purpose of deception. Rather, it is done in the belief that the data will be more valid and reliable, that the subjects will be more natural and honest if they do not know the researcher is doing a research project. If the people being studied know they are being stud-ied, they might reject the researcher or modify their speech and behavior to appear more re-spectable than they otherwise would. In either case, the process being observed might radically change.

On the other side of the coin, if we assume the role of complete participant, we may affect what we are studying. To play the role of partic-ipant, we must participate, yet our participation may affect the social process we are studying. Additional problems may emerge in any partic-ipant observation study of active criminals. Le-gal and physical risks, mentioned in Chapter 2, present obstacles to the complete participant in fi eld research among active offenders or delinquents.

Finally, complete participation in fi eld stud-ies of criminal justice institutions is seldom possible. Although it is common for police of-fi cers to become criminal justice researchers, practical constraints on the offi cial duties of police present major obstacles to simultane-ously acting as researcher and police offi cer. Similarly, the responsibilities of judges, pros-ecutors, probation offi cers, and corrections

Chapter 8 Field Research 205

their experience. In making a decision, research-ers must be guided by both methodological and ethical considerations. Because these often con-fl ict, deciding on the appropriate role may be diffi cult. Often, researchers fi nd that their role limits the scope of their study.

Asking QuestionsField researchers frequently supplement observa-tions by interviewing subjects.

Field research often involves going where the action is and simply watching and listening. Researchers can learn a lot merely by being at-tentive to what’s going on. Field research can also involve more active inquiry. Sometimes it’s appropriate to ask people questions and record their answers.

We examined interviewing during our dis-cussion on survey research in Chapter 7. Field research interviews are usually much less struc-tured than survey interviews. At one extreme, an unstructured interview is essentially a con-versation in which the interviewer establishes a general direction for the conversation and pursues specifi c topics raised by the respon-dent. Ideally, the respondent does most of the talking. Michael Quinn Patton (2001) refers to this type as an “informal conversational inter-view,” which is especially well suited to in-depth probing.

Unstructured interviews are most appro-priate when researchers have little knowledge about a topic and when it’s reasonable for them to have a casual conversation with a subject. This is a good strategy for interviewing active criminals. Unstructured interviews are also ap-propriate when researchers and subjects are together for an extended time, such as a re-searcher accompanying police on patrol.

In other fi eld research situations, interviews will be somewhat more structured. The conver-sational approach may be diffi cult to use with offi cials in criminal justice or other agencies,

offi cer also affords opportunities for unstruc-tured interviewing.

The complete observer, at the other extreme, observes a location or process without becom-ing a part of it in any way. The subjects of study might not even realize they are being studied because of the researcher’s unobtrusiveness. An individual making observations while sitting in a courtroom is an example. Although the complete observer is less likely to affect what is being studied and less likely to go native than the complete participant, he or she may also be less able to develop a full appreciation of what is being studied. A courtroom observer, for ex-ample, witnesses only the public acts that take place in the courtroom, not private conferences between judges and attorneys.

McCall (1978, 45) points out an interesting and often unnoticed trade-off between the role observers adopt and their ability to learn from what they see. If their role is covert (complete participation) or detached (complete observa-tion), they are less able to ask questions to clar-ify what they observe. As complete participants, they take pains to conceal their observations and must exercise care in querying subjects. Similarly, complete observation means that it is generally not possible to interact with the per-sons or things being observed.

Researchers have to think carefully about the trade-off. If it is most important that sub-jects not be affected by their role as observer, then complete participation or observation is preferred. If being able to ask questions about what they observe is important, then some role that combines participation and observation is better.

More generally, the appropriate role for ob-servers hinges on what they want to learn and how their inquiry is affected by opportunities and constraints. Different situations require different roles for researchers. Unfortunately, there are no clear guidelines for making this choice; fi eld researchers rely on their under-standing of the situation, their judgment, and

206 Part Three Modes of Observation

equipment by police or even the ways they go about doing traffi c enforcement. Part of the re-search involved semistructured fi eld interviews with commanders, supervisors, and troopers. Figure 8.1 shows the interview guide Maxfi eld and Andresen used with supervisors. This guide was just that—a guide. Some subjects were friendly and wanted to talk, so the interview became more of a conversation that eventually yielded answers to the queries in the guide. Oth-ers were wary, probably because the agency had been subject to criticism over racial profi ling and other discriminatory practices. Interviews with such persons were brief, and they followed the guide very closely.

At its best, a fi eld research interview is much like normal conversation. Because of this, it is essential to keep reminding ourselves that we are not having a normal conversation. In

who will respond best (at least initially) to a specifi c set of open-ended questions. This is because it is usually necessary to arrange ap-pointments to conduct fi eld research interviews with judges, prosecutors, bail commissioners, and other offi cials. Having arranged such an appointment, it would be awkward to initiate a casual conversation in hopes of eliciting the desired information.

At the same time, one of the special strengths of fi eld research is its fl exibility in the fi eld. Even during structured interviews with public of-fi cials, the answers evoked by initial questions should shape subsequent ones.

For example, Michael Maxfi eld and Carsten Andresen (2002) studied the use by the New Jersey State Police of video recording equip-ment to document traffi c stops. Very little published research existed on the use of video

1. What do you feel are the main uses of the mobile video system (MVS) for supervisors? [probe con-tingent on responses]

2. Please describe how you use tapes from the MVS system to periodically review officer performance. [probes]

• Compare tapes with incident reports.

• Review tapes from certain types of incidents. [probe: Please describe types and why these types.]

• Keep an eye on individual officers. [probe: Please describe what might prompt you to se-lect individual officers. I do not want you to name individuals. Instead, can you describe the rea-sons why you might want to keep an eye on specific individuals?]

3. For routine review, how do you decide which tapes to select for review?

• Does it vary by shift?

• Do you try to review a certain number of incidents?

• Or do you scan through tapes sort of at random?

4. Think back to when the system first became operational in cars at this station. What were some questions, concerns, or problems that officers might have had at the beginning?

5. How do officers under your command feel about the system now?

6. What, if any, technical problems seem to come up regularly? Occasionally?

7. Have you encountered any problems, or do you have any concerns about the MVS? [probe: opera-tional issues; tape custody issues; other]

8. Please describe any specific ways you think the system can or should be improved.

9. If you want to find the tape for an individual officer on a specific day, and for a specific incident, would you say that’s pretty easy, not difficult, or somewhat difficult? Do you have any suggestions for changing the way tapes are fi led and controlled?

10. I would like to view some sample tapes. Can we go to the tape cabinet and find a tape for [select date from list]?

Figure 8.1 Interview Guide for Field Study of State PoliceSource: Adapted from Maxfi eld and Andresen (2002).

Chapter 8 Field Research 207

riety of informal organizational cultures. Crim-inal courts are highly structured organizations in which a presiding judge may oversee court assignments and case scheduling for judges and many support personnel. At the same time, courts are chaotic organizations in which three constellations of professionals—prosecutors, defense attorneys, and judges— episodically in-teract to process large numbers of cases.

Continuing with the example of your re-search on community corrections, the best strategy in gaining access to virtually any other formal criminal justice organization is to use a four-step procedure: sponsor, letter, phone call, and meeting. Our discussion of these steps as-sumes that you will begin your fi eld research by interviewing the agency executive director and gaining that person’s approval for subsequent interviews and observations.

Sponsor The fi rst step is to fi nd a sponsor—a person who is personally known to and respected by the executive director. Ideally, a sponsor will be able to advise you on a person to contact, that person’s formal position in the organization, and her or his informal status, including her or his relationships with other key offi cials. Such advice can be important in helping you initiate contact with the right person while avoiding people who have bad reputations.

For example, you may initially think that a particular judge who is often mentioned in newspaper stories about community cor-r ections would be a useful source of informa-tion. However, your sponsor might advise you that the judge is not held in high regard by prosecutors and community corrections staff. Your association with this judge would generate suspicion on the part of other offi -cials whom you might eventually want to con-tact and may frustrate your attempts to obtain information.

Finding the right sponsor is often the most important step in gaining access. It may require a couple of extra steps, because you might fi rst need to ask a professor whether she or he knows

normal conversations, each of us wants to come across as an interesting, worthwhile person. Of-ten we don’t really hear each other because we’re too busy thinking of what we’ll say next. As an interviewer, the desire to appear interesting is counterproductive to the task. We need to make the other person seem interesting by being in-terested ourselves.

Gaining Access to SubjectsArranging access to subjects in formal organizations or subcultures begins with an initial contact.

Suppose you decide to undertake fi eld research on a community corrections agency in a large city. Let’s assume that you do not know a great deal about the agency and that you will identify yourself as a researcher to staff and other peo-ple you encounter. Your research interests are primarily descriptive: you want to observe the routine operations of the agency in the offi ce and elsewhere. In addition, you want to inter-view agency staff and persons who are serving community corrections sentences. This section will discuss some of the ways you might prepare before you conduct your interviews and direct observations.

As usual, you are well advised to begin with a search of the relevant literature, fi lling in your knowledge of the subject and learning what others have said about it.

Gaining Access to Formal OrganizationsAny research on a criminal justice institution, or on persons who work either in or under the supervision of an institution, normally re-quires a formal request and approval. One of the fi rst steps in preparing for the fi eld, then, is to arrange access to the community corrections agency.

Obtaining initial approval can be confusing and frustrating. Many criminal justice agencies in large cities have a complex organization, com -bining a formal hierarchy with a bewildering va-

208 Part Three Modes of Observation

introduction, brief statement of your research purpose, and action request. See Figure 8.2 for an example. The introduction begins by naming your sponsor, thus immediately establishing your mutual acquaintance. This is a key part of the process; if you do not name a sponsor, or if you name the wrong sponsor, you might get no further.

Next, you describe your research purpose succinctly. This is not the place to give a de-tailed description as you would in a proposal.

someone. You could then contact that person (with the sponsorship of your professor) and ask for further assistance. For purposes of il-lustration, we will assume that your professor is knowledgeable, well connected, and happy to act as your sponsor. Your professor confi rms your view that it is best to begin with the execu-tive director of community corrections.

Letter Next, write a letter to the executive director. Your letter should have three parts:

Jane AdamsExecutive DirectorChaos County Community CorrectionsAnxiety Falls, Colorado 1 May 2009

Dear Ms. Adams:

My colleague, Professor Marcus Nelson, suggested I contact you for as sistance in my research on community corrections. I will be conducting a study of community corrections programs and wish to include the Chaos County agency in my research.

Briefly, I am interested in learning more about the different types of sentences that judges impose in jurisdic-tions with community corrections agencies. As you know, Colorado’s community corrections statute grants considerable discretion to individual counties in arranging locally administered corrections programs. Because of this, it is generally believed that a wide variety of corrections programs and sentences have been developed throughout the state. My research seeks to learn more about these programs as a first step toward developing recommendations that may guide the development of programs in other states. I also wish to learn more about the routine administration of a community corrections program such as yours.

I would like to meet with you to discuss what programs Chaos County has developed, including current pro-grams and those that were considered but not implemented. In addition, any information about different types of community corrections sentences that Chaos County judges impose would be very useful. Finally, I would appreciate your suggestions on further sources of information about community corrections programs in Chaos County and other areas.

I will call your office at about 10:00 a.m. on Monday, May 8, to arrange a meeting. If that time will not be conve-nient, or if you have any questions about my research, please contact me at the number below.

Thanks in advance for your help.

Sincerely,

Alfred NobelResearch AssistantInstitute for Advanced Studies(201) 555-1212

Figure 8.2 Sample Letter of Introduction

Chapter 8 Field Research 209

with the executive director—and established your legitimacy by naming a sponsor.

Meeting The fi nal step is meeting with or interviewing the contact person. Because you have used the letter–phone call–meeting pro-cedure, the contact person may have already taken preliminary steps to help you. For ex-ample, because the letter in Figure 8.2 indicates that you wish to interview the executive direc-tor about different types of community correc-tions sentences, she may have assembled some procedures manuals or reports in preparation for your meeting.

This procedure generally works well in gain-ing initial access to public offi cials or other people who work in formal organizations. Once initial access is gained, it is up to the researcher to use interviewing skills and other techniques to elicit the desired information. This is not as diffi cult as it might seem to novice (or appren-tice) researchers for a couple of reasons.

First, most people are at least a bit fl attered that their work, ideas, and knowledge are of in-terest to a researcher. And researchers can take advantage of this with the right words of en-couragement. Second, criminal justice profes-sionals are often happy to talk about their work with a knowledgeable outsider. Police, proba-tion offi cers, and corrections workers usually encounter only their colleagues and their cli-ents on the job. Interactions with colleagues become routine and suffused with offi ce poli-tics, and interactions with clients are common sources of stress. Talking with an interested and knowledgeable researcher is often seen as a pleasant diversion.

By the same token, Richard Wright and Scott Decker (1994) report that most members of their sample of active burglars were both happy and fl attered to discuss the craft of bur-glary. Because they were engaged in illegal ac-tivities, burglars had to be more circumspect about sharing their experiences, unlike the way

If possible, keep the description to one or two paragraphs, as in Figure 8.2. If a longer descrip-tion is necessary to explain what you will be doing, you should still include only a brief de-scription in your introductory letter, referring the reader to a separate attachment in which you summarize your research.

The action request describes what immedi-ate role you are asking the contact person to play in your research. You may simply be re-questing an interview, or you may want the per-son to help you gain access to other offi cials. Notice how the sample in Figure 8.2 mentions both an interview and “suggestions on further sources of information about community cor-rections.” In any case, you will usually want to arrange to meet or at least talk with the contact person. That leads to the third step.

Phone Call You probably already know that it can be diffi cult to arrange meetings with public offi cials (and often professors) or even to reach people by telephone. You can simplify this task by concluding your letter with a pro-posal for this step: arranging a phone call. The example in Figure 8.2 specifi es a date and ap-proximate time when you will call. To be safe, specify a date about a week from the date of your letter. Notice also the request that the ex-ecutive director call you if some other time will be more convenient.

When you make the call, the executive di-rector will have some idea of who you are and what you want. She will also have had the op-portunity to contact your sponsor if she wants to verify any information in your introductory letter.

The actual phone call should go smoothly. Even if you are not able to talk with the execu-tive director personally, you will probably be able to talk to an assistant and make an ap-pointment for a meeting (the next step). Again, this will be possible because your letter de-scribed what you eventually want—a meeting

210 Part Three Modes of Observation

criminals hang out. Wright and Decker rejected that strategy as a time-consuming and uncer-tain way to fi nd burglars, in part because they were not sure where burglars hung out. In con-trast, Bruce Jacobs (1999) initiated contact with street-level drug dealers by hanging around and being noticed in locations known for crack availability. Consider how this tactic might make sense for fi nding drug dealers, whose il-legal work requires customers. In contrast, the offense of burglary is more secretive, and it’s more diffi cult to imagine how one would fi nd an area known for the presence of burglars.

Whatever techniques are used to identify subjects among subcultures, it is generally not possible to produce a probability sample, and so the sample cannot be assumed to represent some larger population within specifi ed confi dence intervals. It is also important to think about po-tential selection biases in whatever procedures are used to recruit subjects. Although we can’t make probability statements about samples of active offenders, such samples may be represen-tative of a subculture target population.

Selecting Cases for ObservationThis brings us to the more general question of how to select cases for observation in fi eld research. The techniques used by Wright and Decker, as well as by many other researchers who have studied active criminals, combine the use of informants and what is called snowballsampling. As we mentioned in Chapter 6, with snowball sampling, initial research subjects (or informants) identify other persons who might also become subjects, who in turn suggest more potential subjects, and so on. In this way, a group of subjects is accumulated through a se-ries of referrals.

Wright and Decker’s (1994) study provides a good example. The ex-offender contacted a few active burglars and a few streetwise noncrimi-nals, who referred researchers to additional subjects, and so on. This process is illustrated in Figure 8.3, which shows the chain of referrals

many people talk about events at work. As a result, burglars enjoyed the chance to describe their work to interested researchers who both promised confi dentiality and treated them “as having expert knowledge normally unavailable to outsiders” (1994, 26).

Gaining Access to SubculturesResearch by Wright and Decker illustrates how gaining access to subcultures in criminal justice—such as active criminals, deviants, juve-nile gangs, and inmates—requires tactics that differ in some respects from those used to meet with public offi cials. Letters, phone calls, and formal meetings are usually not appropriate for initiating fi eld research among active offenders. However, the basic principle of using a sponsor to gain initial access operates in much the same way, although the word informant is normally used to refer to someone who helps make con-tact with subcultures.

Informants may be people whose job in-volves working with criminals—such as police, juvenile caseworkers, probation offi cers, attor-neys, and counselors at drug clinics. Lawyers who specialize in criminal defense work can be especially useful sources of information about potential subjects. Frances Gant and Peter Gra-bosky (2001) contacted a private investigator to help locate car thieves and people working in auto-related businesses who were reputed to deal in stolen parts.

Wright and Decker (1994) were fortunate to encounter a former offender who was well con-nected with active criminals. Playing the role of sponsor, the ex-offender helped researchers in two related ways. First, he referred them to other people, who in turn found active burglars willing to participate in the study. Second, he was well known and respected among burglars, and his sponsorship of the researchers made it possible for them to study a group of naturally suspicious subjects.

A different approach for gaining access to subcultures is to hang around places where

Chapter 8 Field Research 211

and 029. Notice also that some subjects were themselves “nominated” by more than one source. In the middle of the bottom row in Fig-ure 8.3, for example, subject 064 was mentioned by subjects 060 and 061.

There are, of course, other ways of selecting subjects for observation. Chapter 6 discussed the more conventional techniques involved in probability sampling and the accompanying logic. Although the general principles of rep-resentativeness should be remembered in fi eld research, controlled sampling techniques are often not possible.

that accumulated a snowball sample of 105 individuals.

Starting at the top of Figure 8.3, the ex-offender put researchers in contact with two subjects directly (001 and 003), a small-time criminal, three streetwise noncriminals, a crack addict, a youth worker, and someone serving probation. Continuing downward, the small-time criminal was especially helpful, identifying 12 subjects who participated in the study (005, 006, 008, 009, 010, 021, 022, 023, 025, 026, 030, 032). Notice how the snowball effect conti n- ues, with subject 026 identifying subjects 028

Figure 8.3 Snowball Sample Referral ChartSource: Reprinted from Richard T. Wright and Scott H. Decker, Burglars on the Job: Streetlife and Residential Break-Ins (Boston: Northeastern University Press, 1994), p. 19. Reprinted by permission of Northeastern University Press.

EX-OFFENDER

streetwisenoncriminal

streetwisenoncriminal

streetwisenoncriminal probationer

Females

Whites

Juveniles (under 18) *

KEY

low-level “fence”

retired high-level “fence”

013

014

015009

youthworker

heroin addict/retiredsmall-time criminal

“crack” addictsmall-timecriminal

008

006

005

001*003*

002*

004*

010

021024

022027

023

025

028

026

029*

030 032

034

031*

074*

075*

077*

076*

081*

082*

087 088*

073

083

089

098

101099093

007

094104

105

100

092

072

085*

082

102

090

091

097*

103*

095

096*

040

044

050038

018

017

016 020

019

033

036

039 042

037035

041 047*

043049

054

046

045 048 052

063

011

012

066 078

058056 062*

071

065 068

070064

084

080

079069

067

061060059

057

055

053

051

212 Part Three Modes of Observation

who had not been caught. After accumulating their sample, the researchers were in a position to test this assumption by examining arrest re-cords for their subjects. Only about one-fourth of the active burglars had ever been convicted of burglary; an additional one-third had been arrested for burglary but not convicted. More than 40 percent had no burglary arrests, and 8 percent had never been arrested for any of-fense (1994, 12).

Putting all this together, Wright and Decker concluded that about three-fourths of their subjects would not have been eligible for inclu-sion if the researchers had based their sample on persons convicted of burglary. Thus little overlap exists between the population of ac-tive burglars and the population of convicted burglars.

Purposive Sampling in Field ResearchSampling in fi eld research tends to be more complicated than in other kinds of research. In many types of fi eld studies, researchers attempt to observe everything within their fi eld of study; thus, in a sense, they do not sample at all. In reality, of course, it is impossible to observe ev-erything. To the extent that fi eld researchers ob-serve only a portion of what happens, what they do observe is a de facto sample of all the possible observations that might have been made. We can seldom select a controlled sample of such observations. But we can keep in mind the gen-eral principles of representativeness and inter-pret our observations accordingly.

The ability to systematically sample cases for observation depends on the degree of structure and predictability of the phenomenon being ob-served. This is more of a general guideline than a hard-and-fast rule. The actions of youth gangs, burglars, and auto thieves are less structured and predictable than those of police offi cers. It is possible to select a probability sample of po-lice offi cers for observation because the behav-ior of police offi cers in a given city is structured and predictable in many dimensions. Because

As an illustration, consider the potential se-lection biases involved in a fi eld study of devi-ants. Let’s say we want to study a small number of drug dealers. We have a friend who works in the probation department of a large city and is willing to introduce us to people convicted of drug dealing and sentenced to probation. What selection problems might result from studying subjects identifi ed in this way? How might our subjects not be representative of the general population of drug dealers? If we work our way backward from the chain of events that begins with a crime and ends with a criminal sentence, the answers should become clear.

First, drug dealers sentenced to probation may be fi rst-time offenders or persons convicted of dealing small amounts of “softer” drugs. Repeat offenders and kingpin cocaine dealers will not be in this group. Second, it is possible that people initially charged with drug dealing were convicted of simple possession through a plea bargain; because of our focus on people convicted of dealing, our selection procedure will miss this group as well. Finally, by select-ing dealers who have been arrested and con-victed, we may be gaining access only to those less skilled dealers who got caught. More skilled or experienced dealers may be less likely to be arrested in the fi rst place; they may be differ-ent in important ways from the dealers we wish to study. Also, if dealers in street drug markets are more likely to be arrested than dealers who work through social networks of friends and acquaintances, a sample based on arrested deal-ers could be biased in more subtle ways.

To see why this raises an important issue in selecting cases for fi eld research, let’s return again to the sample of burglars studied by Wright and Decker. Notice that their snowball sample began with an ex-offender and that they sought out active burglars. An alternative ap-proach would be to select a probability or other sample of convicted burglars, perhaps in prison or on probation. But Wright and Decker re-jected this strategy for sampling because of the possibility that they would overlook burglars

Chapter 8 Field Research 213

wide streets and narrow ones, busy streets and quiet ones, or samples from different times of day. In a study of pedestrian traffi c, we might also observe people in different types of urban neighborhoods— comparing residential and commercial areas, for example.

Table 8.1 summarizes different sampling di-mensions that might be considered in planning fi eld research. The behavior of people, together with the characteristics of people and places, can vary by population group, location, time, and weather. We have already touched on the fi rst two in this chapter; now we will briefl y dis-cuss how sampling plans might consider time and weather dimensions.

People tend to engage in more out-of-door activities in fair weather than in wet or snowy conditions. In northern cities, people are out-side more when the weather is warm. Any study of outdoor activity should therefore consider the potential effects of variation in the weather. For example, in Painter’s study of pedestrian traffi c before and after improvements in street lighting, it was important to consider weather conditions during the times observations were made.

Behavior also varies by time, presented as micro and macro dimensions in Table 8.1. City streets in a central business district are busiest during working hours, whereas more people are in residential areas at other times. And, of course, people do different things on weekends than during the work week. Seasonal variation, the macro time dimension, may also be impor-tant in criminal justice research. Daylight lasts longer in summer months, which affects the amount of time people spend outdoors. Shop-ping peaks from Thanksgiving to Christmas, increasing the number of shoppers, who along with their automobiles may become targets for thieves.

In practice, controlled probability sampling is seldom used in fi eld research. Different types of purposive samples are much more common. Patton (2001, 230; emphasis in original) de-scribes a broad range of approaches to purposive

the population of active criminals is unknown, it is not possible to select a probability sample for observation.

This example should call to mind our discus-sion of sampling frames in Chapter 6. A roster of police offi cers and their assignments to pa-trol sectors and shifts could serve as a sampling frame for selecting subjects to observe. No such roster of gang members, burglars, and auto thieves is available. Criminal history records could serve as a sampling frame for selecting persons with previous arrests or convictions, subject to the problems of selectivity we have mentioned.

Now consider the case in which a sampling frame is less important than the regularity of a process. The regular, predictable passage of people on city sidewalks makes it possible to systematically select a sample of cases for ob-servation. There is no sampling frame of pe-destrians, but studies such as Painter’s research (1996) on the effects of street lighting can de-pend on the reliable fl ow of passersby who may be observed.

In an observational study such as Painter’s, we might also make observations at a number of different locations on different streets. We could pick the sample of locations through standard probability methods, or more likely, we could use a rough quota system, observing

Table 8.1 Sampling Dimensions in Field Research

SamplingDimension Variation in

Population Behavior and characteristics

Space Behavior Physical features of locations

Time, micro Behavior by time of day, day of week Lighting by time of day Business, store, entertainment activities by time of day, day of week

Time, macro Behavior by season, holiday Entertainment by season, holiday

Weather Behavior by weather

214 Part Three Modes of Observation

types of automated and remote measurement, such as videotapes, devices that count automo-bile traffi c, or computer tabulations of mass transit users. In between is a host of methods that have many potential applications in crimi-nal justice research.

Of course, the methods selected for record-ing observations are directly related to issues of measurement— especially how key concepts are operationalized. Thinking back to our discus-sion of measurement in Chapter 4, you should recognize why this is so. If we are interested in policies to increase nighttime pedestrian traffi c in some city, we might want to know why peo-ple do or do not go out at night and how many people stroll around different neighborhoods. Interviews—perhaps in connection with a survey— can determine people’s reasons for go-ing out or not, whereas video recordings of pass-ersby can provide simple counts. By the same token, a traffi c-counting device can produce information about the number of automobiles that pass a particular point on the road, but it cannot measure what the blood alcohol levels of drivers are, whether riders are wearing seat belts, or how fast a vehicle is traveling.

Cameras and Voice RecordersVideo cameras may be used in public places to record relatively simple phenomena, such as the passage of people or automobiles, or more complex social processes. For several years, London police have monitored traffi c condi-tions at dozens of key intersections using video cameras mounted on building rooftops. In fact, the 2007 Road Atlas for Britain includes the locations of stationary video cameras on its maps. Since 2003, video cameras have moni-tored all traffi c entering central London as part of an effort to reduce traffi c. The license plates of vehicles that do not register paying a toll are recorded, and violation notices sent to owners. Ronald Clarke (1996) studied speeding in Illi-nois, drawing on observations automatically recorded by cameras placed at several locations throughout the state.

sampling and offers a useful comparison of probability and purposive samples:

The logic and power of probability sam-pling derive from statistical probability theory. A random and statistically repre-sentative sample permits confi dent gener-alization from a sample to a larger popula-tion. . . . The logic and power of purposeful sampling lies in selecting information richcases for study in depth.

Nonetheless, if researchers understand the prin -ciples and logic of more formal sampling meth-ods, they are likely to produce more effective purposive sampling in fi eld research.

Recording ObservationsMany different methods are available for collecting and recording fi eld observations.

Just as there is great variety in the types of fi eld studies we might conduct, we have many op-tions for making records of fi eld observations. In conducting fi eld interviews, researchers al-most certainly write notes of some kind, but they might also tape-record interviews. Video-taping may be useful in fi eld interviews to cap-ture visual images of dress and body language. Photographs or videotapes can be used to make records of visual images such as a block of apartment buildings before and after some physical design change or to serve as a pretest for an experimental neighborhood cleanup campaign. This technique was used by Robert Sampson and Stephen Raudenbush (1999) in connection with probability samples of city blocks in Chicago. Videotapes were made of sampled blocks, and the recordings were then viewed to assess physical and social conditions in those areas.

We can think of a continuum of methods for recording observations. At one extreme is tradi-tional fi eld observation and note taking with pen and paper, such as we might use in fi eld in-terviews. The opposite extreme includes various

Chapter 8 Field Research 215

observations as written notes, perhaps in a fi eld journal. Field notes should include both empirical observations and interpretations of them. They should record what we “know” we have observed and what we “think” we have ob-served. It is important, however, that these dif-ferent kinds of notes be identifi ed for what they are. For example, we might note that person X approached and handed something to per-son Y—a known drug dealer—that we think this was a drug transaction, and that we think person X was a new customer.

Every student is familiar with the process of taking notes. Good note taking in fi eld research requires more careful and deliberate attention and involves some specifi c skills. Three guide-lines are particularly important.

First, don’t trust your memory any more than you have to; it’s untrustworthy. Even if you pride yourself on having a photographic memory, it’s a good idea to take notes, either during the observation or as soon afterward as possible. If you are taking notes during the observation, do it unobtrusively because people are likely to behave differently if they see you writing down everything they say or do.

Second, it’s usually a good idea to take notes in stages. In the fi rst stage, you may need to take sketchy notes (words and phrases) to keep abreast of what’s happening. Then remove yourself and rewrite your notes in more detail. If you do this soon after the events you’ve ob-served, the sketchy notes will help you recall most of the details. The longer you delay, the less likely you are to recall things accurately and fully. James Roberts (2002), in his study of aggression in New Jersey nightclubs, was reluc-tant to take any notes while inside clubs, so he retired to his car to make sketchy notes about observations, then wrote them up in more de-tail later.

Third, you will inevitably wonder how much you should record. Is it really worth the effort to write out all the details you can recall right after the observation session? The basic answer is yes. In fi eld research, you can’t really be sure

Still photographs may be appropriate to re-cord some types of observations, such as graf-fi ti or litter. Photos have the added benefi t of preserving visual images that can later be viewed and coded by more than one person, thus facilitating interrater reliability checks. If we are interested in studying pedestrian traf-fi c on city streets, we might gather data about what types of people we see and how many there are. As the number and complexity of our observations increase, it becomes more diffi cult to reliably record how many males and females we see, how many adults and juveniles, and so on. Taking photographs of sampled areas will enable us to be more confi dent in our measure-ments and will also make it possible for an-other person to check on our interpretation of the photographs.

This approach was used by James Lange and associates (Lange, Johnson, and Voas 2005) in their study of speeding on the New Jersey Turn-pike. The researchers deployed radar devices and digital cameras to measure the speed of vehicles and to take photos of drivers. Equip-ment was housed in an unmarked van parked at sample locations on the turnpike. The race of drivers was later coded by teams of research-ers who studied the digital images. Agreement by at least two of three coders was required to accept the photos for further analysis.

In addition to their use in interviews, audio-tape recorders are useful for dictating observa-tions. For example, a researcher interested in patterns of activity on urban streets can dictate observations while riding through selected ar-eas in an automobile. It is possible to dictate observations in an unstructured manner, de-scribing each street scene as it unfolds. Or a tape recorder can be used more like an audio checklist, with observers noting specifi ed items seen in preselected areas.

Field NotesEven tape recorders and cameras cannot cap-ture all the relevant aspects of social processes. Most fi eld researchers make some records of

216 Part Three Modes of Observation

Because structured fi eld observation forms often resemble survey questionnaires, the use of such forms has the benefi t of enabling re-searchers to generate numeric measures of conditions observed in the fi eld. The Bureau of Justice Assistance (1993, 43) has produced a handbook containing guidelines for conduct-ing structured fi eld observations, called envi-ronmental surveys. The name is signifi cant because observers record information about the conditions of a specifi ed environment:

[Environmental] surveys seek to assess, as systematically and objectively as possible, the overall physical environment of an area. That physical environment comprises the buildings, parks, streets, transporta-tion facilities, and overall landscaping of an area as well as the functions and condi-tions of those entities.

Environmental surveys have come to be an important component of problem-oriented policing and situational crime prevention. Figure 8.4 is adapted from an environmental survey form used by the Philadelphia Police De-partment in drug enforcement initiatives. Envi-ronmental surveys are conducted to plan police strategy in drug enforcement in small areas and to assess changes in conditions following tar-geted enforcement. Notice that the form can be used to record both information about physical conditions (street width, traffi c volume, street-lights) and counts of people and their activities.

Like interview surveys, environmental sur-veys require that observers be carefully trained in what to observe and how to interpret it. For example, the instructions that accompany the environmental survey excerpted in Figure 8.4 include guidance on coding abandoned automobiles:

Count as abandoned if it appears non-drivable (i.e., has shattered windows, dis-mantled body parts, missing tires, missing license plates). Consider it abandoned if it appears that it has not been driven for

what’s important and unimportant until you’ve had a chance to review and analyze a great vol-ume of information, so you should record even things that don’t seem important at the time. They may turn out to be signifi cant after all. In addition, the act of recording the details of something unimportant may jog your memory on something that is important.

Structured ObservationsField notes may be recorded on highly struc-tured forms in which observers mark items in much the same way a survey interviewer marks a closed-ended questionnaire. For example, Steve Mastrofski and associates (1998, 11) de-scribe how police performance can be recorded on fi eld observation questionnaires:

Unlike ethnographic research, which re-lies heavily on the fi eld researcher to make choices about what to observe and how to interpret it, the observer using [structured observation] is guided by . . . instruments designed by presumably experienced and knowledgeable police researchers.

Training for such efforts is extensive and time consuming. But Mastrofski and associates com-pared structured observation to closed-ended questions on a questionnaire. If researchers can anticipate in advance that observers will en-counter a limited number of situations in the fi eld, those situations can be recorded on struc-tured observation forms. And, like closed-ended survey questions, structured observations have higher reliability.

In a long-term study, Ralph Taylor (1999) developed forms to code a range of physical characteristics in a sample of Baltimore neigh-borhoods. Observers recorded information on closed-ended items about housing layout, street length and width, traffi c volume, type of nonresidential land use, graffi ti, persons hang-ing out, and so forth. Observations were com-pleted in the same neighborhoods in 1981 and 1994.

Chapter 8 Field Research 217

Linking Field Observations and Other DataAlthough criminal justice research may use fi eld methods or sample surveys exclusively, a given project will often collect data from several sources. This is consistent with general advice

some time and that it is not going to be for some time to come.

Other instructions provide details on how to count drivable lanes, what sorts of activities constitute “playing” and “working,” how to es-timate the ages of people observed, and so on.

Figure 8.4 Example of Environmental SurveySource: Adapted from Bureau of Justice Assistance (1993, Appendix B).

Date:______________ Day of week:______________ Time:______________

Observer:________________________________________________________

Street name:______________________________________________________

Cross streets:_____________________________________________________

1. Street width

Number of drivable lanes ____

Number of parking lanes ____

Median present? (yes � 1, no � 2) ____

2. Volume of traffic flow: (check one)

a. very light ____

b. light ____

c. moderate ____

d. heavy ____

e. very heavy ____

3. Number of street lights ____

4. Number of broken street lights ____

5. Number of abandoned automobiles ____

6. List all the people on the block and their activities:

Males Hanging out Playing Working Walking Other

Young (up to age 12) ____ ____ ____ ____ ____

Teens (13–19) ____ ____ ____ ____ ____

Adult (20–60) ____ ____ ____ ____ ____

Seniors (61�) ____ ____ ____ ____ ____

Females

Young (up to age 12) ____ ____ ____ ____ ____

Teens (13–19) ____ ____ ____ ____ ____

Adult (20–60) ____ ____ ____ ____ ____

Seniors (61�) ____ ____ ____ ____ ____

218 Part Three Modes of Observation

tailed notes and completing structured observa-tion forms. One section of the form, shown in Figure 8.5, instructed observers to make notes of specifi c types of neighborhood problems that were discussed at the meeting (Bennis, Skogan, and Steiner 2003). Here is an excerpt from the narrative notes that supplemented this section of the form:

They . . . had very serious concerns in regard to a dilapidated building in their block that was being used for drug sales. The drug sell-er’s people were also squatting in the base-ment of the building. The main concern was that the four adults who were squat-ting also had three children under the age of four with them. (Chicago Community Policing Evaluation Consortium 2003, 37)

about using appropriate measures and data collection methods. Simply saying “I am go-ing to conduct an observational study of youth gangs” restricts the research focus at the outset to the kinds of information that can be gath-ered through observation. Such a study may be useful and interesting, but a researcher is better advised to consider what data collection meth-ods are necessary in any particular study.

A long-term research project on community policing in Chicago draws on data from sur-veys, fi eld observation, and police records (Chi-cago Community Policing Evaluation Consor-tium 2003). As just one example, researchers studied what sorts of activities and discussions emerged at community meetings in 130 of the city’s 270 police beats that covered residential areas. Observers attended meetings, making de-

5. Location code (circle one)

1. Police station 7. Hospital

2. Park building 8. Public housing facility

3. Library 9. Private facility

4. Church 10. Restaurant

5. Bank 11. Other not-for-profit

6. Other government

Count the house 30 minutes after the meeting. Exclude police in street clothes . . . and others that you can identify as non-residents.

8. ______ Total number residents attending

Note Problems Discussed

1. Drugs (include possibles)

____ ____ Street sales or usebig small Building used for drugs Drug-related violence

9. Physical decay

____ ____ Abandoned buildingsbig small Run-down buildings Abandoned cars Graffiti and vandalism Illegal dumping

Figure 8.5 Excerpts from Chicago Beat Meeting Observation FormSource: Adapted from Bennis, Skogan, and Steiner (2003, Appendix 1).

Chapter 8 Field Research 219

Field Research on Speeding and Traffi c EnforcementField research has been an important element in studies of racial profi ling for two reasons. First, fi eld research has provided measures of driver behavior that are not dependent on police re-cords. As we have seen in earlier chapters, it is important to compare police records of stops to some other source of information. Second, fi eld research has provided insights into traf-fi c enforcement, an area of policing not much studied by researchers. Field research has also covered the wide range of applications from highly structured counting to less structured fi eld observation and interviews.

Field Measures of Speed Studies of racial profi ling in three states used highly structured techniques to measure the speed of vehicles. The most sophisticated equipment was used by Lange and associates in New Jersey. Here’s how the authors described their setup:

The digital photographs were captured by a TC-2000 camera system, integrated with an AutoPatrol PR-100 radar system, pro-vided by Transcore, Inc. The equipment, other than two large strobe lights, was mounted inside an unmarked van, parked behind preexisting guide rails along the turnpike. The camera and radar sensor pointed out of the van’s back window to-ward oncoming traffi c. The two strobe lights were mounted on tripods behind the van and directed toward oncoming traffi c. Transcore’s employees operated the equip-ment. (2005, 202)

Equipment was programmed to photograph every vehicle exceeding the speed limit by 15 or more miles per hour. Operators also photo-graphed and timed samples of 25 to 50 other vehicles per hour. Elsewhere we have described the other element of observation— coding the appearance of driver race from photographs.

Pennsylvania researchers also used radar to measure the speed of vehicles in selected

In addition, observers distributed question-naires to community residents and police offi -cers attending each meeting. Items asked how often people attended beat meetings, what sorts of other civic activities they pursued, and whether they thought various other issues were problems in their neighborhood. As an exam-ple, the combination of fi eld observation data and survey questionnaires enabled researchers to assess the degree of general social activism among those who attended beat meetings.

Field research can also be conducted after a survey. For example, a survey intended to mea-sure fear of crime and related concepts might ask respondents to specify any area near their residence that they believe is particularly dan-gerous. Follow-up fi eld visits to the named areas could then be conducted, during which observers record information about physical characteristics, land use, numbers of people present, and so forth.

The box titled “Conducting a Safety Audit” describes how structured fi eld observations were combined with a focus group discussion to assess the scope of environmental design changes in Toronto, Canada.

The fl exibility of fi eld methods is one reason observation and fi eld interviews can readily be incorporated into many research projects. And fi eld observation often provides a much richer understanding of a phenomenon that is imper-fectly revealed through a survey questionnaire.

Illustrations of Field ResearchExamples illustrate different applications of fi eld re-search to study speeding, traffi c enforcement, and violence in bars.

Before concluding this chapter on fi eld re-search, let’s examine some illustrations of the method in action. These descriptions will pro-vide a clearer sense of how researchers use fi eld observations and interviews in criminal justice research.

220 Part Three Modes of Observation

characteristics. That was an important research task, however. Engel et al. (2004) describe train-ing and fi eld procedures in detail. Their simple fi eld observation form is included in an appen-dix to their report (2004, 312).

William Smith and associates (Smith, Tomaskovic-Devey, Zingraff, et al. 2003) tried but rejected stationary observation as a tech-nique for recording speed and observing driv-ers. They cited the high speed of passing vehi-cles and glare from windows as problems they encountered. Instead, a research team used mobile observation techniques— observing

locations throughout the state. Their proce-dures were less automated, relying on teams of two observers in a car parked on the side of sampled roadways. Undergraduate students at Pennsylvania State University served as observ-ers. They were trained by Pennsylvania state po-lice in the use of radar equipment, completing the same classroom training that was required of troopers. Additional training for observers was conducted on samples of roadways by the project director. State police were trained to operate radar equipment, but not to combine it with systematic fi eld observation of driver

CONDUCTINGA SAFETYAUDIT

Gisela Bichler-RobertsonCalifornia State University at San Bernardino

A safety audit involves a careful inventory of spe-cifi c environmental and situational factors that may contribute to feelings of discomfort, fear of victimization, or crime itself. The goal of a safety audit is to devise recommendations that will im-prove a specifi c area by reducing fear and crime.

Safety audits combine features of focus groups and structured fi eld observations. To be-gin, the researcher assembles a small group of individuals (10 or fewer) considered to be vul-nerable. Examples include: senior citizens, physi-cally challenged individuals, young women who travel alone, students, youth, and parents with young children. Assembling diverse groups helps to identify a greater variety of environmental and situational factors for the particular area.

After explaining safety audit procedures, an audit leader then takes the group on a tour of the audit site. Since perceptions differ by time of day, at least two audits are conducted for each site—one during daylight and one after dark.

When touring audit sites, individuals do not speak to one another. The audit leader instructs group members to imagine that they are walking through the area alone. Each person is equipped with a structured form for documenting their ob-

servations and perceptions. Forms vary, depend-ing on the group and site. In general, however, safety audit participants are instructed to docu-ment the following items:1. Before walking through the area, briefl y de-

scribe the type of space you are reviewing (e.g., a parking deck, park, shopping district). Record the number of entrances, general vol-ume of users, design of structures, materials used in design, and type of lighting.

2. Complete the following while walking through the area.

General feelings of safety:■ Identify the places in which you feel unsafe

and uncomfortable.■ What is it about each place that makes you

feel this way?■ Identify the places in which you feel safe.■ What is it about each place that makes you

feel this way?

General visibility:■ Can you see very far in front of you?■ Can you see behind you?■ Are there any structures or vegetation that re-

strict your sightlines?■ How dense are the trees/bushes?■ Are there any hiding spots or entrapment

zones?■ Is the lighting adequate? Can you see the face

of someone 15 meters in front of you?■ Are the paths/hallways open or are they very

narrow?■ Are there any sharp corners (90° angles)?

Chapter 8 Field Research 221

Observing New Jersey State Police Other research in New Jersey used less structured fi eld observation techniques. This was because the research purpose was less structured—learning about the general nature of traffi c enforcement on New Jersey highways. In a series of studies, researchers from Rutgers University (Maxfi eld and Kelling 2005; Maxfi eld and Andresen 2002; Andresen 2005) were interested in the mechan-ics of making traffi c stops and what kinds of things troopers considered in deciding which vehicles to stop. Researchers have long ac-companied municipal police on patrol, and a

drivers and timing cars that passed them. Ra-dar was also considered and rejected because it was feared vehicles having radar detectors, said to be common in North Carolina, would slow down when nearing the research vehicle. Worse, Smith and associates report that truck drivers quickly broadcast word of detected radar, thus eroding the planned unobtrusive measure.

As you can see, the observational component of research in these three states varied quite a lot. Reading the detailed reports from each study offers valuable insights into the kinds of things fi eld researchers must consider.

Perceived control over the space:■ Could you see danger approaching with

enough time to choose an alternative route?■ Are you visible to others on the street or in

other buildings?■ Can you see any evidence of a security system?

Presence of others:■ Does the area seem to be deserted?■ Are there many women around?■ Are you alone in the presence of men?■ What do the other people seem to be doing?■ Are there any undesirables—vagrants (home-

less or beggars), drunks, etc.?■ Do you see people you know?■ Are there any police or security offi cers

present?

General safety:■ Do you have access to a phone or other way

of summoning help?■ What is your general perception of criminal

behavior?■ Are there any places where you feel you could

be attacked or confronted in an uncomfort-able way?

Past experience in this space:■ Have you been harassed in this space?■ Have you heard of anyone who had a bad

experience in this place (any legends or real experiences)?

■ Is it likely that you may be harassed here (e.g., drunk young men coming out of the pub)?

■ Have you noticed any social incivilities (minor deviant behavior—i.e., public drinking, van-dalism, roughhousing, or skateboarding)?

■ Is there much in the way of physical incivili-ties (broken windows, litter, broken bottles, vandalism)?

Following the site visit, the group fi nds a se-cure setting for a focused discussion of the various elements they identifi ed. Harvesting observations about good and bad spaces helps to develop rec-ommendations for physical improvement. Group members may also share perceptions and ideas about personal safety. This process should begin with a brainstorming discussion and fi nish with identifying the key issues of concern and most reasonable recommendations for addressing those issues.

This method of structured observation has proven to be invaluable. Much of the public space in Toronto, including university campuses, public parks, transportation centers, and garages, has been improved through such endeavors.

Source: Adapted from materials developed by the Metro Action Committee on Public Violence Against Women and Children (METRAC) (Toronto, Canada: METRAC, 1987). Used by permission.

222 Part Three Modes of Observation

able nor needed. A fascinating series of studies of violence in Australian bars by Ross Homel and associates provides an example (Homel, Tomsen, and Thommeny 1992; Homel and Clark 1994).

Homel and associates set out to learn how various situational factors related to public drinking might promote or inhibit violence in Australian bars and nightclubs. Think for a mo-ment about how you might approach their re-search question: “whether alcohol consumption itself contributes in some way to the likelihood of violence, or whether aspects of the drinkers or of the drinking settings are the critical fac-tors” (1992, 681). Examining police records might reveal that assault reports are more likely to come from bars than from, say, public librar-ies. Or a survey might fi nd that self-reported bar patrons were more likely to have witnessed or participated in violence than respondents who did not frequent bars or nightclubs. But neither of these approaches will yield measures of the setting or situational factors that might provoke violence. Field research can produce direct observation of barroom settings and is well suited to addressing the question framed by Homel and associates.

Researchers began by selecting 4 high-risk and 2 low-risk sites on the basis of Sydney po-lice records and preliminary scouting visits. These 6 sites were visited fi ve or more times, and an additional 16 sites were visited once in the course of scouting.

Visits to bars were conducted by pairs of ob-servers who stayed two to six hours at each site. Their role is best described as complete partici-pant because they were permitted one alcoholic drink per hour and otherwise played the role of bar patron. Observers made no notes while on-site. As soon as possible after leaving a bar, they wrote up separate narrative accounts. Later, at group meetings of observers and research staff, the narrative accounts were discussed and any discrepancies resolved. Narratives were later coded by research staff to identify categories of situations, people, and activities that seemed to be associated with the likelihood of violence.

number of studies have documented their ef-forts. But, as Andresen points out, only a hand-ful of studies have examined traffi c enforce-ment, and even fewer considered state police.

To study video recording cameras in state police cars, Maxfi eld and Andresen (2002) rode with state police and watched the equipment in use. They learned that sound quality of record-ings was often poor, for a variety of reasons as-sociated with microphones and wireless trans-mittal. It was initially hoped that video records might make it possible to classify the race of drivers, but after watching in-car video moni-tors the researchers confi rmed that poor im-age quality undermined the potential reliabil-ity of that approach. The Rutgers researchers expected that troopers would be on their best behavior. But they did witness actions by troop-ers to avoid recording sound and/or images on a few occasions. Even though people behave differently when accompanied by researchers, it’s not uncommon for police to let their guard down a little.

Andresen accompanied troopers on 57 pa-trols overall, conducting unstructured inter-views during the several hours he spent with individual troopers. He adopted the common practice of using an interview guide, a list of simple questions he planned to ask in the fi eld. He took extensive notes while riding, and re-peatedly told troopers they could examine his notes. Andresen observed more than 150 traf-fi c stops, writing fi eld notes to document who was involved, reasons for the stop, what actions troopers took, and post-stop comments from troopers. He reports that most troopers seemed to enjoy describing their work. And, as you might imagine, troopers’ commentary about traffi c enforcement was very interesting.

Bars and ViolenceResearchers in the fi rst example conducted sys-tematic observations for specifi c purposes and produced quantitative estimates of speeding traffi c stops. Field research is commonly used in more qualitative studies as well, in which precise quantitative estimates are neither avail-

Chapter 8 Field Research 223

working-class males. However, these personal characteristics were deemed less important than the fl ow of people in and out of bars. Vio-lent incidents were often triggered when groups of males entered a club and encountered other groups of males they did not know.

Physical features mattered little unless they contributed to crowding or other adverse char-acteristics of the social atmosphere. Chief among social features associated with violence were discomfort and boredom. A crowded, un-comfortable bar with no entertainment spelled trouble.

Drinking patterns made a difference; vio-lent incidents were most common when bar patrons were very drunk. More importantly, certain management practices seemed to pro-duce more drunk patrons. Fewer customers were drunk in bars that had either a restaurant or a snack table. Bars with high cover charges and cheap drinks produced a high density of drunk patrons and violence. The economics of this situation are clear: if you must pay to enter a bar that serves cheap drinks, you’ll get more for your money by drinking a lot.

The fi nal ingredient found to contribute to violence was aggressive bouncers. “Many bouncers seem poorly trained, obsessed with their own machismo (relating badly to groups of male strangers), and some of them appear to regard their employment as a license to as-sault people” (1992, 688). Rather than reducing violence by rejecting unruly patrons, bouncers sometimes escalated violence by starting fi ghts.

Field observation was necessary to identify what situations produce violence in bars. No other way of collecting data could have yielded the rich and detailed information that enabled Homel and associates to diagnose the complex relationships that produce violence in bars:

Violent incidents in public drinking loca-tions do not occur simply because of the presence of young or rough patrons or be-cause of rock bands, or any other single variable. Violent occasions are charac-terized by subtle interactions of several

These eventually included physical and social atmosphere, drinking patterns, characteristics of patrons, and characteristics of staff.

The researchers began their study by assum-ing that some thing or things distinguished bars in which violence was common from those in which it was less common. After beginning their fi eldwork, however, Homel and associates (1992, 684) realized that circumstances and sit-uations were the more important factors:

During fi eld research it soon became ap-parent that the violent premises are for most of the time not violent. Violent oc-casions in these places seemed to have characteristics that clearly marked them out from nonviolent times. . . . This unex-pectedly helped us refi ne our ideas about the relevant situational variables, and to some extent reduced the importance of comparisons with the premises selected as controls.

In other words, the research question was partly restated. What began as a study to deter-mine why some bars in Sydney were violent was revised to determine what situations seemed to contribute to violence.

This illustrates one of the strengths of fi eld research—the ability to make adjustments while in the fi eld. You may recognize this as an example of inductive reasoning. Learning that even violent clubs were peaceful most of the time, Homel and associates were able to focus observers’ attention on looking for more spe-cifi c features of the bar environment and staff and patron characteristics. Such adjustments on the fl y would be diffi cult, if not impossible, if you were doing a survey.

Altogether, fi eld observers made 55 visits to 23 sites, for a total of about 300 hours of fi eld observation. During these visits, observ-ers witnessed 32 incidents of physical violence. Examining detailed fi eld notes, researchers at-tributed violent incidents to a variety of inter-related factors.

With respect to patrons, violence was most likely to break out in bars frequented by young,

224 Part Three Modes of Observation

estimates for a large population of behaviors beyond those actually observed. However, be-cause it is diffi cult to know the total population of given phenomena—shoppers or drivers, for example—precise probability samples cannot normally be drawn. In designing a quantitative fi eld study or assessing the representativeness of some other study, researchers must think care-fully about the density and predictability of what will be observed. Then they must decide whether sampling procedures are likely to tap represen-tative instances of cases they will observe.

More generally, the advantages and disad-vantages of different types of fi eld studies can be considered in terms of their validity, reli-ability, and generalizability. As we have seen, validity and reliability are both qualities of measurements. Validity concerns whether mea-surements actually measure what they are sup-posed to, not something else. Reliability is a matter of dependability: if researchers make the same measurement again and again, will they get the same result? Note that some examples we described in this chapter included special steps to improve reliability. Finally, generalizabil-ity refers to whether specifi c research fi ndings apply to people, places, and things not actually observed. Let’s see how fi eld research stacks up in these respects.

ValidityRecall our discussion in Chapter 7 of some of the limitations of using survey methods to study domestic violence. An alternative is a fi eld study in which the researcher interacts at length with victims of domestic violence. The relative strengths of each approach are nicely illustrated in a pair of articles that ex-amine domestic violence in England. Chapter 7 quoted from Catriona Mirrlees-Black’s (1995) article on domestic violence as measured in the British Crime Survey. John Hood-Williams and Tracey Bush (1995) provide a different perspective through their study published in the same issue of the Home Offi ce Research Bulletin.

variables. Chief among these are groups of male strangers, low comfort, high bore-dom, high drunkenness, as well as aggres-sive and unreasonable bouncers and fl oor staff (1992, 688).

Strengths and Weaknesses of Field ResearchValidity is usually a strength of fi eld research, but reliability and generalizability are sometimes weaknesses.

As we have seen, fi eld research is especially effec-tive for studying the subtle nuances of behav-ior and for examining processes over time. For these reasons, the chief strength of this method is the depth of understanding it permits.

Flexibility is another advantage of fi eld re-search. Researchers can modify their research design at any time. Moreover, they are always prepared to engage in qualitative fi eld research if the occasion arises, whereas launching a sur-vey requires considerable advance work.

Field research can be relatively inexpensive. Other research methods may require costly equipment or a large research staff, but fi eld research often can be undertaken by one re-searcher with a notebook and a pen. This is not to say that fi eld research is never expensive. The studies of speeding and race profi ling required many trained observers. Expensive recording equipment may be needed, or the researcher may wish to travel to Australia to replicate the study by Homel and associates.

Field research has its weaknesses, too. First, qualitative studies seldom yield precise descrip-tive statements about a large population. Ob-serving casual discussions among corrections offi cers in a cafeteria, for example, does not yield trustworthy estimates about prison condi-tions. Nevertheless, it could provide important insights into some of the problems facing staff and inmates in a specifi c institution.

Second, fi eld observation can produce sys-tematic counts of behaviors and reasonable

Chapter 8 Field Research 225

As the writing proceeded, we read various parts of the manuscript to selected mem-bers of our sample. This allowed us to check our interpretations against those of insiders and to enlist their help in reformu-lating passages they regarded as misleading or inaccurate. . . . The result of using this procedure, we believe, is a book that faith-fully conveys the offender’s perspective on the process of committing residential bur-glaries. (1994, 33–34)

This approach is possible only if subjects are aware of the researcher’s role as a researcher. In that case, having informants review draft fi eld notes or interview transcripts can be an excel-lent strategy for improving validity.

ReliabilityQualitative fi eld research can have a potential problem with reliability. Suppose you charac-terize your best friend’s political orientations based on everything you know about him or her. There’s certainly no question that your assessment of that person’s politics is at least somewhat idiosyncratic. The measurement you arrive at will appear to have considerable valid-ity. We can’t be sure, however, that someone else will characterize your friend’s politics the same way you do, even with the same amount of observation.

Field research measurements— even in-depth ones—are also often very personal. If, for ex-ample, you wished to conduct a fi eld study of bars and honky-tonks near your campus, you might judge levels of disorder on a Friday night to be low or moderate. In contrast, older adults might observe the same levels of noise and commotion and rate levels of disorder as intol-erably high. How we interpret the phenomena we observe depends very much on our own ex-periences and preferences.

The reliability of quantitative fi eld studies can be enhanced by careful attention to the de-tails of observation. Environmental surveys in particular can promote reliable observations

Tracey Bush lived in a London public hous-ing project (termed housing estate in England) for about fi ve years. This enabled her to study domestic violence in a natural setting: “The views of men and women on the estate about relationships and domestic violence have been gathered through the researcher’s network of friends, neighbours, acquaintances, and con-tacts” (Hood-Williams and Bush 1995, 11). Through long and patient fi eldwork, Bush learned that women sometimes normalize low levels of violence, seeing it as an unfortunate but unavoidable consequence of their relation-ship with a male partner. When violence esca-lates, victims may blame themselves. Victims may also remain in an abusive relationship in hopes that things will get better:

She reported that she wanted the compan-ionship and respect that she had received at the beginning of the relationship. It was the earlier, nonviolent man, whom she had met and fallen in love with, that she wanted back. (1995, 13)

Mirrlees-Black (1995) notes that measuring do-mestic violence is “diffi cult territory” in part because women may not recognize assault by a partner as a crime. Field research such as that by Hood-Williams and Bush offers an example of this phenomenon and helps us understand why it exists.

In fi eld research, validity often refers to whether the intended meaning of the things observed or people interviewed has been ac-curately captured. In the case of interviews, Jo-seph Maxwell (2005) suggests getting feedback on the measures from the people being studied. For example, Wright and Decker (1994) con-ducted lengthy semistructured interviews with their sample of burglars. The researchers recog-nized that their limited understanding of the social context of burglary may have produced some errors in interpreting what they learned from subjects. To guard against this, Wright and Decker had some of their subjects review what they thought they had learned:

226 Part Three Modes of Observation

several forms. One experience involved learn-ing about radar speed enforcement on a 50-mile segment of the New Jersey Turnpike. Maxfi eld accompanied troopers on a thorough tour of this segment, identifying where radar units were routinely stationed (termed fi shing holes by troop-ers). In his fi eldwork, he also examined physical characteristics of the roadway; patterns of in- and out-of-state travel; and areas where entrance ramps, slight upward grades, and other features affected vehicle speed. Finally, he gained exten-sive information on priorities and patterns in speed enforcement—learning what affects troopers’ decisions to stop certain vehicles.

As a result, Maxfi eld has detailed knowledge about that 50-mile segment of the New Jersey Turnpike. How generalizable is that knowl-edge? In one sense, learning about fi shing holes in very specifi c terms can help identify such sites on other roads. And learning how slight upward grades can slow traffi c in one situation may help us understand traffi c on other upward grades. But a detailed, idiosyncratic under-standing of 50 miles of highway is just that—id-iosyncratic. Knowing all there is to know about a straight, largely level stretch of limited-access toll road with few exits is not generalizable to other roadways—winding roads in mountain-ous areas with many exits, for example.

At the same time, some fi eld studies are less rooted in the local context of the subject under study. Wright and Decker (1994) studied bur-glars in St. Louis, and it’s certainly reasonable to wonder whether their fi ndings apply to residen-tial burglars in St. Petersburg, Florida. The ac-tions and routines of burglars might be affected by local police strategies, differences in the age or style of dwelling units, or even the type and amount of vegetation screening buildings from the street. However, Wright and Decker draw general conclusions about how burglars search for targets, what features of dwellings signal vulnerability, how opportunistic knowledge can trigger an offense, and what strategies ex-ist for fencing stolen goods. It’s likely that their fi ndings about the technology and incentives

by including detailed instructions on how to classify what is observed. Reliability can be strengthened by reviewing the products of fi eld observations. Homel and associates sought to increase the reliability of observers’ narrative descriptions by having group discussions about discrepancies in reports by different observers.

In a more general sense, reliability will in-crease as the degree of interpretation required in making actual observations decreases. Par-ticipant observation or unstructured interviews may require a considerable degree of interpreta-tion on the part of the observer, and most of us draw on our own experiences and backgrounds in interpreting what we observe. At another ex-treme, electronic devices and machines can pro-duce very reliable counts of persons who enter a store or of cars that pass some particular point. Somewhere in the middle are fi eld-workers who observe motorists or pedestrians and tabulate a specifi c behavior.

GeneralizabilityOne of the chief goals of social science is gen-eralization. We study particular situations and events to learn about life in general. Generaliz-ability can be a problem for qualitative fi eld re-search. It crops up in two forms.

First, the personal nature of the observations and measurements made by the researcher can produce results that will not necessarily be rep-licated by another independent researcher. If the observation depends in part on the individ-ual observers, it is more valuable as a source of particular insight than as a general truth. You may recognize the similarity between this and the more general issue of reliability.

Second, because fi eld researchers get a full and in-depth view of their subject matter, they can reach an unusually comprehensive under-standing. By its very comprehensiveness, how-ever, this understanding is less generalizable than results based on rigorous sampling and standardized measurements.

For example, Maxfi eld’s observational re-search with the New Jersey State Police took

Chapter 8 Field Research 227

• If fi eld observations will be made on a phenom-enon that occurs with some degree of regular-ity, purposive sampling techniques can be used to select cases for observation.

• Alternatives for recording fi eld observations range from video, audio, and other equipment to unstructured fi eld notes. In between are ob-servations recorded on structured forms; envi-ronmental surveys are examples.

• Field notes should be planned in advance to the greatest extent possible. However, note taking should be fl exible enough to make records of unexpected observations.

• Compared with surveys, fi eld research measure-ments generally have more validity but less re-liability, and fi eld research results cannot be generalized as safely as those based on rigorous sampling and standardized questionnaires.

✪ Key Termsenvironmental snowball

survey, p. 216 sampling, p. 210

✪ Review Questions and Exercises1. Think of some group or activity you partici-

pate in or are familiar with. In two or three paragraphs, describe how an outsider might ef-fectively go about studying that group or activ-ity. What should he or she read, what contacts should be made, and so on?

2. Review the box titled “Conducting a Safety Au-dit” by Gisela Bichler-Robertson on page 220. Try conducting a safety audit on your campus or in an area near your campus.

3. Many police departments encourage citizen ride-alongs as a component of community po-licing. If this is the case for a police or sheriff ’s department near you, take advantage of this excellent opportunity to test your observation and unstructured interviewing skills.

✪ Additional ReadingsBureau of Justice Assistance, A Police Guide to Sur-

veying Citizens and Their Environment (Washing-ton, DC: U.S. Department of Justice, Offi ce of Justice Programs, Bureau of Justice Assistance, 1993). Intended for use in community policing initiatives, this publication is a useful source of ideas about conducting structured observa-tions. Appendixes include detailed examples

that affect St. Louis burglars apply generally to residential burglars in other cities.

In reviewing reports of fi eld research proj-ects, it’s important to determine where and to what extent the researcher is generalizing be-yond her or his specifi c observations to other settings. Such generalizations may be in order, but it is necessary to judge that. Nothing in this research method guarantees it.

As we’ve seen, fi eld research is a potentially powerful tool for criminal justice research, one that provides a useful balance to surveys.

✪ Main Points• Field research is a data collection method that

involves the direct observation of phenomena in their natural settings.

• Field observation is usually the preferred data collection method for obtaining information about physical or social settings, behavior, and events.

• Field research in criminal justice may pro-duce either qualitative or quantitative data. Grounded theory is typically built from quali-tative fi eld observations. Observations that can be quantifi ed may produce measures for hy-pothesis testing.

• Observations made through fi eld research can often be integrated with data collected from other sources. In this way, fi eld observations can help researchers interpret other data.

• Asking questions through a form of special-ized interviewing is often integrated with fi eld observation.

• Field researchers may or may not identify them-selves as researchers to the people they are ob-serving. Being identifi ed as a researcher may have some effect on what is observed.

• Preparing for the fi eld involves negotiating or arranging access to subjects. Specifi c strate-gies depend on whether formal organizations, subcultures, or something in between are being studied.

• Controlled probability sampling techniques are not usually possible in fi eld research.

• Snowball sampling is a method for acquiring an ever-increasing number of sample observations. One participant is asked to recommend others for interviewing, and each of these other partic-ipants is asked for more recommendations.

228 Part Three Modes of Observation

Patton, Michael Quinn, Qualitative Research and Evaluation Methods, 3rd ed. (Thousand Oaks, CA: Sage, 2001). We mentioned this book in Chap-ter 7 as a good source of guidance on question-naire construction. Patton also offers in-depth information on observation techniques, along with tips on conducting unstructured and sem-istructured fi eld interviews. In addition, Patton describes a variety of purposive sampling tech-niques for qualitative interviewing and fi eld research.

Smith, Steven K., and Carolyn C. DeFrances, Assess-ing Measurement Techniques for Identifying Race, Ethnicity, and Gender: Observation-Based Data Col-lection in Airports and at Immigration Checkpoints(Washington, DC: Bureau of Justice Statistics, 2003). Racial profi ling and the September 11, 2001, attacks on New York and Washington prompted researchers and public offi cials to consider observational studies of drivers and others. This report by the Bureau of Justice Sta-tistics describes experiments to test observation methods.

of environmental surveys. You can also down-load this publication in text form (no drawings) from the Web (www.ncjrs.org/txtfi les/polc.txt; accessed May 12, 2008).

Felson, Marcus, Crime and Everyday Life, 3rd ed. (Thousand Oaks, CA: Sage, 2002). We men-tioned this book in Chapter 2 as an example of criminal justice theory. Many of Felson’s expla-nations of how everyday life is linked to crime describe physical features of cities and other land use patterns. This entertaining book sug-gests many opportunities for conducting fi eld research.

Miller, Joel, Profi ling Populations Available for Stops and Searches (London: Home Offi ce Policing and Reducing Crime Unit, 2000). Race-biased po-licing has been a concern in England for many years. This report presents a thorough descrip-tion of observation to produce baseline mea-sures of populations eligible to be stopped by police. Similar efforts have been underway in many U.S. states and cities (www.homeoffi ce.gov.uk/rds/policerspubs1.html; accessed May 12, 2008).

229

Chapter 9

Agency Records, Content Analysis, and Secondary DataWe’ll examine three sources of existing data: agency records, content analysis, and data collected by other researchers. Data from these sources have many applications in criminal justice research.

Introduction 230

Topics Appropriate for Agency Records and Content Analysis 230

Types of Agency Records 232

Published Statistics 232

Nonpublic Agency Records 234

New Data Collected by Agency Staff 236

IMPROVING POLICE RECORDS OF

DOMESTIC VIOLENCE 238

Reliability and Validity 239

Sources of Reliability and Validity Problems 240

HOW MANY PAROLE VIOLA-TORS WERE THERE LAST

MONTH? 242

Content Analysis 244

Coding in Content Analysis 244

Illustrations of Content Analysis 246

Secondary Analysis 247

Sources of Secondary Data 248

Advantages and Disadvantages of Secondary Data 249

230 Part Three Modes of Observation

IntroductionAgency records, secondary data, and content analy-sis do not require direct interaction with research subjects.

Except for the complete observer role in fi eld research, the modes of observation discussed so far require the researcher to intrude to some degree into whatever he or she is studying. This is most obvious with survey research. Even the fi eld researcher, as we’ve seen, can change things in the process of studying them.

Other ways of collecting data do not involve intrusion by observers. In this chapter, we’ll consider three different approaches to using information that has been collected by others, often as a routine practice.

First, a great deal of criminal justice research uses data collected by state and local agencies such as police, criminal courts, probation of-fi ces, juvenile authorities, and corrections de-partments. Government agencies gather a vast amount of crime and criminal justice data, probably rivaled only by efforts to produce economic and public health indicators. We re-fer to such information as “data from agency records.” In this chapter, we will describe dif-ferent types of such data that are available for criminal justice research, together with the promise and potential pitfalls of using infor-mation from agency records.

Second, in content analysis, researchers ex-amine a class of social artifacts—written docu-ments or other types of messages. Suppose you want to contrast the importance of criminal justice policy and health care policy for Ameri-cans in 1992 and 2004. One option is to ex-amine public-opinion polls from these years. Another method is to analyze articles from newspapers published in each year. The latter is an example of content analysis: the analysis of communications.

Finally, information collected by others is frequently used in criminal justice research, which in this case involves secondary a nalysisof existing data. Investigators who conduct

r esearch funded by federal agencies such as the National Institute of Justice are usually obliged to release their data for public use. Thus if you were interested in developing sentence reform proposals for your state, you might analyze data collected by Nancy Merritt, Terry Fain, and Su-san Turner in the study of Oregon’s efforts to increase sentence length for certain types of of-fenders (Merritt, Fain, and Turner 2006)

Keep in mind that most data we might ob-tain from agency records or research projects conducted by others are secondary data. Some-one else gathered the original data, usually for purposes that differ from ours.

Topics Appropriate for Agency Records and Content AnalysisAgency records support a wide variety of research applications.

Data from agency records or archives may have originally been gathered in any number of ways, from sample surveys to direct observation. Be-cause of this, such data may, in principle, be appropriate for just about any criminal justice research topic.

Published statistics and agency records are most commonly used in descriptive or explor-atory studies. This is consistent with the fact that many of the criminal justice data pub-lished by government agencies are intended to describe something. For instance, the Bureau of Justice Statistics (BJS) publishes annual fi gures on prison populations. If we are inter-ested in describing differences in prison popu-lations between states or changes in prison populations from 1990 through 2005, a good place to begin is with the annual data pub-lished by BJS. Similarly, published fi gures on crimes reported to police, criminal victimiza-tion, felony court caseloads, drug use by high school seniors, and a host of other measures are available over time—for 25 years or longer in many cases.

Chapter 9 Agency Records, Content Analysis, and Secondary Data 231

Agency records may also be used in explana-tory studies. Nancy Sinauer and colleagues (1999) examined medical examiner records for more than 1,000 female homicide victims in North Carolina to understand the relation-ship between female homicide and residence in urban versus rural counties. They found that counties on the outskirts of cities had higher female homicide rates than either urban or ru-ral counties.

Agency records are frequently used in applied studies as well. Evaluations of new policies that seek to reduce recidivism might draw on arrest or conviction data for measures of recidivism. A study of drug courts as an alternative way to process defendants with substance abuse prob-lems traced arrest records for experimental and control subjects (Gottfredson, Najaka, and Ke-arly 2003). In another study, Lawrence Winner and associates (Winner, Lanza-Kaduce, Bishop, and Frazier 1997) compared re-arrest records for juveniles transferred to the adult court sys-tem with those who had been kept in the juve-nile system. Sometimes, records obtained from private fi rms can be used in applied studies; Gisela Bichler and Ronald Clarke (1996) exam-ined records of international telephone calls in their evaluation of efforts to reduce telephone fraud in New York City.

In a different type of applied study, Pamela Lattimore and Joanna Baker (1992) combined data on prison releases and capacities, reincar-cerations, and general-population forecasts to develop a mathematical model that predicts future prison populations in North Carolina. This is an example of forecasting, in which past relationships among arrest rates and prison sentences for different age groups are compared with estimates of future population by age group. Assuming that past associations between age, arrest, and prison sentence will remain con-stant in future years, demographic models of future population can be used to predict future admissions to prison.

Topics appropriate to research using content analysis center on the important links between

communication, perceptions of crime prob-lems, individual behavior, and criminal justice policy. The prevalence of violence on fi ctional television dramas has long been a concern of researchers and public offi cials (Anderson and Bushman 2002). Numerous attempts have been made to explore relationships between expo-sure to pornography and sexual assault (see, for example, Pollard 1995). Mass media also play an important role in affecting policy action by public offi cials. Many studies have examined the infl uence of media in setting the agenda for criminal justice policy (see, for example, Cher-mak and Weiss 1997).

Research data collected by other investiga-tors, through surveys or fi eld observation, may be used for a broad variety of later studies. Na-tional Crime Victimization Survey (NCVS) data have been used by a large number of researchers in countless descriptive and explanatory stud-ies since the 1970s. In a notably ambitious use of secondary data, Robert Sampson and John Laub (1993) recovered life history data on 500 delinquents and 500 nondelinquents originally collected by Sheldon and Eleanor Glueck in the 1940s. Taking advantage of theoretical and em-pirical advances in criminological research over the ensuing 40 years, Sampson and Laub pro-duced a major contribution to knowledge of criminal career development in childhood.

Existing data may also be considered as a supplemental source of data. For example, a re-searcher planning to survey correctional facility administrators about their views on the need for drug treatment programs will do well to exam-ine existing data on the number of drug users sentenced to prison terms. Or, if we are evaluat-ing an experimental morale-building program in a probation services department, statistics on absenteeism will be useful in connection with the data our own research will generate.

This is not to say that agency records and secondary data can always provide answers to research questions. If this were true, much of this book would be unnecessary. The key to d istinguishing appropriate and inappropriate

232 Part Three Modes of Observation

uses of agency records, content analysis, and secondary data is understanding how these written records are produced. We cannot em-phasize this point too strongly. Much of this chapter will underscore the importance of learning where data come from and how they are gathered.

Types of Agency RecordsResearchers use a variety of published statistics and nonpublic agency records.

Information collected by or for public agencies usually falls into one of three general categories: (1) published statistics, (2) nonpublic agency records routinely collected for internal use, and (3) new data collected by agency staff for spe-cifi c research purposes. Each category varies in the extent to which data are readily available to the researcher and in the researcher’s degree of control over the data collection process.

Published StatisticsMost government organizations routinely col-lect and publish compilations of data. Examples are the Census Bureau, the FBI, the Administra-tive Offi ce of U.S. Courts, the Federal Bureau of Prisons, and the BJS. Two of these organiza-tions merit special mention. First, the Census Bureau conducts enumerations and sample surveys for several other federal organizations.

Notable examples are the NCVS, Census of Children in Custody, Survey of Inmates in Local Jails, Correctional Populations in the United States, and Survey of Justice Expenditure and Employment.

Second, the BJS compiles data from several sources and publishes annual and special re-ports on most data series. For example, Criminal Victimization in the United States reports summary data from the NCVS each year. Table 9.1 pres-ents a sample breakdown of victimization rates by race and home ownership from the report for 2004. The BJS also issues reports on people un-der correctional supervision. These are based on sample surveys and enumerations of jail, prison, and juvenile facility populations. Sample tabu-lations of females serving sentences in state and federal prison in 2004 are shown in Table 9.2. And the series Federal Criminal Case Processing re-ports detailed data on federal court activity.

The most comprehensive BJS publication on criminal justice data is the annual Source-book of Criminal Justice Statistics. Since 1972, this report has summarized hundreds of criminal justice data series, ranging from public percep-tions about crime to characteristics of criminal justice agencies, to tables on how states execute capital offenders. Data from private sources such as the Gallup Poll are included with s tatistics collected by government a gencies. Most importantly, each year’s Sourcebook con-

Table 9.1 Victimization Rates by Type of Crime, Form of Tenure, and Race of Head of Household, 2004 (rates per 1,000 households)

Household Crimes

Total Motor Vehicle Number of Burglary Theft Theft Households

Race of Head of Household White 27.6 7.6 121.6 95,605,550 Black 44.3 15.6 130.6 13,376,960

Home Ownership Owned 24.9 7.1 110.8 79,511,410 Rented 39.9 12.5 148.9 36,264,170

Source: Adapted from Bureau of Justice Statistics (2006: Tables 16 and 56).

Chapter 9 Agency Records, Content Analysis, and Secondary Data 233

cludes with notes on data sources, appendixes summarizing data collection procedures for major series, and addresses of organizations that either collect or archive original data.

Compilations of published data on crime and criminal justice are readily available from many sources. For example, the Sourcebook is avail-able on the Web (www.albany.edu/sourcebook; accessed May 14, 2008). The companion web-site to this book presents a more comprehen-sive list, together with some guidelines on how to get more information from the BJS and other major sources. At this point, however, we want to suggest some possible uses, and lim-its, of what Herbert Jacob (1984, 9) refers to as being “like the apple in the Garden of Eden: tempting but full of danger . . . [for] the unwary researcher.”

Referring to Tables 9.1 and 9.2, you may rec-ognize that published data from series such as the NCVS or Correctional Populations in the United States are summary data, as discussed in Chapter 4. This means that data are presented in highly aggregated form and cannot be used to analyze the individuals from or about whom infor-mation was originally collected. For e xample,

Table 9.2 shows that in 2004 a total of 2,789 women were serving sentences of one year or more in state and federal correctional institu-tions in New York. By comparing that fi gure with fi gures from other states, we can make some descriptive statements about prison pop-ulations in different states. We can also consult earlier editions of Correctional Populations to ex-amine trends in prison populations over time or to compare rates of growth from state to state.

Summary data cannot, however, be used to reveal anything about individual correctional facilities, let alone facility inmates. Individual-level data about inmates and institutions are available from the Census Bureau in electronic form, but published tabulations present only summaries.

This is not to say that published data are useless to criminal justice researchers. Highly aggregated summary data from published sta-tistical series are frequently used in descriptive, explanatory, and applied studies. For example, Eric Baumer and colleagues (Baumer, Lauritsen, Rosenfeld, and Wright 1998) examined Uni-form Crime Report (UCR), Drug Abuse Warn-ing Network (DAWN), and Drug Use Forecast-ing (DUF) data for 142 U.S. cities from 1984 to 1992. Interested in the relationship between crack cocaine use and robbery, burglary, and homicide rates, the researchers found that high levels of crack cocaine use in the population were found in cities that also had high rates of robbery. In contrast, burglary rates were lower in cities with high levels of crack cocaine use.

Ted Robert Gurr used published statistics on violent crime dating back to 13th-century England to examine how social and political events affected patterns of homicide through 1984. A long-term decline in homicide rates has been punctuated by spikes during periods of social dislocation, when:

signifi cant segments of a population have been separated from the regulating

Table 9.2 Female Prisoners Under Jurisdiction of State and Federal Correctional Authorities

Total Percent Change 2004 2003–2004

Northeast 8,910 �2.2 Connecticut 1,488 �3.9 Maine 125 0.8 Massachusetts 741 4.7 New Hampshire 119 1.7 New Jersey 1,470 �3.1 New York 2,789 �4.3 Pennsylvania 1,827 0.2 Rhode Island 208 �6.3 Vermont 143 5.9

Midwest 16,545 5.5

South 44,666 3.7

West 22,563 5.1

Source: Adapted from Harrison and Beck (2005: Table 5).

234 Part Three Modes of Observation

institutions that instill and reinforce the basic Western injunctions against interper-sonal violence. They may be migrants, de-mobilized veterans, a growing population of resentful young people for whom there is no social or economic niche, or badly ed-ucated young black men trapped in the de-caying ghettos of an affl uent society. (1989, 48– 49)

Published data can therefore address questions about highly aggregated patterns or trends—drug use and crime, the covariation in two estimates of crime, or epochal change in fatal violence. Published data also have the distinct advantage of being readily available; a web search or a trip to the library can quickly place several data series at your disposal. You can obtain copies of most publications by the BJS and other Justice Department offi ces on the In-ternet; most documents published since about 1994 are available electronically. Some formerly printed data reports are now available only in electronic format.

Electronic formats have many advantages. Data fi elds may be read directly into statisti-cal or graphics computer programs for analy-sis. Many basic crime data can be downloaded from the BJS website in spreadsheet formats or read directly into presentation software. More importantly, complete data series are available in electronic formats. Although printed reports of the NCVS limit you to summary tabulations such as those in Table 9.1, electronic and opti-cal media include the original survey data from 80,000 or more respondents.

Of course, before using either original data or tabulations from published sources, re-searchers must consider the issues of validity and reliability and the more general question of how well these data meet specifi c research purposes. It would make little sense to use only FBI data on homicides in a descriptive study of domestic violence because murder records mea-sure only incidents of fatal violence. Or data from an annual prison census would not be ap-

propriate for research on changes in sentences to community corrections programs.

Nonpublic Agency RecordsDespite the large volume of published statis-tics in criminal justice, those data represent only the tip of the proverbial iceberg. The FBI publishes the summary UCR, but each of the nation’s several thousand law enforcement agencies produces an incredible volume of data not routinely released for public distri-bution. The BJS series Correctional Populations in the United States presents statistics on prison inmates collected from annual surveys of cor-rectional facilities, but any given facility also maintains detailed case fi les on individual in-mates. Court Caseload Statistics, published by the National Center for State Courts, contains summary data on cases fi led and disposed in state courts, but any courthouse in any large city houses fi les on thousands of individual defendants.

Although we have labeled this data source “nonpublic agency records,” criminal justice organizations quite often make such data avail-able to criminal justice researchers. But obtain-ing agency records is not as simple as scrolling through a list of publications on the BJS web-site, or clicking a download button to get data from the Census Bureau. Obtaining access to nonpublic records involves many of the same steps we outlined in Chapter 8 for gaining en-try for fi eld research.

At the outset, we want to emphasize that the potential promise of agency records is not with-out cost. Jacob’s caution about the hidden perils to unwary researchers who uncritically accept published data applies to nonpublic agency re-cords as well. On the one hand, we could devote an entire book to describing the potential ap-plications of agency records in criminal justice research, together with advice on how to use and interpret such data. On the other hand, we can summarize that unwritten book with one piece of advice: understand how agency records are produced. Restated slightly, researchers can

Chapter 9 Agency Records, Content Analysis, and Secondary Data 235

fi nd the road to happiness in using agency re-cords by following the paper trail.

By way of illustrating this humble maxim, we present two types of examples. First, we d escribe a study in which nonpublic agency records were used to reveal important fi ndings about the etiology of crime. Second, we briefl y review two studies in which the authors recog-nized validity and reliability problems but still were able to draw conclusions about the behav-ior of criminal justice organizations. At the end of this section, we summarize the promise of agency records for research and note cautions that must be exercised if such records are to be used effectively.

Child Abuse, Delinquency, and Adult ArrestsIn earlier chapters, we described Cathy Spatz Widom’s research on child abuse as an example of a quasi-experimental design. For present purposes, her research illustrates the use of sev-eral different types of agency records.

Widom (1989a) identifi ed cases of child abuse and neglect by consulting records from juvenile and adult criminal courts in a large midwestern city. Unlike adult courts, juvenile court proceedings are not open to the public, and juvenile records may not be released. How-ever, after obtaining institutional review board approval, Widom was granted access to these fi les for research purposes by court authorities. From juvenile court records, she selected 774 cases of neglect, physical abuse, or sexual abuse. Cases of extreme abuse were processed in adult criminal court, where charges were fi led against the offender. Criminal court records yielded an additional 134 cases.

As described in Chapter 5, Widom con-structed a comparison group of nonabused children through individual matching. Com-parison subjects were found from two differ-ent sources of agency records. First, abused children who were age 6 to 11 at the time of the abuse were matched to comparison subjects by consulting public school records. An abused child was matched with a comparison child of

the same sex, race, and age (within six months) who attended the same school. Second, chil-dren who were younger than 6 years old at the time of abuse were matched on similar c riteria by consulting birth records and selecting a comparison subject born in the same hospital.

These two types of public records and match-ing criteria—attending the same public school and being born in the same hospital—were used in an attempt to control for socioeconomic status. Although such criteria may be viewed with skepticism today, Widom (1989a, 360), points out that during this time period (1967 to 1971) school busing was not used in the area where her study was conducted, and “elemen-tary schools represented very homogeneous neighborhoods.” Similarly, in the late 1960s, hospitals tended to serve local communities, unlike contemporary medical-industrial com-plexes that advertise their services far and wide. Widom assumed that children born in the same hospital were more likely to be from similar so-cioeconomic backgrounds than children born in different hospitals.

Though far from perfect, school and birth records enabled Widom to construct approxi-mate matches on socioeconomic status, which illustrates our earlier advice to be creative while being careful.

Widom’s research purpose was explanatory—to examine the link between early child abuse and later delinquency or adult criminal behav-ior. These two dependent variables were mea-sured by consulting additional agency records for information on arrests of abused subjects and comparison subjects for the years 1971 through 1986. Juvenile court fi les yielded infor-mation on delinquency. Adult arrests were mea-sured from criminal history fi les maintained by local, state, and national law enforcement agen-cies. In an effort to locate as many subjects as possible, Widom searched state Department of Motor Vehicles fi les for current addresses and Social Security numbers. Finally, “marriage li-cense bureau records were searched to fi nd mar-ried names for the females” (1989a, 371).

236 Part Three Modes of Observation

Findings revealed modest but statistically signifi cant differences between abused and com-parison subjects. As a group, abused subjects were more likely to have records of d elinquency or adult arrests (Maxfi eld and Widom 1996). However, Widom (1992) also found differences in these dependent variables by race and gender.

Now let’s consider two potential valid-ity and reliability issues that might be raised by Widom’s use of nonpublic agency records. First, data from juvenile and adult criminal courts reveal only cases of abuse that come to the attention of public offi cials. Unreported and unsubstantiated cases of abuse or neglect are excluded, and this raises a question about the validity of Widom’s measure of the inde-pendent variable. Second, dependent variable measures are similarly fl awed because juvenile and adult arrests do not refl ect all delinquent or criminal behavior.

Recognizing these problems, Widom is care-ful to point out that her measure of abuse prob-ably refl ects only the most severe cases, those that were brought to the attention of public offi cials. She also notes that cases in her study were processed before offi cials and the general public had become more aware of the problem of child abuse. Widom understood the limits of offi cial records and qualifi ed her conclusions accordingly: “These fi ndings, then, are not generalizable to unreported cases of abuse or neglect. Ours are also the cases in which agen-cies have intervened, and in which there is little doubt of abuse or neglect. Thus, these fi ndings are confounded with the processing factor” (1989a, 365–366).

Agency Records as Measures of Decision Making In Chapters 4 and 5, we mentioned research by Richard McCleary and associates (McCleary, Nienstedt, and Erven 1982) as an illustration of measurement problems and the ways in which those problems threaten certain quasi-experimental designs. Recall that Mc-Cleary and associates discovered that an appar-ent reduction in burglary rates was, in fact, due to changes in record-keeping practices. Assign-

ing offi cers from a special unit to investigate burglaries revealed that earlier investigative pro-cedures sometimes resulted in double counts of a single incident; the special unit also reduced misclassifi cation of larcenies as burglaries.

McCleary and colleagues became suspicious of police records when they detected an imme-diate decline in burglary rates following the introduction of the special unit, a pattern that was not reasonable given the technology of bur-glary. Following the paper trail, they were able to discover how changes in procedures affected these measures. In the same article, they describe additional examples—how replacement of a police chief and changes in patrol dispatch pro-cedures produced apparent increases in crime and calls for service. However, after carefully investigating how these records were produced, they were able to link changes in the indica-tors to changes in agency behavior rather than changes in the frequency of crime.

In a similar type of study, Hugh Whitt (2006) examined mortality data for New York City over 16 years. While deaths by homicide and ac-cidents varied little from year to year between 1976 and 1992, the number of suicides dropped dramatically from 1984 through 1988, then rose sharply in 1989 to match the numbers in earlier years. Reasoning that such sharp changes in a short period are unlikely to refl ect natural variation in suicide, Whitt traced the deviation to a series of personnel and policy changes in the city medical examiner’s offi ce. So the sharp, short-term variation was due to changes in how deaths were recorded, not changes in the man-ner of death.

New Data Collected by Agency StaffThus far we have concentrated on the research uses of information routinely collected by or for public agencies. Such data are readily avail-able, but researchers have little control over the actual data collection process. Furthermore, agency procedures and defi nitions may not cor-respond with the needs of researchers.

It is sometimes possible to use a hybrid source of data in which criminal justice agency

Chapter 9 Agency Records, Content Analysis, and Secondary Data 237

staff collect information for specifi c research purposes. We refer to this as a “hybrid” source because it combines the collection of new data—through observation or interviews—with day-to-day criminal justice agency activities. Virtually all criminal justice organizations rou-tinely document their actions, from investigat-ing crime reports to housing convicted felons. By slightly modifying forms normally used for recording information, researchers may be able to have agency staff collect original data for them.

For example, let’s say we are interested in the general question of how many crimes reported to police involve nonresident victims. Reading about violent crime against international tour-ists in south Florida, we wonder how common such incidents are and decide to systematically investigate the problem. It doesn’t take us long to learn that no published data are available on the resident or nonresident status of Dade County crime victims. We next try the Miami and Miami-Dade Police Departments, suspect-ing that such information might be recorded on crime report forms. No luck here, either. Incident report forms include victim name and address, but we are told that police routinely record the local address for tourists, typically a hotel, motel, or guest house. Staff in the po-lice crime records offi ce inform us that offi cers sometimes write down something like “tourist, resident of Montreal” in the comments section of the crime report form, but they are neither required nor asked to do this. Assuming we can gain approval to conduct our study from po-lice and other relevant parties, we may be able to supplement the standard police report form. Adding an entry such as the following will do the trick:

Is complainant a resident of Dade County?______ yes ______ no If “no,” record permanent address here:_________________________________

A seemingly simple modifi cation of crime report forms may not happen quite so easily.

Approval from the department is, of course, one of the fi rst requirements. And recall our discussion in Chapter 8 on the need to gain ap-proval for research from criminal justice agency staff at all levels. Individual offi cers who com-plete crime report forms must be made aware of the change and told why the new item was added. We might distribute a memorandum that explains the reasons for adopting a new crime report form. It might also be a good idea to have supervisors describe the new form at roll call before each shift of offi cers heads out on patrol. Finally, we could review samples of the new forms over the fi rst few days they are used to determine the extent to which offi cers are completing the new item.

Incorporating new data collection proce-dures into agency routine has two major advan-tages. First, and most obvious, having agency staff collect data for us is much less costly than fi elding a team of research assistants. It is dif-fi cult to imagine how original data on the resi-dent status of Dade County victims could be collected in any other way.

Second, we have more control over the mea-surement process than we would by relying on agency defi nitions. Some Dade County offi cers might note information on victim residence, but most probably would not. Adding a spe-cifi c question enhances the reliability of data collection. We might consider using an existing crime report item for “victim address,” but the tendency of offi cers to record local addresses for tourists would undermine measurement va-lidity. A residence-specifi c item is a more valid indicator.

The box by Marie Mele, titled “Improving Police Records of Domestic Violence,” gives an example of this approach to gathering new data from agency records. You should rec-ognize the importance of collaboration in this example. Less obvious, but equally important, is a lesson we will return to later in this chap-ter: agency records are not usually intended for research purposes and, as a consequence, are not always well suited to researchers’ needs. Sometimes, as was the case for Marie Mele,

238 Part Three Modes of Observation

some subtle tweaking can transform unusable mounds of paper fi les into computer-based re-cord systems. In any event, researchers are well advised to be careful, not to assume that exist-ing data will meet their needs, but to also be creative in seeking ways to improve the quality of agency data.

This approach to data collection has many potential applications. Probation offi cers or other court staff in many jurisdictions com-plete some type of presentence investigation on convicted offenders. A researcher might be able to supplement standard interview forms with additional items appropriate for some specifi c research interest.

At the same time, having agencies collect original research data has some disadvan-tages. An obvious one is the need to obtain the

c ooperation of organizations and staff. The dif-fi culty of this varies in direct proportion to the intrusiveness of data collection. Cooperation is less likely if some major additional effort is required of agency personnel or if data collec-tion activities disrupt routine operations. The potential benefi t to participating agencies is a related point. If a research project or an experi-mental program is likely to economize agency operations or improve staff performance, as was the case for Marie Mele’s research, it will be easier to enlist their assistance.

Researchers have less control over the data collection process when they rely on agency staff. Staff in criminal justice agencies have competing demands on their time and natu-rally place a lower priority on data collection than on their primary duties. If you were a

IMPROVING POLICERECORDS OFDOMESTIC VIOLENCE

Marie MeleMonmouth University

An interest in domestic violence led me to learn more about repeat victimization. Research con-ducted in England has indicated that repeated domestic violence involving the same victim and offender is quite common (Hanmer, Griffi ths, and Jerwood 1999; Farrell, Edmunds, Hobbs, and Laycock 2000). For example, combining data from four British Crime Surveys, Pease (1998, 3) reports that about 1 percent of respondents had four or more personal victimizations and that these victims accounted for 59 percent of all per-sonal victimization incidents disclosed to BCS in-terviewers. Other studies in England have shown that many other types of offenses are dispropor-tionately concentrated among small groups of victims. Identifying repeat victims and directing prevention and enforcement resources to them has the potential to produce a large decrease in crime rates.

My interest centered on determining the dis-tributions of incidents, victims, and offenders in cases of domestic violence reported to police in a large city in the northeastern United States. Partly this was exploratory research, but I also wished to make policy recommendations to reduce repeat victimization. Working with Michael Maxfi eld, my initial inquiries found that, although data on as-saults and other offenses were available in paper fi les, it was not possible to effi ciently count the number of repeat incidents for victims and of-fenders. However, seeing the potential value of gathering more systematic information on repeat offenders and victims, senior department staff collaborated with Maxfi eld and me to produce a database that reorganized existing data on do-mestic violence incidents. A pilot data system was developed in November 2001. Plans called for the database to be maintained by the department’s domestic violence and sexual assault unit (DV-SAU); staff would begin entering new incidents in January 2002. In the meantime, I proposed to test the database and its entry procedures by person-ally entering incidents from August through De-cember 2001. After ironing out a few kinks, the fi nal system was established, and I continued en-tering new incident reports for a period of time.

Chapter 9 Agency Records, Content Analysis, and Secondary Data 239

p robation offi cer serving a heavy caseload and were asked to complete detailed 6- and 12-month reports on services provided to individ-ual clients, would you devote more attention to keeping up with your clients or fi lling out data collection forms?

Reliability and ValidityUnderstanding the details of how agency records are produced is the best guard against reliability and validity problems.

The key to evaluating the reliability and va-lidity of agency records, as well as the general suitability of those data for a research project, is to understand as completely as possible how the data were originally collected. Doing so can help researchers identify potential new uses of

data and be better able to detect potential reli-ability or validity problems in agency records.

Any researcher who considers using agency records will benefi t from a careful reading of Ja-cob’s invaluable little guide Using Published Data: Errors and Remedies (1984). In addition to warn-ing readers about general problems with reli-ability and validity, such as those we discussed in Chapter 4, Jacob cautions users of these data to be aware of other potential errors that can be revealed by scrutinizing source notes. Cleri-cal errors, for example, are unavoidable in such large-scale reporting systems as the UCR. These errors may be detected and reported in correc-tion notices appended to later reports.

Users of data series collected over time must be especially attentive to changes in data collec-tion procedures or changes in the operational

This is an admittedly brief description of a process that unfolded over several months. The process began as an attempt to obtain what we thought were existing data. Finding that suitable data were not available, Maxfi eld had suffi cient access to the department to begin discussions of how to supplement existing record-keeping prac-tices to tabulate repeat victimization. Many public agencies will accommodate researchers if such ac-commodation does not unduly burden the agency. Collecting new data for researchers does qualify as unduly burdensome for most public organizations. The key here was collaboration in a way that helped both the researchers and the host organization.

It’s obvious that my research benefi ted from having the department design and ultimately as-sume responsibility for tabulating repeat victim-ization. The department benefi ted in three ways. First, DVSAU staff recognized how a data fi le that identifi ed repeat offenders and victims could produce information that would aid their inves-tigations of incidents. I emphasize offenders here because, all other things being equal, police are more interested in offenders than they are in vic-tims. So the database was designed to track peo-ple (offenders) of special interest to police and people (victims) of special interest to my research.

Second, recognizing that setting up a new data system is especially diffi cult in a tradition-bound organization, I was able to ease the transition somewhat by initially entering data myself. One intentional by-product of this was to establish quality-control procedures during a shakedown period. As a criminal justice researcher, I knew that the reliability of data-gathering procedures was important and was able to establish reliable data entry routines for the department.

The DVSAU staff benefi ted in yet another way that was probably most important of all. This department had incorporated its version of Compstat for about three years. Two important components of Compstat are timely data and ac-countability. Each week, the DVSAU commander was required to present a summary of the unit’s activity and was held accountable for the per-formance of unit staff. Before the database was developed, the DVSAU commander spent several hours compiling data to prepare for each week’s Compstat meeting. After the database was opera-tional, preparation time was reduced to minutes. In addition, the DVSAU commander was able to introduce new performance measures—repeat of-fenders and victims—that came to be valued by the department’s chief.

240 Part Three Modes of Observation

defi nitions of key indicators. As you might ex-pect, such changes are more likely to occur in data series that extend over several years. David Cantor and James Lynch (2005) describe how changes in the NCVS, especially the redesign elements introduced in 1992, should be kept in mind for any long-term analysis of NCVS data. In earlier chapters, we described the NCVS re-design that was completed in 1994. If we plan to conduct research on victimization over time, we will have to consider how changes in sample size and design, increased use of telephone in-terviews, and questionnaire revisions might af-fect our fi ndings.

Longitudinal researchers must therefore diligently search for modifi cations of proce-dures or defi nitions over time to avoid attrib-uting some substantive meaning to a change in a particular measure. Furthermore, as the time interval under investigation increases, so does the potential for change in measurement. Ted Robert Gurr (1989, 24) cites a good example:

In the fi rst two decades of the 20th century many American police forces treated the fatalities of the auto age as homicides. The sharp increase in “homicide” rates that followed has led to some dubious conclu-sions. Careful study of the sources and their historical and institutional context is necessary to identify and screen out the potentially misleading effects of these fac-tors on long-term trends.

The central point is that researchers who analyze criminal justice data produced by dif-ferent cities or states or other jurisdictions must be alert to variations in the defi nitions and mea-surement of key variables. Even in cases in which defi nitions and measurement seem straightfor-ward, they may run into problems. For example, Craig Perkins and Darrell Gilliard (1992, 4)—statisticians at the Bureau of Justice Statistics—caution potential users of corrections data:

Care should be exercised when comparing groups of inmates on sentence length and time served. Differences may be the result

of factors not described in the tables, in-cluding variations in the criminal histories of each group, variations in the offense composition of each group, and variations among participating jurisdictions in their sentencing and correctional practices.

Fortunately, most published reports on regular data series present basic information on defi nitions and collection procedures. The BJS website includes copies of questionnaires used in surveys and enumerations. Researchers should, however, view summary descriptions in printed reports as no more than a starting point in their search for information on how data were collected.

Sources of Reliability and Validity ProblemsWe conclude this section on agency records by discussing some general characteristics of record keeping by public agencies. Think care-fully about each of the features we mention, considering how each might apply to specifi c types of criminal justice research. It will also be extremely useful for you to think of some ex-amples in addition to those we mention, per-haps discussing them with your instructor or others in your class.

Social Production of Data Virtually all criminal justice record keeping is a social pro-cess. By this we mean that indicators of, say, arrests, juvenile probation violations, court convictions, or rule infractions by prison in-mates refl ect decisions made by criminal jus-tice offi cials in addition to the actual behavior of juvenile or adult offenders. As Baumer and associates state, “Researchers must realize that performance measures are composites of offend-ers’ behavior, organizational capacity to detect behavior, and decisions about how to respond to offenders’ misbehavior” (1993, 139; empha-sis added). Richard McCleary (1992) describes the social production of data by parole offi -cers. They may fail to record minor infractions to avoid paperwork or, alternatively, may keep careful records of such incidents in an effort

Chapter 9 Agency Records, Content Analysis, and Secondary Data 241

to punish troublesome parolees by returning them to prison.

Discretionary actions by criminal justice of-fi cials and others affect the production of vir-tually all agency records. Police neither learn about all crimes nor arrest all offenders who come to their attention. Similarly, prosecutors, probation offi cers, and corrections staff are se-lectively attentive to charges fi led or to rule vio-lations by probationers and inmates. At a more general level, the degree of attention state legis-latures and criminal justice offi cials devote to various crime problems varies over time. Levels of tolerance of such behaviors as child abuse, drug use, acquaintance rape, and even alcohol consumption have changed over the years.

Agency Data Are Not Designed for Re-search In many cases, criminal justice offi -cials collect data because the law requires them to do so. More generally, agencies tend to col-lect data for their own use, not for the use of re-searchers. Court disposition records are main-tained in part because of legal mandates, and such records are designed for the use of judges, prosecutors, and other offi cials. Record-keeping procedures refl ect internal needs and directives from higher authorities. As a consequence, re-searchers sometimes fi nd it diffi cult to adapt agency records for a specifi c research purpose.

Even when agencies have advanced re-cord keeping and data management systems, researchers may still encounter problems. The Chicago Police Department, for example, has developed an advanced system for collecting and analyzing crime and calls-for-service data. Richard and Carolyn Block (1995) sought to take advantage of this in an analysis relating the density of taverns and liquor stores to po-lice crime reports from taverns. The research-ers wanted to learn whether areas with high concentrations of taverns and liquor stores generated a disproportionate number of crime reports on which police had indicated “tavern or liquor store.” Because the Chicago Police Department recorded addresses and type of location and the city Department of Revenue

r ecorded addresses for liquor license holders, the task seemed straightforward enough: match the addresses from the two data sources, and analyze the correspondence between crime and liquor establishments.

It wasn’t so simple. The Blocks discovered inconsistent recording of addresses in crime re-ports. Sometimes police approximated a loca-tion by recording the nearest intersection—for example, “Clark and Division.” Some large es-tablishments spanned several street addresses, and police recorded one address when the li-quor license data had a different address. Other times, police recorded the name of the tavern—for example, “Red Rooster on Wilson Avenue.” In sum, police recorded addresses to meet their needs—to locate a tavern or liquor store so that offi cers could fi nd it—while the Department of Revenue recorded a precise address on a license application. Each address was accurate for each agency’s purpose, but addresses did not match in about 40 percent of the cases analyzed by the Blocks.

For another example of how defi nitional dif-ferences can be traced to different agency needs, see the box titled “How Many Parole Violators Were There Last Month?”

Tracking People, Not Patterns At the op-erational level, offi cials in criminal justice or-ganizations are generally more interested in keeping track of individual cases than in exam-ining patterns. Police patrol offi cers and inves-tigators deal with individual calls for service, arrests, or case fi les. Prosecutors and judges are most attentive to court dockets and the clear-ing of individual cases, and corrections offi cials maintain records on individual inmates. Al-though each organization produces summary reports on weekly, monthly, or annual activity, offi cials tend to be much more interested in in-dividual cases. Michael Geerken (1994) makes this point clearly in his discussion of problems researchers are likely to encounter in analyzing police arrest records. Few rap-sheet databases are regularly reviewed for accuracy; rather, they simply accumulate arrest records submitted by

242 Part Three Modes of Observation

individual offi cers. Joel Best (2001) offers addi-tional examples of how small errors in case-by-case record keeping can accumulate to produce compound errors in summary data.

With continued advances in computer and telecommunications technology, more criminal justice agencies are developing the capability to analyze data in addition to tracing individual cases. Crime analysis by police tracks spatial patterns of recent incidents; prosecutors are attentive to their scorecards; state correctional intake facilities consider prison capacity, secu-rity classifi cation, and program availability in deciding where to send new admissions.

As problem-oriented and community ap-proaches to policing become more widely ad-opted, many law enforcement agencies have im-proved their record-keeping and crime analysis practices. New York City’s reduction in reported crime has been attributed to the use of timely,

accurate crime data by police managers to plan and evaluate specifi c anticrime tactics (Bratton 1999). This illustrates an important general principle about the accuracy of agency-produced data: when agency managers routinely use data to make decisions, they will be more attentive to data quality. The box describing Marie Mele’s research offers another example—when domes-tic violence detectives realized how a database would help them prepare for weekly command staff meetings, they endorsed the effort to im-prove their record-keeping procedures.

However, record-keeping systems used in most cities today are still designed more to track individual cases for individual depart-ments than to produce data for management or research purposes. Individual agencies main-tain what are sometimes called silo databases, a colorful label that refers to stacks of data that are isolated from each other.

HOW MANY PAROLEVIOLATORS WERE THERELAST MONTH?

John J. PoklembaNew York State Division of Criminal Justice Services

Question: How many parole violators were there last month? Answer: It depends. More accurately, it depends on which agency is asked. Each of the three answers below is right in its own way:

New York State Commission of Correction 611

New York Department of Correctional Services 670

New York Division of Parole 356

The State Commission of Correction (SCOC) maintains daily aggregate information on the lo-cal under-custody population. Data are gathered from local sheriffs, using a set of common defi -nitions. The SCOC defi nes a parole violator as

follows: an alleged parole violator being held as a result of allegedly having violated a condition of parole—for example, a new arrest. This makes sense for local jails; a special category is devoted to counting alleged parole violators with new ar-rests. However, New York City does not distin-guish between parole violators with and without new arrests, so the SCOC fi gure includes violators from upstate New York only.

The Department of Correctional Services (DOCS) is less interested in why people are in jail; their concern centers on the backlog of inmates whom they will soon need to accommodate. Furthermore, as far as DOCS is concerned, the only true parole violator is a technical parole vio-lator. This makes sense for DOCS, because a pa-role violator convicted of a new crime will enter DOCS as a new admission, who—from an admin-istrative standpoint—will be treated differently than a parolee returned to prison for a technical violation.

The Division of Parole classifi es parole viola-tors into one of four categories: (1) those who have violated a condition of parole, (2) those

Chapter 9 Agency Records, Content Analysis, and Secondary Data 243

This does not mean that computerized crim-inal history, court disposition, or prison intake records are of little use to researchers. However, researchers must be aware that even state-of-the-art systems may not be readily adaptable for research purposes.

Error Increases with Volume The potential for clerical errors increases as the number of clerical entries increases. This seemingly obvi-ous point is nonetheless important to keep in mind when analyzing criminal justice records. Lawrence Sherman and Ellen Cohn (1989, 34) describe the “mirror effect” of duplicate calls-for-service (CFS) records. Handling a large volume of CFS, phone operators in Minneapo-lis, or any large city for that matter, are not al-ways able to distinguish duplicate reports of the same incident. An updated report about a CFS may be treated as a new incident. In either

case, d uplicate data result. Richard McCleary and associates (McCleary, Nienstedt, and Erven 1982) describe a similar problem with burglary reports.

The relationship between volume of data entry and the potential for error can be es-pecially troublesome for studies of relatively rare crimes or incidents. Although murder is rare compared with other crimes, information about individual homicides might be keyed into a computer by the same clerk who inputs data on parking violations. If a murder record is just one case among hundreds of parking tickets and petty thefts awaiting a clerk, there is no guarantee that the rare event will be treated any differently than the everyday ones.

While preparing a briefi ng for an Indianapo-lis commission on violence, Maxfi eld discovered that a single incident in which four people had been murdered in a rural area appeared twice

who have absconded, (3) those who have been arrested for a new crime, and (4) those who have been convicted of a new crime. Once again, this makes sense, because the Division of Parole is responsible for monitoring parolee performance and wishes to distinguish different types of parole violations. The Division also classifi es a parole violation as either alleged (yet to be confi rmed by a parole board) or actual (the violation has been confi rmed and entered into the parolee’s fi le). Further differences in the fl uid status of parolees and their violations, together with differences be-tween New York City and other areas, add to the confusion.

Taking the varying perspectives and roles of these three organizations into account, answers to the “How many” question can be made more specifi c:

SCOC: Last month, there were 611 alleged pa-role violators who were believed to have violated a condition of their parole by be-ing arrested for a new offense and are be-ing held in upstate New York jails.

DOCS: Last month, there were 670 actual pa-role violators who were judged to have vi-olated a condition of their parole and are counted among the backlog of persons ready for admission to state correctional facilities.

Parole Division: Last month, 356 parolees from the Division’s aggregate population were actually removed from the Division’s case-load and were en route to DOCS.

One of the major reasons that agency counts do not match is that agency information systems have been developed to meet internal operational needs. A systemwide perspective is lacking. Ques-tions that depend on data from more than one agency are often impossible to answer with con-fi dence. Recognize also that the availability and quality of state data depend on data from local agencies.

As stated above, the best answer to the ques-tion is: It depends.

Source: Adapted from Poklemba (1988, 11–13).

244 Part Three Modes of Observation

in computerized FBI homicide records. This was traced to the fact that offi cers from two agencies—sheriff ’s deputies and state police—investigated the crime, and each agency fi led a report with the FBI. But the thousands of mur-ders entered into FBI computer fi les for that year obscured the fact that duplicate records had been keyed in for one multiple murder in a rural area of Indiana.

In concluding our discussion of agency re-cords, we do not mean to leave you with the im-pression that data produced by and for criminal justice organizations are fatally fl awed. Thou-sands of studies making appropriate use of such data are produced each year. It is, however, essential that researchers understand potential sources of reliability and validity problems, as well as ways they can be overcome. Public agen-cies do not normally collect information for research purposes. The data they do collect of-ten refl ect discretionary decisions by numerous individuals. And, like any large-scale human activity, making observations on large numbers of people and processes inevitably produces some error.

Content AnalysisContent analysis involves the systematic study of messages.

The Offi ce of Community Oriented Policing Services (COPS) was established by the 1994 Crime Bill to promote community policing by providing funds to local law enforcement agen-cies. In addition to being concerned about the effectiveness of these efforts, COPS staff wanted to know something about the public image of community policing as presented in local news-papers. Stephen Mastrofski and Richard Ritti conducted a content analysis of stories about community policing in newspapers serving 26 cities. The researchers found more than 7,500 stories from 1993 through 1997, with most fo-cusing on a small number of themes: “commu-nity, resources, and producing real results for

the community. Stories that offer a viewpoint on community policing are nearly always over-whelmingly positive” (1999, 10–11).

This is an example of content analysis, the systematic study of messages and the meaning those messages convey. For the COPS offi ce, the study by Mastrofski and Ritti was satisfying—many stories about community policing were published in urban newspapers, and most sto-ries presented positive images.

Content analysis methods may be applied to virtually any form of communication. Among the possible artifacts for study are books, mag-azines, fi lms, songs, speeches, television pro-grams, letters, laws, and constitutions, as well as any components or collections of these. Content analysis is particularly well suited to answer-ing the classic questions of communications research: who says what, to whom, why, how, and with what effect? As a mode of observation, content analysis requires a considered handling of the what, and the analysis of data collected in this mode, as in others, addresses the why and with what effect.

Coding in Content AnalysisContent analysis is essentially a coding operation, and of course, coding represents the measurement process in content analysis. Communications— oral, written, or other—are coded or classifi ed according to some concep-tual framework. Thus, for example, newspaper editorials might be coded as liberal or conser-vative. Radio talk shows might be coded as bombastic or not. Novels might be coded as de-tective fi ction or not. Political speeches might be coded as containing unsupported rhetoric about crime or not. Recall that terms such as these are subject to many interpretations, and the researcher must specify defi nitions clearly.

Coding in content analysis involves the logic of conceptualization and operationalization we considered in Chapter 4. In content analysis, as in other research methods, researchers must refi ne their conceptual framework and develop

Chapter 9 Agency Records, Content Analysis, and Secondary Data 245

specifi c methods for observing in relation to that framework.

For all research methods, conceptualization and operationalization typically involve the in-teraction of theoretical concerns and empirical observations. If, for example, you believe some newspaper editorials support liberal crime poli-cies and others support conservative ones, ask yourself why you think so. Read some editorials, asking which ones are liberal and which ones are conservative. Is the political orientation of a particular editorial most clearly indicated by its manifest content or by its overall tone? Is your decision based on the use of certain terms (such as moral decay or need for rehabilitation) or on the support or opposition given to a particular is-sue, such as mandatory prison sentences versus treatment programs for drug users?

As in other decisions relating to measure-ment, the researcher faces a fundamental choice between depth and specifi city of understand-ing. The survey researcher must decide whether specifi c closed-ended questions or more general open-ended questions will better suit her or his needs. By the same token, the content analyst has a choice between searching for manifest or for latent content. Coding the manifestcontent—the visible, surface content— of a communication more closely approximates the use of closed-ended items in a survey ques-tionnaire. Alternatively, coding the latent con-tent of the communication—its underlying meaning—is an option. In the most general sense, manifest and latent content can be dis-tinguished by the degree of interpretation re-quired in measurement.

Throughout the process of conceptualizing manifest- and latent-content coding proce-dures, remember that the operational defi ni-tion of any variable is composed of the attri-butes included in it. Such attributes, moreover, should be mutually exclusive and exhaustive. A newspaper editorial, for example, should not be described as both liberal and conservative, al-though we should probably allow for some to

be middle-of-the-road. It may be suffi cient to code TV programs as being violent or not vio-lent, but it is also possible that some programs could be antiviolence.

No coding scheme should be used in con-tent analysis unless it has been carefully pre-tested. We must decide what manifest or latent contents of communications will be regarded as indicators of the different attributes that make up our research variables, write down these operational defi nitions, and use them in the actual coding of several units of observation. If we plan to use more than one coder in the fi -nal project, each of them should independently code the same set of observations so that we can determine the extent of agreement. In any event, we’ll want to take special note of any dif-fi cult cases— observations that were not easily classifi ed using the operational defi nition. Fi-nally, we should review the overall results of the pretest to ensure that they are appropriate to our analytic concerns. If, for example, all of the pretest newspaper editorials have been coded as liberal, we should certainly reconsider our defi -nition of that attribute.

Before beginning to code newspapers, crime dramas on TV, or detective fi ction, we need to make plans to assess the reliability of coding. Fortunately, reliability in content analysis can readily be tested, if not guaranteed, in two re-lated ways. First, interrater reliability can be de-termined by having two different people code the same message and then computing the proportion of items coded the same. If 20 at-tributes of newspaper stories about crime are being coded and two coders score 18 attributes identically, their reliability is 90 percent.

The second way to assess coding reliability is the test–retest method, in which one person codes the same message twice. Of course, some time should elapse between the two coding op-erations. Test–retest procedures can be used when only one person is doing the coding; reli-ability can be computed in the same way as if the interrater method were being used.

246 Part Three Modes of Observation

Illustrations of Content AnalysisWe now turn to examples of content analysis in action. The fi rst illustration describes content analysis of violence in video games. The second demonstrates how extracting information from police records is a form of content analysis.

Violence in Video Games It seems that whenever some new technology or musical idiom becomes popular, someone becomes in-terested in linking it to behavior. Examples in-clude: television and violence; pornography and sexual assault; suggestive lyrics in popular mu-sic and sexual behavior. It is always diffi cult to establish causality in such cases, and we will say nothing more about that. But content analysis is the appropriate research tool for classifying content as violent or sexually explicit.

Kimberly Thompson and Kevin Haninger examined the contents of video games rated in categories “E” (suitable for everyone) and “T” (teens, aged 13 and up) by the Entertainment Software Rating Board (ESRC). Their fi rst study (Thompson and Haninger 2001) sampled 55 of more than 600 E-rated games available at the time. An undergraduate college student “with considerable video gaming experience” was as-signed to play all games for 90 minutes or un-til the game reached a natural conclusion. The game-player was videotaped, which formed the basis for content analysis. One researcher (also described as being an experienced player) and the game-player reviewed the video, coding sev-eral dimensions of what was depicted.

Coders counted the number of violent in-cidents depicted while the game was being played, and timed the duration of each violent incident. Violence was defi ned as “acts in which the aggressor causes or attempts to cause physi-cal injury or death to another character.” This is an example of latent content. The duration of violent acts was manifest content, though researchers had to distinguish short pauses be-tween violent acts. Additional variables coded included the number of deaths; the presence of drugs, alcohol, or tobacco; profanity and sexual

behavior; weapon use; and whether any music was included that itself was rated as explicit. Comparing the duration of violent acts and the number of deaths to how long each game was played yielded two standardized measures: violent minutes as a percent of all minutes, and the number of deaths per minute.

Results showed quite a lot of violence. Action games ranged from 3.3 percent (Sonic Adventure)to 91 percent (Nuclear Strike) violence as a por-tion of total time. Paperboy depicted no deaths, but Rat Attack averaged 8.4 deaths per minute. Games classifi ed as “sports” rarely showed violence.

Later research used similar methods to exam-ine violence in a larger number of games rated as suitable for teens (Haninger and Thompson 2004). These games displayed a wider variety of behaviors in the general domains of violence, obscenity, substance use, and sexual behavior. Again, the authors did not attempt to link such content with behavior. Their content analy-sis centered on systematically classifying what sorts of things were depicted in video games, thus providing information independent of industry ratings. Of their fi ndings, the authors highlight that ESRC ratings did not mention several examples of violence in almost half of the games reviewed.

Classifying Gang-Related Homicides When is a homicide gang-related? Are there different types of gang-related homicides? These two questions guided research by Rosenfeld and associates (Rosenfeld, Bray, and Egley 1999) to understand how gang membership might fa-cilitate homicide in different ways. To address these questions, the researchers conducted a content analysis of police case fi les for homi-cides in St. Louis over a 10-year period.

By now, you should recognize the impor-tance of conceptualization in most criminal justice research. Rosenfeld and associates be-gan by further specifying the ambiguous term gang-related. They distinguished gang-motivated and gang-affi liated homicides. Gang-motivated

Chapter 9 Agency Records, Content Analysis, and Secondary Data 247

k illings “resulted from gang behavior or re-lationships, such as an initiation ritual, the ‘throwing’ of gang signs, or a gang fi ght” (1999, 500). Gang-affi liated homicides involved a gang member as victim or offender, but with no indication of specifi c gang activity; a gang member killing a nongang person during a rob-bery is an example. A third category, nongang youth homicide, included all other murders in which no evidence of gang activity was avail-able, and the suspected offender was between ages 10 and 24.

Because St. Louis police did not apply the labels gang-affi liated or gang-motivated, it was necessary for researchers to code each case into one of the three categories using informa-tion from case fi les. This was a form of content analysis—systematically classifying the mes-sages contained in homicide case fi les. Ho-micide case fi les are good examples of police records that are not maintained for research purposes. Recognizing this, Rosenfeld and as-sociates coded the fi les in a two-stage process, building reliability checks into each stage.

First, one person coded each case as either gang-related or not gang-related. This might seem a step backwards, but it focused research-ers’ measurement on the separate dimensions of homicide of interest to them by simplifying the coding process. It was relatively easy to de-termine whether any evidence of gang activity or membership was present; if not, the case was classifi ed as a nongang youth killing and set aside. Cases that had some evidence of gang in-volvement were retained for the second coding stage. During this stage, a second researcher randomly selected a 10 percent sample of cases and coded them again, without knowing how the fi rst coder had classifi ed the sampled cases. You will recognize this as an example of inter-rater reliability.

The second coding stage involved the fi ner and more diffi cult classifi cation of cases as ei-ther gang-motivated or gang-affi liated. Inter-rater reliability checks were again conducted, this time on a 25 percent sample of cases. More

cases were selected because reliability was lower in this stage—the two coders exhibited less agreement on how to classify gang homicides. Cases in which independent coding produced discrepancies were reviewed and discussed by the two coders until they agreed on how the ho-micide should be classifi ed.

From these very different examples, we ex-pect that you can think of many additional applications of content analysis in criminal justice research. You might wish to consult Ray Surette’s (2006) excellent book Media, Crime, and Justice: Images, Realities, and Policies to learn more about the scope of topics for which con-tent analysis can be used. The General Account-ing Offi ce (renamed the Government Account-ability Offi ce in 2004) has an excellent guide to content analysis generally (1996).

Secondary AnalysisData collected by other researchers are often used to address new research questions.

Our fi nal topic encompasses all sources of crim-inal justice data we have described in this and preceding chapters: content analysis, agency re-cords, fi eld observation, and surveys. We begin with an example of an unusually ambitious use of secondary data by a prolifi c criminal justice scholar.

For almost three decades, Wesley Skogan has examined the infl uence of crime on the lives of urban residents. In most cases, his research has relied on sample surveys to investigate ques-tions about fear of crime (Skogan and Maxfi eld 1981), community crime prevention (Skogan 1988), and the relationships between urban res-idents and police (Skogan 1990b), among oth-ers. He has long recognized the importance of incivilities—symbols of social disorder—as in-dicators of neighborhood crime problems and as sources of fear for urban residents.

In 1990, Skogan published a comprehen-sive study of incivilities, drawing on his own research as well as studies by others (Skogan 1990a). However, instead of conducting new

248 Part Three Modes of Observation

surveys to collect original data, Skogan’s fi nd-ings were based on secondary analysis of 40 sur-veys conducted in six cities from 1977 through 1983. He aggregated responses from about 13,000 individuals and examined questions about the sources of disorder, its impact, and the scope of action by individuals and police.

Secondary analysis of data collected by other researchers has become an increasingly important tool. Like Skogan, numerous crimi-nal justice researchers have reanalyzed data col-lected by others. Several factors contribute to this, including the high cost of collecting origi-nal data through surveys or other means. More important, however, is that data for secondary analysis are readily available, largely because of efforts by the National Institute of Justice (NIJ), the BJS, and the Interuniversity Consortium for Political and Social Research (ICPSR).

Suppose you are interested in the relation-ships between delinquency, drug use, and school performance among adolescents. The National Youth Survey (NYS), which includes responses from 1,725 youths interviewed nine times from 1975 through 2004, might suit your needs nicely. NYS data were originally collected by Delbert Elliott and associates (for example, El-liott, Huizinga, and Ageton 1985). However, like Janet Lauritsen and associates (Lauritsen, Sampson, and Laub 1991), who used the NYS to examine links between delinquency and victim-ization, you may be able to reanalyze the survey data to address your own research questions.

Or perhaps you wish to learn whether there are differences in the sentencing decisions of black and white judges. Cassia Spohn (1990) addressed this question using data originally collected by Milton Heumann and Colin Lof-tin (1979), who were interested in the effect of a new Michigan law on plea bargaining. Spohn was able to conduct a secondary analysis of the same data to answer a different research ques-tion. Let’s examine these examples more closely to see how they illustrate the uses and advan-tages of secondary analysis.

Original NYS data were collected by El-liott and associates (1985, 91) for three related

research purposes: (1) to estimate the prevalence and incidence of delinquency and drug use among U.S. adolescents, (2) to assess causal re-lationships between drug use and delinquency, and (3) to test a comprehensive theory of delin-quency. The NYS was designed as a panel sur-vey, in which a nationally representative sample of youths aged 11 to 17 in 1976 was interviewed once each year from 1976 through 1989. As we described in Chapter 3, this is an example of a longitudinal study, and it is especially well suited to disentangling the time ordering of such behaviors as drug use and delinquency.

Lauritsen and associates (1991) were inter-ested in the time order of somewhat different behaviors— delinquency and victimization—that were not directly addressed by the original researchers. A longitudinal design was equally important in this secondary analysis because Lauritsen and associates sought to determine whether youths experienced violent victimiza-tion after committing delinquent acts or vice versa. Given this research interest, they faced two choices: collect original data by conducting a new panel survey or reanalyze existing data from a panel survey that included questions on victimization and self-reported delinquency. Because the NYS included questions appropri-ate for their research purpose, Lauritsen and colleagues were spared the need and (consider-able) expense of conducting a new panel study.

Sources of Secondary DataAs a college student, you probably would not be able to launch a nine-wave panel study of a national sample of adolescents or even gather records from some 2,600 felony cases in Michigan. You do, however, have access to the same data used in those studies, together with data from thousands of other research proj-ects, through the ICPSR at the University of Michigan. For more than 40 years, the ICPSR has served as a central repository of machine-readable data collected by social scientifi c re-searchers. Current holdings include data from thousands of studies conducted by researchers all over the world.

Chapter 9 Agency Records, Content Analysis, and Secondary Data 249

Of particular interest to criminal justice re-searchers is the National Archive of Criminal Justice Data (NACJD), established by the BJS in cooperation with the ICPSR. Here you will fi nd the NYS, Heumann and Loftin’s sentenc-ing data, and each of the 40 surveys analyzed by Skogan for the book we mentioned earlier. There’s more, including surveys on criminal jus-tice topics by national polling fi rms, the NCVS from 1972 to the present, periodic censuses of juvenile detention and correctional facilities, a study of thefts from commercial trucks in New York City, and data from Marvin Wolfgang’s classic study of a Philadelphia birth cohort. Data from the growing National Incident-Based Re-porting System (NIBRS) are now available, with an expanding number of participating agencies dating from 1996. Data from selected studies are even available for online data analysis. The possibilities are almost endless and grow each year as new data are added to the archives.

One of the most useful websites for aggre-gate secondary data is maintained by the BJS. Summary tabulations for many published data series are presented as graphs or tables. In ad-dition, it’s possible to download summary data in spreadsheet format to easily conduct addi-tional analysis, to display graphs in different forms, and even to prepare transparencies for presentations.

Other sites on the Internet offer a virtually unlimited source of secondary data. You can obtain documentation for most data archived by the ICPSR and the NACJD, as well as health statistics, census data, and other sources lim-ited only by your imagination. Find the NACJD website at www.icpsr.umich.edu/NACJD/ (ac-cessed May 14, 2008).

Advantages and Disadvantages of Secondary DataThe advantages of secondary analysis are obvi-ous and enormous: it is cheaper and faster than collecting original data, and depending on who did the original study, you may benefi t from the work of topfl ight professionals and esteemed academics.

Potential disadvantages must be kept in mind, however. The key problem involves the recurrent question of validity. When one re-searcher collects data for one particular pur-pose, you have no assurance that those data will be appropriate to your research interests. Typically you’ll fi nd that the original researcher collected data that come close to measuring what you are interested in, but you may wish key variables had been operationalized just a little differently.

The question, then, is whether secondary data provide valid measures of the variables you want to analyze. This closely resembles one of the key problems in the use of agency records. Perhaps a particular set of data does not pro-vide a totally satisfactory measure of what in-terests you, but other sets of data are available. Even if no one set of data provides totally valid measures, you can build up a weight of evidence by analyzing all the possibilities. If each of the imperfect measures points to the same research conclusion, you will have developed consider-able support for its accuracy. The use of replica-tion lessens the problem.

In general, secondary data are least useful for evaluation studies. This is because evalua-tions are designed to answer specifi c questions about specifi c programs. It is always possible to reanalyze data from evaluation studies, but secondary data cannot be used to evaluate an entirely different program. Thus, for example, a number of researchers have reexamined data collected for a series of domestic violence exper-iments conducted by Lawrence Sherman and others in several cities (see Sherman 1992b for a summary). In most cases, these secondary re-searchers (such as Maxwell, Garner, and Fagan 2001) wished to verify or reassess fi ndings from the original studies. But it is not possible to use those data to answer questions about domestic violence interventions other than arrest or to evaluate arrest policies in new cities where the experiments did not take place.

In this book, the discussion of secondary analysis has a special purpose. As we conclude our examination of modes of observation in

250 Part Three Modes of Observation

criminal justice research, you should have de-veloped a full appreciation for the range of possibilities available in fi nding the answers to questions about crime and criminal justice pol-icy. No single method of getting information unlocks all puzzles, yet there is no limit to the ways you can fi nd out about things. And, more powerfully, you can zero in on an issue from several independent directions, gaining an even greater mastery of it.

✪ Main Points• Data and records produced by formal organiza-

tions may be the most common source of data in criminal justice research.

• Many public organizations produce statistics and data for the public record, and these data are often useful for criminal justice researchers.

• All organizations keep nonpublic records for in-ternal operational purposes, and these records are valuable sources of data for criminal justice research.

• Public organizations can sometimes be enlisted to collect new data—through observation or in-terviews—for use by researchers.

• Although agency records have many potential research uses, because they are produced for purposes other than research they may be un-suitable for a specifi c study.

• Researchers must be especially attentive to pos-sible reliability and validity problems when they use data from agency records.

• “Follow the paper trail” and “expect the ex-pected” are two general maxims for researchers to keep in mind when using agency records in their research.

• Content analysis is a research method appro-priate for studying human communications. Because communication takes many forms, content analysis can study many other aspects of behavior.

• Units of communication, such as words, para-graphs, and books, are the usual units of analy-sis in content analysis.

• Coding is the process of transforming raw da-ta— either manifest or latent content—into a standardized, quantitative form.

• Secondary analysis refers to the analysis of data collected earlier by another researcher for some

purpose other than the topic of the current study.

• Archives of criminal justice and other social data are maintained by the ICPSR and the NACJD for use by other researchers.

• The advantages and disadvantages of using sec-ondary data are similar to those for agency re-cords— data previously collected by a researcher may not match our own needs.

✪ Key Termscontent analysis,

p. 244latent content,

p. 245

manifest content, p. 245

secondary analysis, p. 230

✪ Review Questions and Exercises1. Each year, the BJS publishes the Sourcebook

of Criminal Justice Statistics, a compendium of data from many different sources. Consult the online edition of the sourcebook (www.albany.edu/sourcebook; accessed May 14, 2008), se-lect a table of interest to you, and describe how the data presented in that table were originally collected.

2. In New York City, police offi cers assigned to a specialized gang squad pay special attention to graffi ti, or tagging. In doing so, they conduct a type of content analysis to study actions, threats, and other messages presented in this form of communication. Describe how you would plan a formal content analysis of graffi ti. Be sure to distinguish manifest and latent content, units of analysis, and coding rules for your study.

✪ Additional ReadingsBureau of Justice Statistics, Data Quality Guidelines

(Washington, DC: U.S. Department of Justice, Bureau of Justice Statistics, 2002). In 2001, the U.S. Offi ce of Management and Budget di-rected that all federal agencies develop guide-lines to maximize the quality of information they collect and disseminate. This publication describes how the BJS complied with that direc-tive. It provides an excellent overview of validity and reliability issues in series of data often used by criminal justice researchers.

Geerken, Michael R., “Rap Sheets in Criminologi-cal Research: Considerations and Caveats,”

Chapter 9 Agency Records, Content Analysis, and Secondary Data 251

Journal of Quantitative Criminology 10(1994): 3–21. You won’t fi nd a more thorough or inter-esting discussion of how police arrest records are produced and what that means for research-ers. Anyone who uses arrest data should read this very carefully.

General Accounting Offi ce, Content Analysis: A Methodology for Structuring and Analyzing Written Material, Transfer Paper 10.3.1. (Washington, DC: U.S. General Accounting Offi ce, 1996). As an agency conducting evaluation studies for

the U.S. Congress, the GAO uses a variety of research methods. One of a series of “transfer papers” that describe GAO methods, this book presents an excellent overview of content analy-sis applications and methods.

Jacob, Herbert, Using Published Data: Errors and Remedies (Thousand Oaks, CA: Sage, 1984). We have often referred to this small book. It is an extremely valuable source of insight into the promise and pitfalls of using agency records.

This page intentionally left blank

253

Part Four

Application and Analysis

This fi nal section of the book draws on concepts and ideas from earlier chapters to bring you closer to the actual process of criminal justice research. Having examined the role of theory, cause and effect, measure-ment, experiments, and different ways of collecting data, we are now ready to see how these pieces come together.

Criminal justice research can be con-ducted in many ways to answer many dif-ferent types of questions. We have touched on various research purposes throughout the text, but the fi rst chapter in this section examines a specifi c research purpose more closely. Because crime is an important and seemingly intractable social problem, ap-plied research is attracting growing interest from researchers and public offi cials alike.

Chapter 10 describes evaluation research and problem analysis. As we will see, care-fully specifying concepts and being atten-tive to measures are as important for ap-plied research as they are for other research purposes.

Chapter 11 takes up the question of anal-ysis. After we have designed a research proj-ect, specifi ed measures, and collected data, our attention will turn to a search for pat-terns and relationships for description, ex-planation, or evaluation, depending on the research purpose. In Chapter 11, we take a preliminary look at descriptive and inferen-tial statistics. Our goal is to establish a fa-miliarity with the principles of basic statisti-cal analysis.

254

Chapter 10

Evaluation Research and Problem AnalysisIn this chapter, our attention centers on applied criminal justice research. Evaluation studies are conducted to learn whether (and why) programs have succeeded or failed. Problem analysis helps offi cials plan their actions and an-ticipate the possible effects of new programs.

Introduction 255

Topics Appropriate for Evaluation Research and Problem Analysis 255

The Policy Process 256

Linking the Process to Evaluation 257

Getting Started 260

Evaluability Assessment 260

Problem Formulation 261

Measurement 263

Designs for Program Evaluation 266

Randomized Evaluation Designs 266

Home Detention: Two Randomized Studies 269

Quasi-Experimental Designs 271

Other Types of Evaluation Studies 273

Problem Analysis and Scientifi c Realism 273

Problem-Oriented Policing 274

Auto Theft in Chula Vista 275

Other Applications of Problem Analysis 276

Space- and Time-Based Analysis 276

Scientifi c Realism and Applied Research 280

Chapter 10 Evaluation Research and Problem Analysis 255

IntroductionEvaluation research and problem analysis are in-creasingly important activities for researchers and public offi cials alike.

Evaluation research—sometimes called pro-gram evaluation—refers to a research purpose rather than a specifi c research method. Its spe-cial purpose is to evaluate the effects of policies such as mandatory arrest for domestic violence, innovations in probation, and new sentencing laws. Another type of evaluation study, prob-lem analysis, helps public offi cials plan and se-lect alternative actions. Virtually all types of de-signs, measures, and data collection techniques can be used in evaluation research and problem analysis.

Growth of evaluation research over the last several years no doubt refl ects desire on the part of criminal justice researchers to actually make a difference in the world. At the same time, we cannot discount the infl uence of two additional factors: (1) increased federal require-ments for program evaluations to accompany the implementation of new programs and (2) the availability of research funds to meet that requirement.

By the same token, increased interest in program evaluation and problem analysis has followed heightened concern for the account-ability of public offi cials and public policy. Criminal justice agencies are expected to justify the effectiveness and cost of their actions. If tra-ditional approaches to probation supervision, for example, do not deter future lawbreaking, new approaches should be developed and their effectiveness assessed. Or if using t emporary de-

tention facilities fabricated from recycled semi-trailers is less costly than constructing new jails, public offi cials should consider whether the low-er-cost alternative will meet their needs for pre-trial detention and short-term incarceration.

Justice agencies have come to rely more on evidence-based policy, in which the actions of justice agencies are linked to evidence used for planning and evaluation. Traditional practices are being reevaluated against evidence provided by social science research. The Problem-Oriented Guides series summarizes evidence concerning responses by police and others to problems ranging from Acquaintance Rape of College Stu-dents (Sampson 2002) to Witness Intimidation( Johnson 2006). CompStat and its variations base police actions on evidence about the lo-cation and circumstances of crime problems. Corrections policies are increasingly evaluated to sort out those that do in fact reduce reof-fending (Cullen and Sundt 2003). This trend represents an expansion of applied research that moves beyond collaborations between jus-tice professionals and professional researchers.

Topics Appropriate for Evaluation Research and Problem AnalysisProblem analysis and evaluation are used to de-velop justice policy and determine its impact.

Evaluation research is appropriate whenever some policy intervention occurs or is planned. A policy intervention is an action taken for the purpose of producing some intended result. In its simplest sense, evaluation research is a

The Political Context of Applied Research 282

Evaluation and Stakeholders 282

WHEN POLITICS ACCOMMODATES

FACTS 283

Politics and Objectivity 284

256 Part Four Application and Analysis

process of determining whether the intended result was produced. Problem analysis focuses more on deciding what intervention should be pursued. Given alternative courses of ac-tion, which is likely to be least costly, most ef-fective, or least diffi cult to implement? Our focus, of course, is on the analysis and evalua-tion of criminal justice policy and criminal jus-tice agencies. However, it will be useful to fi rst consider a simple general model of the policy-making process in order to understand various topics appropriate to evaluation and problem analysis.

The Policy ProcessFigure 10.1 presents our model, adapted from Robert Lineberry’s (1977, 42– 43) classic sum-mary of a policy system. A similar type of input–output model is described in a National Institute of Justice publication on evaluation guidelines (McDonald and Smith 1989). Although we will examine each step in turn, recognize that the policy process, like the research process gener-ally (see Chapter 3), is fl uid and does not always start at the beginning and conclude at the end.

The policy process begins with a demand that normally appears as support for a new course of action or opposition to existing policy. Such demands can emerge from within a public or-ganization or from outside sources. Newspaper stories alleging racial discrimination in drug sentencing can generate demand for revised sen-tencing policies or a prosecutor may indepen-dently decide to review all sentence recommen-dations made by deputies who prosecute drug cases. Before any action can be taken, demands must fi nd a place on the policy agenda.

The next step, as shown in Figure 10.1, ac-tually encompasses several steps. Policy makers consider ultimate goals they wish to accom-plish and different means of achieving those goals. Does our prosecutor seek absolute equal-ity in sentences recommended for all white and African American drug defendants, or should there be ranges of permissible variation based on criminal history, severity of charges, and so on? Resources must be allocated from available

inputs, including personnel, equipment, sup-plies, and even time. Who will review sentence recommendations? How much time will that take, and will additional staff be required? Be-cause the word policy implies some standard course of action about how to respond to a recurring problem or issue, routine practices and decision rules must be formulated. Will sentence recommendations for each case be re-viewed as they are prepared, or is it suffi cient to review all cases on a weekly basis?

Policy outputs refer to what is actually pro-duced, in much the same manner that a manu-facturer of offi ce supplies produces paper clips and staples. In our hypothetical example, the prosecutor’s policy produces the routine review of sentence recommendations in drug cases. Or, to consider a different example, a selective traffi c enforcement program intended to re-duce auto accidents on a particular roadway may produce a visible police presence, together with traffi c citations for speeding.

In the fi nal stage, we consider the impact of policy outputs. Does the prosecutor’s review process actually eliminate disparities in sen-tences? Are auto accidents reduced in the tar-geted enforcement area?

The distinction between policy outputs and their impacts is important for understanding applications of evaluation to different stages of the policy process. Unfortunately, this difference is often confusing to both public offi cials and researchers. Impacts are fundamentally related to policy goals; they refer to the basic question of what a policy action is trying to achieve. Out-puts embody the means to achieve desired policy goals. The prosecutor seeks to achieve equality in sentence recommendations (impact), so a re-view process is produced as a means to achieve that goal (output). Or a police executive allo-cates offi cers, patrol cars, and overtime pay to produce traffi c citations (outputs) in the expec-tation that citations will achieve the goal of re-ducing auto accidents (impact).

Now consider the left side of Figure 10.1. Our policy model can be expressed as a simple cause-and-effect process such as we considered

Chapter 10 Evaluation Research and Problem Analysis 257

in earlier chapters. A cause has produced the variation in sentences for African American and white defendants, or a cause has produced a concentration of auto accidents. Policies are formulated to produce an effect, or impact. In this sense, a policy can be viewed as a hypoth-esis in which an independent variable is ex-pected to produce change in a dependent vari-able. Sentence review procedures are expected to produce a reduction in sentence disparities; targeted enforcement is expected to produce a reduction in auto accidents. Goal-directed pub-lic policies may therefore be viewed as if-then statements: if some policy action is taken, then we expect some result to be produced.

Linking the Process to EvaluationBy comparing this simple model with a general defi nition of program evaluation given in one of the most widely used texts on the subject (Rossi, Freeman, and Lipsey 1999), the topics appropriate to applied research will become

clearer. Peter Rossi and associates (1999, 4; emphasis in original) defi ne program evalua-tion as

the use of social science research procedures to systematically assess the effectiveness of social in-tervention programs. More specifi cally, evalu-ation researchers (evaluators) use social research methods to study, appraise, and help improve social programs in all their aspects, including the diagnosis of the so-cial problems they address, their conceptu-alization and design, their implementation and administration, their outcomes, and their effi ciency.

We have been discussing systematic social scientifi c research procedures throughout this book. Now let’s substitute criminal justice for social programs and see how this defi nition and Figure 10.1 help us understand program evalu-ation applications.

Inputs

BudgetStaffEquipmentSupplies

Institutional Processes

Choosing goalsChoosing actionsAllocating resourcesFormulating routineAdministering routine

Cause

Policy

Policy Outputs

Effects Policy Impacts

Policy Agenda

Policy Demands,Support, Opposition

Figure 10.1 The Policy ProcessSource: Adapted from Lineberry (1977, 42–43).

258 Part Four Application and Analysis

Problem Analysis Activities listed under “Institutional Processes” in Figure 10.1 refer to conceptualization and design. For example, faced with a court order to maintain prison pop-ulations within established capacity, corrections offi cials might begin by conceiving and design-ing different ways to achieve this demand. Prob-lem analysis is an example of a social scientifi c research procedure that can help corrections offi cials evaluate alternative actions, choose among them, and formulate routine practices for implementing policy to comply with a court order.

One approach might be to increase rated ca-pacity through new construction or conversion of existing facilities. Another might be to devise a program to immediately reduce the existing population. Still another might be to cut back on the admission of newly sentenced offenders. A more general goal that would certainly be con-sidered is the need to protect public safety. Each goal implies different types of actions, together with different types and levels of resources, that would be considered within constraints implied by the need to protect public safety. If offi cials from other organizations—prosecutors, judges, or state legislators—were involved in concep-tualization and design, then additional goals, constraints, and policies might be considered.

Increasing capacity by building more prisons would be the most costly approach in fi nancial terms, but it might also be viewed as the most certain way to protect public safety. Early re-lease of current inmates would be cheaper and faster than building new facilities, but this goal implies other decisions, such as how persons would be selected and whether they would be released to parole or to halfway houses. Each of these alternatives requires some organizational capacity to choose inmates for release, place them in halfway houses, or supervise compli-ance with parole. Refusing new admissions would be least costly. Political support must be considered for each possible approach. Each alternative—spending money on new construc-tion, accepting responsibility for early release,

or tacitly passing the problem on to jails that must house inmates refused admission to state facilities—requires different types of political infl uence or courage.

Many other topics in criminal justice re-search are appropriate for problem analysis. Police departments use such techniques to help determine the boundaries of patrol beats and the allocation of other resources. In most large cities, analysts examine the concentration of calls for service in terms of space and time and consider how street layout and obstacles might facilitate or impede patrol car mobility.

A growing number of law enforcement agen-cies are using computerized crime maps to de-tect emerging patterns in crime and develop appropriate responses. Producing computer-generated maps that display reported crimes within days after they have occurred is one of the most important policy planning tools for the New York City Police Department (Silverman 1999). Other departments have taken advantage of funding and technical assistance made avail-able by federal funding to enhance mapping and other crime analysis capabilities (Boba 2005).

Program Evaluation Problem analysis takes place in the earlier stages of the policy pro-cess. In contrast, program evaluation studies are conducted in later stages and seek answers to two types of questions: (1) Are policies be-ing implemented as planned? (2) Are policies achieving their intended goals? Evaluation, therefore, seeks to link the intended actions and goals of criminal justice policy to empiri-cal evidence that policies are being carried out as planned and are having the desired effects. These two types of questions correspond to two related types of program evaluations: pro-cess evaluation and impact assessment. Return-ing to our example of policies to reduce prison population, we will fi rst consider impact assess-ment and then process evaluation.

Let’s assume that corrections department policy analysts select an early-release program to reduce the population of one large institution.

Chapter 10 Evaluation Research and Problem Analysis 259

Inmates who have less than 120 days remain-ing on their sentence and who were committed for nonviolent offenses will be considered for early release. Further assume that of those in-mates selected for early release, some will be assigned to parole offi cers, and some will serve their remaining sentence in halfway houses—working at jobs during the week but spending evenings and weekends in a community-based facility.

The program has two general goals: (1) to reduce prison population to the court-imposed ceiling and (2) to protect public safety. Whereas the fi rst goal is fairly straightforward, the sec-ond is uncomfortably vague. What do we mean by “protecting public safety”? For now, let’s say we will conclude that the program is successful in this regard if, after six months, persons in the two early-release conditions have aggregate rates of arrest for new offenses equal to or less than a comparison group of inmates released after completing their sentences.

Our impact assessment would examine data on the prison population before and after the new program was implemented, together with arrest records for the two types of early releases and a comparison group. We might obtain something like the hypothetical results shown in Table 10.1.

Did the program meet its two goals? Your initial reaction might be that it did not, but Ta-ble 10.1 presents some interesting fi ndings. The prison population certainly was reduced, but it did not reach the court-imposed cap of 1,350. Those released to halfway houses had lower ar-rest rates than others, but persons placed on early parole had higher arrest rates. Averaging arrest rates for all three groups shows that the total fi gure is about the same as that for per-sons released early. Notice also that almost twice as many people were released to early pa-role as were placed in halfway houses.

The impact assessment results in Table 10.1 would have been easier to interpret if we had con-ducted a process evaluation. A process evalua-tion focuses on program outputs, as represented

in Figure 10.1, seeking answers to the question of whether the program was implemented as intended. If we had conducted a process evalu-ation of this early-release program, we might have discovered that something was amiss in the selection process. Two pieces of evidence in Ta-ble 10.1 suggest that one of the selection biases we considered in Chapter 5, “creaming,” might be at work in this program. Recall that cream-ing is the natural tendency of public offi cials to choose experimental subjects least likely to fail. In this case, selectivity is indicated by the failure of the early-release program to meet its target number, the relatively small number of persons placed in halfway houses, and the lower re-arrest rates for these persons. A process evaluation would have monitored selection procedures and probably revealed evidence of excessive caution on the part of corrections offi cials in releasing offenders to halfway houses.

Ideally, impact assessments and process evaluations are conducted together. Our exam-ple illustrates the important general point that process evaluations make impact assessments more interpretable. In other cases, process eval-uations may be conducted when an impact as-sessment is not possible. To better understand how process evaluations and impact assess-ments complement each other, let’s now look more closely at how evaluations are conducted.

Table 10.1 Hypothetical Results of Early-Prison-Release Impact Assessment

Percent New Arrests After Number of 6 Months Persons

Normal release 26% 142

Early release to halfway houses 17 25

Early parole 33 46

Subtotal early release 27 71

Total 26 213

Note: Preprogram population = 1,578; actual population after implementation = 1,402; court-imposed population cap = 1,350.

260 Part Four Application and Analysis

Getting StartedLearning policy goals is a key fi rst step in doing eval-uation research.

Several steps are involved in planning any type of research project. This is especially true in applied studies, for which even more planning may be required. In evaluating a prison early re-lease program, we need to think about design, measurement, sampling, data collection proce-dures, analysis, and so on. We also have to ad-dress such practical problems as obtaining ac-cess to people, information, and data needed in an evaluation.

In one sense, however, evaluation research differs slightly in the way research questions are developed and specifi ed. Recall that we equated program evaluation with hypothesis testing; policies are equivalent to if-then state-ments postulating that an intervention will have the desired impact. Preliminary versions of research questions, therefore, will already have been formulated for many types of evalu-ations. Problem analysis usually considers a limited range of alternative choices; process evaluations focus on whether programs are carried out according to plans; and impact as-sessments evaluate whether specifi ed goals are attained.

This is not to say that evaluation research is a straightforward business of using social scien-tifi c methods to answer specifi c questions that are clearly stated by criminal justice offi cials. It is often diffi cult to express policy goals in the form of if-then statements that are empirically testable. Another problem is the presence of confl icting goals. Many issues in criminal jus-tice are complex, involving different organiza-tions and people. And different organizations and people may have different goals that make it diffi cult to defi ne specifi c evaluation ques-tions. Perhaps most common and problematic are vague goals. Language describing criminal justice programs may optimistically state goals of enhancing public safety by reducing recidi-vism without clearly specifying what is meant by that objective.

In most cases, researchers have to help crim-inal justice offi cials formulate testable goals, something that is not always possible. Other obstacles may interfere with researchers’ access to important information. Because of these and similar problems, evaluation researchers must fi rst address the question of whether to evaluate at all.

Evaluability AssessmentAn evaluability assessment is described by Rossi and associates (1999, 157) as sort of a “pre-evaluation,” in which a researcher determines whether conditions necessary for conducting an evaluation are present. One obvious condi-tion is support for the study from organizations delivering program components that will be evaluated. The word evaluation may be threaten-ing to public offi cials, who fear that their own job performance is being rated. Even if offi cials do not feel personally threatened by an impact assessment or other applied study, evaluation research can disrupt routine agency operations. Ensuring agency cooperation and support is therefore an important part of evaluability as-sessment. Even if no overt opposition exists, of-fi cials may be ambivalent about evaluation.

This might be the case, for example, if an evaluation is required as a condition of launch-ing a new program. This and other steps in eval-uability assessment may be accomplished by scouting a program and interviewing key per-sonnel (Rossi, Freeman, and Lipsey 1999, 135). The focus in scouting and interviewing should be on obtaining preliminary answers to ques-tions that eventually will have to be answered in more detail as part of an evaluation. What are general program goals and more specifi c objectives? How are these goals translated into program components? What kinds of records and data are readily available? Who will be the primary consumers of evaluation results? Do other persons or organizations have a direct or indirect stake in the program? Figure 10.2 pres-ents a partial menu of questions that can guide information gathering for the evaluability as-sessment and later stages.

Chapter 10 Evaluation Research and Problem Analysis 261

The answers to these and similar questions should be used to prepare a program descrip-tion. Although offi cial program descriptions may be available, evaluation researchers should always prepare their own description, one that refl ects their own understanding of program goals, elements, and operations. Offi cial docu-ments may present incomplete descriptions or ones intended for use by program staff, not evaluators. Even more importantly, offi cial pro-gram documents often do not contain usable statements about program goals. As we will see, formulating goal statements that are empiri-cally testable is one of the most important com-ponents of evaluation research.

Douglas McDonald and Christine Smith (1989, 1) describe slightly different types of questions to be addressed by criminal justice

offi cials and evaluators in deciding whether to evaluate state-level drug control programs:

How central is the project to the state’s strategy?

How costly is it relative to others?Are the project’s objectives such that prog-

ress toward meeting them is diffi cult to estimate accurately with existing moni-toring procedures?

Such questions are related to setting both pro-gram and evaluation priorities. On the one hand, if a project is not central to drug control strategies or if existing information can help determine project effectiveness, then an evalua-tion should probably not be conducted. On the other hand, costly projects that are key elements in antidrug efforts should be evaluated so that resources can be devoted to new programs if ex-isting approaches are found to be ineffective.

Problem FormulationWe mentioned that evaluation research ques-tions may be defi ned for you. This is true in a general sense, but formulating applied re-search problems that can be empirically evalu-ated is an important and often diffi cult step. E valuation research is a matter of fi nding out whether something is or is not there, whether something did or did not happen. To conduct evaluation research, we must be able to opera-tionalize, observe, and recognize the presence or absence of what is under study.

This process normally begins by identifying and specifying program goals. The diffi culty of this task, according to Rossi and associates (1999, 167), revolves around the fact that for-mal statements of goals are often abstract state-ments about ideal outcomes. Here are some examples of goal statements paraphrased from actual program descriptions:

• Equip individuals with life skills to succeed (a state-level shock incarceration program; MacKenzie, Shaw, and Gowdy 1993).

• Provide a safe school environment condu-cive to learning (a school resource offi cer program; Johnson 1999).

Figure 10.2 Evaluation QuestionsSource: Adapted from Stecher and Davis (1987, 58–59).

1. Goals a. What is the program intended to accomplish? b. How do staff determine how well they have at-

tained their goals? c. What formal goals and objectives have been

identified? d. Which goals or objectives are most important? e. What measures of performance are currently

used? f. Are adequate measures available, or must they

be developed as part of the evaluation?

2. Clients a. Who is served by the program? b. How do they come to participate? c. Do they differ in systematic ways from

nonparticipants?

3. Organization and Operation a. Where are the services provided? b. Are there important differences among sites? c. Who provides the services? d. What individuals or groups oppose the program

or have been critical of it in the past?

4. History a. How long has the program been operating? b. How did the program come about? c. Has the program grown or diminished in size

and influence? d. Have any significant changes occurred in the

program recently?

262 Part Four Application and Analysis

• Encourage participants to accept the philos-ophy and principles of drug-free living (an urban drug court; Finn and Newlyn 1993).

• Provide a mechanism that engages local citizens and community resources in the problem-solving process (a probation-police community corrections program; Wooten and Hoelter 1998).

Each statement expresses a general program objective that must be clarifi ed before we can formulate research questions to be tested em-pirically. We can get some idea of what the fi rst example means, but this goal statement raises several questions. The objective is for individu-als to succeed, but succeed at what? What is meant by “life skills”—literacy, job training, time management, self-discipline? We might also ask whether the program focuses on outputs (equipping people with skills) or on impacts (promoting success among people who are equipped with the skills). On the one hand, an evaluation of program outputs might assess in-dividual learning of skills, without considering whether the skills enhance chances for success. On the other hand, an evaluation of program impacts might obtain measures of success such as stable employment or not being arrested within some specifi ed time period.

In all fairness, these goal statements are taken somewhat out of context; source documents expand on program goals in more detail. They are, however, typical of stated goals or initial re-sponses we might get to the question, “What are the goals of this program?” Researchers require more specifi c statements of program objectives.

Wesley Skogan (1985) cautions that offi cial goal statements frequently oversell what a program might realistically be expected to ac-complish. It’s natural for public offi cials to be positive or optimistic in stating goals, and overselling may be evident in goal statements. Another reason offi cials and researchers em-brace overly optimistic goals is that they fail to develop a micromodel of the program produc-tion process (Weiss 1995). That is, they do not

adequately consider just how some specifi ed in-tervention will work. Referring to Figure 10.1, we can see that developing a micromodel can be an important tool for laying out program goals and understanding how institutional processes are structured to achieve those goals. Skogan (1985, 38; emphasis in original) describes a mi-cromodel as

part of what is meant by a “theory-driven” evaluation. Researchers and program per-sonnel should together consider just how each element of a program should affect its targets. If there is not a good reason why “X” should cause “Y” the evaluation is probably not going to fi nd that it did! Mi-cromodeling is another good reason for monitoring the actual implementation of programs.

A micromodel can also reveal another prob-lem that sometimes emerges in applied stud-ies: inconsistent goals. Michael Maxfi eld and Terry Baumer (1992) evaluated a pretrial home detention program in which persons awaiting trial for certain types of offenses were released from jail and placed on home detention with electronic monitoring. Five different criminal justice organizations played roles in imple-mentation or had stakes in the program. The county sheriff ’s department (1) faced pressure to reduce its jail population. Under encourage-ment from the county prosecutor (2), the pre-trial release program was established. Criminal court judges (3) had the ultimate authority to release defendants to home detention, follow-ing recommendations by bail commissioners in a county criminal justice services agency (4). Finally, a community corrections department (5) was responsible for actually monitoring per-sons released to home detention.

Maxfi eld and Baumer interviewed persons in each of these organizations and discovered that different agencies had different goals. The sheriff ’s department was eager to release as many people as possible to free up jail space for convicted offenders and pretrial defendants

Chapter 10 Evaluation Research and Problem Analysis 263

who faced more serious charges. Community corrections staff, charged with the task of mon-itoring pretrial clients, were more cautious and sought only persons who presented a lower risk of absconding or committing more offenses while on home detention. The county prosecu-tor viewed home detention as a way to exercise more control over some individuals who would otherwise be released under less restrictive con-ditions. Some judges refused to release people on home detention, whereas others followed prosecutors’ recommendations. Finally, bail commissioners viewed pretrial home detention as a form of jail resource management, adding to the menu of existing pretrial dispositions ( jail, bail, or release on recognizance).

The different organizations involved in the pretrial release program comprised mul-tiple stakeholders—persons and organizations with a direct interest in the program. Each stakeholder had different goals for and differ-ent views on how the program should actually operate—who should be considered for pretrial home detention, how they should be moni-tored, and what actions should be taken against those who violated various program rules. After laying out these goals and considering different measures of program performance, Maxfi eld and Baumer (1992, 331) developed a micro-model of home detention indicating that elec-tronic monitoring is suitable for only a small fraction of defendants awaiting trial.

Clearly specifying program goals, then, is a fundamental fi rst step in conducting evaluation studies. If offi cials are not certain about what a program is expected to achieve, it is not pos-sible to determine whether goals are reached. If multiple stakeholders embrace different goals, evaluators must specify different ways to assess those goals. Maxfi eld (2001) describes a number of different approaches to specifying clear goals, a crucial fi rst step in the evaluation process.

MeasurementAfter we identify program goals, our attention turns to measurement, considering fi rst how to

measure a program’s success in meeting goals. Obtaining evaluable statements of program goals is conceptually similar to the measure-ment process, in which program objectives represent conceptual defi nitions of what a pro-gram is trying to accomplish.

Specifying Outcomes If a criminal justice program is intended to accomplish something, we must be able to measure that something. If we want to reduce fear of crime, we need to be able to measure fear of crime. If we want to increase consistency in sentences for drug offenses, we need to be able to measure that. Notice, however, that although outcome mea-sures are derived from goals, they are not the same as goals. Program goals represent desired outcomes, whereas outcome measures are em-pirical indicators of whether those desired out-comes are achieved. Furthermore, if a program pursues multiple goals, then researchers may have to either devise multiple outcome mea-sures or select a subset of possible measures to correspond with a subset of goals.

Keeping in mind our program-as-hypothesis simile, outcome measures correspond to depen-dent variables—the Y in a simple X → Y causal hypothesis. Because we have already consid-ered what’s involved in developing measures for dependent variables, we can describe how to formulate outcome measures. Pinning down program goals and objectives results in a con-ceptual defi nition. We then specify an opera-tional defi nition by describing empirical indi-cators of program outcomes.

In our earlier example, Maxfi eld and Baumer (1992) translated the disparate interests of or-ganizations involved in pretrial home detention into three more specifi c objectives: (1) ensure appearance at trial, (2) protect public safety, and (3) relieve jail crowding. These objectives led to corresponding outcome measures: (1) failure-to-appear rates for persons released to pretrial home detention, (2) arrests while on home de-tention, and (3) estimates of the number of jail beds made available, computed by multiplying

264 Part Four Application and Analysis

the number of persons on pretrial home deten-tion by the number of days each person served on the program. Table 10.2 summarizes the goals, objectives, and measures defi ned by Max-fi eld and Baumer.

Measuring Program Contexts Measuring the dependent variables directly involved in an impact assessment is only a beginning. As Ray Pawson and Nick Tilley (1997, 69) point out, it is usually necessary to measure the context within which the program is conducted. These variables may appear to be external to the ex-periment itself, yet they still affect it.

Consider, for example, an evaluation of a job-skills training program coupled with early prison release to a halfway house. The primary outcome measure might be participants’ suc-cess at gaining employment after completing the program. We will, of course, observe and calculate the subjects’ employment rates. We should also be attentive to what has happened to the employment and unemployment rates of the community and state where the program is located. A general slump in the job market should be taken into account in assessing what might otherwise seem to be a low employment rate for subjects. Or if all the experimental sub-jects get jobs following the program, that might result more from a general increase in available jobs than from the value of the program itself.

There is no magic formula or set of guide-lines for selecting measures of program context, any more than there is for choosing control variables in some other type of research. Just as we read what other researchers have found with respect to some topic we are interested in—say, explanatory research—we should also learn about the production process for some criminal justice program before conducting an evaluation.

Measuring Program Delivery In addition to making measurements relevant to the out-comes of a program, it is necessary to measure the program intervention—the experimen-tal stimulus or independent variable. In some

cases, this measurement will be handled by as-signing subjects to experimental and control groups, if that’s the research design. Assigning a person to the experimental group is the same as scoring that person “yes” on the interven-tion, and assignment to the control group rep-resents a score of “no.” In practice, however, it’s seldom that simple.

Let’s continue with the job-training ex-ample. Some inmates will participate in the program through early release; others will not. But imagine for a moment what job-training programs are actually like. Some subjects will participate fully; others might miss sessions or fool around when they are present. So we may need measures of the extent or quality of par-ticipation in the program. And if the program is effective, we should fi nd that those who par-ticipated fully have higher employment rates than those who participated less.

Other factors may further confound the ad-ministration of the experimental stimulus. Sup-pose we are evaluating a new form of c ounseling

Table 10.2 Pretrial Home Detention with Electronic Monitoring: Goals, Objectives, and Measures

Actor/Organization Goals

Sheriff Release jail inmates

Prosecutor Increase supervision of pretrial defendants

Judges Protect public safety

Bail commission Provide better jail resource management

Community Monitor defendant corrections compliance

Return violators to jail

Objectives Measures

Ensure court appearance Failure-to-appear counts

Protect public safety Arrests while on program

Relieve jail crowding N defendants � days served

Source: Adapted from Farrington and associates (1993, 108).

Chapter 10 Evaluation Research and Problem Analysis 265

designed to cure drug addiction. Several coun-selors administer it to subjects composing an experimental group. We can compare the re-covery rate of the experimental group with that of a control group (a group that received some other type of counseling or none at all). It might be useful to include the names of the counsel-ors who treat specifi c subjects in the experimen-tal group, because some may be more effective than others. If that turns out to be the case, we must fi nd out why the treatment works better for some counselors than for others. What we learn will further elaborate our understanding of the therapy itself.

Obtaining measures that refl ect actual deliv-ery of the experimental intervention is very im-portant for many types of evaluation designs. Variation in the levels of treatment delivered by a program can be a major threat to the validity of even randomized evaluation studies. Put an-other way, uncontrolled variation in treatment is equivalent to unreliable measurement of the independent variable.

Specifying Other Variables It is usually nec-essary to measure the population of subjects involved in the program being evaluated. In particular, it is important to defi ne those for whom the program is appropriate. In evalua-tion studies, such persons are referred to as the program’s target population. If we are evaluating a program that combines more intensive pro-bation supervision with periodic urine testing for drug use, it’s probably appropriate for con-victed persons who are chronic users of illegal drugs, but how should we defi ne and measure chronic drug use more specifi cally? The job-skills training program mentioned previously is probably appropriate for inmates who have poor employment histories, but a more specifi c defi nition of employment history is needed.

This process of defi nition and measurement has two aspects. First, the program target popu-lation must be specifi ed. This is usually done in a manner similar to the process of defi ning pro-gram goals. Drawing on questions like those

in Figure 10.2, evaluators consult program of-fi cials to identify the intended targets or ben-efi ciaries of a particular program. Because the hypothetical urine-testing program is com-bined with probation, its target population will include persons who might receive suspended sentences with probation. However, offenders convicted of crimes that carry nonsuspendable sentences will not be in the target population. Prosecutors and other participants may specify additional limits to the target population—employment or no previous record of proba-tion violations, for example.

Most evaluation studies that use individual people as units of analysis also measure such background variables as age, gender, educa-tional attainment, employment history, and prior criminal record. Such measures are made to determine whether experimental programs work best for males, those older than 25, high school graduates, persons with fewer prior ar-rests, and so forth.

Second, in providing for the measurement of these different kinds of variables, we need to choose whether to create new measures or use ones already collected in the course of nor-mal program operation. If our study addresses something that’s not routinely measured, the choice is easy. More commonly, at least some of the measures we are interested in will be repre-sented in agency records in some form or other. We then have to decide whether agency mea-sures are adequate for our evaluation purposes.

Because we are talking about measurement here, our decision to use our own measures or those produced by agencies should, of course, be based on an assessment of measurement reli-ability and validity. If we are evaluating the pro-gram that combined intensive probation with urinalysis, we will have more confi dence in the reliability and validity of basic demographic in-formation recorded by court personnel than in court records of drug use. In this case, we might want to obtain self-report measures of drug use and crime commission from subjects them-selves, rather than relying on offi cial records.

266 Part Four Application and Analysis

By now, it should be abundantly clear that measurement must be taken very seriously in evaluation research. Evaluation researchers must carefully determine all the variables to be measured and obtain appropriate measures for each. However, such decisions are typically not purely scientifi c ones. Evaluation researchers often must work out their measurement strat-egy with the people responsible for the program being evaluated.

Designs for Program EvaluationDesigns used in basic research are readily adapted for use in evaluation research.

Chapter 5 introduced a variety of experimen-tal and other designs that researchers use in studying criminal justice. Recall that randomly assigning research subjects to experimental or control groups controls for many threats to in-ternal validity. Here our attention turns specifi -cally to the use of different designs in program evaluation.

Randomized Evaluation DesignsTo illustrate the advantages of random assign-ment, consider this dialogue from Lawrence Sherman’s book Policing Domestic Violence: Ex-periments and Dilemmas (1992b, 67):

When the Minneapolis domestic violence experiment was in its fi nal planning stage, some police offi cers asked: “Why does it have to be a randomized experiment? Why can’t you just follow up the people we ar-rest anyway, and compare their future vio-lence risks to the people we don’t arrest?” Since this question reveals the heart of the logic of controlled experiments, I said, “I’m glad you asked. What kind of people do you arrest now?” “Assholes,” they replied. “People who commit aggravated POPO.” “What is aggravated POPO?” I asked.

“Pissing off a police offi cer,” they an-swered. “Contempt of cop. But we also ar-rest people who look like they’re going to be violent, or who have caused more seri-ous injuries.” “What kind of people do you not ar-rest for misdemeanor domestic assault?” I continued. “People who act calm and polite, who lost their temper but managed to get con-trol of themselves,” came the answer. “And which kinds of people do you think would have higher risks of repeat violence in the future?” I returned. “The ones we arrest,” they said, the light dawning. “But does that mean arrest caused them to become more violent?” I pressed. “Of course not—we arrested them be-cause they were more trouble in the fi rst place,” they agreed. “So just following up the ones you ar-rest anyway wouldn’t tell us anything about the effects of arrest, would it?” was my fi nal question. “Guess not,” they agreed. And they went on to perform the experiment.

Sherman’s dialogue portrays the obvious prob-lems of selection bias in routine police proce-dures for handling domestic violence. In fact, one of the most important benefi ts of random-ization is to avoid the selectivity that is such a fundamental part of criminal justice decision making. Police selectively arrest people, pros-ecutors selectively fi le charges, judges and juries selectively convict defendants, and offenders are selectively punished. In a more general sense, randomization is the great equalizer: through probability theory, we can assume that groups created by random assignment will be statisti-cally equivalent.

Randomized designs are not suitable for evaluating all experimental criminal justice programs. Certain requirements of randomized studies mean that this design cannot be used

Chapter 10 Evaluation Research and Problem Analysis 267

in many situations. A review of those require-ments illustrates many of the limits of random-ized designs for applied studies.

Program and Agency Acceptance Random assignment of people to receive some especially desirable or punitive treatment may not be possible for legal, ethical, and practical rea-sons. We discussed ethics and legal issues in Chapter 2. Sometimes practical obstacles may also be traced to a misunderstanding of the meaning of random assignment. It is crucial that public offi cials understand why random-ization is desirable and that they fully endorse the procedure.

Richard Berk and associates (2003) de-scribe how researchers obtained cooperation for an evaluation of a new inmate classifi ca-tion system in the California Department of Corrections (CDC) by appealing to the needs of agency managers. Preliminary research sug-gested that the experimental classifi cation sys-tem would increase inmate and staff safety at lower cost than classifi cation procedures then in use. In addition:

Plans for the study were thoroughly re-viewed by stakeholders, including CDC administrators, representatives of prison employee bargaining unions, . . . California State legislative offi ces, and a wide variety of other interested parties. There was wide-spread agreement that the study was worth doing. (Berk et al. 2003, 211)

At the same time, justice agencies have ex-panding needs for evaluations of smaller-scale programs. John Eck (2002) explains how de-signs that are less elaborate are more likely to be accepted by public agencies.

Minimization of Exceptions to Random Assignment In any real-world delivery of al-ternative programs or treatments to victims, offenders, or criminal justice agency staff, ex-ceptions to random assignment are all but inevitable. In a series of experiments on po-lice responses to domestic violence, offi cers

r esponded to incidents in one of three ways, according to a random assignment procedure (Sherman et al. 1992). The experimental treat-ment was arrest; control treatments included simply separating parties to the dispute or at-tempting to advise and mediate. Although pa-trol offi cers and police administrators accepted the random procedure, exceptions were made as warranted in individual cases, subject to an offi cer’s discretionary judgment.

As the number of exceptions to random assignment increases, however, the statistical equivalence of experimental and control groups is threatened. When police (or others) make exceptions to random assignment, they are in-troducing bias into the selection of experimen-tal and control groups. Randomized experi-ments are best suited for programs in which such exceptions can be minimized. The prison classifi cation study by Berk and associates of-fers a good example. Random assignment was automatic—inmates having odd identifi cation numbers at intake were assigned to the treat-ment group, while those having even numbers were in the control group. This procedure pro-duced treatment and control groups that were virtually identical in size: 9,662 in treatment and 9,656 controls (2003, 224 –225).

Adequate Case Flow for Sample Size In Chapter 6, we examined the relationship be-tween sample size and accuracy in estimating population characteristics. As sample size in-creases (up to a point), estimates of population means and standard errors become more pre-cise. By the same token, the number of subjects in groups created through random assignment is related to the researcher’s ability to detect signifi cant differences in outcome measures between groups. If each group has only a small number of subjects, statistical tests can detect only very large program effects or differences in outcome measures between the two groups. This is a problem with statistical conclusion validity and sample size, as we discussed in Chapters 5 and 6.

268 Part Four Application and Analysis

Case fl ow represents the process through which subjects are accumulated in experimen-tal and control groups. In Sherman’s domestic violence evaluations, cases fl owed into experi-mental and control groups as domestic violence incidents were reported to police. Evaluations of other types of programs will generate cases through other processes—for example, offend-ers sentenced by a court or inmates released from a correctional facility.

If relatively few cases fl ow through some process and thereby become eligible for ran-dom assignment, it will take longer to obtain suffi cient numbers of cases. The longer it takes to accumulate cases, the longer it will take to conduct an experiment and the longer experi-mental conditions must be maintained. Imag-ine fi lling the gas tank of your car with a small cup: it would take a long time, it would test your patience, and you would probably tarnish the paint with spilled gasoline as the ordeal dragged on. In a similar fashion, an inadequate fl ow of cases into experimental groups risks contaminating the experiment through other problems. Getting information about case fl ow in the planning stages of an evaluation is a good way to diagnose possible problems with numbers of subjects.

Maintaining Treatment Integrity Treat-ment integrity refers to whether an experimen-tal intervention is delivered as intended. Some-times called treatment consistency, treatment integrity is therefore roughly equivalent to measurement reliability. Experimental designs in applied studies often suffer from problems related to treatment inconsistencies. If serving time in jail is the experimental treatment in a program designed to test different approaches to sentencing drunk drivers, treatment integ-rity will be threatened if some defendants are sentenced to a weekend in jail while others serve 30 days or longer.

Criminal justice programs can vary consid-erably in the amount of treatment applied to different subjects in experimental groups. For example, Gottfredson and associates (2003)

acknowledge that the drug-court treatment in Baltimore County was unevenly implemented. Only about half of those assigned to the experi-mental group received certifi ed drug treatment. In contrast, the classifi cation system tested by Berk and associates was a relatively simple treatment that was readily standardized. There was no danger of treatment dilution as was the case in the drug-court experiment.

Midstream changes in experimental pro-grams can also threaten treatment integrity. Rossi and associates (1999, 297) point out that the possibility of midstream changes means that randomized designs are usually not appro-priate for evaluating programs in early stages of development, when such changes are more likely. For example, assume we are evaluating an intensive supervision probation program with randomized experimental and control groups. Midway through the experiment, program staff decide to require weekly urinalysis for everyone in the experimental group (those assigned to intensive supervision). If we detect differences in outcome measures between the experimental and control groups (say, arrests within a year after release), we will not know how much of the difference is due to intensive supervision and how much might be due to the midstream change of adding urine tests.

Summing Up the Limits of Randomized Designs Randomized experiments therefore require that certain conditions be met. Staff re-sponsible for program delivery must accept ran-dom assignment and further agree to minimize exceptions to randomization. Case fl ow must be adequate to produce enough subjects in each group so that statistical tests will be able to detect signifi cant differences in outcome mea-sures. Finally, experimental interventions must be consistently applied to treatment groups and withheld from control groups.

These conditions, and the problems that may result if they are not met, can be summa-rized as two overriding concerns in fi eld experi-ments: (1) equivalence between experimental and control groups before an intervention, and

Chapter 10 Evaluation Research and Problem Analysis 269

(2) the ability to detect differences in outcome measures after an intervention is introduced. If there are too many exceptions to random assignment, experimental and control groups may not be equivalent. If there are too few cases, or inconsistencies in administering a treatment, or treatment spillovers to control subjects, out-come measures may be affected in such a way that researchers cannot detect the effects of an intervention.

Let’s now look at an example that illustrates both the strengths of random experiments and constraints on their use in criminal justice pro-gram evaluations.

Home Detention: Two Randomized StudiesTerry Baumer and Robert Mendelsohn con-ducted two random experiments to evaluate programs that combine home detention with electronic monitoring (ELMO). In earlier chap-ters, we examined how different features of these studies illustrated measurement princi-ples; here our focus is on the mechanics of ran-dom assignment and program delivery.

In their fi rst study, Baumer and Mendelsohn evaluated a program that targeted adult offend-ers convicted of nonviolent misdemeanor and minor felony offenses (Baumer and Mendel-sohn 1990; also summarized in Baumer, Max-fi eld, and Mendelsohn 1993). The goal of the program was to provide supervision of offend-ers that was more enhanced than traditional probation but less restrictive and less costly than incarceration. Several measures of out-comes and program delivery were examined, as we have described in earlier chapters.

Baumer and Mendelsohn selected a random-ized posttest-only design, in which the target population was offenders sentenced to proba-tion. Subjects were randomly assigned to an experimental group in which the treatment was electronically monitored home detention or to a control group sentenced to home deten-tion without electronic monitoring. Figure 10.3 summarizes case fl ow into the evaluation ex-

periment. After a guilty plea or trial conviction, probation offi ce staff reviewed offenders’ back-grounds and criminal records for the purpose of recommending an appropriate sentence. The next step was a hearing, at which sentences were imposed by a criminal court judge.

Persons sentenced to probation were eligible for inclusion in the experiment. Their case fi les were forwarded to staff in the community cor-rections agency responsible for administering the home detention programs. On receiving an eligible case fi le, community corrections staff telephoned the evaluation researchers, who, having prepared a random list of case numbers, assigned subjects to either the treatment or control group. Subject to two constraints, this process produced 78 treatment subjects and 76 control subjects.

Thinking back on our consideration of eth-ics in Chapter 2, you should be able to think of one constraint: informed consent. Research-ers and program staff explained the evaluation project to subjects and obtained their consent to participate in the experiment. Those who declined to participate in the evaluation study could nevertheless be assigned to home deten-tion as a condition of their probation. The sec-ond constraint was made necessary by the tech-nology of ELMO: subjects could not be kept in the treatment group if they did not have a tele-phone that could be connected to the ELMO equipment.

Notice that random assignment was made after sentencing. Baumer and Mendelsohn be-gan their evaluation by randomizing subjects between stages 2 and 3 in Figure 10.3. This produced problems because judges occasion-ally overruled presentence investigation recom-mendations to probation, thus overriding ran-dom assignment. After detecting this problem, Baumer and Mendelsohn (1990, 27–29) moved randomization downstream, so that judicial decisions could not contaminate the selection process.

Baumer and Mendelsohn (1990, 26) ob-tained agreement from community corrections staff, prosecutors, and judges to use random

270 Part Four Application and Analysis

a ssignment by getting all parties to accept an assumption of “no difference”:

That is, in the absence of convincing evi-dence to the contrary, they were willing to assume that there was no difference be-tween the . . . methods of monitoring. This allowed the prosecutor to negotiate and judges to assign home detention as a con-dition of probation only, while permitting the community corrections agency to make the monitoring decision.

Convinced of the importance of random as-signment, the community corrections agency delegated to researchers the responsibility for making the monitoring decision, a “decision” that was randomized.

In this example, the experimental condition—ELMO—was readily distinguished from the control condition, home detention without ELMO. There was no possibility of treatment

spillover; control subjects could not uninten-tionally receive ELMO because they had neither the bracelet nor the home-base unit that em-bodied the treatment. ELMO could therefore be readily delivered to subjects in the experimental group and withheld from control subjects.

The second ELMO evaluation conducted by Baumer and Mendelsohn reveals how program delivery problems can undermine the strengths of random assignment (Baumer, Maxfi eld, and Mendelsohn 1993). In their study of juvenile burglars, they used similar procedures for ran-domization, but eligible subjects were placed in one of four groups, as illustrated in this table:

Electronic Monitoring?

Police Visits? No Yes

No C E1

Yes E2 E3

Figure 10.3 Home Detention for Convicted Adults: Case Flow and Random Assignment

Conviction or Guilty Plea

Presentence Investigation

Sentence Hearing

Probation Sentences(eligible population)

Random Assignment

Program Intake

1.

2.

3.

4.

5.

6.

Experimental GroupHome detentionElectronic monitoring

(78)

Control GroupHome detentionManual monitoring

(76)

Other Sentences

Chapter 10 Evaluation Research and Problem Analysis 271

Juvenile burglars could be randomly assigned to three possible treatments: ELMO only (E1), po-lice visits to their home after school only (E2), or ELMO and police visits (E3). Subjects in the control group (C) were sentenced to home de-tention only. As in the adult study, outcome measures included arrests after release.

Although there were no problems with ran-dom assignment, inconsistencies in the deliv-ery of each of the two experimental treatments produced uninterpretable results (Maxfi eld and Baumer 1991, 5):

Observations of day-to-day program op-erations revealed that, compared with the adult program, the juvenile court and cooperating agencies paid less attention to delivering program elements and us-ing information from . . . the electronic monitoring equipment. Staff were less well trained in operating the electronic moni-toring equipment, and police visits were inconsistent.

The box titled “Home Detention” in C hapter 1 elaborates on differences in the o peration of these two programs and a third ELMO pro-gram for pretrial defendants. However, the les-son from these studies bears repeating here: randomization does not control for variation in treat-ment integrity and program delivery.

Randomized experiments can be powerful tools in criminal justice program evaluations. However, it is often impossible to maintain the desired level of control over experimental condi-tions. This is especially true for complex inter-ventions that may change while an evaluation is underway. Experimental conditions are also dif-fi cult to maintain when different organizations work together in delivering some service—a community-based drug treatment provider coupled with intensive probation, for example.

Largely because of such problems, evalu-ation researchers are increasingly turning to other types of designs that are less fragile—less subject to problems if rigorous experimental conditions cannot be maintained.

Quasi-Experimental DesignsQuasi-experiments are distinguished from true experiments by the lack of random assignment of subjects to an experimental and a control group. Random assignment of subjects is of-ten impossible in criminal justice evaluations. Rather than forgo evaluation altogether in such instances, it is usually possible to create and ex-ecute research designs that will permit evalua-tion of the program in question.

Quasi-experiments may also be nested into experimental designs as backups should one or more of the requisites for a true experiment break down. For example, Shaddish, Cook, and Campbell (2002) describe how time-series designs can be nested into a series of random experiments. In the event that case fl ow is in-adequate or random assignment to enhanced or standard counseling regimens breaks down, the nested time-series design will salvage a quasi-experiment.

We considered different classes of quasi-experimental designs—nonequivalent groups, cohorts, and time series—in Chapter 5, together with examples of each type. Each of these de-signs has been used extensively in criminal jus-tice evaluation research.

Nonequivalent-Groups Designs As we saw in Chapter 5, quasi-experimental designs lack the built-in controls for selection bias and other threats to internal validity. Nonequivalent-groups designs, by defi nition, cannot be as-sumed to include treatment and comparison subjects who are statistically equivalent. For this reason, quasi-experimental program evalu-ations must be carefully designed and analyzed to rule out possible validity problems.

For evaluation designs that use nonequiva-lent groups, attention should be devoted to con-structing experimental and comparison groups that are as similar as possible on important variables that might account for differences in outcome measures. Rossi and associates (1999) caution that procedures for constructing such groups should be grounded in a theoretical

272 Part Four Application and Analysis

understanding of what individual and group characteristics might confound evaluation re-sults. In a study of recidivism by participants in shock incarceration programs, we certainly want to ensure that equal numbers of men and women are included in groups assigned to shock incar-ceration and groups that received another sen-tence. Alternatively, we can restrict our analysis of program effects to only men or only women.

One common reason for using nonequiva-lent-groups designs is that some experimental interventions are intended to affect all persons in a larger unit—a neighborhood crime preven-tion program, for example. It may not be pos-sible to randomly assign some neighborhoods to receive the intervention while withholding it from others. And we are usually unable to con-trol which individuals in a neighborhood are exposed to the intervention.

Different types of quasi-experimental de-signs can be used in such cases. In a Kansas City program to reduce gun violence, police targeted extra patrols at gun crime hot spots (Sherman, Shaw, and Rogan 1995). Some beats were as-signed to receive the extra patrols, while compar-ison beats—selected for their similar frequency of gun crimes— did not get the special patrols. Several outcome measures were compared for the two types of areas. After 29 weeks, gun sei-zures in the target area increased by more than 65 percent and gun crimes dropped by 49 per-cent. There were no signifi cant changes in either gun crimes or gun seizures in the comparison beat. Drive-by shootings dropped from 7 to 1 in the target area and increased from 6 to 12 in the comparison area. Homicides declined in the target area but not in the comparison area. Citi-zen surveys showed less fear of crime and more positive feelings about the neighborhood in the target area than in the comparison area.

Time-Series Designs Interrupted time-series designs require attention to certain validity threats because researchers cannot normally control how reliably the experimental treatment is implemented. Foremost among these issues

are instrumentation, history, and construct va-lidity. In many interrupted time-series designs, conclusions about whether an intervention produced change in an outcome measure rely on simple indicators that represent complex causal processes.

In their evaluation of legislation to provide for mandatory minimum sentences in Or-egon, Nancy Merritt and associates (Merritt, Fain, and Turner 2006) examined changes in sentences for different types of offenses. They found that sentences for offenses clearly cov-ered by the law did in fact increase in the fi rst fi ve years after its passage. However, they also found declines in the number of cases fi led that were clearly included in the mandatory provi-sions. Meanwhile, more charges were fi led for offenses covered by discretionary provisions. Of course, criminal case prosecution and sentenc-ing are complex processes. The authors could not directly control for different circumstances surrounding cases processed before and after the law took effect. However, their time series analysis does clearly show changes in case fi l-ings, suggesting that prosecutors exercised dis-cretion to evade the mandatory provisions of Oregon’s legislation.

Understanding the causal process that pro-duces measures used in time-series analysis is crucial for interpreting results. Such under-standing can come in two related ways. First, we should have a sound conceptual grasp of the underlying causal forces at work in the process we are interested in. Second, we should under-stand how the indicators used in any time-series analysis are produced.

Patricia Mayhew and associates (Mayhew, Clarke, and Elliott 1989) concluded that laws requiring motorcycle riders to wear helmets produced a reduction in motorcycle theft. This might seem puzzling until we consider the causal constructs involved in stealing motorcy-cles. Assuming that most motorcycle thefts are crimes of opportunity, Mayhew and associates argue that few impulsive thieves stroll about carrying helmets. Even thieves are suffi ciently

Chapter 10 Evaluation Research and Problem Analysis 273

rational to recognize that a helmetless motor-cycle rider will be unacceptably conspicuous—an insight that deters them from stealing mo-torcycles. Mayhew and colleagues considered displacement as an alternative explanation for the decline in motorcycle theft, but they found no evidence that declines in motorcycle theft were accompanied by increases in either sto-len cars or bicycles. By systematically thinking through the causal process of motorcycle theft, Mayhew and associates were able to conclude that helmet laws were unintentionally effective in reducing theft.

Other Types of Evaluation StudiesEarlier in this chapter, we noted how process evaluations are distinct from impact assess-ments. Whereas the latter seek answers to ques-tions about program effects, process evalu-ations monitor program implementation, asking whether programs are being delivered as intended.

Process evaluations can be invaluable aids in interpreting results from an impact assessment. We described how Baumer and Mendelsohn were better able to understand outcome mea-sures in their evaluation of ELMO for juvenile burglars because they had monitored program delivery. Similarly, process evaluations were key elements of CCTV evaluations reported by Gill and Spriggs (2005). They were able to describe whether cameras were placed and monitored as intended. In many cases camera placement was modifi ed, something the authors suggest was related to the relative success of different CCTV installations. Without a process evaluation, information about program implementation cannot be linked to outcome measures.

Process evaluations can also be useful for criminal justice offi cials whose responsibility centers more on the performance of particular tasks than on the overall success of some pro-gram. For example, police patrol offi cers are collectively responsible for public safety in their beat, but their routine actions focus more on performing specifi c tasks such as responding

to a call for service or, in community policing, diagnosing the concerns of neighborhood resi-dents. Police supervisors are attentive to traffi c tickets written, arrests made, and complaints against individual offi cers. Probation and pa-role offi cers are, of course, interested in the ul-timate performance of their clients, but they are also task-oriented in their use of records to keep track of client contacts, attendance at substance abuse sessions, or job performance. Process evaluations center on measures of task performance— on the assumption that tasks are linked to program outcomes. So process evaluations can be valuable in their own right, as well as important for diagnosing measures of program effects.

Problem Analysis and Scientifi c RealismProblem analysis, coupled with scientifi c realism, helps public offi cials use research to select and as-sess alternative courses of action.

Program evaluation differs from problem anal-ysis with respect to the time dimension and where each activity takes place in the policy process. Problem analysis is used to help design alternative courses of action and choose among them.

In reality, there is not much of a difference between these two types of applied research. Similar types of research methods are used to address problem analysis questions (What would happen? What should we do?) as are brought to bear on program evaluation ques-tions (What did happen? What have we done?). Consider, for example, a defi nition of a similar approach, policy analysis, from a prominent text: “Attempting to bring modern science and technology to bear on society’s problems, policy analysis searches for feasible courses of action, generating information and marshal-ling evidence of the benefi ts and other conse-quences that would follow their adoption and implementation” (Quade 1989, 4). Except for

274 Part Four Application and Analysis

the form of the verb (“would follow”), this is not too different from the way we defi ned pro-gram evaluation.

Results from program evaluations are fre-quently considered in choosing among future courses of action. Problem analysis and policy analysis depend just as much on clearly speci-fying goals and objectives as does program evaluation. And the achievement of goals and objectives worked out through problem analy-sis can be tested through program evaluation. Measurement is also a fundamental concern in both types of applied studies.

Problem-Oriented PolicingMore than an alternative approach to law en-forcement, the core of problem-oriented polic-ing is applying problem analysis methods to public safety problems. Problem-oriented polic-ing depends on identifying problems, planning and taking appropriate action, then assess-ing whether those actions achieved intended results.

This approach centers on problems, not individual incidents. For example, traditional policing responds to reports of auto thefts, writing up details about the incident to sup-port an insurance claim, then moving on to the next incident. Let’s consider this incident-oriented policing. In contrast, problem-oriented policing would begin by analyzing a number of auto theft reports. Reports would be examined for similarities, such as where and when they occurred, types of autos stolen, whether stolen cars were eventually recovered, and if so in what condition. Such analysis would defi ne a more general problem of auto theft. Subsequent steps would consider what kinds of actions might be taken to address the problem.

Problem solving is a fundamental tool in problem-oriented policing. As initially defi ned by Ronald Clarke and John Eck, problem solv-ing involves four analytic steps:

(1) carefully defi ned specifi c problems . . . ; (2) conduct in-depth analysis to u nderstand their causes; (3) undertake broad searches

for solutions to remove these causes and bring about lasting reductions in p roblems; (4) evaluate how successful these activities have been. (2005, step 7-1)

You can easily see how problem-solving merges the application of problem analysis and evalua-tion (assessment) of the effects of interventions.

Problem-oriented policing is an especially useful example of applied research because a large number of resources are available. We’ll briefl y describe three types of such resources: how-to-do-it guides, problem and response guides, and case studies. Most of the fi rst two categories have been prepared with support from the Community Oriented Policing Ser-vices (COPS) offi ce in the U.S. Department of Justice. Resources are available at the Center for Problem-Oriented Policing website: www.popcenter.org (accessed May 16, 2008).

How-to-Do-It Guides Ronald Clarke and John Eck (2005) have prepared a general guide to crime analysis to support problem-oriented policing. Adapted from a document originally prepared for the Jill Dando Institute of Crime Science in London, this publication offers succinct guidance on analysis and reporting results. The COPS offi ce has also sponsored guides that provide more detail on different problem analysis tools: assessment and evalua-tion (Eck 2003a); understanding the process of repeat victimization (Weisel 2005); conducting background research on problems (Clarke and Schultze 2005); interviewing offenders (Decker 2005); and collaborating with private-sector in-terests to solve problems (Chamard 2006).

Crime mapping and other methods of space-based analysis are important tools in problem-oriented policing. GIS and Crime Mapping by Spencer Chainey and Jerry Ratcliffe (2005) is an excellent general guide. John Eck and associates (2005) focus on the use of mapping to identify crime hot spots.

Problem and Response Guides In an ear-lier chapter, we mentioned that justice agencies frequently adopt programs that appear to have

Chapter 10 Evaluation Research and Problem Analysis 275

been successful in other jurisdictions. While this can sometimes be advisable, a key principle of problem-oriented policing is to base local ac-tions on an understanding of local problems. Instead of trying an off-the-shelf program or so-called “best practice,” appropriate interven-tions should be considered only after analyzing the data.

This principle is evident in two series of guides that describe what is known about effec-tive responses based on past experience. Problemguides describe how to analyze very specifi c types of problems (for example, “Financial Crimes Against the Elderly”), and what are known to be effective or ineffective responses. Response guides describe very general kinds of actions that might be undertaken to address different types of problems (for example, “Video Surveil-lance of Public Places”).

Case Studies and Other Research One of the hallmarks of applied research is to use re-search to change practice. The two groups of guides discussed so far were prepared for use by criminal justice professionals, but they were developed following many years of research. Many examples of research that contributed to changes in justice policy have been published in the series Crime Prevention Studies. We now turn to an example that illustrates the application of problem analysis, as well as other research prin-ciples presented in this and earlier chapters.

Auto Theft in Chula VistaChula Vista is a medium-sized city of just un-der 200,000 residents, bordered by the Pacifi c Ocean on the west, and sandwiched by San Diego on the north and southwest; the city is about seven miles north of the U.S.-Mexico bor-der. Nanci Plouffe and Rana Sampson (2004) began their analysis of vehicle theft by compar-ing Chula Vista to other southern California cities. After noting that theft rates tended to in-crease for cities closer to the border, they began to disaggregate the problem by searching for areas where vehicle thefts and break-ins were concentrated. Deborah Weisel (2003) refers to

this as “parsing,” or breaking down a large-area measure to examine smaller areas.

Plouffe and Sampson fi rst determined that 10 parking lots accounted for 25 percent of thefts and 20 percent of break-ins in the city. Furthermore, 6 of those 10 lots were also among the top 10 calls-for-service locations in Chula Vista. This meant that auto-theft hot spots also tended to be hot spots for other kinds of inci-dents. Continuing their analysis, the analysts found some notable patterns:

• Recovery rates for stolen cars and trucks were lower in Chula Vista than in areas to the north.

• Recovery rates in 4 of the 10 hot parking lots were especially low, under 40 percent.

• Smaller pick-up trucks and older Toyota Camrys had even lower recovery rates.

• High-risk lots were close to roads that led to the Mexico border.

Together these fi ndings suggested many cars stolen from the high-risk areas were being driven into Mexico.

Plouffe and Sampson next moved beyond using existing data from police records. This is again consistent with the methods of prob-lem analysis: use existing data to identify prob-lems and their general features, then collect additional data to better understand the mech-anisms of problems. For Plouffe and Sampson that meant conducting environmental surveys of high-risk parking lots, observing operations and interviewing offi cials at U.S.-Mexico border crossings, and interviewing a small number of individuals arrested for auto theft from target lots. They sought to understand why particu-lar lots were targeted and whether stolen cars could be easily driven into Mexico.

We described environmental surveys in Chap-ter 8. In conducting theirs, Plouffe and Samp-son discovered that the highest-risk lot was a two-minute drive from vehicle entry points into Mexico. The lot served a midrange gen-eral shopping mall with typical open p arking. Access was easy and thieves could expect that vehicles would be unguarded for some time.

276 Part Four Application and Analysis

Information gathered from the border crossing confi rmed that few cars entering Mexico were stopped, and vehicle identifi cation documents were rarely requested.

In-person interviews with auto thieves used a 93-item questionnaire, asking about tar-get selection, techniques, and other routines. Thieves preferred older cars because they could be easily stolen—steering column locks wear out and can be broken with simple tools. They watched people entering stores, judged that their vehicle would be unguarded for a time, then drove the few minutes into Mexico. Cars were rarely stolen from parking garages because thieves would have to produce a ticket in order to exit.

With this and other information, Plouffe and Sampson discussed strategies with Chula Vista police and security staff at parking lots and shopping malls. More diligent screening at the border was rejected, largely because most vehicles had been driven into Mexico before the theft was even discovered. They recommended that high-risk shopping malls install gates at entrance and exit points for parking lots. Drivers would take a ticket upon entering and would have to produce it when leaving. This, it was argued, would substantially increase the ef-fort required to steal vehicles from parking lots near the border.

Other Applications of Problem AnalysisPartly because it has proved helpful in law en-forcement applications, problem analysis is being adopted by other criminal justice agen-cies. Veronica Coleman and associates describe how local and federal prosecutors in several U.S. cities have formed planning teams to iden-tify crime problems and develop appropriate interventions. Teams include U.S. attorneys, researchers, and other criminal justice profes-sionals who pursue a form of problem analysis labeled Strategic Approaches to Community Safety Initiatives (SACSI). SACSI involves fi ve

steps, four of which should look familiar (Cole-man et al., 1999, 18):

1. Form an interagency working group.2. Gather information and data about a local

crime problem.3. Design a strategic intervention to tackle the

problem.4. Implement the intervention.5. Assess and modify the strategy as the data

reveal effects.

We have only scratched the surface of prob-lem analysis applications in criminal justice. This is an area of applied research that is grow-ing daily. Other examples draw on methods of systems analysis, operations research, and economics for such purposes as cost-benefi t studies, police patrol allocation, and decisions about hiring probation offi cers. Cost-benefi t analysis, in particular, is used to assess the rela-tive value and expense of alternative policies. Although the mathematical tools that form the basis of problem analysis can be sophisticated, the underlying logic is relatively simple. For ex-ample, police departments traditionally used pin maps to represent the spatial and temporal concentrations of reported crime.

Space- and Time-Based AnalysisPin maps are examples of “low-tech” problem analysis that are nonetheless conceptually iden-tical to computer models of hot spots used in many departments to plan police deployment. Growing numbers of justice agencies, especially police and sheriff ’s departments, have taken advantage of rapid advances in computing and telecommunications. Computerized mapping systems now permit police to monitor changes in crime patterns on a daily or hourly basis and to develop responses accordingly. Furthermore, simultaneous advances in computing power and declines in the cost of that power make it possible for even small agencies to use map-ping tools (Harries 1999). The ongoing tech-nological advances in mapping have fueled the a pplication of statistical models to geographic

Chapter 10 Evaluation Research and Problem Analysis 277

clusters of crime problems. Thomas Rich (1999) describes this as analytic mapping, whereby sta-tistical tools supplement the “eyeballing” ap-proach to locating concentrations of crime.

Crime maps usually represent at least four different things: (1) one or more crime types, (2) a space or area, (3) some time period, and (4) some dimension of land use, usually streets. The most useful crime maps will show patterns that can help analysts and police decide what sort of action to take. That’s part of applied research. An example will illustrate some basic features of crime maps.

Figure 10.4 shows four crime maps prepared by Shuryo Fujita, a graduate student at the Rut-gers University School of Criminal Justice, for a midsized city in the northeast United States. All four maps show completed auto theft, but for different areas and time periods. The map in panel A shows auto thefts for the year 2005 in one of four police precincts in the city. About 1,750 completed thefts are represented, about 33 percent of all thefts in the city. You will prob-ably notice two things about panel A. First, car theft seems to be everywhere in this area, except for blank spots in the center and on the right side of the map—a large park and a river, re-spectively. Second, because car theft seems to be everywhere, the map is not especially useful. Much of the district appears to be a hot spot. Panel B changes the time reference, showing the 30 car thefts that occurred in the fi rst week of August 2005. You might think this is some-what more useful, showing more theft in the southern part of the district. But while panel A shows too much, panel B shows smaller num-bers that don’t seem to cluster very much.

Panel C shifts the geographic focus to one sector within the district, to the left of the park. This sector happens to have the highest volume of car theft, 464 completed thefts in 2005; it’s the hottest sector in the hottest pre-cinct in the city. Again, car theft seems to be all over the sector. A closer look shows more dots on the longer north-south streets than on cross streets. This is more clear in panel D, which

shows a crime density map of the sector. Crime density is a numerical value showing how close some dots are to each other, and how distant those clusters are from outlying dots. These values are mapped, showing patterns much more clearly than simple dots. The darker ar-eas of panel D represent more dense concen-trations of car theft. There seem to be two cor-ridors of car theft, running north-south below the diagonal street that bisects the map. These corridors are sort of connected in the middle, showing a rough H-shape. This shape happens to correspond with some major thoroughfares in the area. You might be able to imagine cruis-ing up, across, and down, looking for cars to steal. That’s useful information a crime analyst can provide for police managers. During the summer months of 2006, police in this city de-ployed special patrols on the streets within the H-shaped area depicted in panel D.

Tools for mapping crime and other prob-lems are similar to the tools of statistical anal-ysis, a topic we consider in the fi nal chapter. Maps and statistics are most useful when we seek to understand patterns in a large number of observations. Very small police departments that report very few incidents need neither sta-tistical nor geographic analysis to understand crime problems. But departments serving cit-ies like the one in Figure 10.4 can really benefi t from space-based analytic tools like crime map-ping and density analysis.

Computerized crime mapping has been used for many years in a small number of de-partments and is spreading to many large and midsized cities. Software is more powerful, and web-based mapping programs have been used to make crime maps generally available. More published guides are appearing that describe how to combine maps with other analysis pro-grams and sources of data. Jerry Ratcliffe (2004) describes how to classify crime concentrations across space and time dimensions to produce a hot spot matrix.

Crime mapping and other types of prob-lem analysis illustrate another advantage of

278 Part Four Application and Analysis

A. District 2, 2005

0N

0.4 Miles

B. District 2, August 1–7, 2005

0N

0.4 Miles

Figure 10.4 Mapping Auto TheftSource: Maps prepared by Shuryo Fujita.

Chapter 10 Evaluation Research and Problem Analysis 279

C. Sector 212, 2005

0N

0.2 Miles

D. Sector 212, 2005 Density

0N

0.2 Miles

Figure 10.4 (continued)

280 Part Four Application and Analysis

i ncident-based data—the potential for use in the kind of problem analysis we have described. Most crime mapping and similar tools are de-veloped and used by individual departments, refl ecting the fact that crime analysis is based on locally generated data. With incident-based reporting, crime analysis can be conducted on larger units. For example, Donald Faggiani and Colleen McLaughlin (1999) describe how Na-tional Incident Based Reporting System (NI-BRS) data can show state or regional patterns in drug arrests and offenses. Using NIBRS data for Virginia, the authors demonstrate differ-ences in types of arrests and drugs for different areas of the state.

Scientifi c Realism and Applied ResearchTraditional research and evaluation are based on the model of cause and effect we considered in Chapters 3 and 5. An independent variable (cause) produces some change in a dependent variable (effect). Experimental and quasi-ex-perimental designs seek to isolate this causal process from the possible effects of intervening variables. Designs thus try to control for the possible effects of intervening variables.

Problem analysis as we have described it rep-resents a bridge between traditional research approaches and applied research that is the foundation of scientifi c realism. Ray Pawson and Nick Tilley (1997) propose that, instead of trying to explain cause in a traditional sense, evaluators should search for mechanisms act-ing in context to explain outcomes. As we have seen, experiments do this by producing pretest statistical equivalence between groups of sub-jects who receive an intervention and groups of subjects who do not. Quasi-experiments using nonequivalent groups might seek to control intervening variables by holding possible inter-vening variables constant. So, for example, if we believe that employment status might be an in-tervening variable in the relationship between arrest and subsequent domestic violence, we

will try to structure an evaluation to hold em-ployment status constant between treatment and control groups.

Scientifi c realism treats employment status as the context in which an arrest mechanism oper-ates on the outcome of repeat domestic violence. Rather than try to control for employment sta-tus, a scientifi c realist will study the mechanism in context and conclude, for example, that ar-rest is effective in reducing subsequent violence in situations in which an offender is employed but is not effective when the offender is unem-ployed. This fi nding will be no different from what Sherman and associates (1992) conclude in their assessment of a series of randomized experiments.

What is different is that the scientifi c realist approach is rooted in the principle that simi-lar interventions can naturally be expected to have different outcomes in different contexts. Most notably, this approach is more compat-ible with the realities of evaluation than is the experimental approach. Pawson and Tilley (1997, 81) put it this way: “Ultimately, realist evaluation would be mechanism- and context-driven rather than program-led” (emphasis in original). This means that interventions should be designed not so much as comprehensive programs that apply equally in all situations. Instead, interventions should be developed for specifi c contexts, and evaluations of those interventions must consider context as a key factor in whether the intervention achieves the desired outcome.

Situational crime prevention (Clarke 1997b) is an example of the scientifi c realist approach that bridges problem analysis and evalua-tion because it focuses on what mechanisms operate for highly specifi c types of crime in specifi c situations. So, for example, rather than develop and evaluate large-scale programs intended to reduce auto theft generally, situ-ational crime prevention seeks specifi c inter-ventions that will be effective in reducing par-ticular types of auto theft. Ronald Clarke and

Chapter 10 Evaluation Research and Problem Analysis 281

Patricia Harris (1992) distinguish several types of auto theft by their purposes: joyriding, tem-porary transportation, resale or stripping, or insurance fraud. Theft of certain models for joyriding may be reduced by modest increases in security, while theft of expensive cars for re-sale or export requires different approaches. Many types of auto theft can be reduced by placing attendants at the exits of parking ga-rages, but car break-ins may not be affected by that intervention.

As we mentioned in Chapter 5, the realist ap-proach resembles a case-study approach. Both are variable-oriented strategies for research—they depend on measures of many variables to understand and assess a small number of cases. Detailed data and information are gathered about specifi c interventions, often in very small areas. Whereas an experimental evaluation uses probability theory to control for intervening variables, the case-study approach depends on detailed knowledge to understand the context in which mechanisms operate.

In his discussion of applied research tools for problem solving, Eck (2002, 2003b) makes the case even more strongly. Public offi cials, he argues, are more interested in solving local problems than in identifying robust cause-and effect relationships. Both problem solving and evaluation are concerned with answering the question “Did the problem decline?” But elimi-nating alternative explanations for a decline, which is the central concern of internal valid-ity and the rationale for stronger evaluation de-signs, is important only if offi cials wish to use the same intervention elsewhere.

In what Eck terms “small-claim, small-area problem solving,” analysts develop appropriate interventions for problems in context. This is the essence of the problem-solving process. Like Eck, we emphasize process—systematically study-ing a problem, developing appropriate inter-ventions, and seeing if those interventions have the intended effect. This is quite different from what Eck terms “large-claim interventions”—

such as Drug Abuse Resistance E ducation (D.A.R.E.) or corrections boot camps—that are developed to apply in a wide variety of settings. Because small-claim, small-scale interventions are tailored to highly specifi c settings, they can-not easily be transferred intact to different set-tings. However, the process of diagnosing local problems, selecting appropriate interventions, and then assessing the effects of those interven-tions can be generally applied. Anthony Braga (2002) offers more examples of this reasoning. Gloria Laycock presents an even stronger case for scientifi c realism in applied criminal jus-tice research generally (2002) and in making specifi c plans for crime prevention (Tilley and Laycock 2002).

Randomized or quasi-experimental evalua-tions should be conducted when such designs are appropriate. But it is important to recog-nize the formidable requirements for deploying these designs. The scientifi c realist approach to evaluation is fl exible and may be appropriate in many situations. A scientifi c realist evalu-ation or case study can be especially useful in smaller-scale evaluations in which interest centers on solving some particular problem in a specifi c context more than on fi nding gener-alizable scientifi c truths. In any case, a variety of approaches can satisfy the defi nition of pro-gram evaluation we discussed early this chap-ter, by systematically applying social science research procedures to an individual program or agency.

Our general advice in this regard is simple: do the best you can. This requires two things: (1) understanding the strengths and limits of social science research procedures, and (2) care-fully diagnosing what is needed and what is possible in a particular application. Only by understanding possible methods and program constraints can we properly judge whether any kind of evaluation study is worth undertaking with an experimental, quasi-experimental, or nonexperimental design, or whether an evalua-tion should not be undertaken at all.

282 Part Four Application and Analysis

The Political Context of Applied ResearchPublic policy involves making choices, and that in-volves politics.

Applied researchers bridge the gap between the body of research knowledge about crime and the practical needs of criminal justice professionals—a process that has potential po-litical, ideological, and ethical problems. In the fi nal section of this chapter, we turn our at-tention to the context of applied research, de-scribing some of the special problems that can emerge in such studies.

Some similarities are evident between this material and our discussion of ethics in Chap-ter 2. Although ethics and politics are often closely intertwined, the ethics of criminal jus-tice research focuses on the methods used, whereas political issues are more concerned with the substance and use of research fi ndings. Ethical and political aspects of applied research also differ in that there are no formal codes of accepted political conduct comparable to the codes of ethical conduct we examined earlier. Although some ethical norms have political aspects—for example, not harming subjects re-lates to protection of civil liberties—no one has developed a set of political norms that can be agreed on by all criminal justice researchers.

Evaluation and StakeholdersMost applied studies involve multiple stake-holders—people who have a direct or indirect interest in the program or evaluation results (Rossi, Freeman, and Lipsey 1999, 204 –205). Some stakeholders may be enthusiastic sup-porters of an experimental program, others may oppose it, and still others may be neutral. Different stakeholder interests in programs can produce confl icting perspectives on evaluations of those programs.

Emil Posavec and Raymond Carey (2002) describe such problems as dysfunctional at-titudes toward program evaluation. Program

supporters may have unrealistic expectations that evaluation results will document dramatic success. Conversely, they may worry that nega-tive results will lead to program termination. Agency staff may feel that day-to-day experi-ence in delivering a program imparts a qualita-tive understanding of its success that cannot be documented by a controlled experiment. Staff and other stakeholders may object that an e valuation consumes scarce resources better spent on actually delivering a program.

We have two bits of advice in dealing with such problems. First, identify program stake-holders, their perspectives on the program, and their likely perspectives on the evaluation. In addition to agency decision makers and staff, stakeholders include program benefi ciaries and competitors. For example, store owners in a downtown shopping district might benefi t from an experimental program to deploy ad-ditional police on foot patrol, whereas people who live in a nearby residential area might ar-gue that additional police should be assigned to their neighborhood.

Second, educate stakeholders about why an evaluation should be conducted. This is best done by explaining that applied research is conducted to determine what works and what does not. The National Institute of Justice (NIJ) and the Offi ce of Community Oriented Polic-ing Services (COPS) have issued brief docu-ments that describe how evaluation can benefi t criminal justice agencies by rationalizing their actions (Maxfi eld 2001; Eck 2003a). Such pub-lications, together with examples of completed evaluations, can be valuable tools for winning the support of stakeholders.

Also keep in mind that applied research is very much a cooperative venture. Accordingly, researchers and program staff are mutual stake-holders in designing and executing evaluations. Evaluators’ interest in a strong design that will meet scientifi c standards must be balanced against the main concern of program spon-sors— obtaining information that is useful for developing public policy.

Chapter 10 Evaluation Research and Problem Analysis 283

The fl ip side of being cautious about getting caught in stakeholder confl ict is the benefi t of applied research in infl uencing public policy. Evaluation studies can provide support for continuing or expanding successful criminal justice programs, or evidence that ineffective

programs should be modifi ed or terminated. And problem analysis results can sometimes be used to infl uence actions by public offi cials. For an example, see the box titled “When Poli-tics Accommodates Facts,” in which Tony Fabelo describes how problem analysis dis-

WHEN POLITICSACCOMMODATESFACTS

Tony Fabelo

The 1994 federal anticrime bill, and related poli-tics emanating from this initiative, put pressure on the states to adopt certain sentencing policies as a condition for receiving federal funds. Among these policies is the adoption of a “three strikes and you’re out” provision establishing a no-parole sentence for repeat violent offenders. Facts have prevented a criminal justice operational gridlock in Texas by delineating to policy makers the oper-ational and fi scal impact of broadly drafted poli-cies in this area. Facts established through policy analysis by the Criminal Justice Policy Council (CJPC) have clearly stated that a broad applica-tion of the “three strikes and you’re out” policy will have a tremendous fi scal impact.

Therefore, state policy makers have carefully drafted policies in this area. For example, during the last legislative session, the adoption of life with no parole for repeat sex offenders was con-sidered. State policy makers, after considering facts presented by the CJPC, adopted a policy that narrowly defi ned the group of offenders for whom the law is to apply. They also adopted a 35-year minimum sentence that must be served before pa-role eligibility, rather than a life sentence with no parole. The careful drafting of this policy limited its fi scal impact while still accomplishing the goal of severely punishing the selected group of sex offenders.

Unlike Texas, politics did not accommodate facts in California, where lawmakers adopted a

fi scally unsustainable “three strikes and you’re out” policy.

For my part, I need to maintain personal in-tegrity and the integrity of the CJPC in defi ning the facts for policy makers. I have to be judged not only by “objectivity,” which is an elusive con-cept, but by my judgment in synthesizing complex information for policy makers. To do this, I fol-low and ask my staff to follow these rules:

1. Consider as many perspectives as possible in synthesizing the meaning of information, in-cluding the perspectives of those stakeholders who will be affected.

2. State the limits of the facts and identify ar-eas where drawing conclusions is clearly not possible.

3. Consult with your peers to verify methodologi-cal assumptions and meet accepted criteria to pass the scrutiny of the scientifi c community.

4. Provide potential alternative assumptions be-hind the facts.

5. Set clear expectations for reviewing reports and releasing information so that facts are not perceived as giving advantage to any par-ticular interest group.

6. Judge the bottom-line meaning of the infor-mation for policy action based on a frame of reference broader than that of any particular party or constituency.

7. Finally, if the above are followed, never suc-cumb to political pressure to change your judgment. Integrity cannot be compromised even once. In the modern crowded market-place of information, your audience will judge you fi rst for your motives and then for your technical expertise.

Source: Adapted from Fabelo (1997, 2, 4).

284 Part Four Application and Analysis

suaded Texas legislators from costly, ineffective lawmaking.

Politics and ObjectivityPolitics and ideology can color research in ways even more subtle than those described by Fabelo. You may consider yourself an open-minded and unbiased person who aspires to be an objective criminal justice researcher. How-ever, you may have strong views about different sentencing policies, believing that probation and restitution are to be preferred over long prison sentences. Because there is no conclusive evidence to favor one approach over the other, your beliefs would be perfectly reasonable.

Now, assume that one of the requirements for the course you are taking is to write a proposal for an evaluation project on corrections policy. In all likelihood, you will prepare a proposal to study a probation program rather than, say, a program on the use of portable jails to provide increased detention capacity. That is natural, and certainly legitimate, but your own policy preferences will affect the topic you choose.

Ronald Clarke (1997b, 28) describes politi-cal objections to applied studies of situational crime prevention: “Conservative politicians re-gard it as an irrelevant response to the break-down in morality that has fueled the postwar rise in crime. Those on the left criticize it for neglecting issues of social justice and for be-ing too accepting of the defi nitions of crime of those in power.” By the same token, ELMO is distrusted for being simultaneously too lenient by allowing offenders to do time at home and too close to a technological nightmare by en-abling the government to spy on individuals. Evaluations of situational crime prevention or ELMO programs may be criticized for tac-itly supporting either soft-on-crime or heavy-handed police state ideologies (Lilly 2006; Nel-lis 2006).

It is diffi cult to claim that criminal justice research, either applied or basic, is value-free. Our own beliefs and preferences affect the top-ics we choose to investigate. Political prefer-

ences and ideology may also infl uence criminal justice research agendas by making funds avail-able for some projects but not others. For ex-ample, in 2004, the National Institute of Justice awarded money for projects to study these top-ics: “Chinese Connection: Changing Patterns of Drug Traffi cking in the Golden Triangle” and “Assessment of Risk Factors Associated with Sexual Violence in the Texas Prison System.” No funds were awarded, however, for such proj-ects as “The Scope of Institutionalized Racism in the War on Drugs” or “Exploratory Research on Torture in Federal Detention Camps.” It is, of course, possible for researchers— consciously or unconsciously—to become instruments for achieving political or policy objectives in ap-plied research.

It may sometimes seem diffi cult to main-tain an acceptable level of objectivity about or distance from evaluation results in criminal justice research. This task can be further com-plicated if you have strong views one way or another about a particular program or policy. Researchers who evaluate, say, an experimental program to prevent offenders from repeating probably sincerely hope that the program will work. However, substantially less consensus ex-ists about other criminal justice problems and policies. For example, how do you feel about a project to test the effects of restrictive hand-gun laws or mandatory jail sentences for abor-tion protesters? We conclude this chapter with one fi nal example that we expect will make you think about some of the political issues in-volved in applied research.

In 1990, the elected prosecutor of Marion County, Indiana—in which Indianapolis is located—was sharply criticized in a series of newspaper stories that claimed to present evidence of racial disparity in drug sentences handed down in the county. Convicted minor-ity offenders, it was asserted, received longer prison terms than white offenders. The pros-ecutor immediately responded, criticizing the data collected and methods used by the investi-gative reporter. He also contacted Maxfi eld and

Chapter 10 Evaluation Research and Problem Analysis 285

asked him to conduct an independent analysis of drug cases accepted for prosecution.

In the fi rst place, the prosecutor claimed, he had had previous feuds with the author of the newspaper stories. Second, he categorically de-nied any discriminatory policies in making sen-tence requests in drug cases. Third, he said he knew that the data and methods reported in the newspaper stories were defi cient even though the reporter would not reveal details about his sources and information. Finally, if any pattern of racial disparity existed, it was certainly in-advertent, and the prosecutor wanted to know about it so that the problem could be fi xed. Maxfi eld accepted the project and was paid to produce a report.

How do you feel about this example? Did Maxfi eld sell out? How would you feel if Max-fi eld turned up clear evidence of disparity in sen-tences? Or no evidence of disparity? What about political party affi liation—would it make a dif-ference if the prosecutor and Maxfi eld identifi ed with the same party? With different parties?

✪ Main Points• Evaluation research and problem analysis are ex-

amples of applied research in criminal justice.

• Different types of evaluation activities corre-spond to different stages in the policy process—policy planning, process evaluation, and impact evaluation.

• An evaluability assessment may be undertaken as a scouting operation or a preevaluation to determine whether it is possible to evaluate a particular program.

• A careful formulation of the problem, including relevant measurements and criteria of success or failure, is essential in evaluation research.

• Organizations may not have clear statements or ideas about program goals. In such cases, researchers must work with agency staff to f ormulate mutually acceptable statements of goals before proceeding.

• Evaluation research may use experimental, quasi-experimental, or nonexperimental de-signs. As in studies with other research purposes, designs that offer the greatest control over ex-perimental conditions are usually preferred.

• The use of randomized fi eld experiments re-quires careful attention to random assignment, case fl ow, and treatment integrity.

• Randomized designs cannot be used for evalu-ations that begin after a new program has been implemented or for full-coverage programs in which it is not possible to withhold an experi-mental treatment from a control group.

• Process evaluations can be undertaken inde-pendently or in connection with an impact assessment. Process evaluations are all but es-sential for interpreting results from an impact assessment.

• Problem analysis is more of a planning tech-nique. However, problem analysis draws on the same social science research methods used in program evaluation. Many variations on prob-lem analysis are used in applied criminal justice research.

• The scientifi c realist approach to applied re-search focuses on mechanisms in context, rather than generalizable causal processes.

• Criminal justice agencies are increasingly using problem analysis tools for tactical and strate-gic planning. Crime mapping and other space-based procedures are especially useful applied techniques.

• Problem solving, evaluation, and scientifi c real-ism have many common elements.

• Evaluation research entails special logistical, ethical, and political problems because it is em-bedded in the day-to-day events of public policy and real life.

✪ Key Termsevaluation

research, p. 255evidence-based

policy, p. 255impact assessment,

p. 259problem

analysis, p. 255

problemsolving, p. 274

processevaluation, p. 259

stakeholders, p. 263

✪ Review Questions and Exercises1. In presentations to justice practitioners, Max-

fi eld describes evaluation as answering two questions: “Did you get what you expected?” and “Compared to what?” Discuss how particu-lar sections of this chapter relate to those two questions.

286 Part Four Application and Analysis

that aid police in addressing a wide range of problems. Eck summarizes key elements of user-oriented evaluation.

Pawson, Ray, and Nick Tilley, Realistic Evaluation (Thousand Oaks, CA: Sage, 1997). The authors describe scientifi c realism and apply it to the evaluation of crime prevention and other crimi-nal justice policy. This book also presents an interesting critique on the inappropriate use of experimental and quasi-experimental designs in criminal justice evaluation.

Rossi, Peter H., Howard E. Freeman, and Mark W. Lipsey, Evaluation: A Systematic Approach, 6thed. (Thousand Oaks, CA: Sage, 1999). Of the many available “handbooks” on evaluation methods, this is the most widely read. Although the book is uneven in its coverage of recent de-velopments, the authors provide a good general foundation in evaluation methods.

Tilley, Nick (ed.), Analysis for Crime Prevention: Crime Prevention Studies, vol. 13 (Monsey, NY: Criminal Justice Press, 2002); Evaluation for Crime Pre-vention: Crime Prevention Studies, vol. 14 (Mon-sey, NY: Criminal Justice Press, 2002). These companion volumes present innovative think-ing about how problem analysis and program evaluation can be used by public offi cials in preventing crime. Some of the articles will be controversial. All are interesting and mostly fun to read.

2. When programs do not achieve their expected results, it’s due to one of two things: the pro-gram was not a good idea to begin with, or it was a good idea but was not implemented prop-erly. Discuss why it is necessary to conduct both a process and an impact evaluation to learn why a program failed.

3. What are the principal advantages and disad-vantages of randomized designs for fi eld experi-ments? Are such designs used in problem analy-sis? Explain your answer.

✪ Additional ReadingsClarke, Ronald V., and John Eck, Crime Analysis for

Problem Solvers in 60 Small Steps (Washington, DC: U.S. Department of Justice, Offi ce of Com-munity Oriented Policing, 2005; www.popcenter.org/learning/60steps/; accessed May 16, 2008). This guide assumes some knowledge of crime mapping and some experience in doing crime analysis. But it is still a source of countless (well, maybe just 60) tips about doing applied research in crime prevention.

Eck, John E., Assessing Responses to Problems: An Introductory Guide for Police Problem-Solvers (Washington, DC: U.S. Department of Jus-tice, Offi ce of Community Oriented Polic-ing Services, 2003; www.popcenter.org/tools/assessing_responses/; accessed May 16, 2008). This highly recommended guide was written to accompany a series of problem-solving guides

287

Chapter 11

Interpreting DataWe’ll examine a few simple statistics frequently used in criminal justice re-search. We’ll also cover the fundamental logic of multivariate analysis. You’ll come away from this chapter able to perform simple, though powerful, analy-ses to describe data and reach research conclusions.

Introduction 288

Univariate Description 288

Distributions 288

Measures of Central Tendency 289

Measures of Dispersion 291

Comparing Measures of Dispersion and Central Tendency 293

Computing Rates 295

Describing Two or More Variables 296

Bivariate Analysis 296

MURDER ON THE JOB 298

Multivariate Analysis 301

Inferential Statistics 303

Univariate Inferences 304

Tests of Statistical Signifi cance 305

Visualizing Statistical Signifi cance 306

Chi Square 307

Cautions in Interpreting Statistical Signifi cance 309

288 Part Four Application and Analysis

IntroductionEmpirical research usually uses some type of statis-tical analysis.

Many people are intimidated by empirical re-search because they feel uncomfortable with mathematics and statistics. And, indeed, many research reports are fi lled with otherwise un-specifi ed computations. The role of statistics in criminal justice research is very important, but it is equally important for that role to be seen in its proper perspective.

Empirical research is, fi rst and foremost, a logical rather than a mathematical operation. Mathematics is not much more than a conve-nient and effi cient language for accomplishing the logical operations inherent in good data analysis. Statistics is the applied branch of mathematics especially appropriate to a variety of research analyses.

We’ll be looking at two types of statistics: descriptive and inferential. Descriptive statis-tics are used to summarize and otherwise de-scribe data in manageable forms. Inferentialstatistics help researchers form conclusions from their observations; typically that involves forming conclusions about a population from the study of a sample drawn from it.

Before considering any numbers, we want to assure you that the level of statistics used in this chapter has been proven safe for humans. The underlying logic and fundamental tech-niques of statistics are not at all complicated. It’s mostly counting and comparing.

We assume that you are taking a course in research methods for criminology and cri-minal justice because you are interested in the subjects of crime and criminal justice pol-icy. We suggest that you approach this chapter by thinking about statistics as tools for describ-ing and explaining crime and criminal justice policy. Learning how to use these tools will help you better understand this fascinating subject. And learning how to summarize and interpret data about a subject you fi nd inher-ently interesting is the least painful and most

rewarding way to become acquainted with statistics.

Univariate DescriptionThe simplest statistics describe some type of average and dispersion for a single variable.

Descriptive statistics represent a method for presenting quantitative descriptions in a man-ageable form. Sometimes we want to describe single variables; this procedure is known as univariate analysis. Other times we want to describe the associations that connect one vari-able with another. Bivariate analysis refers to descriptions of two variables, and multivariate analysis examines relationships among three or more variables.

Univariate analysis examines the distribu-tion of cases on only one variable at a time. We’ll begin with the logic and formats for the analysis of univariate data.

DistributionsThe most basic way to present univariate data is to report all individual cases—that is, to list the attribute for each case under study in terms of the variable in question. Suppose we are in-terested in the ages of criminal court judges; our data might come from a directory of judges prepared by a state bar association. The most direct manner of reporting the ages of judges is to simply list them: 63, 57, 49, 62, 80, 72, 55, and so forth. Such a report will provide read-ers with complete details of the data, but it is too cumbersome for most purposes. We could arrange our data in a somewhat more manage-able form without losing any of the detail by re-porting that 5 judges are 38 years old, 7 are 39, 18 are 40, and so forth. Such a format avoids duplicating data on this variable.

For an even more manageable format—with a certain loss of detail—we might report judges’ ages as marginals, which are frequency distri-butions of grouped data: 24 judges under 45 years of age, 51 between 45 and 50 years of age, and so forth. Our readers will have less data

Chapter 11 Interpreting Data 289

comparable data on the population from which the sample was drawn, we will probably want to omit the “no answers” from the computation. Our best estimate of the age distribution of all respondents is the distribution for those who answered the question. Because “no answer” is not a meaningful age category, its presence among the base categories will only confuse the comparison of sample and population fi gures.

Measures of Central TendencyBeyond simply reporting marginals, research-ers often present data in the form of summary averages, or measures of central tendency. Options in this regard include the mode (the most frequent attribute, either grouped or un-grouped), the arithmetic mean (the sum of val-ues for all observations, divided by the number of observations), and the median (the middle attribute in the ranked distribution of observed attributes). Here’s how the three averages are calculated from a set of data.

Suppose we are conducting an experiment that involves teenagers as subjects. They range

to examine and interpret, but they will not be able to reproduce fully the original ages of all the judges. Thus, for example, readers will have no way of knowing how many judges are 41 years old.

The preceding example presented margin-als in the form of raw numbers. An alternative form is the use of percentages. We might report that x percent of the judges are younger than 45, y percent are between 45 and 50, and so forth. Table 11.1 shows an example.

In computing percentages, it is necessary to determine the base from which to compute—the number that represents 100 percent. In the most straightforward situation, the base is the total number of cases under study. A prob-lem arises, however, whenever some cases have missing data. Let’s consider a survey in which respondents are asked to report their ages. If some respondents fail to answer that question, we have two alternatives. First, we might still base our percentages on the total number of respondents, reporting those who fail to give their ages as a percentage of the total. Second, we might use the number of persons who give an answer as the base from which to compute the percentages; this approach is illustrated in Table 11.1. We will still report the number who do not answer, but they will not fi gure in the percentages.

The choice of a base depends entirely on the purposes of the analysis. If we wish to compare the age distribution of a survey sample with

Table 11.1 Ages of Criminal Court Judges (Hypothetical)

Age Percent

Under 35 9%

36–45 21

46–55 45

56–65 19

65 and older 6

Total 100% � 433

No data 18

Age Number

13 3

14 4

15 6

16 8

17 4

18 3

19 3

in age from 13 to 19, as indicated in this fre-quency distribution:Now that we know the ages of the 31 subjects, how old are these subjects in general, or on aver-age? Let’s look at three different ways we might answer that question.

The easiest average to calculate is the mode, the most frequent value. The distribution of our 31 subjects shows there are more 16-year-olds (eight of them) than any other age, so the modal age is 16, as indicated in Figure 11.1.

290 Part Four Application and Analysis

Figure 11.1 Three “Averages”

Age

13

14

15

16

17

18

19

Number

Age

13

14

15

16

17

18

19

Number

13 × 3 = 39

14 × 4 = 56

15 × 6 = 90

16 × 8 = 128

17 × 4 = 68

18 × 3 = 54

19 × 3 = 57

Age

13

14

15

16

17

18

19

1–3

4–7

8–13

22–25

26–28

29–31 16.06

Number

492(Total)

÷ 31 = 15.87

(Cases)

14

16.19

15

16.31

16

16.44

17

16.56

18

16.69

19

16.81

20

16.94

21

Median = 16.31

Midpoint

Mean = 15.87

Arithmetic average

Mode = 16

Most frequent

Chapter 11 Interpreting Data 291

average value of all observations in a group, then the standard deviation represents the aver-age amount each individual observation varies from the mean. Table 11.2 presents some hypo-thetical data on the ages of persons in juvenile and adult court that will help illustrate the con-cepts of deviation and average deviation. Let’s fi rst consider the top of Table 11.2.

The fi rst column shows the age for each of 10 juvenile court defendants. The mean age for these 10 juveniles is 14. The second column shows how much each individual’s age deviates from the mean. Thus the fi rst juvenile is two years younger than the mean, the second is one year older, and the third is the same age as the mean.

You might fi rst think that the average devia-tion is calculated in the same way as the mean—add up all individual deviations for each case and divide by the number of cases. We did that in Table 11.2, but notice that the total deviation is zero; therefore the average deviation is zero. In fact, the sum of deviations from the mean will always be zero. This is because some indi-vidual deviations will be negative and some will be positive—and the positive and negative val-ues will always cancel each other out.

Largely for this reason, the standard de-viation measure of dispersion is based on the squared deviations from the mean. Squaring any number always produces a positive value, so when we add all the squared deviations together, we will not get zero for the total. Summing these squared deviations in the top of Table 11.2 pro-duces a total of 20, and dividing by the number of observations produces an average deviation of 2. This quantity—the sum of squared devia-tions from the mean divided by the number of cases—is called the variance. Taking the square root of the variance produces the standard de-viation, which is 1.41 for juveniles in Table 11.2.

How should we interpret a standard devia-tion of 1.41, or any other such value, for that matter? By itself, any particular value for the standard deviation has no intuitive meaning. This measure of dispersion is most useful in a

Figure 11.1 also demonstrates the calcula-tion of the mean. There are three steps: (1) mul-tiply each age by the number of subjects who are that age, (2) total the results of all those multiplications, and (3) divide that total by the number of subjects. As indicated in Figure 11.1, the mean age in this illustration is 15.87.

The median represents the middle value; half are above it and half below. If we had the pre-cise age of each subject (for instance, 17 years and 124 days), we could arrange all 31 subjects in order by age, and the median for the whole group would be the age of the middle subject.

We do not, however, know precise ages; our data constitute grouped data in this regard. Three people who are not precisely the same age have been grouped in the category “13 years old,” for example.

Figure 11.1 illustrates the logic of calculat-ing a median for grouped data. Because there are 31 subjects altogether, the middle subject is number 16 when they are arranged by age—15 are younger and 17 are older. The bottom por-tion of Figure 11.1 shows that the middle per-son is one of the eight 16-year-olds. In the en-larged view of that group, we see that number 16 is the third from the left.

Measures of DispersionIn the research literature, we fi nd both means and medians presented. Whenever means are presented, we must be aware that they are sus-ceptible to extreme values: a few very large or very small numbers can change the mean dra-matically. Because of this, it is usually impor-tant to examine measures of dispersion about the mean.

The simplest measure of dispersion is the range: the distance separating the highest from the lowest value. Thus besides reporting that our subjects have a mean age of 15.87, we might also indicate that their ages range from 13 to 19. A somewhat more sophisticated mea-sure of dispersion is the standard deviation,which can be described as the average amount of variation about the mean. If the mean is the

292 Part Four Application and Analysis

In our example of juvenile court cases, the standard deviation of 1.41 is rather low relative to the mean of 14. Now compare the data for ju-venile court with the bottom half of Table 11.2, which presents ages for a hypothetical group of adult court defendants. The mean is higher, of course, because adults are older than ju-

comparative sense. Comparing the relative val-ues for the standard deviation and the mean in-dicates how much variation there is in a group of cases, relative to the average. Similarly, com-paring standard deviations for different groups of cases indicates relative amounts of disper-sion within each group.

Table 11.2 Standard Deviation for Two Hypothetical Distributions

Juvenile Court

Squared Deviation Deviation Age from Mean from Mean

12 �2 4

15 1 1

14 0 0

13 �1 1

15 1 1

14 0 0

16 2 4

16 2 4

12 �2 4

13 �1 1

Sum 140 0 20

Average 14 (0) (2)

Standard deviation 1.41

Adult Court

Squared Deviation Deviation Age from Mean from Mean

18 �10 100

37 9 81

23 �5 25

22 �6 36

25 �3 9

43 15 225

19 –9 81

50 22 484

21 �7 49

22 �6 36

Sum 280 0 1,126

Average 28 (0) (112.6)

Standard deviation 10.61

Chapter 11 Interpreting Data 293

sure of dispersion, the standard deviation plays a role in the calculation of other descriptive statistics, some of which we will touch on later in this chapter. The standard deviation is also a central component of many inferential sta-tistics used to make generalizations from a sample of observations to the population from which the sample was drawn.

Comparing Measures of Dispersion and Central TendencyOther measures of dispersion can help us inter-pret measures of central tendency. One useful indicator that expresses both dispersion and grouping of cases is the percentile, which indi-cates what percentage of cases fall at or below some value. For example, scores on achieve-ment tests such as the SAT are usually reported in both percentiles and raw scores. Thus a raw

veniles. More important for illustrating the standard deviation, there is greater variation in the distribution of adult court defendants, as illustrated by the standard deviation and the columns that show raw deviations and squared deviations from the mean of 28. The standard deviation for adult cases (10.61) is much higher relative to the mean of 28 than are the relative values of the standard deviation and mean for juvenile cases. In this hypothetical example, the substantive reason for this is obvious: there is much greater age variation in adult court than in juvenile court because the range for ages of adults is potentially greater (18 to whatever) than the range for ages in juveniles (1 to 17). As a result, the standard deviation for adult defen-dants indicates greater variation than the same measure for juvenile defendants.

In addition to providing a summary mea-

Table 11.3 Hypothetical Data on Distribution of Prior Arrests

Number Number Percentage of Prior of of Percentile/ Arrests Cases Cases Quartile

0 1 0.56

1 16 8.89

2 31 17.22 25th/1st

3 23 12.78

4 20 11.11 50th/2nd

5 16 8.89

6 19 10.56

7 18 10.00 75th/3rd

8 11 6.11

9 14 7.78

10 5 2.78

30 3 1.67

40 2 1.11

55 1 0.56

Total 180 100%

Mode 2

Median 4

Mean 5.76

Range 0–55

Standard deviation 6.64

294 Part Four Application and Analysis

and second quartile. Only one-fourth of the cases have 8 or more prior arrests.

Notice also the different values for our three measures of central tendency. The mode for prior arrests is 2, and the mean or average num-ber is 5.76. Whenever the mean is much higher than the mode, it indicates that the mean is dis-torted by a small number of persons with many prior arrests. The standard deviation of 6.64 further indicates that our small population has quite a bit of variability. Figure 11.2 presents a graphic representation of the dispersion of cases and the different values for the three mea-sures of central tendency.

Distributions such as those shown in Table 11.3 and Figure 11.2 are known as skeweddistributions. Although most cases cluster near the low end, a few are spread out over very high

score of 630 might fall in the 80th percentile, indicating that 80 percent of persons who take the SAT achieve scores of 630 or less; alterna-tively, the 80th percentile means that 20 per-cent of scores were higher than 630. Percentiles may also be grouped into quartiles, which give the cases that fall in the fi rst (lowest), sec-ond, third, and fourth (highest) quarters of a distribution.

Table 11.3 presents a distribution of prior arrests for a hypothetical population of, say, probationers to illustrate different measures of central tendency and dispersion. Notice that, although the number of prior arrests ranges from 0 to 55, cases cluster in the lower end of this distribution. Half the cases have 4 or fewer prior arrests, as indicated by three descriptive statistics in Table 11.3: median, 50th percentile,

Figure 11.2 Graphic Representation of a Distribution of Prior Arrests (Hypothetical Data)

0

5

10

15

20

25

30

35

554030109876543210

Num

ber

of c

ases

Number of prior arrests

Mode = 2

Median = 4Mean = 5.63

Chapter 11 Interpreting Data 295

UCR fi gures on total murders for 2004 in four states.

Obviously, California had far more murders than the other three states, but these fi gures are diffi cult to interpret because of large differences in the states’ total populations. Computing rates enables us to standardize by population size and make more meaningful comparisons, as Table 11.5 shows.

We can see that Louisiana, even with the few-est murders in 2004 (among the states reported here), had the highest murder rate. Notice also that the murder rate is expressed as the num-ber of murders per 100,000 population. This is a common convention in reporting rates of crime and other rare events. To get the rate of murder per person, move the decimal point fi ve places to the left for each fi gure in Table 11.5. You can clearly see which version is easier to interpret.

The arithmetic of calculating rates could not be much easier. What is not so simple, and in any event requires careful consideration, is deciding on the two basic components of rates: numerator and denominator. The numerator represents the central concept we are interested

values for prior arrests. Many variables of inter-est to criminal justice researchers are skewed in similar ways, especially when examined for a general population. Most people have no prior arrests, but a small number of persons have many. Similarly, most people suffer no victim-ization from serious crime in any given year, but a small number of persons are repeatedly victimized.

In an appropriately titled article, “Deviating from the Mean,” Michael Maltz (1994) cautions that criminologists, failing to recognize high levels of variation, sometimes report means for populations that exhibit a great deal of skew-ness. When reading reports of criminal justice research, researchers are advised to look closely at measures of both dispersion and central ten-dency. When the numerical value of the stan-dard deviation is high and that for the mean is low, the mean is not a good measure of central tendency.

The preceding calculations are not appropri-ate for all variables. To understand this, we must examine two types of variables: continuous and discrete. Age and number of prior arrests are continuous ratio variables; they increase steadily in tiny fractions instead of jumping from cate-gory to category as does a discrete variable such as gender or marital status. If discrete variables are being analyzed—a nominal or ordinal vari-able, for example—then some of the techniques discussed previously are not applicable.

Strictly speaking, medians and means should be calculated for only interval and ratio data, respectively. If the variable in question is gender, for instance, raw numbers or percent-age marginals are appropriate and useful mea-sures. Calculating the mode is a legitimate tool of analysis, but reports of mean, median, or dis-persion summaries would be inappropriate.

Computing RatesRates are fundamental descriptive statistics in criminal justice research. In most cases, rates are used to standardize some measure for compara-tive purposes. For example, Table 11.4 shows

Table 11.4 Total Murders in Four States, 2004

Total Murders, 2004 2004 Population

California 2,407 35,894,000

Florida 946 17,397,000

Louisiana 574 4,516,000

Pennsylvania 650 12,406,000

Source: Federal Bureau of Investigation 2005.

Table 11.5 Murder Rates per 100,000 Population, 2004

California 6.7

Florida 5.4

Louisiana 12.7

Pennsylvania 5.2

296 Part Four Application and Analysis

an example of confusion about the meaning of rates.

Describing Two or More VariablesDescriptive statistics applied to two or more vari-ables are tools to understand relationships among those variables.

Univariate analyses describe the units of analy-sis of a study and, if they are a sample drawn from some larger population, allow us to make descriptive inferences about the larger popula-tion. Bivariate and multivariate analyses are aimed primarily at explanation.

Often it’s appropriate to describe subsets of cases, subjects, or respondents. Table 11.6, for example, presents hypothetical data on sentence length for offenders grouped by prior felony re-cord. In some situations, the researcher presents subgroup comparisons purely for descriptive purposes. More often, the purpose of subgroup descriptions is comparative. In this case, com-paring sentences for subgroups of convicted of-fenders implies some causal connection between prior felony record and sentence length. Simi-larly, if we compare sentence lengths for men and women, it implies that something about gender has a causal effect on sentence length.

Bivariate AnalysisIn contrast to univariate analysis, subgroup comparisons constitute a kind of bivariate analysis in that two variables are involved. In such situations, we are usually interested in relationships among the variables. Thus uni-variate analysis and subgroup comparisons fo-cus on describing the people (or other units of analysis) under study, whereas bivariate analysis focuses more on the variables themselves.

Notice, then, that Table 11.7 can be regarded as a subgroup comparison: it independently de-scribes gun ownership among male and female respondents in the 2000 General Social Survey. It shows— comparatively and descriptively—that fewer females than males report owning a gun.

in measuring, so selecting the numerator in-volves all the considerations of measurement we have discussed elsewhere. Murder rates, ar-rest rates, conviction rates, and incarceration rates are common examples in which the nu-merator is a relatively straightforward count.

Choosing the right denominator is impor-tant. In most cases, we should compute rates to standardize according to some population eligible to be included in the numerator. Some-times, the choice is fairly obvious, as in our use of each state’s total population to compute murder rates. To compute rates of rape or sex-ual assault, we should probably use the popu-lation of adult women in the denominator, al-though we should also consider how to handle rapes with male victims. Because households are at risk of residential burglary, burglary rates should be computed using some count of households. Similarly, commercial burglaries should be based on a count of commercial es-tablishments, and auto theft on an indicator of registered autos.

More diffi cult problems can arise in com-puting rates to express a characteristic of a mo-bile population. For example, residents of Mi-ami are at risk of criminal victimization in that city, but so are tourists and other visitors. Be-cause many nonresidents visit or pass through the city in any given year, a measure of Miami’s crime rate based on only the city’s resident pop-ulation (such as the U.S. Census) will tend to overestimate the number of crimes standard-ized by the population at risk; many people at risk will not be counted in the denominator. Or what about estimating the crime rate on a sub-way system? The population at risk here is us-ers, who, in New York City, amount to millions of persons per day.

Rates are very useful descriptive statistics that may be easily computed. It is important, however, to be careful in selecting numerators and denominators. Recognize that this caution applies as much to questions of making mea-surements as it does to questions of computing descriptive statistics. The box titled “Murder on the Job,” presented later in this chapter, gives

Chapter 11 Interpreting Data 297

Another, related problem complicates the lives of novice data analysts: how do you read a percentage table? There is a temptation to read Table 11.7 as “Among females, only 25 percent owned a gun, and 75 percent did not; there-fore, being female makes you less likely to own a gun.” That is not the correct way to read the table, however. The conclusion that gender—as a variable—has an effect on gun ownership must hinge on a comparison between males and females. Specifi cally, we compare the 25 percent of females with the 42 percent of males and note that women are less likely than men to own a gun. The appropriate compari-son of subgroups, then, is essential in reading an explanatory bivariate table.

Percentaging a Table In constructing and presenting Table 11.7, we have used a conven-tion called percentage down. This means that we can add the percentages down each column to total 100 percent. We read this form of table across a row. For the row labeled “Yes,” what percentage of the males own a gun? What per-centage of the females?

The percentage-down convention is just that—a conventional practice; some research-ers prefer to percentage across. They would organize Table 11.7 with “Male” and “Female” on the left side of the table, identifying the two rows, and “Yes” and “No” at the top, identifying the columns. The actual numbers in the table would be moved around accordingly, and each row of percentages would total 100 percent. In that case, we would make our comparisons between males and females by reading down,

The same table viewed as an explanatory bi-variate analysis tells a somewhat different story. It suggests that the variable gender has an effect on the variable gun ownership. The behavior is seen as a dependent variable that is partially de-termined by the independent variable, gender. Explanatory bivariate analyses, then, involve the variable language we introduced in Chap-ter 1. In a subtle shift of focus, we are no longer talking about male and female as different sub-groups but about gender as a variable—a vari-able that has an infl uence on other variables.

The logic of causal relationships among variables has an important implication for the construction and reading of percentage tables. Novice data analysts often have diffi culty in deciding on the appropriate direction of “per-centaging” for any given table. In Table 11.7, for example, we divided the group of subjects into two subgroups—male and female—and then described the behavior of each subgroup. That is the correct way to construct this table.

Notice, however, that it would have been possible, though inappropriate, to construct the table differently. We could have fi rst divided the subjects into different categories of gun ownership and then described each of those subgroups by the percentage of male and female subjects in each. This method would make no sense in terms of explanation, however; owning a gun does not make someone male or female.

Table 11.7 suggests that gender affects gun ownership. Had we used the other method of construction, the table would have suggested that gun ownership affects whether someone is male or female—which is nonsense.

Table 11.6 Hypothetical Illustration of Subgroup Comparisons: Length of Prison Sentence by Felony Criminal History

Felony Criminal Median SentenceHistory Length

No arrests or convictions 6 months

Prior arrests only 11 months

Prior convictions 23 months

Table 11.7 Gun Ownership Among Male and Female Respondents, 2000

Male Female

Own a Gun?Yes 42% 25%

No 58 75

100% = (817) (1,040)

Source: 2000 General Social Survey (available at http://sda.berkeley.edu/archive.htm; accessed May 18, 2008).

298 Part Four Application and Analysis

been percentaged across. Follow these rules of thumb:

• If the table is percentaged down, read across.

• If the table is percentaged across, read down.

Here’s another example: Suppose we are in-terested in investigating newspaper editorial policies regarding the legalization of marijuana. We undertake a content analysis of editorials on this subject that have appeared during a given year in a sample of daily newspapers across the nation. Each editorial has been classifi ed as favorable, neutral, or unfavorable with regard to the legalization of marijuana. Perhaps we wish to examine the relationship between edi-

within table columns, still asking what percent-age of males and females owned guns. The logic and the conclusion would be the same in either case; only the form would be different.

In reading a table that someone else has constructed, it’s therefore necessary to fi nd out in which direction it has been percentaged. Usually that will be apparent from the labeling of the table or the logic of the variables being analyzed. Sometimes, however, tables are not clearly labeled. In such cases, the reader should add the percentages in each column and each row. If each of the columns totals 100 per-cent, the table has been percentaged down. If the rows total 100 percent each, the table has

MURDERON THE JOB

“High Murder Rate for Women on the Job,” read the headline for a brief story in the New York Times, reporting on a study released by the U.S. Department of Labor. The subhead was equally alarming, and misleading, to casual readers: “40 percent of women killed at work are murdered, but the fi gure for men is only 15 percent.” Think about this statement, in light of our discussion of how to percentage a table. You should be able to imagine something like the following:

Cause of Death at Work

Women Men

Murder 40% 15%

Other 60 85

Total 100% 100%

This table indicates that of those women who die while on the job at work, 40 percent are mur-dered, and the table is consistent with the open-ing paragraphs of the story. Notice that so far nothing has been said about how many women and men are murdered, or how many women and men die on the job from all causes. Later on, the story provides more details:

Vehicle accidents caused the most job-related deaths, 18 percent or 1,121 of the 6,083 work-related deaths in 1992. . . . Homicides, including shootings and stab-bings, were a close second with 17 percent, or 1,004 deaths, said the study by the de-partment’s Bureau of Labor Statistics.

This information enables us to supplement the ta-ble by adding row totals: 6,083 people died on the job in 1992, 1,004 of them were murdered, and 5,079 (6,083 – 1,004) died from other causes.

One more piece of information is needed to construct a contingency table: the total number of men and women killed on the job. The story does not tell us that directly, but it provides enough in-formation to approximate the answer: “Although men are 55 percent of the workforce, they com-prise 93 percent of all job-related deaths.” Men must therefore be 93 percent of the 6,083 total workplace deaths, or approximately 5,657; this leaves approximately 426 deaths of women on the job. “Approximately” is an important qualifi er here, because computing numbers of cases from percentages creates some inconsistencies due to rounding off percentages reported in the newspa-per story. Let’s now construct a contingency table to look at the numbers of workplace deaths. The next table shows computed numbers in parenthe-ses. Notice that with the exception of “Total” all

Chapter 11 Interpreting Data 299

is for simplicity of illustration and does not mean that rural refers to a community of less than 100,000 in any absolute sense.) Of these,

torial policies and the types of communities in which the newspapers are published, thinking that rural newspapers might be more conser-vative than urban ones. Thus each newspaper (and so, each editorial) is classifi ed in terms of the population of the community in which it is published.

Table 11.8 presents some hypothetical data describing the editorial policies of rural and ur-ban newspapers. Note that the unit of analysis in this example is the individual editorial. Table 11.8 tells us that there were 127 editorials about marijuana in our sample of newspapers pub-lished in communities with populations less than 100,000. (Note: This choice of 100,000

numbers have been estimated from the newspaper story’s percentages for men and women.

Cause of Death at Work

Women Men Total (est.) (est.) Total (est.)

Murder (170) (849) 1,004 (1,019)

Other (256) (4,808) 5,079 (5,064)

Total (426) (5,657) 6,083

The results are interesting. Although a greater per-centage of women than men are murdered, a much larger number of men than women are murdered.

Now, recall the story’s headline, “High Mur-der Rate.” This implies that the number of women murdered on the job, divided by the total number of women at risk of murder on the job, is higher than the same computed rate for men. We need more information than the story provides to verify this claim, but there is a clue. Women are about 45 percent of the workforce, so there are about 1.2 men in the workforce for every woman (55 percent � 45 percent). But about fi ve times as many men as women are murdered on the job (849 � 170).

This should tip you off that the headline is mis-leading. If the ratio of male-to-female murders is 5 to 1, but the ratio of male-to-female workers is 1.2 to 1, how could the murder rate for women be

higher? You could compute actual rates of mur-der on the job by fi nding a suitable denominator; in this case, the number of men and women in the workforce would be appropriate. Consulting the Census Bureau publication Statistical Abstract of the United States would provide this information and enable you to compute rates, as in our fi nal table:

Women Men

Civilian workforce (1,000s) 53,284 63,593

Murdered at work 170 849

On-the-job murder rate per 100,000 workers 0.319 1.335

So the New York Times got it wrong; there is a higher murder rate for men on the job. Women are less often killed on the job by any cause, including murder. But women who die on the job (in much smaller numbers than men) are more likely to die from murder than are men who die on the job. A murder rate expresses the number of people mur-dered divided by the population at risk.

Rates are often computed with inappropriate denominators. But it is less common to fi nd the term rate used so inaccurately.

Source: “High Murder Rate for Women on the Job” (1993); U.S. Bureau of the Census (1992).

Table 11.8 Newspaper Editorials on the Legalization of Marijuana

Editorial Policy Community Size

Toward Legalizing Under OverMarijuana 100,000 100,000

Favorable 11% 32%

Neutral 29 40

Unfavorable 60 28

100% � (127) (438)

300 Part Four Application and Analysis

2. Describe each subgroup of editorials in terms of the percentages favorable, neutral, or unfavorable toward the legalization of marijuana.

3. Compare the two subgroups in terms of the percentages favorable toward the legaliza-tion of marijuana.

Bivariate Table Formats Tables such as those we’ve been examining are commonly called contingency tables: values of the dependent variable are contingent on values of the inde-pendent variable. Although contingency tables are commonly used in criminal justice research, their format has never been standardized. As a result, a variety of formats will be found in the research literature. As long as a table is easy to read and interpret, there is probably no reason to strive for standardization; however, the fol-lowing guidelines should be followed in the presentation of most tabular data:

• Provide a heading or a title that succinctly describes what is contained in the table.

• Present the original content of the variables clearly—in the table itself if at all possible or in the text with a paraphrase in the table. This information is especially critical when a variable is derived from responses to an attitudinal question because the meaning of the responses will depend largely on the wording of the question.

• Clearly indicate the attributes of each vari-able. Complex categories need to be abbre-viated, but the meaning should be clear in the table, and of course, the full description should be reported in the text.

• When percentages are reported in the table, identify the base on which they are com-puted. It is redundant to present all the raw numbers for each category, because these could be reconstructed from the percent-ages and the bases. Moreover, the presenta-tion of both numbers and percentages often makes a table more diffi cult to read.

• If any cases are omitted from the table be-cause of missing data (“no answer,” for

11 percent (14 editorials) were favorable toward the legalization of marijuana, 29 percent were neutral, and 60 percent were unfavorable. Of the 438 editorials that appeared in our sample of newspapers published in communities with more than 100,000 residents, 32 percent (140 editorials) were favorable toward legalizing marijuana, 40 percent were neutral, and 28 per-cent were unfavorable.

When we compare the editorial policies of rural and urban newspapers in our imaginary study, we fi nd—as expected—that rural newspa-pers are less favorable toward the legalization of marijuana than are urban newspapers. That is determined by noting that a larger percentage (32 percent) of the urban editorials than the ru-ral ones (11 percent) were favorable. We might note, as well, that more rural than urban edi-torials were unfavorable (60 percent versus 28 percent). Note, too, that this table assumes that the size of a community might affect its news-paper’s editorial policies on this issue, rather than that editorial policy might affect the size of communities.

Constructing and Reading Tables Before introducing multivariate analysis, let’s review the steps involved in the construction of ex-planatory bivariate tables:

1. Divide the cases into groups according to attributes of the independent variable.

2. Describe each of these subgroups in terms of attributes of the dependent variable.

3. Read the table by comparing the indepen-dent variable subgroups with one another in terms of a given attribute of the dependent variable.

In the example of editorial policies regard-ing the legalization of marijuana, the size of a community is the independent variable, and a newspaper’s editorial policy is the dependent variable. The table is constructed as follows:

1. Divide the editorials into subgroups accord-ing to the sizes of the communities in which the newspapers are published.

Chapter 11 Interpreting Data 301

victimization—younger people are more often victims of assault and robbery, for example. A classic book by Michael Hindelang and as-sociates (Hindelang, Gottfredson, and Garo-falo 1978) suggested a lifestyle explanation for this relationship. The lifestyle of many younger people—visiting bars and clubs for evening entertainment, for example— exposes them to street crime and potential predators more than does the less active lifestyle of older people. This is certainly a sensible hypothesis, and Hinde-lang and associates found general support for the lifestyle explanation in their analysis of data from early versions of the National Crime Vic-timization Survey (NCVS). But the U.S. crime survey data did not include direct measures of lifestyle concepts.

Questionnaire items in the British Crime Survey (BCS) provided better measures of in-dividual behaviors. Using these data, Ronald Clarke and associates (Clarke, Ekblom, Hough, and Mayhew 1985) examined the link between exposure to risk and victimization, while hold-ing age and gender constant. Specifi cally, Clarke and colleagues hypothesized that older persons are less often victims of street crime because they spend less time on the streets. The 1982 BCS asked respondents whether they had left their homes in the previous week (that is, the week before they were interviewed) for any eve-ning leisure or social activities. Those who re-sponded yes were asked which nights they had gone out and what they had done.

Hypothesizing that some types of evening activities are more risky than others, Clarke and associates restricted their analysis to leisure pursuits away from respondents’ homes, such as visiting a pub, nightclub, or theater. Their de-pendent variable—street crime victimization—was also carefully defi ned to include only crimes against persons (actual and attempted assault, robbery, rape, and theft) that occurred away from victims’ homes or workplaces or the homes of friends. Furthermore, because the leisure behavior questions asked about evening activities, only street crime victimizations that

example), indicate their numbers in the table.

By following these guidelines and think-ing carefully about the kinds of causal and de-scriptive relationships they want to examine, researchers address many policy and research questions in criminal justice. We want to em-phasize, however, the importance of thinking through the logic of contingency tables. De-scriptive statistics— contingency tables, mea-sures of central tendency, or rates—are some-times misrepresented or misinterpreted. See the box titled “Murder on the Job” for an example of this.

Multivariate AnalysisA great deal of criminal justice research uses multivariate techniques to examine relation-ships among several variables. Like much statis-tical analysis, the logic of multivariate analysis is straightforward, but the actual use of many multivariate statistical techniques can be com-plex. A full understanding requires a solid back-ground in statistics and is beyond the scope of this book. In this section, we briefl y discuss the construction of multivariate tables—those con-structed from three or more variables—and the comparison of multiple subgroups.

Multivariate tables can be constructed by following essentially the same steps outlined previously for bivariate tables. Instead of one independent variable and one dependent vari-able, however, we will have more than one in-dependent variable. And instead of explaining the dependent variable on the basis of a single independent variable, we’ll seek an explanation through the use of more than one independent variable. Let’s consider an example from re-search on victimization.

Multivariate Tables: Lifestyle and Street Crime If we consult any source of published statistics on victimization (see Chapter 9), we will fi nd several tables that document a re-lationship between age and personal crime

302 Part Four Application and Analysis

took place between 6:00 p.m. and midnight were included.

Clarke and associates therefore proposed a very specifi c hypothesis that involves three carefully defi ned concepts and variables: older persons are less often victims of street crime be-cause they less often engage in behavior that ex-poses them to risk of street crime. Tables 11.9A through 11.9C present cross-tabulations for the three possible bivariate relationships among these variables: evening street crime victimiza-tion by age and by evening leisure pursuits, and evening leisure pursuits by age.

The relationships illustrated in these tables are consistent with the lifestyle hypothesis of personal crime victimization. First, victimiza-tion was more common for younger people (ages 16 to 30) and for those who pursued lei-sure activities outside their home three or more evenings in the previous week (Tables 11.9A and 11.6B). Second, as shown in Table 11.9C, the at-tributes of young age and frequent exposure to risk were positively related: about 41 percent of the youngest group had gone out three or more nights, compared with 20 percent of those ages 31 to 60 and only 10 percent of those over age 60.

However, because we are interested in the effects of two independent variables—lifestyle and age— on victimization, we must construct a table that includes all three variables.

Several of the tables we have presented in this chapter are somewhat ineffi cient. When the dependent variable—street crime victimization—is dichotomous (two attri-butes), knowing one attribute permits us to easily reconstruct the other. Thus if we know from Table 11.9A that 1 percent of respondents ages 31 to 60 were victims of street crime, then we know automatically that 99 percent were not victims. So reporting the percentages for both values of a dichotomy is unnecessary. On the basis of this recognition, Table 11.9D pres-ents the relationship between victimization and two independent variables in a more effi -cient format.

Table 11.9A Evening Street Crime Victimization by Age

StreetCrime Victim 16–30 31–60 61�

Yes 4.8% 1.0% 0.3%

No 95.2 99.0 99.7

100% � (2,738) (4,460) (1,952)

Table 11.9B Evening Street Crime Victimization by Evenings Out During Previous Week

StreetCrime Victim None 1 or 2 3�

Yes 1.2% 1.6% 3.8%

No 98.9 98.4 96.2

100% � (3,252) (3,695) (2,203)

Table 11.9C Evenings Out During Previous Week by Age

Evenings Out 16–30 31–60 61�

None 20.6% 35.2% 57.2%

1 or 2 38.6 44.9 32.5

3� 40.9 19.8 10.2

100% � (2,738) (4,460) (1,952)

Table 11.9D Evening Street Crime Victimization by Age and Evenings Out

Percentage of Victims

Evenings Out 16–30 31–60 61�

None 3.9 1.0 0.2 (563) (1,572) (1,117)

1 or 2 3.8 0.9 0.2 (1,056) (2,004) (635)

3� 6.2 1.4 1.1 (1,119) (884) (200)

Total N � (2,738) (4,460) (1,952)

Note: Percentages and numbers of cases computed from published tabulations.

Source: Adapted from Clarke, Ekblom, Hough, and Mayhew (1985, Tables 1, 2, and 3).

Chapter 11 Interpreting Data 303

for lower rates of victimization among older persons: within categories of exposure to risk, victimization still declines with age. So lifestyle is related to victimization, but this measure of behavior— exposure to risk of street crime—does not account for all age-related differences in victimization. Furthermore, what we might call lifestyle intensity plays a role here. Going out once or twice a week does not have as much impact on victimization as going out more of-ten. The most intensely active night people ages 30 or younger are most often victims of street crime.

Multivariate contingency tables are power-ful tools for examining relationships between a dependent variable and multiple independent variables measured at the nominal or categori-cal level. Contingency tables can, however, be-come cumbersome and diffi cult to interpret if independent variables have several categories or if more than two independent variables are included. In practice, criminal justice research-ers often employ more sophisticated techniques for multivariate analysis of discrete or nominal variables. Although the logic of such analysis is not especially diffi cult, most people learn these techniques in advanced courses in statistics.

Inferential StatisticsWhen we generalize from samples to larger popula-tions, we use inferential statistics to test the signifi -cance of an observed relationship.

Many criminal justice research projects exam-ine data collected from a sample drawn from a larger population. A sample of people may be interviewed in a survey; a sample of court records may be coded and analyzed; a sample of newspapers may be examined through con-tent analysis. Researchers seldom, if ever, study samples merely to describe the samples per se; in most instances, their ultimate purpose is to make assertions about the larger popula-tion from which the sample has been selected. Frequently, then, we will want to interpret our

In Table 11.9D, the percentages of respon-dents who reported a street crime victimiza-tion are shown in the cells at the intersections of the two independent variables. The numbers presented in parentheses below each percentage are the numbers of cases on which the percent-ages are based. Thus we know that 563 people ages 16 to 30 did not go out for evening lei-sure in the week before their interview and that 3.9 percent of them were victims of street crime in the previous year. We can calculate from this that 22 of those 563 people were victims and the other 541 people were not victims.

Let’s now interpret the results presented in this table:

• Within each age group, persons who pur-sue outside evening leisure activities three or more times per week are more often victimized.

• There is not much difference between those who go out once or twice per week and those who stayed home.

• Within each category for evening leisure ac-tivities, street crime victimization declines as age increases.

• Exposure to risk through evenings out is less strongly related to street crime victim-ization than is age.

• Age and exposure to risk have independent effects on street crime victimization. Within a given attribute of one independent vari-able, different attributes of the second are still related to victimization.

• Similarly, the two independent variables have a cumulative effect on victimization. Younger people who go out three or more times per week are most often victimized.

Returning to the lifestyle hypothesis, what can we conclude from Table 11.9D? First, this measure of exposure to risk is, in fact, related to street crime victimization. People who go out more frequently are more often exposed to risk and more often victims of street crime. As Clarke and associates point out, however, differences in exposure to risk do not account

304 Part Four Application and Analysis

Any statement of sampling error, then, must contain two essential components: the confi -dence level (for example, 95 percent) and the confi dence interval (for example, 2.5 percent). If 50 percent of a sample of 1,600 people say they have received traffi c tickets during the year, we might say we are 95 percent confi dent that the population fi gure is between 47.5 and 52.5 percent.

Recognize in this example that we have moved beyond simply describing the sample into the realm of making estimates (inferences) about the larger population. In doing that, we must be wary of three assumptions.

First, the sample must be drawn from the population about which inferences are being made. A sample taken from a telephone direc-tory cannot legitimately be the basis for statis-tical inferences about the population of a city.

Second, the inferential statistics assume simple random sampling, which is virtually never the case in actual sample surveys. The statistics assume sampling with replacement, which is almost never done, but that is proba-bly not a serious problem. Although systematic sampling is used more frequently than random sampling, that, too, probably presents no seri-ous problem if done correctly. Stratifi ed sam-pling, because it improves representativeness, clearly presents no problem. Cluster sampling does present a problem, however, because the estimates of sampling error may be too small. Clearly, street-corner sampling does not war-rant the use of inferential statistics. This stan-dard error sampling technique also assumes a 100 percent completion rate. This problem increases in seriousness as the completion rate decreases.

Third, inferential statistics apply to sampling error only; they do not take account of non-sampling errors. Thus, although we might cor-rectly state that between 47.5 and 52.5 percent of the population (95 percent confi dence) will report getting a traffi c ticket during the previ-ous year, we cannot so confi dently guess the per-centage that had actually received them. Because

univariate and multivariate sample fi ndings as the basis for inferences about some population.

This section examines the statistical mea-sures used for making such inferences and their logical bases. We’ll begin with univariate data and then move to bivariate.

Univariate InferencesThe opening sections of this chapter dealt with methods of presenting univariate data. Each summary measure was intended to describe the sample studied. Now we will use those measures to make broader assertions about the popula-tion. This section will focus on two univariate measures: percentages and means.

If 50 percent of a sample of people say they received traffi c tickets during the past year, then 50 percent is also our best estimate of the pro-portion of people who received traffi c tickets in the total population from which the sample was drawn. Our estimate assumes a simple random sample, of course. It is rather unlikely, however, that precisely 50 percent of the population got tickets during the year. If a rigorous sampling design for random selection has been followed, we will be able to estimate the expected range of error when the sample fi nding is applied to the population.

The section in Chapter 6 on sampling the-ory covered the procedures for making such es-timates, so they will be only reviewed here. The quantity

sp q

n�

�,

where p is a percentage, q equals 1–p, and n is the sample size, is called the standard error. As noted in Chapter 8, this quantity is very important in the estimation of sampling error. We may be 68 percent confi dent that the population fi gure falls within plus or minus 1 standard error of the sample fi gure, 95 percent confi dent that it falls within plus or minus 2 standard errors, and 99.9 percent confi dent that it falls within plus or minus 3 standard errors.

Chapter 11 Interpreting Data 305

crepancy between the assumed independence of variables in a population and the observed dis-tribution of sample elements, we may explain that discrepancy in either of two ways. (1) We attribute it to an unrepresentative sample, or (2) we reject the assumption of independence. The logic and statistics associated with prob-ability sampling methods offer guidance about the varying probabilities of different degrees of unrepresentativeness (expressed as sampling error). Most simply put, there is a high prob-ability of a small degree of unrepresentative-ness and a low probability of a large degree of unrepresentativeness.

The statistical signifi cance of a relation-ship observed in a set of sample data, then, is always expressed in terms of probabilities. Sig-nifi cant at the .05 level (p � .05) simply means that the probability of a relationship as strong as the observed one being attributable to sam-pling error alone is no more than 5 in 100. Put somewhat differently, if two variables are inde-pendent of each other in the population, and if 100 probability samples were selected from that population, then no more than 5 of those samples should provide a relationship as strong as the one that has been observed.

There is a corollary to confi dence intervals in tests of signifi cance, which represents the probability of the measured associations being due to only sampling error. This is called the level of signifi cance. Like confi dence intervals, levels of signifi cance are derived from a logical model in which several samples are drawn from a given population. In the present case, we as-sume that no association exists between the variables in the population, and then we ask what proportion of the samples drawn from that population would produce associations at least as great as those measured in the em-pirical data. Three levels of signifi cance are fre-quently used in research reports: .05, .01, and .001. These mean, respectively, that the chances of obtaining the measured association as a re-sult of sampling error are no more than 5 in 100, 1 in 100, and 1 in 1,000.

nonsampling errors are probably larger than sampling errors in a respectable sample design, we need to be especially cautious in generalizing from our sample fi ndings to the population.

Tests of Statistical Signifi canceThere is no scientifi c answer to the question of whether a given association between two vari-ables is signifi cant, strong, important, interest-ing, or worth reporting. Perhaps the ultimate test of signifi cance rests with our ability to per-suade readers (present and future) of the asso-ciation’s signifi cance. At the same time, a body of inferential statistics—known as parametric tests of signifi cance— can assist in this regard. As the name suggests, parametric statistics make certain assumptions about the parameters that describe the population from which the sample is selected.

Although tests of statistical signifi cance are widely reported in criminal justice literature, the logic underlying them is subtle and often mis-understood. A test of statistical signifi cance is based on the same sampling logic that has been discussed elsewhere in this book. To understand that logic, let’s return to the concept of sam-pling error with regard to univariate data.

Recall that a sample statistic normally pro-vides the best single estimate of the correspond-ing population parameter, but the statistic and the parameter are seldom identical. Thus we report the probability that the parameter falls within a certain range (confi dence interval). The degree of uncertainty within that range is due to normal sampling error. The corollary of such a statement is, of course, that it is improb-able that the parameter will fall outside the specifi ed range only as a result of sampling er-ror. Thus, if we estimate that a parameter (99.9 percent confi dence) lies between 45 and 55 per-cent, we say by implication that it is extremely improbable that the parameter is actually, say, 70 percent if our only error of estimation is due to normal sampling.

The fundamental logic of tests of statistical signifi cance, then, is this: faced with any dis-

306 Part Four Application and Analysis

Consider fi rst the topmost bar, showing estimates for change from 1973 to 1974. The small dot signifying the point estimate (an in-crease of 1.24 percent) is just to the right of the no-change vertical line. But notice also that the confi dence intervals of 1, 1.6, and 2 standard er-rors cross over the no-change line; the estimate of a 1.24 percent increase is less than 2 standard errors above zero. This means that, using the .05 (2 standard errors) criterion, the estimated increase is not statistically signifi cant. To fur-ther emphasize this point, notice the key in Fig-ure 11.3, which shows different probabilities that a change occurred each year; the small dot representing the point estimate for 1973–74 change indicates the probability that a change occurred is less than 90 percent.

Now consider the most recent estimate shown in this fi gure— change from 1995 to 1996. The point estimate of –9.9 percent is well below the no-change line, and the confi -dence intervals are well to the left of this line. Bracketing the point estimate by 2 standard er-rors produces an interval estimate of between –15.7 percent (point estimate minus 2 standard errors) and – 4.05 percent (point estimate plus 2 standard errors). This means we can be 95 percent certain that the violent victimization rate declined from 1995 to 1996 by somewhere between – 4.05 and –15.7 percent. Because the point estimate of –9.9 percent is more than 2 standard errors from zero, we can confi dently say there was a statistically signifi cant decline in violent victimization.

Displaying point and interval estimates in this way accurately represents the concepts of statistical inference and statistical signifi cance. Sample-based estimates of victimization, or any other variable, are just that— estimates of popu-lation values, bounded by estimates of standard error. Statistical signifi cance, in this example, means that our estimates of change are above or below zero, according to some specifi ed cri-terion for signifi cance.

Studying Figure 11.3 will enhance your un-derstanding of statistical inference. We sug-

Researchers who use tests of signifi cance normally follow one of two patterns. Some specify in advance the level of signifi cance they will regard as suffi cient. If any measured asso-ciation is statistically signifi cant at that level, they will regard it as representing a genuine as-sociation between the two variables. In other words, they are willing to discount the possibil-ity of its resulting from sampling error only.

Other researchers prefer to report the spe-cifi c level of signifi cance for each association, disregarding the conventions of .05, .01, and .001. Rather than reporting that a given asso-ciation is signifi cant at, say, the .05 level, they might report signifi cance at the .023 level, in-dicating that the chances of its having resulted from sampling error are no more than 23 in 1,000.

Visualizing Statistical Signifi canceIn a Bureau of Justice Statistics publication de-scribing the NCVS for a nontechnical audience, Michael Maltz and Marianne Zawitz (1998) present a very informative graphical display to show statistical signifi cance. Recall that the NCVS is a national sample designed to esti-mate nationwide rates of victimization. Maltz and Zawitz use visual displays of estimates and their confi dence intervals to demonstrate the relative precision of victimization rates dis-closed by the survey.

Figure 11.3, reproduced from Maltz and Zawitz, presents an example of this approach. The fi gure shows annual rates of change in all violent victimizations from 1973 through 1996. Notice the vertical line representing no change in violent victimization rates for each year. Es-timates of annual rates of change are shown in horizontal bars arrayed along the vertical line. The horizontal bars for each year present pa-rameter estimates for annual change, signifi ed by a dot or square, bracketed by the confi dence intervals for each parameter estimate at three confi dence levels: 68 percent (1 standard error), 90 percent (1.6 standard errors), and 95 percent (2 standard errors).

Chapter 11 Interpreting Data 307

Figure 11.3 Point Estimates and Confi dence IntervalsSource: Maltz and Zawitz (1998, 4).

1973–741974–751975–761976–771977–781978–791979–801980–811981–821982–831983–841984–851985–861986–871987–881988–891989–901990–911991–921992–931993–941994–951995–96

Year-to-year changes in victimization rates and their precision

Annual percent change in violent victimization, 1973–1996

�25% 0% 25%

Decrease Nochange

Increase

Probability that the true percentchange in violent victimization

is within the range

Best estimate

68%

90%

95%

Probability that achange occurred

Greater than 95%

Greater than 90%

Less than 90%

Percent change in violent victimization rates

gest that you consider the point and interval estimates for each year’s change. Pay particular attention to the confi dence intervals and their position relative to the no-change line. Then classify the statistical signifi cance (at the .05 level) for change each year into one of three categories: (1) no change, (2) signifi cant in-crease, and (3) signifi cant decrease. You’ll fi nd our tabulation in the exercises at the end of this chapter.

Chi SquareChi square (�2) is a different type of signifi -cance test that is widely used in criminal justice research. It is based on the null hypothesis:

the assumption that there is no relationship between two variables in a population. Given the observed distribution of values on two variables in a contingency table, we compute the joint distribution that would be expected if there were no relationship between the two variables. The result of this operation is a set of expected frequencies for all the cells in the con-tingency table. We then compare this expected distribution with the empirical distribution—cases actually found in the data—and deter-mine the probability that the difference be-tween expected and empirical distributions could have resulted from sampling error alone. Stated simply, chi square compares what you

308 Part Four Application and Analysis

get (empirical) with what you expect given a null hypothesis of no relationship. An example will illustrate this procedure.

Let’s assume we are interested in the pos-sible relationship between gender and whether people avoid areas near their home because of crime, which we will refer to as avoidance be-havior. To test this relationship, we select a sample of 100 people at random. Our sample is made up of 40 men and 60 women; 70 percent of our sample report avoidance behavior, and the remaining 30 percent do not.

If there is no relationship between gender and avoidance behavior, then 70 percent of the men in the sample should report avoiding ar-eas near their home, and 30 percent should re-port no avoidance behavior. Moreover, women should describe avoidance behavior in the same proportion. Table 11.10 (part I) shows that, based on this model, 28 men and 42 women say they avoid areas at night, with 12 men and 18 women reporting no avoidance.

Part II of Table 11.10 presents the observed avoidance behavior for the hypothetical sample of 100 people. Note that 20 of the men say they

avoid areas at night, and the remaining 20 say they do not. Among the women in the sample, 50 avoid areas and 10 do not. Comparing the ex-pected and observed frequencies (parts I and II), we note that somewhat fewer men report avoid-ance behavior than expected, whereas some-what more women than expected avoid areas near their home at night.

Chi square is computed as follows: for each cell in the tables, we (1) subtract the expected frequency for that cell from the observed fre-quency, (2) square this quantity, and (3) divide the squared difference by the expected fre-quency. This procedure is carried out for each cell in the tables, and the results are added. Part III of Table 11.10 presents the cell-by-cell computations. The fi nal sum is the value of chi square—12.70 in this example.

This value is the overall discrepancy between the observed distribution in the sample and the distribution we would expect if the two variables were unrelated. Of course, the mere discovery of a discrepancy does not prove that the two variables are related, because normal sampling error might produce discrepancies

Table 11.10 Hypothetical Illustration of Chi Square

I. Expected Cell Frequencies Men Women Total

Avoid areas* 28 42 70

Do not avoid areas 12 18 30

Total 40 60 100

II. Observed Cell Frequencies Men Women Total

Avoid areas 20 50 70

Do not avoid areas 20 10 30

Total 40 60 100

III. (Observed – Expected)2 � Expected Men Women

Avoid areas 2.29 1.52 Chi sq. � 12.70

Do not avoid areas 5.33 3.56 p .001

* “Is there any area around here—that is, within a city block—that you avoid at night because of crime?”

Chapter 11 Interpreting Data 309

even when there is no relationship in the to-tal population. The magnitude of the value of chi square, however, permits us to estimate the probability of that having happened.

To determine the statistical signifi cance of the observed relationship, we must use a stan-dard set of chi-square values. That will require the computation of the degrees of freedom. For chi square, the degrees of freedom are com-puted as follows: the number of rows in the table of observed frequencies, minus one, is multiplied by the number of columns, minus one. This may be written as (r – 1) (c -1). In the present example, we have two rows and two col-umns (discounting the totals), so there is 1 de-gree of freedom.

Turning to a table of chi-square values (see the inside back cover), we fi nd that, for 1 degree of freedom and random sampling from a popu-lation in which there is no relationship between two variables, 10 percent of the time we should expect a chi square of at least 2.7. Thus, if we select 100 samples from such a population, we should expect about 10 of those samples to pro-duce chi squares equal to or greater than 2.7. Moreover, we should expect chi-square values of at least 6.6 in only 1 percent of the samples and chi-square values of 10.8 in only 0.1 per-cent of the samples. The higher the chi-square value, the less probable it is that the value can be attributed to sampling error alone.

In our example, the computed value of chi square is 12.70. If there is no relationship be-tween gender and avoidance behavior and a large number of samples were selected and studied, we can expect a chi square of this mag-nitude in fewer than 0.1 percent of those sam-ples. Thus the probability of obtaining a chi square of this magnitude is less than 0.001 if random sampling has been used and there is no relationship in the population. We report this fi nding by saying that the relationship is statis-tically signifi cant at the .001 level. Because it is so improbable that the observed relationship could have resulted from sampling error alone, we are likely to reject the null hypothesis and

assume that a relationship does, in fact, exist between the two variables.

Many measures of association can be tested for statistical signifi cance in a similar manner. Standard tables of values permit us to deter-mine whether a given association is statistically signifi cant and at what level.

Cautions in Interpreting Statistical Signifi canceTests of signifi cance provide an objective yard-stick against which to estimate the signifi cance of associations between variables. They as-sist us in ruling out associations that may not represent genuine relationships in the popu-lation under study. However, the researcher who uses or reads reports of signifi cance tests should be aware of certain cautions in their interpretation.

First, we have been discussing tests of statis-tical signifi cance; there are no objective tests of substantive signifi cance. We may be legitimately convinced that a given association is not due to sampling error but still assert, without fear of contradiction, that two variables are only slightly related to each other. Recall that sam-pling error is an inverse function of sample size: the larger the sample, the smaller the expected error. Thus a correlation of, say, .1 might well be signifi cant (at a given level) if discovered in a large sample, whereas the same correla-tion between the same two variables would not be signifi cant if found in a smaller sample. Of course, that makes perfect sense if one under-stands the basic logic of tests of signifi cance: in the larger sample, there is less chance that the correlation is simply the product of sampling error.

Consider Table 11.11, in which 20 cases are distributed in the same proportions across row and column categories as in Table 11.10. In each table, 83 percent of women report avoid-ance behavior (10 out of 12 in Table 11.11, and 50 out of 60 in Table 11.10). But with one-fi fth the number of cases in Table 11.11, the com-puted value of chi square is only one-fi fth that

310 Part Four Application and Analysis

can say that the difference is of no substantive signifi cance. We conclude, in fact, that they are essentially the same age.

Second, lest you be misled by this hypotheti-cal example, statistical signifi cance should not be calculated on relationships observed in data collected from whole populations. Remember, tests of statistical signifi cance measure the like-lihood of relationships between variables be-ing a product of only sampling error, which, of course, assumes that data come from a sample. If there’s no sampling, there’s no sampling error.

Third, tests of signifi cance are based on the same sampling assumptions we use to compute confi dence intervals. To the extent that these as-sumptions are not met by the actual sampling design, the tests of signifi cance are not strictly legitimate.

In practice, tests of statistical signifi cance are frequently used inappropriately. Michael Maltz (2006) presents several examples of ques-tionable interpretations of signifi cance. If you were to review any given issue of an academic journal in criminal justice, we’d be willing to

obtained in Table 11.10. Consulting the distri-bution of chi-square values (inside back cover), we see that the probability of obtaining a chi square of 2.54 with 1 degree of freedom lies be-tween .1 and .2. Thus, if there is no relationship between these two variables, we can expect to obtain a chi square of this size in 10 to 20 per-cent of samples drawn. Most researchers would not reject the null hypothesis of no relationship in this case.

The distinction between statistical and sub-stantive signifi cance is perhaps best illustrated by those cases in which there is absolute cer-tainty that observed differences cannot be a result of sampling error. That is the case when we observe an entire population. Suppose we are able to learn the age and gender of every murder victim in the United States for 1996. For argument’s sake, let’s assume that the aver-age age of male murder victims is 25, as com-pared with, say, 26 for female victims. Because we have the ages of all murder victims, there is no question of sampling error. We know with certainty that the female victims are older than their male counterparts. At the same time, we

Table 11.11 Hypothetical Illustration of Chi-Square Sensitivity to Sample Size

I. Expected Cell Frequencies Men Women Total

Avoid areas* 5.6 8.4 14

Do not avoid areas 2.4 3.6 6

Total 8 12 20

II. Observed Cell Frequencies Men Women Total

Avoid areas 4 10 14

Do not avoid areas 4 2 6

Total 8 12 20

III. (Observed – Expected)2 � Expected Men Women

Avoid areas 0.46 0.30 Chi sq. � 2.54

Do not avoid areas 1.07 0.71 10 p 20

* “Is there any area around here—that is, within a city block—that you avoid at night because of crime?”

Chapter 11 Interpreting Data 311

bet that you would fi nd one or more of these technically improper uses:

• Tests of signifi cance computed for data rep-resenting entire populations

• Tests based on samples that do not meet the required assumptions of probability sampling

• Interpretation of statistical signifi cance as a measure of association (a “relationship” of p � .001 is “stronger” than one of p � .05)

We do not mean to suggest a purist approach by these comments. We encourage you to use any statistical technique—any measure of asso-ciation or any test of signifi cance— on any set of data if it will help you understand your data. In doing so, however, you should recognize what measures of association and statistical signifi -cance can and cannot tell you, as well as the as-sumptions required for various measures. Any individual statistic or measure tells only part of the story, and you should try to learn as much of the story as you can.

✪ Main Points• Descriptive statistics are used to summarize

data under study.

• A frequency distribution shows the number of cases that have each of the attributes of a given variable.

• Measures of central tendency reduce data to an easily manageable form, but they do not convey the detail of the original data.

• Measures of dispersion give a summary indica-tion of the distribution of cases around an aver-age value.

• Rates are descriptive statistics that standardize some measure for comparative purposes.

• Bivariate analysis and subgroup comparisons examine some type of relationship between two variables.

• The rules of thumb in interpreting bivariate percentage tables to make the subgroup com-parisons are (1) if percentaged down, then read across or (2) if percentaged across, then read down.

• Multivariate analysis is a method of analyzing the simultaneous relationships among several

variables and may be used to more fully under-stand the relationship between two variables.

• Inferential statistics are used to estimate the generalizability of fi ndings arrived at in the analysis of a sample to the larger population from which the sample has been selected.

• Inferences about a characteristic of a popula-tion—such as the percentage that favors gun control laws—must contain an indication of a confi dence interval (the range within which the value is expected to be—for example, between 45 and 55 percent favor gun control) and an in-dication of the confi dence level (the likelihood that the value does fall within that range—for example, 95 percent confi dence).

• Tests of statistical signifi cance estimate the like-lihood that an association as large as the ob-served one could result from normal sampling error if no such association exists between the variables in the larger population.

• Statistical signifi cance must not be confused with substantive signifi cance, which means that an observed association is strong, important, or meaningful.

• Tests of statistical signifi cance, strictly speak-ing, make assumptions about data and meth-ods that are almost never satisfi ed completely by real social research.

✪ Key Terms

✪ Review Questions and Exercises1. Using the data in the accompanying table, con-

struct and interpret tables showing:

average, p. 289bivariate

analysis, p. 288contingency

table, p. 300descriptive statistics,

p. 288dispersion, p. 291frequency distribu-

tions, p. 288inferential

statistics, p. 288level of signifi cance,

p. 305mean, p. 289median, p. 289

mode, p. 289multivariate

analysis, p. 288nonsampling

errors, p. 304null hypothesis,

p. 307range, p. 291standard

deviation, p. 291statistical signifi –

cance, p. 305test of statistical sig-

nifi cance, p. 305univariate analysis,

p. 288

312 Part Four Application and Analysis

can be downloaded from the Bureau of Justice Statistics website at www.ojp.usdoj.gov/bjs/abstract/dvctue.htm (accessed May 18, 2008).

✪ Additional ReadingsBabbie, Earl, Fred Halley, and Jeanne Zaino, Ad-

ventures in Social Research, 6th ed. (Newbury Park, CA: Pine Forge Press, 2007). This book introduces you to data analysis through SPSS, a widely used computer program for statisti-cal analysis. Several of the basic techniques de-scribed in this chapter are illustrated and dis-cussed further.

Finkelstein, Michael O., and Bruce Levin, Statistics for Lawyers, 2nd ed. (New York: Springer-Verlag, 2001). Law school trains people to be analytic, but few lawyers know much about statistics. This book provides a straightforward explana-tion of many basic statistical concepts. Exam-ples are drawn from actual cases to illustrate how to calculate and interpret statistics. In ad-dition, readers gain insight into how to reason with statistics.

Miller, Jane E. The Chicago Guide to Writing about Numbers. (Chicago: University of Chicago Press, 2004). This is an excellent guide to thinking about and interpreting results of data analy-sis. The book is helpful as an introduction for beginning researchers and a reference for more experienced analysts.

Weisburd, David, and Chester Britt, Statistics in Criminal Justice, 2nd ed. (Belmont, CA: Wads-worth, 2002). This text presents an excellent basic introduction to statistics using examples from criminal justice.

a. The bivariate relationship between age and at-titude toward capital punishment

b. The bivariate relationship between politi-cal orientation and attitude toward capital punishment

c. The multivariate relationship linking age, po-litical orientation, and attitude toward capital punishment

Attitude Toward Political CapitalAge Orientation Punishment Frequency

Young Conservative Favor 90

Young Conservative Oppose 10

Young Liberal Favor 60

Young Liberal Oppose 40

Old Conservative Favor 60

Old Conservative Oppose 10

Old Liberal Favor 15

Old Liberal Oppose 15

2. Here are our answers to the question about sta-tistical signifi cance relating to Figure 11.3: Fif-teen years show no signifi cant change in violent victimization; signifi cant increases are shown for 3 years; violent victimization declines sig-nifi cantly in 5 years. The increase from 1976 to 1977 and the decrease from 1979 to 1980 are close; notice the edge of the 95 percent confi -dence interval borders the no-change line. We recommend that you read the Maltz and Zawitz publication. It’s listed in the references and

313

a greater uniformity of responses and are more easily analyzed than open-ended questions. See Chapter 7.

cluster sample A multistage sample in which natural groups (clusters) are sampled initially, with the mem-bers of each selected group being subsampled after-ward. For example, you might select a sample of mu-nicipal police departments from a directory, get lists of the police offi cers at all the selected departments, then draw samples of offi cers from each. See Chapter 6.

cohort study A study in which some specifi c group is studied over time, although data may be collected from different members in each set of observations. See Chapter 3.

computer-assisted interviewing Survey research by computer, in which questionnaires are presented on computer screens instead of paper. In computer-assisted personal interviewing, an interviewer reads items from the computer screen and keys in responses. In computer-assisted self-interviewing, respondents read items (silently) on the screen of a laptop com-puter and key in their answers. Another variation is audio-assisted self-interviewing, whereby respondents hear questions through headphones, then key in their answers. Both types of self-interviewing are especially useful for sensitive questions such as those in self-reports. See Chapter 7.

concept The words or symbols in language that we use to represent mental images. Crime, for example, is a concept that represents our mental images of vio-lence and other acts that are prohibited and punished by government. We use conceptual defi nitions to spec-ify the meaning of concepts. Compare with conception.See Chapter 4.

conception The mental images we have that repre-sent our thoughts about things we routinely encoun-ter. We use the word speeding (a concept) to represent our mental image (conception) of traveling above the posted speed limit. See Chapter 4.

conceptual defi nition Defi ning concepts by using other concepts. Concepts are abstract—the words and symbols that are used to represent mental images of things and ideas. This means that a conceptual defi -nition uses words and symbols to defi ne concepts. In practice, conceptual defi nitions represent explicit statements of what a researcher means by a concept. A conceptual defi nition of prior record might be “recorded evidence of one or more convictions for a criminal offense.” See Chapter 4. See also operational defi nition.

Glossaryaggregate Groups of units—people, prisons, court-rooms, or stolen autos, for example. Although crimi-nal justice professionals are usually most concerned with individual units, social science searches for pat-terns that are refl ected in aggregations of units. For example, a probation offi cer focuses on probation cli-ents as individuals, whereas a social scientist focuses on groups of probation clients, or aggregates. See Chapter 1.

anonymity The identity of a research subject is not known, and it is therefore impossible to link data about a subject to an individual’s name. Anonymity is one tool for addressing the ethical issue of privacy. Compare with confi dentiality. See Chapter 2.

attributes Characteristics of persons or things. See Chapter 1. See also variables.average An ambiguous term that generally suggests typical or normal. The mean, median, and mode are specifi c mathematical averages. See Chapter 11.

binomial variable A variable that has only two at-tributes is binomial. Gender is an example; it has the attributes male and female. See Chapter 6.

bivariate analysis The analysis of two variables si-multaneously for the purpose of determining the em-pirical relationship between them. The construction of a simple percentage table and the computation of a simple correlation coeffi cient are examples of bivari-ate analyses. See Chapter 11.

case-oriented research A research strategy in which many cases are examined to understand a compara-tively small number of variables. Examples include ex-periments (Chapter 5) and surveys (Chapter 7).

case study A research strategy in which the research-er’s attention centers on an in-depth examination of one or a few cases on many dimensions. Case studies can be exploratory, descriptive, or explanatory. Case studies can also be used in evaluation research. See Chapter 5.

classical experiment A research design well suited to inferring cause, the classical experiment involves three major pairs of components: (1) independent and dependent variables, (2) pretesting and posttesting, and (3) experimental and control groups, with sub-jects randomly assigned to one group or the other. See Chapter 5.

closed-ended questions Survey questions in which the respondent is asked to select an answer from a list provided by the researcher. Closed-ended questions are especially popular in surveys because they provide

314 Glossary

conceptualization The mental process whereby fuzzy and imprecise notions (concepts) are made more specifi c and precise. So you want to study fear of crime? What do you mean by fear of crime? Are there different kinds of fear? What are they? See Chapters 3 and 4.

confi dence interval The range of values within which a population parameter is estimated to lie. A survey, for instance, may show that 40 percent of a sample favor a ban on handguns. Although the best estimate of the support that exists among all people is also 40 percent, we do not expect it to be exactly that. We might, therefore, compute a confi dence interval (for example, from 35 to 45 percent) within which the actual percentage of the population probably lies. Note that it is necessary to specify a confi dence level in connection with every confi dence interval. See Chapters 6 and 11.

confi dence level The estimated probability that a population parameter lies within a given confi dence interval. Thus we might be 95 percent confi dent that between 35 and 45 percent of all residents of Califor-nia favor an absolute ban on handguns. See Chapters 6 and 11.

confi dentiality Researchers know the identity of a research subject but promise not to reveal any infor-mation that can be attributed to an individual subject. Anonymity is similar, but sometimes researchers need to know subjects’ names to link information from dif-ferent sources. Assuring confi dentiality is one way of meeting our ethical obligation to not harm subjects. See Chapter 2.

construct validity (1) The degree to which a measure relates to other variables as expected within a system of theoretical relationships. See Chapter 4. (2) How well an observed cause-and-effect relationship repre-sents the underlying causal process a researcher is in-terested in. See Chapters 3, 4, and 5. See also validity threats.content analysis The systematic study of messages and their meaning. Researchers use content analysis to study all forms of communication, including text, pictures, and video recordings. See Chapter 9.

contingency table A format for presenting the rela-tionship among variables in the form of percentage distributions. See Chapter 11.

control group In experimentation, a group of sub-jects to whom no experimental stimulus is admin-istered and who should resemble the experimental group in all other respects. The comparison of the control group and the experimental group at the end of the experiment indicates the effect of the experi-mental stimulus. See Chapter 5.

criterion-related validity The degree to which a measure relates to some external criterion. For exam-ple, the validity of self-report surveys of drug use can be shown by comparing survey responses to labora-tory tests for drug use. See Chapter 4.

cross-sectional study A study based on observations that represent a single point in time. Compare with longitudinal study. See Chapter 3.

deductive reasoning A mode of inquiry using the logical model in which specifi c expectations of hy-potheses are developed on the basis of general princi-ples. Starting from the general principle that all deans are meanies, you might anticipate that your current one won’t let you change courses. That anticipation would be the result of deduction. See Chapters 1 and 2. See also inductive reasoning.

dependent variable The variable assumed to depend on or be caused by another variable (called the inde-pendent variable). If you fi nd that sentence length is partly a function of the number of prior arrests, then sentence length is being treated as a dependent vari-able. See Chapters 1 and 5.

descriptive statistics Statistical computations that describe either the characteristics of a sample or the relationship among variables in a sample. Descrip-tive statistics summarize a set of sample observations, whereas inferential statistics move beyond the de-scription of specifi c observations to make inferences about the larger population from which the sample observations were drawn. See Chapter 11.

dimension A specifi able aspect, or feature, of a con-cept. See Chapter 4.

dispersion The distribution of values around some central value, such as an average. The range is a simple measure of dispersion. Thus we may report that the mean age of a group is 37.9 and the range is from 12 to 89. See Chapter 11.

disproportionate stratifi ed sampling Deliberatelydrawing a sample that overrepresents or underrepre-sents some characteristic of a population. We may do this to ensure that we obtain a suffi cient number of uncommon cases in our sample. For example, believ-ing violent crime to be more common in large cities, we might oversample urban residents to obtain a spe-cifi c number of crime victims. See Chapter 6.

ecological fallacy Erroneously drawing conclusions about individuals based solely on the observation of groups. See Chapter 3.

empirical From experience. Social science is said to be empirical when knowledge is based on what we ex-perience. See Chapter 1.

Glossary 315

less selected to represent a target population. Focus groups are most useful in two situations: (1) when precise generalization to a larger population is not necessary and (2) when focus group participants and the larger population they are intended to represent are relatively homogeneous. See Chapter 7.

frequency distribution A description of the number of times the various attributes of a variable are ob-served in a sample. The report that 53 percent of a sam-ple were men and 47 percent were women is a simple example of a frequency distribution. Another example is the report that 15 of the cities studied had popula-tions of less than 10,000, 23 had populations between 10,000 and 25,000, and so forth. See Chapter 11.

generalizability That quality of a research fi nding that justifi es the inference that it represents something more than the specifi c observations on which it was based. Sometimes this involves the generalization of fi ndings from a sample to a population. Other times it is a matter of concepts: if you are able to discover why people commit burglaries, can you generalize that discovery to other crimes as well? See Chapter 5.

grounded theory A type of inductive theory that is based on (grounded in) fi eld observation. The re-searcher makes observations in natural settings, then formulates a tentative theory that explains those ob-servations. See Chapter 2.

hypothesis An expectation about the nature of things derived from a theory. It is a statement of something that will be observed in the real world if the theory is correct. See Chapter 5. See also deductive reasoning.

hypothesis testing The determination of whether the expectations that a hypothesis represents are in-deed found in the real world. See Chapter 5.

idiographic A mode of causal reasoning that seeks detailed understanding of all factors that contribute to a particular phenomenon. Police detectives trying to solve a particular case use the idiographic mode of explanation. Compare with nomothetic. See Chapters 1 and 3.

impact assessment A type of applied research that seeks to answer the question: “Did a public program have the intended effect on the problem it was meant to address?” If, for example, a new burglary prevention program has the goal of reducing burglary in a partic-ular neighborhood, an impact assessment would try to determine whether burglary was, in fact, reduced as a result of the new program. Compare with processevaluation. See Chapter 10.

incident-based measure Refers to crime measures that express characteristics of individual crime inci-dents. The FBI Supplementary Homicide Reports are

environmental survey Structured observations un-dertaken in the fi eld and recorded on specially de-signed forms. Note that interview surveys record a respondent’s answers to questions, whereas environ-mental surveys record what an observer sees in the fi eld. For example, a community organization may conduct periodic environmental surveys to monitor neighborhood parks—whether facilities are in good condition, how much litter is present, and what kinds of people use the park. See Chapter 8.

equal probability of selection method (EPSEM) Asample design in which each member of a population has the same chance of being selected in the sample. See Chapter 6.

evaluation research An example of applied research, evaluation involves assessing the effects of some pro-gram or policy action, usually in connection with the goals of that action. Determining whether a sex of-fender treatment program attained its goal of reduc-ing recidivism by participants would be an example. Compare with problem analysis. See Chapter 10.

evidence-based policy Using data and other sources of information to formulate and evaluate justice pol-icy. This usually means planning justice actions based on evidence of need, such as deploying police patrols to crime hot spots. It also includes assessing the re-sults of justice policy, such as measuring any change in recidivism among a group of offenders processed through drug court. See Chapter 10.

experimental group In experimentation, a group of subjects who are exposed to an experimental stimu-lus. Subjects in the experimental group are normally compared with subjects in a control group to test the effects of the experimental stimulus. See Chapter 5.

external validity Whether a relationship observed in a specifi c population, at a specifi c time, in a specifi c place would also be observed in other populations, at other times, in other places. External validity is con-cerned with generalizability from a relationship ob-served in one setting to the same relationship in other settings. Replication enhances external validity. See Chapters 3 and 5.

face validity The quality of an indicator that makes it seem a reasonable measure of some variable. That sentence length prescribed by law is some indication of crime seriousness seems to make sense without a lot of explanation; it has face validity. See Chapter 4.

focus group Small groups (of 12 to 15) engaged in guided discussions of some topic. Participants se-lected are from a homogeneous population. Although focus groups cannot be used to make statistical es-timates about a population, members are neverthe-

316 Glossary

trasted to a cross-sectional study. See Chapter 3. See also trend study, cohort study, and panel study.

manifest content In connection with content analy-sis, the concrete terms contained in a communication, as distinguished from latent content. See Chapter 9.

mean An average, computed by summing the values of several observations and dividing by the number of observations. If you now have a grade-point average of 4.0 based on 10 courses, and you get an F in this course, then your new grade-point (mean) average will be 3.6. See Chapter 11.

median Another average, representing the value of the middle case in a rank-ordered set of observations. If the ages of fi ve people are 16, 17, 20, 54, and 88, then the median is 20. (The mean is 39.) See Chapter 11.

mode Still another average, representing the most frequently observed value or attribute. If a sample contains 1,000 residents of California, 275 from New Jersey, and 33 from Minnesota, then California is the modal category for residence. See Chapter 11.

multivariate analysis The analysis of the simultane-ous relationships among several variables. Examining simultaneously the effects of age, gender, and city of residence on robbery victimization is an example of multivariate analysis. See Chapter 11.

nominal measure A level of measurement that de-scribes a variable whose different attributes are only different, as distinguished from ordinal, interval, and ratio measures. Gender is an example of a nominal measure. See Chapter 4.

nomothetic A mode of causal reasoning that tries to explain a number of similar phenomena or situa-tions. Police crime analysts trying to explain patterns of auto thefts, burglaries, or some other offense use nomothetic reasoning. Compare with idiographic. See Chapters 1 and 3.

nonprobability sample A sample selected in some fashion other than those suggested by probability theory. Examples are purposive, quota, and snowball samples. See Chapter 6.

nonsampling error Imperfections of data quality that are a result of factors other than sampling error. Examples are misunderstandings of questions by re-spondents, erroneous recordings by interviewers and coders, and data entry errors. See Chapter 11.

null hypothesis In connection with hypothesis test-ing and tests of statistical signifi cance, the hypothesis that suggests there is no relationship between the variables under study. You may conclude that the two variables are related after having statistically rejected the null hypothesis. See Chapter 11.

well-known examples, reporting details on each homi-cide incident. Compare with summary-based measure.See Chapter 4.

independent variable An independent variable is presumed to cause or determine a dependent variable. If we discover that police cynicism is partly a function of years of experience, then experience is the indepen-dent variable and cynicism is the dependent variable. Note that any given variable might be treated as inde-pendent in one part of an analysis and dependent in another part of the analysis. Cynicism might become an independent variable in the explanation of job sat-isfaction. See Chapters 1 and 5.

inductive reasoning Uses the logical model in which general principles are developed from specifi c observa-tions. Having noted that teenagers and crime victims are less supportive of police than older people and nonvictims are, you might conclude that people with more direct police contact are less supportive of police and explain why. That would be an example of induc-tion. See Chapters 1 and 2. See also deductive reasoning.

inferential statistics The body of statistical compu-tations relevant to making inferences from fi ndings on the basis of sample observations to some larger popu-lation. See Chapter 11. See also descriptive statistics.

internal validity Whether observed associations be-tween two (or more) variables are, in fact, causal as-sociations or are due to the effects of some other vari-able. The internal validity of causal statements may be threatened by an inability to control experimen-tal conditions. See Chapters 3 and 5. See also validity threats.

interval measure A level of measurement that de-scribes a variable whose attributes are rank ordered and have equal distances between adjacent attributes. The Fahrenheit temperature scale is an example of this because the distance between 17 and 18 is the same as that between 89 and 90. See Chapter 4. See also nomi-nal measure, ordinal measure, and ratio measure.

latent content As used in connection with content analysis, this term describes the underlying meaning of communications as distinguished from their mani-fest content. See Chapter 9.

level of signifi cance In the context of tests of sta-tistical signifi cance, the degree of likelihood that an observed, empirical relationship could be attributable to sampling error. A relationship is signifi cant at the .05 level if the likelihood of its being only a function of sampling error is no greater than 5 out of 100. See Chapter 11.

longitudinal study A study design that involves the collection of data at different points in time, as con-

Glossary 317

pursued is an example of problem analysis. Compare with evaluation research. See Chapter 10.

problem solving An example of applied research that combines elements of evaluation and policy anal-ysis. The most widely known approach to problem solving in policing is the SARA model, which stands for scanning, analysis, response, and assessment. See Chapter 10.

process evaluation A type of applied research that seeks to determine whether a public program was implemented as intended. For example, a burglary prevention program might seek to reduce burglar-ies by having crime prevention offi cers meet with all residents of some target neighborhood. A process evaluation would determine whether meetings with neighborhood residents were taking place as planned. Compare with impact assessment. See Chapter 10.

prospective approach A type of longitudinal study that follows subjects forward in time. “How many people who were sexually abused as children are con-victed of a sexual offense as an adult?” is an example of a prospective question. Compare with retrospective research. See Chapter 3.

purposive sample A type of nonprobability sample in which you select the units to be observed on the basis of your own judgment about which ones will be best suited to your research purpose. For example, if you were interested in studying community crime prevention groups affi liated with public schools and groups affi liated with religious organizations, you would probably want to select a purposive sample of school- and church-affi liated groups. Most television networks use purposive samples of voting precincts to project winners on election night; precincts that al-ways vote for winners are sampled. See Chapter 6.

quasi-experiment A research design that includes most, but not all, elements of an experimental de-sign. Quasi means “sort of,” and a quasi-experiment is sort of an experiment. Two general classes of quasi-experiments are nonequivalent groups and time-series designs. Compare with classical experiment. See Chapter 5.

questionnaire A document that contains questions and other types of items designed to solicit informa-tion appropriate to analysis. Questionnaires are used primarily in survey research and also in fi eld research. See Chapter 7.

quota sample A type of nonprobability sample in which units are selected in the sample on the basis of prespecifi ed characteristics, so that the total sample will have the same distribution of characteristics as are assumed to exist in the population being studied. See Chapter 6.

open-ended questions Questions for which the re-spondent is asked to provide his or her own answers. Chapter 7.

operational defi nition Specifying what operations should be performed to measure a concept. The op-erational defi nition of prior record might be “Consult the county (or state or FBI) criminal history records information system. Count the number of times a person has been convicted of committing a crime.” See Chapter 4.

operationalization One step beyond conceptualiza-tion. Operationalization is the process of developing operational defi nitions by describing how actual mea-surements will be made. See Chapters 3 and 4.

ordinal measure A level of measurement that de-scribes a variable whose attributes may be rank or-dered along some dimension. An example is socio-economic status as composed of the attributes high, medium, and low. See Chapter 4. See also nominal mea-sure, interval measure, and ratio measure.

panel study A type of longitudinal study in which data are collected from the same subjects (the panel) at several points in time. See Chapter 3.

population All people, things, or other elements we wish to represent. Researchers often study only a sub-set or sample of a population, then generalize from the people, things, or other elements actually observed to the larger population of all people, things, or ele-ments. See Chapter 6.

population parameter The summary description of a particular variable in the population. For example, if the mean age of all professors at your college is 43.7, then 43.7 is the population parameter for professors’ mean age. Compare with sample statistic and samplingdistribution. See Chapter 6.

probabilistic A type of causal reasoning that certain factors make outcomes more or less likely to hap-pen. Having been arrested as a juvenile makes it more likely that someone will be arrested as an adult. See Chapter 3.

probability sample The general term for a sample selected in accord with probability theory, typically involving some random selection mechanism. Spe-cifi c types of probability samples include area prob-ability sample, equal probability of selection method (EPSEM), simple random sample, and systematic sample. See Chapter 6.

problem analysis Using social science research methods to assess the scope and nature of a problem, to plan and select actions to address the problem. For example, examining patterns of auto theft to decide what preventive and enforcement strategies should be

318 Glossary

number of samples from a single population. With random sampling, we expect that the sampling distri-bution for a particular statistic (mean age, for exam-ple) will cluster around the population parameter for mean age. Furthermore, sampling distributions for larger sample sizes will cluster more tightly around the population parameter. See Chapter 6.

sampling frame That list or quasi-list of units com-posing a population from which a sample is selected. If the sample is to be representative of the population, it is essential that the sampling frame include all (or nearly all) members of the population. See Chapter 6.

sampling units Like sampling elements, these are things that may be selected in the process of sampling; often sampling units are people. In some types of sampling, however, we often begin by selecting large groupings of the eventual elements we will analyze. Sampling units is a generic term for things that are se-lected in some stage of sampling but are not necessar-ily the objects of our ultimate interest. See Chapter 6.

scientifi c realism An approach to evaluation that studies what’s called “local causality.” Interest focuses more on how interventions and measures of effect are related in a specifi c situation. This is different from a more traditional social science interest in fi nding causal relationships that apply generally to a variety of situations. As explained by Ray Pawson and Nick Tilley (1997), scientifi c realism is especially useful for evaluating justice programs because it centers on ana-lyzing interventions in local contexts. See Chapters 3 and 10.

secondary analysis A form of research in which the data collected and processed by one researcher are reanalyzed— often for a different purpose—by an-other. This is especially appropriate in the case of sur-vey data. Data archives are repositories, or libraries, for the storage and distribution of data for secondary analysis. See Chapter 9.

self-report survey Self-report surveys ask people to tell about crimes they have committed. This method is best for measuring drug use and other so-called vic-timless crimes. Confi dentiality is especially important in self-report surveys. See Chapters 4 and 7.

simple random sample A type of probability sam-ple in which the units composing a population are assigned numbers, a set of random numbers is then generated, and the units that have those numbers are included in the sample. Although probability theory and the calculations it provides assume this basic sampling method, it is seldom used for practical rea-sons. An alternative is the systematic sample (with a random start). See Chapter 6.

randomization A technique for randomly assign-ing experimental subjects to experimental groups and control groups. See Chapter 5.

range A measure of dispersion, the distance that separates the highest and lowest values of a variable in some set of observations. In your class, for example, the range of ages might be from 17 to 37. See Chapter 11.

ratio measure A level of measurement that describes a variable whose attributes have all the qualities of nom-inal, ordinal, and interval measures and in addition are based on a true zero point. Length of prison sen-tence is an example of a ratio measure. See Chapter 4.

reliability That quality-of-measurement standard whereby the same data would have been collected each time in repeated observations of the same phenome-non. We would expect that the question, “Did you see a police offi cer in your neighborhood today?” would have higher reliability than the question, “About how many times in the past six months have you seen a po-lice offi cer in your neighborhood?” This is not to be confused with validity. See Chapter 4.

replication Repeating a research study to test the fi ndings of an earlier study, often under slightly dif-ferent conditions or for a different group of subjects. Replication results either support earlier fi ndings or cause us to question the accuracy of an earlier study. See Chapter 1.

retrospective research A type of longitudinal study that looks backward, asking subjects to recall events that happened earlier in their lives or tracing offi cial records of someone’s previous actions. “How many current sex offenders were sexually abused as chil-dren?” is a retrospective question. Compare with pro-spective approach. See Chapter 3.

sample A subset of a population selected according to one or more criteria. Two general types are prob-ability and nonprobability samples. See Chapter 6.

sample element The unit about which informa-tion is collected and that provides the basis of analy-sis. Typically, in survey research, elements are people. Other kinds of units can be the elements for criminal justice research— correctional facilities, gangs, police beats, or court cases. See Chapter 6.

sample statistic The summary description of a par-ticular variable in a sample. For example, if the mean age of a sample of 100 professors on your campus is 41.1, then 41.1 is the sample statistic for professor age. We usually use sample statistics to estimate popu-lation parameters. Compare with sampling distribution.See Chapter 6.

sampling distribution The range, or array, of sam-ple statistics we would obtain if we drew a very large

Glossary 319

Within certain constraints, systematic sampling is a functional equivalent of simple random sampling and is usually easier to do. Typically the fi rst unit is selected at random. See Chapter 6.

test of statistical signifi cance A class of statistical computations that indicate the likelihood that the re-lationship observed between variables in a sample can be attributed to sampling error only. See Chapter 11. See also inferential statistics.

theory A theory is a systematic explanation for the observed facts and laws that relate to a particular as-pect of life. For example, routine activities theory (see Cohen and Felson 1979) explains crime as the result of three key elements coming together: a suitable vic-tim, a motivated offender, and the absence of capable guardians. See Chapter 1.

trend study A type of longitudinal study in which a given characteristic of some population is monitored over time. An example is the series of annual Uniform Crime Report totals for a jurisdiction. See Chapter 3.

typology Classifying observations in terms of their attributes. Sometimes referred to as taxonomies, ty-pologies are typically created with nominal variables. For example, a typology of thieves might group them according to the types of cars they steal and the types of locations they search to fi nd targets. See Chapter 4.

units of analysis The what or who being studied. Units of analysis may be individual people, groupings of people (a juvenile gang), formal organizations (a probation department), or social artifacts (crime re-ports). See Chapter 3.

univariate analysis The analysis of a single variable for purposes of description. Frequency distributions, averages, and measures of dispersion are examples of univariate analysis, as distinguished from bivariate and multivariate analyses. See Chapter 11.

validity (1) Whether statements about cause and ef-fect are true (valid) or false (invalid). See Chapters 3 and 5. See also validity threats. (2) A descriptive term used for a measure that accurately refl ects what it is intended to measure. For example, police records of auto theft are more valid measures than police records of shoplifting. It is important to realize that the ul-timate validity of a measure can never be proved. Yet we may agree as to its relative validity on the basis of face validity, criterion-related validity, and construct validity. This must not be confused with reliability. See Chapter 4.

validity threats Possible sources of invalidity, or making false statements about cause and effect. Four categories of validity threats are linked to fundamen-tal requirements for demonstrating cause: statistical

snowball sampling A method for drawing a non-probability sample. Snowball samples are often used in fi eld research. Each person interviewed is asked to suggest additional people for interviewing. See Chapters 6 and 8.

stakeholders Individuals with some interest, or stake, in a specifi c program. Any particular program may have multiple stakeholders with different inter-ests and goals. See Chapter 10.

standard deviation A measure of dispersion about the mean. Conceptually, the standard deviation repre-sents an average deviation of all values relative to the mean. See Chapter 11.

standard error A measure of sampling error, the standard error gives us a statistical estimate of how much a member of a sample might differ from the population we are studying, solely by chance. Larger samples usually result in smaller standard errors. See Chapters 6 and 11.

statistical conclusion validity Whether we can fi nd covariation among two variables. This is the fi rst of three requirements for causal inference (see Chapter 3 for the other two). If two variables do not vary to-gether (covariation), there cannot be a causal relation-ship between them. See Chapters 3 and 5 for more on statistical conclusion validity. Chapter 11 describes the role of sample size in fi nding statistical signifi -cance, which is conceptually related to statistical con-clusion validity.

statistical signifi cance A general term for the unlike-liness that relationships observed in a sample could be attributed to sampling error alone. See Chapter 11. See also test of statistical signifi cance.

stratifi cation The grouping of the units composing a population into homogeneous groups (or strata) before sampling. This procedure, which may be used in conjunction with simple random, systematic, or cluster sampling, improves the representativeness of a sample, at least in terms of the stratifi cation variables. See Chapter 6.

summary-based measure Crime measures that re-port only total crimes for a jurisdiction or other small area are summary-based measures of crime. The FBI Uniform Crime Reports is one well-known summary measure. Compare with incident-based measure. See Chapter 4.

systematic sampling A method of probability sam-pling in which every kth unit in a list is selected for inclusion in the sample—for example, every 25th student in the college directory of students. We com-pute k (also called the sampling interval) by dividing the size of the population by the desired sample size.

320 Glossary

variables Logical groupings of attributes. The vari-able gender is made up of the attributes male and fe-male. See Chapter 1.

victim survey A sample survey that asks people about their experiences as victims of crime. Victim surveys are one way to measure crime, and they are especially valuable for getting information about crimes not re-ported to police. The National Crime Victimization Survey is an example. See Chapters 4 and 7.

conclusion validity, internal validity, construct validity, and external validity (see separate entries in this glossary). In general, statistical conclusion validity and internal validity are concerned with bias; construct validity and external validity are concerned with generaliza-tion. See Chapters 3 and 5.

variable-oriented research A research strategy whereby a large number of variables are studied for one or a small number of cases or subjects. Time-series designs and case studies are examples. See Chapter 5.

321

periment in Home Detention. Final Report to the National Institute of Justice. Indianapolis: Indiana University, School of Public and Environmental Affairs.

Baumer, Terry L., and Dennis Rosenbaum. 1982. Combating Retail Theft: Programs and Strategies. Boston: Butterworth.

Bennett, Trevor, and Katy Holloway. 2005. “Association Be-tween Multiple Drug Use and Crime.” International Jour-nal of Offender Therapy 49: 63– 81.

Bennis, Jason, Wesley G. Skogan, and Lynn Steiner. 2003. “The 2002 Beat Meeting Observation Study.” Commu-nity Policing Working Paper 26. Evanston, IL: Center for Policy Research, Northwestern University. www.north-western.edu/ipr/publications/policing.html (accessed April 17, 2008).

Berk, Richard A., Heather Ladd, Heidi Graziano, and Jong-Ho Baek. 2003. “A Randomized Experiment Testing Inmate Classifi cation Systems.” Criminology and Public Policy 2(2, March): 215–242.

Best, Joel. 2001. Damned Lies and Statistics: Untangling Numbers from the Media, Politicians, and Activists. Berkeley, CA: Uni-versity of California Press.

Bichler, Gisela, and Ronald V. Clarke. 1996. “Eliminating Pay Phone Toll Fraud at the Port Authority Bus Terminal in Manhattan.” In Preventing Mass Transit Crime, ed. Ronald V. Clarke. Crime Prevention Studies, vol. 6. Monsey, NY: Criminal Justice Press.

Block, Richard L., and Carolyn Rebecca Block. 1995. “Space, Place, and Crime: Hot Spot Areas and Hot Spot Places of Liquor-Related Crime.” In Crime and Place, ed. John E. Eck and David Weisburd. Crime Prevention Studies, vol. 4. Monsey, NY: Criminal Justice Press.

Blumberg, Stephen J., Julian V. Luke, and Marcie L. Cyna-mon. 2006. “Telephone Coverage and Health Survey Es-timates: Evaluating the Need for Concern About Wire-less Substitution.” American Journal of Public Health 96(6, June): 926–931.

Boba, Rachel. 2005. Crime Analysis and Crime Mapping. Thou-sand Oaks, CA: Sage Publications.

Braga, Anthony A. 2002. Problem-Oriented Policing and Crime Prevention. Monsey, NY: Criminal Justice Press.

Braga, Anthony A., David M. Kennedy, Elin J. Waring, and Anne Morrison Piehl. 2001. “Problem-Oriented Polic-ing and Youth Violence: An Evaluation of Boston’s Op-eration Ceasefi re.” Journal of Research in Crime and Delin-quency 38(3, August): 195–225.

Brantingham, Patricia L., and Paul J. Brantingham. 1991. “Notes on the Geometry of Crime.” In EnvironmentalCriminology, 2d ed., ed. Paul J. Brantingham and Patricia L. Brantingham. Prospect Heights, IL: Waveland.

Bratton, William J. 1999. “Great Expectations: How Higher Expectations for Police Departments Can Lead to a Decrease in Crime.” In Measuring What Matters, Proceed-ings from the Policing Research Institute Meetings, ed. Robert

Academy of Criminal Justice Sciences. 2000. “Code of Eth-ics.” Greenbelt, MD: Academy of Criminal Justice Sci-ences. www.acjs.org/pubs/167_671_2922.cfm (accessed May 23, 2008).

Als-Nielsen, Bodil, Wendong Chen, Christian Gluud, and Lise L. Kjaergard. 2003. “Association of Funding and Conclusions in Randomized Drug Trials: A Refl ection of Treatment Effect or Adverse Events?” Journal of the Ameri-can Medical Association 290(7, August): 921–927.

American Association of University Professors. 2001. Pro-tecting Human Beings: Institutional Review Boards and So-cial Science Research. AAUP Redbook. Washington, DC: American Association of University Professors. http://findart icles .com /p/art icles /mi_qa3860/ is_/ai_n8939770 (accessed May 23, 2008).

American Psychological Association. 2002. Ethical Principles of Psychologists and Code of Conduct. Washington, DC: Ameri-can Psychological Association. www.apa.org/ethics (ac-cessed May 23, 2008).

American Sociological Association. 1997. “Code of Ethics.” Washington, DC: American Sociological Association. www.asanet.org/cs/root/leftnav/ethics/ethics (accessed May 23, 2008).

Anderson, Craig A., and Brad Bushman. 2002. “The Effects of Media Violence on Society.” Science 295(March 29): 2377–2379.

Andresen, W. Carsten. 2005. State Police: Discretion and Traffi c Enforcement. Unpublished Ph.D. dissertation. Newark, NJ: School of Criminal Justice, Rutgers University.

Babbie, Earl. 1990. Survey Research Methods. 2nd ed. Belmont, CA: Wadsworth.

Babbie, Earl. 1994. The Sociological Spirit. Belmont, CA: Wadsworth.

Babbie, Earl, Fred Halley, and Jeanne Zaino. 2007. Adventuresin Social Research. 6th ed. Newbury Park, CA: Pine Forge Press.

Baron, Stephen W., and Timothy F. Hartnagel. 1998. “Street Youth and Criminal Violence.” Journal of Research in Crime and Delinquency 35(2): 166–192.

Baum, Katrina. 2006. Identity Theft, 2004. BJS Bulletin. Wash-ington, DC: U.S. Department of Justice, Offi ce of Justice Programs, Bureau of Justice Statistics (April).

Baumer, Eric, Janet L. Lauritsen, Richard Rosenfeld, and Richard Wright. 1998. “The Infl uence of Crack Cocaine on Robbery, Burglary, and Homicide Rates: A Cross-City Longitudinal Analysis.” Journal of Research in Crime and Delinquency 35: 316–340.

Baumer, Terry L., Michael G. Maxfi eld, and Robert I. Men-delsohn. 1993. “A Comparative Analysis of Three Elec-tronically Monitored Home Detention Programs.” Justice Quarterly 10: 121–142.

Baumer, Terry L., and Robert I. Mendelsohn. 1990. The Elec-tronic Monitoring on Non-Violent Convicted Felons: An Ex-

References

322 References

Langworthy. Washington, DC: U.S. Department of Jus-tice, Offi ce of Justice Programs, National Institute of Justice.

Brown, Rick, and Ronald V. Clarke. 2004. “Police Intelligence and Theft of Vehicles for Export: Recent U.K. Experi-ence.” In Understanding and Preventing Car Theft, ed. Mi-chael G. Maxfi eld and Ronald V. Clarke. Crime Prevention Studies, vol. 17. Monsey, NY: Criminal Justice Press.

Bureau of Justice Assistance. 1993. A Police Guide to Surveying Citizens and Their Environment. Washington, DC: U.S. De-partment of Justice, Offi ce of Justice Programs, Bureau of Justice Assistance. NCJ-143711.

Bureau of Justice Statistics. 1993. Performance Measures for the Criminal Justice System. Discussion Papers from the BJS-Princeton Project. Washington, DC: U.S. Department of Justice, Offi ce of Justice Programs, Bureau of Justice Statistics.

Bureau of Justice Statistics. 2002. “Data Quality Guidelines.” Washington, DC: U.S. Department of Justice, Offi ce of Justice Programs, Bureau of Justice Statistics (October).

Bureau of Justice Statistics. 2006. Criminal Victimization in the United States, 2004 Statistical Tables. Washington, DC: U.S. Department of Justice, Offi ce of Justice Programs, Bureau of Justice Statistics. www.ojp.usdoj.gov/bjs/abstract/cvusst.htm (accessed April 17, 2008).

Campbell, Donald T. 2003. “Introduction.” In Case Study Re-search: Design and Methods, 3d ed., ed. Robert K. Yin. Thou-sand Oaks, CA: Sage Publications.

Campbell, Donald T., and Julian Stanley. 1966. Experimental and Quasi-Experimental Designs for Research. Chicago: Rand McNally.

Cantor, David, and James P. Lynch. 2005. “Exploring the Effects of Changes in Design on the Analytical Uses of the NCVS Data.” Journal of Quantitative Criminology 21: 293–319.

Casady, Tom. 1999. “Privacy Issues in the Presentation of Geocoded Data.” Crime Mapping News 1(3, Summer). Washington, DC: Police Foundation.

Chaiken, Jan M., and Marcia R. Chaiken. 1982. Varieties of Criminal Behavior. Santa Monica, CA: Rand.

Chaiken, Jan M., and Marcia R. Chaiken. 1990. “Drugs and Predatory Crime.” In Crime and Justice: A Review of Re-search: Vol. 13, Drugs and Crime, ed. Michael Tonry and James Q. Wilson. Chicago: University of Chicago Press.

Chainey, Spencer, and Jerry Ratcliffe. 2005. GIS and Crime Mapping. New York: Wiley.

Chamard, Sharon. 2006. Partnering with Businesses to Address Public Safety Problems. Problem-Solving Tools Series, no. 5. Washington, DC: U.S. Department of Justice, Offi ce of Community Oriented Policing Services. www.popcenter.org/tools/partnering/ (accessed May 23, 2008).

Chermak, Steven M., and Alexander Weiss. 1997. “The Ef-fects of the Media on Federal Criminal Justice Policy.” Criminal Justice Policy Review 8: 323–341.

Chicago Community Policing Evaluation Consortium. 2003. Community Policing in Chicago, Years Eight and Nine. Chi-cago: Illinois Criminal Justice Information Authority.

Chicago Community Policing Evaluation Consortium. 2004. Community Policing in Chicago, Year Ten. Chicago: Illinois Criminal Justice Information Authority.

Clarke, Ronald, Paul Ekblom, Mike Hough, and Pat Mayhew. 1985. “Elderly Victims of Crime and Exposure to Risk.” Howard Journal 24(1): 1–9.

Clarke, Ronald V. 1996. “The Distribution of Deviance and Exceeding the Speed Limit.” British Journal of Criminology36: 169–181.

Clarke, Ronald V. 1997a. “Deterring Obscene Phone Callers: The New Jersey Experience.” In Situational Crime Preven-tion: Successful Case Studies, 2d ed., ed. Ronald V. Clarke. New York: Harrow and Heston.

Clarke, Ronald V. 1997b. “Introduction.” In Situational Crime Prevention: Successful Case Studies, 2d ed., ed. Ronald V. Clarke. New York: Harrow and Heston.

Clarke, Ronald V., and John Eck. 2005. Crime Analysis for Prob-lem Solvers in 60 Small Steps. Washington, DC: U.S. De-partment of Justice, Offi ce of Community Oriented Po-licing. www.popcenter.org/learning/60Steps/ (accessed May 23, 2008).

Clarke, Ronald V., and Patricia M. Harris. 1992. “Auto Theft and Its Prevention.” In Crime and Justice: An Annual Re-view of Research, vol. 16, ed. Michael Tonry. Chicago: Uni-versity of Chicago Press.

Clarke, Ronald V., and Phyllis A. Schultze. 2005. Research-ing a Problem. Response Guides. Washington, DC: U.S. Department of Justice, Offi ce of Community Oriented Policing Services. www.cops.usdoj.gov/fi les/ric/Publica-tions/e02052729.pdf (accessed May 23, 2008).

Cohen, Jacqueline, and Jens Ludwig. 2003. “Policing Crime Guns.” In Evaluating Gun Policy: Effects of Crime and Vio-lence, ed. Jens Ludwig and Philip J. Cook. Washington, DC: Brookings Institution Press.

Cohen, Lawrence E., and Marcus Felson. 1979. “Social Change and Crime Rate Trends: A Routine Activity Ap-proach.” American Sociological Review 44: 588–608.

Coleman, Veronica, et al. 1999. “Using Knowledge and Team-work to Reduce Crime.” National Institute of Justice Journal(October).

Committee on Science, Engineering, and Public Policy. 1995. On Being a Scientist: Responsible Conduct in Research. Wash-ington, DC: National Academy Press.

Cullen, Francis T., and Paul Gendreau. 2000. “Assessing Cor-rectional Rehabilitation: Policy, Practice, and Prospects.” In Policies, Processes, and Decisions of the Criminal Justice Sys-tem, ed. Julie Horney. Criminal Justice 2000, vol. 3. Wash-ington, DC: U.S. Department of Justice, Offi ce of Justice Programs, National Institute of Justice.

Cullen, Francis T., and Jody L. Sundt. 2003. “Reaffi rming Ev-idence-Based Corrections.” Criminology and Public Policy2(2, March): 353–358.

D’Alessio, Stewart J., Lisa Stolzenberg, and W. Clinton Terry. 1999. “‘Eyes on the Street’: The Impact of Tennessee’s Emergency Cellular Telephone Program on Alcohol Re-lated Crashes.” Crime and Delinquency 45(4): 453– 466.

References 323

Elliott, Delbert S., David Huizinga, and Suzanne S. Ageton. 1985. Explaining Delinquency and Drug Use. Thousand Oaks, CA: Sage Publications.

Engel, Robin Shepard, Jennifer M. Calnon, Lin Liu, and Rich-ard Johnson. 2004. Project on Police-Citizen Contacts: Year 1 Final Report. Cincinnati OH: Criminal Justice Research Center, University of Cincinnati.

Fabelo, Tony. 1997. “The Critical Role of Policy Research in Developing Effective Correctional Policies.” Corrections Management Quarterly 1(1): 25–31.

Fagan, Jeffrey, Frank E. Zimring, and June Kim. 1998. “De-clining Homicide in New York: A Tale of Two Trends.” Journal of Criminal Law and Criminology 88: 1277–1324.

Faggiani, Donald, and Colleen McLaughlin. 1999. “Using National Incident-Based Reporting System Data for Strategic Crime Analysis.” Journal of Quantitative Crimi-nology 15(2, June).

Farrell, Graham, Alan Edmunds, Louise Hobbs, and Gloria Laycock. 2000. RV Snapshot: UK Policing and Repeat Victimi-sation. Crime Reduction Research Series. Policing and Reducing Crime Unit, Paper 5. London: Home Offi ce.

Farrington, David P., et al. 2003. “Comparing Delinquency Careers in Court Records and Self-Reports.” Criminology41: 933–958.

Farrington, David P., Patrick A. Langan, Michael Tonry, and Darrick Jolliffe. 2004. “Introduction.” In Cross-National Studies in Crime and Justice, ed. David P. Farrington, Pat-rick A. Langan, and Michael Tonry. Washington, DC: U.S. Department of Justice, Offi ce of Justice Programs, Bureau of Justice Statistics.

Farrington, David P., Lloyd E. Ohlin, and James Q. Wilson. 1986. Understanding and Controlling Crime: Toward a New Research Strategy. New York: Springer-Verlag.

Federal Bureau of Investigation. 2000. National Incident-Based Reporting System: Vol. 1, Data Collection Guidelines. Wash-ington, DC: U.S. Department of Justice, Federal Bureau of Investigation (August).

Federal Bureau of Investigation. 2005. Crime in the United States 2004. Washington, DC: U.S. Department of Jus-tice, Federal Bureau of Investigation. www.fbi.gov/ucr/cius_04/ (accessed May 23, 2008).

Federal Bureau of Investigation. 2007. Crime in the United States 2006. Washington, DC: U.S. Department of Jus-tice, Federal Bureau of Investigation. www.fbi.gov/ucr/cius2006/ (accessed May 23, 2008).

Felson, Marcus. 2002. Crime and Everyday Life. 3d ed. Thou-sand Oaks, CA: Sage Publications.

Felson, Marcus, and Ronald V. Clarke. 1998. Opportunity Makes the Thief: Practical Theory for Crime Prevention. Police Research Series, Paper 98. London: Policing and Reduc-ing Crime Unit, Home Offi ce Research, Development and Statistics Directorate.

Felson, Marcus, et al. 1996. “Redesigning Hell: Preventing Crime and Disorder at the Port Authority Bus Terminal.” In Preventing Mass Transit Crime, ed. Ronald V. Clarke. Crime Prevention Studies, vol. 6. Monsey, NY: Criminal Jus-tice Press.

Decker, Scott H. 2005. Using Offender Interviews to Inform Police Problem Solving. Problem Solving Tools Series. Washing-ton, DC: U.S. Department of Justice, Offi ce of Commu-nity Oriented Policing Services.

Decker, Scott H., and Barrik Van Winkle. 1996. Life in the Gang: Family, Friends, and Violence. New York: Cambridge University Press.

Dennis, Michael L. 1990. “Assessing the Validity of Random-ized Field Experiments: An Example from Drug Abuse Treatment Research.” Evaluation Review 14: 347–373.

Devine, Joel A., and James D. Wright. 1993. The Greatest of Evils: Urban Poverty and the American Underclass. Haw-thorne, NY: Aldine de Gruyter.

Dillman, Don A. 2006. Mail and Internet Surveys: The Tailored Design Method 2007 Update. 2d ed. New York: Wiley.

Dinkes, Rachel, Emily Forrest Cataldi, Wendy Lin-Kelly, and Thomas D. Snyder. 2007. Indicators of School Crime and Safety: 2007. Washington, DC: National Center for Edu-cation Statistics, Institute of Education Sciences, U.S. Department of Education, and Bureau of Justice Statis-tics, Offi ce of Justice Programs, U.S. Department of Jus-tice (December).

Ditton, Jason, and Stephen Farrall. 2007. “The British Crime Survey and the Fear of Crime.” In Surveying Crime in the 21st Century, ed. Mike Hough and Mike Maxfi eld. Crime Prevention Studies, vol. 22. Monsey, NY: Criminal Justice Press.

Duhart, Detis T. 2001. Violence in the Workplace, 1993–99. BJS Special Report. Washington, DC: U.S. Department of Justice, Offi ce of Justice Programs, Bureau of Justice Sta-tistics (December).

Durose, Matthew R., Erica L. Schmitt, and Patrick A. Langan. 2005. Contacts Between Police and the Public: Findings from the 2002 National Survey. Washington, DC: U.S. Depart-ment of Justice, Offi ce of Justice Programs, Bureau of Justice Statistics (April).

Eck, John E. 2002. “Learning from Experience in Problem-Oriented Policing and Situational Prevention: The Posi-tive Functions of Weak Evaluations and the Negative Functions of Strong Ones.” In Evaluation for Crime Pre-vention, ed. Nick Tilley. Crime Prevention Studies, vol. 14. Monsey, NY: Criminal Justice Press.

Eck, John E. 2003a. Assessing Responses to Problems: An Intro-ductory Guide for Police Problem-Solvers. Problem-Oriented Guides for Police. Washington, DC: U.S. Department of Justice, Offi ce of Community Oriented Policing Services.

Eck, John E. 2003b. “Police Problems: The Complexity of Problem Theory, Research and Evaluation.” In Problem-Oriented Policing: From Innovation to Mainstream, ed. Jo-hannes Knutsson. Crime Prevention Studies, vol. 15. Mon-sey, NY: Criminal Justice Press.

Eck, John, et al. 2005. Mapping Crime: Understanding Hot Spots.NIJ Special Report. Washington, DC: U.S. Department of Justice, Offi ce of Justice Programs, National Institute of Justice.

Eisenberg, Michael. 1999. Three Year Recidivism Tracking of Of-fenders Participating in Substance Abuse Treatment Programs.Austin, TX: Criminal Justice Policy Council.

324 References

Gottfredson, Denise C., Stacy S. Najaka, Brook W. Kearly, and Carlos M. Rocha. 2006. “Long-Term Effects of Par-ticipation in the Baltimore City Drug Treatment Court: Results from an Experimental Study.” Journal of Experi-mental Criminology 2: 67–98.

Gottfredson, Michael R., and Travis Hirschi. 1987. “The Methodological Adequacy of Longitudinal Research on Crime.” Criminology 25: 581–614.

Gottfredson, Michael R., and Travis Hirschi. 1990. A General Theory of Crime. Stanford, CA: Stanford University Press.

Greenwood, Peter. 1975. The Criminal Investigation Process.Santa Monica, CA: Rand Corporation.

Gurr, Ted Robert. 1989. “Historical Trends in Violent Crime: Europe and the United States.” In Violence in America: The History of Crime, ed. Ted Robert Gurr. Thousand Oaks, CA: Sage Publications.

Hall, Richard A. Spurgeon, Carolyn Brown Dennis, and Tere L. Chipman. 1999. The Ethical Foundations of Criminal Jus-tice. Boca Raton, FL: CRC Press.

Haney, Craig, Curtis Banks, and Philip Zimbardo. 1973. “In-terpersonal Dynamics in a Simulated Prison.” Interna-tional Journal of Criminology and Penology 1: 69–97.

Haninger, Kevin, and Kimberly M. Thompson. 2004. “Con-tent and Ratings of Teen-Rated Video Games.” Jour-nal of the American Medical Association 21(7, February): 856– 865.

Hanmer, Jalna, Sue Griffi ths, and David Jerwood. 1999. Ar-resting Evidence: Domestic Violence and Repeat Victimisation.Police Research Series, Paper 104. London: Home Offi ce.

Harries, Keith. 1999. Mapping Crime: Principle and Practice.Washington, DC: U.S. Department of Justice, Offi ce of Justice Programs, National Institute of Justice.

Harrison, Paige M., and Allen J. Beck. 2005. Prisoners in 2004.BJS Bulletin. Washington, DC: U.S. Department of Jus-tice, Offi ce of Justice Programs, Bureau of Justice Statis-tics (October).

Hawkins, J. D., T. I. Herrenkohl, D. B. Farrington, F. Brewer, R. F. Catalano, T. W. Harachi, et al. 2000. “Predictors of Youth Violence.” Juvenile Justice Bulletin 4: 1–11.

Heeren, Timothy, Robert A. Smith, Suzette Morelock, and Ralph W. Hingson. 1985. “Surrogate Measures of Alcohol Involvement in Fatal Crashes: Are Conventional Indica-tors Adequate?” Journal of Safety Research 16: 127–134.

Hempel, Carl G. 1952. “Fundamentals of Concept Forma-tion in Empirical Science.” In International Encyclopedia of Unifi ed Science: Foundations of the Unity of Science. Chicago: University of Chicago Press.

Hesseling, Rene B. P. 1994. “Displacement: A Review of the Empirical Literature.” In Crime Prevention Studies, vol. 3, ed. Ronald V. Clarke. Monsey, NY: Criminal Justice Press.

Heumann, Milton, and Colin Loftin. 1979. “Mandatory Sen-tencing and the Abolition of Plea Bargaining: The Michi-gan Felony Firearm Statute.” Law and Society Review 13: 393– 430.

Hewitt, Hugh. 2005. Blog: Understanding the Information Refor-mation That’s Changing Your World. Nashville, TN: Nelson Books.

Felson, Richard B., Steven F. Messner, Anthony W. Hoskin, and Glenn Deane. 2002. “Reasons for Reporting and Not Reporting Domestic Violence to the Police.” Criminology40(3, August): 617–647.

Finkelhor, David, and Lisa M. Jones. 2004. Explanations for the Decline in Child Sexual Abuse Cases. Juvenile Justice Bulle-tin. Washington, DC: U.S. Department of Justice, Offi ce of Justice Programs, Offi ce of Juvenile Justice and Delin-quency Prevention ( January).

Finkelhor, David, and Richard Ormrod. 2004, December. Child Pornography: Patterns from NIBRS. Juvenile Justice Bulletin. Washington, DC: U.S. Department of Justice, Offi ce of Justice Programs, Offi ce of Juvenile Justice and Delinquency Prevention.

Finkelstein, Michael O., and Bruce Levin. 2001. Statistics for Lawyers. 2nd ed. New York: Springer-Verlag.

Finn, Peter, and Andrea K. Newlyn. 1993. Miami’s Drug Court: A Different Approach. Program Focus. Washington, DC: U.S. Department of Justice, Offi ce of Justice Programs, National Institute of Justice.

Forster, Emma, and Alison McCleery. 1999. “Computer As-sisted Personal Interviewing: A Method of Capturing Sensitive Information.” IASSIST Quarterly 23(2, Sum-mer): 26–38. www.iassistdata.org/publications/iq/ (ac-cessed May 23, 2008).

Gaes, Gerald G., Scott D. Camp, Julianne B. Nelson, and Wil-liam G. Saylor. 2004. Measuring Prison Performance: Gov-ernment Privatization and Accountability. Walnut Creek, CA: AltaMira Press.

Gant, Frances, and Peter Grabosky. 2001. The Stolen Vehicle Parts Market. Trends and Issues, no. 215. Can-berra, Australia: Australian Institute of Criminology (October).

Geerken, Michael R. 1994. “Rap Sheets in Criminological Re-search: Considerations and Caveats.” Journal of Quantita-tive Criminology 10: 3–21.

General Accounting Offi ce. 1991. Using Structured Interview-ing Techniques. Transfer Paper 10.1.5. Washington, DC: United States General Accounting Offi ce.

General Accounting Offi ce. 1996. Content Analysis: A Methodol-ogy for Structuring and Analyzing Written Material. Transfer Paper 10.3.1. Washington, DC: United States General Accounting Offi ce.

Gill, Martin, et al. 2005. Technical Annex: Methods Used in As-sessing the Impact of CCTV. Online Report 17/05. London: Home Offi ce Research, Development and Statistics Di-rectorate. www.homeoffi ce.gov.uk/rds/pdfs05/hors292.pdf (accessed May 23, 2008).

Gill, Martin, and Angela Spriggs. 2005. Assessing the Impact of CCTV. Home Offi ce Research Study, 292. London: Her Majesty’s Stationery Offi ce.

Glaser, Barney G., and Anselm Strauss. 1967. The Discovery of Grounded Theory. Chicago: University of Chicago Press.

Gottfredson, Denise C., Stacy S. Najaka, and Brook Kearly. 2003. “Effectiveness of Drug Treatment Courts: Evi-dence from a Randomized Trial.” Criminology and Public Policy 2(2, March): 171–196.

References 325

lence in New York City.” In The Crime Drop in America, ed. A. Blumstein and J. Wallman. New York: Cambridge Uni-versity Press.

Johnson, Ida M. 1999. “School Violence: The Effectiveness of a School Resource Offi cer Program in a Southern City.” Journal of Criminal Justice 27: 173–192.

Johnson, Kelly Dedel. 2006. Witness Intimidation. Problem-Oriented Guides for Police, no. 42. Washington, DC: U.S. Department of Justice, Offi ce of Community Ori-ented Policing Services.

Johnston, Lloyd D., Patrick M. O’Malley, Jerald G. Bach-man, and John E. Schulenberg. 2005. Monitoring the Fu-ture National Survey Results on Drug Use, 1975–2004: Vol. I, Secondary School Students. NIH Publication no. 05-5727. Bethesda, MD: National Institute on Drug Abuse. www.monitoringthefuture.org/pubs.html (accessed May 23, 2008).

Kalichman, Seth C. 2000. Mandated Reporting of Suspected Child Abuse: Ethics, Law, and Policy. 2d ed. Washington, DC: American Psychological Association.

Karmen, Andrew. 2000. New York Murder Mystery: The True Story Behind the Crime Crash of the 1990s. New York: NYU Press.

Kelling, George L., Tony Pate, Duane Dieckman, and Charles E. Brown. 1974. The Kansas City Preventive Patrol Ex-periment: A Technical Report. Washington, DC: Police Foundation.

Kennedy, David M. 1998. “Pulling Levers: Getting Deterrence Right.” National Institute of Justice Journal 236( July): 2– 8.

Kennedy, David M, Anne M. Piehl, and Anthony A. Braga. 1996. “Youth Gun Violence in Boston: Gun Markets, Serious Youth Offenders, and a Use Reduction Strategy.” Law and Contemporary Problems 59(1, Winter): 147–196.

Kennet, Joel, and Joseph Gfroerer, eds. 2005. Evaluating and Improving Methods Used in the National Survey on Drug Use and Health. Publication no. SMA 03-3768. DHHS Pub-lication no. SMA 05-4044, Methodology Series M-5. Rockville, MD: Offi ce of Applied Studies, Substance Abuse and Mental Health Services Administration. www.oas.samhsa.gov/nsduh/methods.cfm (accessed May 23, 2008).

Kershaw, Chris, et al. 2000. The 2000 British Crime Survey: England and Wales. Home Offi ce Statistical Bulletin. Lon-don: Home Offi ce Research, Development and Statistics Directorate.

Kessler, David A. 1999. “The Effects of Community Policing on Complaints Against Offi cers.” Journal of Quantitative Criminology 15(3): 333–372.

Killias, Martin. 1993. “Gun Ownership, Suicide and Homi-cide: An International Perspective.” Canadian Medical As-sociation Journal 148: 289–306.

Killias, Martin, Marcelo F. Aebi, and Denis Ribeaud. 2000. “Learning Through Controlled Experiments: Commu-nity Service and Heroin Prescription in Switzerland.” Crime and Delinquency 46: 233–251.

Kim, So Young, and Wesley G. Skogan. 2003. “Statistical Analysis of Time Series Data on Problem Solving.” Com-

Hindelang, Michael J., Michael R. Gottfredson, and James Ga-rofalo. 1978. Victims of Personal Crime: An Empirical Foun-dation for a Theory of Personal Victimization. Cambridge, MA: Ballinger.

Homel, Ross, and Jeff Clark. 1994. “The Prediction and Pre-vention of Violence in Pubs and Clubs.” In Crime Pre-vention Studies, vol 3, ed. Ronald V. Clarke. Monsey, NY: Criminal Justice Press.

Homel, Ross, Steve Tomsen, and Jennifer Thommeny. 1992. “Public Drinking and Violence: Not Just an Alcohol Problem.” Journal of Drug Issues 22: 679–697.

Hood-Williams, John, and Tracey Bush. 1995. “Domestic Violence on a London Housing Estate.” Research Bulletin37: 11–18.

Hoover, Kenneth R., and Todd Donovan. 2007. The Elements of Social Scientifi c Thinking. 9th ed. Belmont, CA: Wadsworth.

Hough, Mike and Mike Maxfi eld, eds. 2007. Surveying Crime in the 21st Century. Crime Prevention Studies. Monsey, NY: Criminal Justice Press.

Hunter, Rosemary S., and Nancy Kilstrom. 1979. “Breaking the Cycle in Abusive Families.” American Journal of Psy-chiatry 136: 1318–1322.

Idaho State Police. 2005. Crime in Idaho 2004. Meridian, ID: Idaho State Police, Bureau of Criminal Identifi cation, Uniform Crime Reporting Unit. www.isp.state.id.us/identification/ucr/2004/crime_in_Idaho_2004.html (accessed May 23, 2008).

Inciardi, James A. 1993. “Some Considerations on the Meth-ods, Dangers, and Ethics of Crack-House Research.” Ap-pendix A. In Women and Crack-Cocaine, James A. Inciardi, Dorothy Lockwood, and Anne E. Pottieger. New York: Macmillan.

Inciardi, James A., Dorothy Lockwood, and Anne E. Pottieger. 1993. Women and Crack-Cocaine. New York: Macmillan.

Jacob, Herbert. 1984. Using Published Data: Errors and Remedies.Thousand Oaks, CA: Sage Publications.

Jacobs, Bruce A. 1996. “Crack Dealers’ Apprehension Avoid-ance Techniques: A Case of Restrictive Deterrence.” Jus-tice Quarterly 13: 359–381.

Jacobs, Bruce A. 1999. Dealing Crack: The Social World of Street-corner Selling. Boston: Northeastern University Press.

Jacobs, Bruce A., and Jody Miller. 1998. “Crack Dealing, Gender, and Arrest Avoidance.” Social Problems 45(4): 550–569.

Jeffery, C. Ray. 1977. Crime Prevention Through Environmental Design. 2d ed. Thousand Oaks, CA: Sage Publications.

Johansen, Helle Krogh, and Peter C. Gotzsche. 1999. “Prob-lems in the Design and Reporting of Trials of Antifun-gal Agents Encountered During Meta-Analysis.” Journal of the American Medical Association 282(18, November): 1752–1759.

Johnson, Bruce D., et al. 1985. Taking Care of Business: The Economics of Crime by Heroin Abusers. Lexington, MA: Lexington.

Johnson, Bruce D., Andrew Golub, and Eloise Dunlap. 2000. “The Rise and Decline of Drugs, Drug Markets, and Vio-

326 References

Macintyre, Stuart, and Ross Homel. 1996. “Danger on the Dance Floor: A Study of the Interior Design, Crowding and Aggression in Nightclubs.” In Policing for Prevention: Reducing Crime, Public Intoxication, and Injury, ed. Ross Homel. Crime Prevention Studies, vol. 7. Monsey, NY: Crim-inal Justice Press.

MacKenzie, Doris Layton, Katherine Browning, Stacy B. Sk-roban, and Douglas A. Smith. 1999. “The Impact of Pro-bation on the Criminal Activities of Offenders.” Journal of Research in Crime and Delinquency 36(4): 423– 453.

MacKenzie, Doris Layton, James W. Shaw, and Voncile B. Gowdy. 1993. An Evaluation of Shock Incarceration in Louisi-ana. Research in Brief. Washington, DC: U.S. Department of Justice, Offi ce of Justice Programs, National Institute of Justice ( June).

Maher, Lisa. 1997. Sexed Work: Gender, Race, and Resistance in a Brooklyn Drug Market. Oxford, UK: Clarendon Press.

Maltz, Michael D. 1994. “Deviating from the Mean: The De-clining Signifi cance of Signifi cance.” Journal of Research in Crime and Delinquency 31: 434 – 463.

Maltz, Michael D. 2006. “Some P-Baked Thoughts (P > 0.5) on Experiments and Statistical Signifi cance.” Journal of Experimental Criminology 2: 211–226.

Maltz, Michael D., and Marianne W. Zawitz. 1998. DisplayingViolent Crime Trends Using Estimates from the National Crime Victimization Survey. Bureau of Justice Statistics Techni-cal Report. Washington, DC: U.S. Department of Justice, Offi ce of Justice Programs, Bureau of Justice Statistics.

Marshall, Catherine, and Gretchen B. Rossman. 1995. De-signing Qualitative Research. Thousand Oaks, CA: Sage Publications.

Marx, Karl. 1880. “Revue Socialist.” Reprinted 1964. In KarlMarx: Selected Writings in Sociology and Social Philosophy,ed. T. N. Bottomore and Maximilien Rubel. New York: McGraw-Hill.

Mastrofski, Stephen D., et al. 1998. Systematic Observation of Public Police: Applying Field Research Methods to Policy Issues.Research Report. Washington, DC: U.S. Department of Justice, Offi ce of Justice Programs, National Institute of Justice (December).

Mastrofski, Stephen D., and R. Richard Ritti. 1999. “Patterns of Community Policing: A View from Newspapers in the United States.” COPS Working Paper 2. Washington: U.S. Department of Justice, Offi ce of Community Ori-ented Policing Services.

Matz, David. 2007. “Development and Key Results from the First Two Waves of the Offending Crime and Justice Survey.” In Surveying Crime in the 21st Century, ed. Mike Hough and Mike Maxfi eld. Crime Prevention Studies,vol. 22. Monsey, NY: Criminal Justice Press.

Maxfi eld, Michael G. 1999. “The National Incident-Based Reporting System: Research and Policy Applications.” Journal of Quantitative Criminology 15(2, June): 119–149.

Maxfi eld, Michael G. 2001. Guide to Frugal Evaluation for Criminal Justice. Final Report to the National Institute of Justice. Washington, DC: U.S. Department of Justice, Offi ce of Justice Programs, National Institute of Justice.

munity Policing Working Paper 27. Evanston, IL: Cen-ter for Policy Research, Northwestern University. www.northwestern.edu/ipr/publications/policing.html (ac-cessed May 23, 2008).

Kish, Leslie. 1965. Survey Sampling. New York: Wiley.Klaus, Patsy. 2004. Carjacking, 1993–2002. BJS Special Report.

Washington, DC: U.S. Department of Justice, Offi ce of Justice Programs, Bureau of Justice Statistics ( July).

Krueger, Richard A., and Mary Anne Casey. 2000. FocusGroups: A Practical Guide for Applied Research. 3d ed. Thou-sand Oaks, CA: Sage Publications.

Lange, James E., Mark B. Johnson, and Robert B. Voas. 2005. “Testing the Racial Profi ling Hypothesis for Seemingly Disparate Traffi c Stops on the New Jersey Turnpike.” Jus-tice Quarterly 22: 193–223.

Langworthy, Robert, ed. 1999. Measuring What Matters. Pro-ceedings from the Policing Research Institute Meetings. Wash-ington, DC: U.S. Department of Justice, Offi ce of Justice Programs, National Institute of Justice.

Larson, Richard C. 1975. “What Happened to Patrol Opera-tions in Kansas City? A Review of the Kansas City Pre-ventive Patrol Experiment.” Journal of Criminal Justice 3: 267–297.

Lattimore, Pamela K., and Joanna R. Baker. 1992. “The Im-pact of Recidivism and Capacity on Prison Populations.” Journal of Quantitative Criminology 8: 189–215.

Laub, John H., and Robert J. Sampson. 2003. Shared Begin-nings, Divergent Lives: Delinquent Boys to Age 70. Cambridge, MA: Harvard University Press.

Lauritsen, Janet L., Robert J. Sampson, and John H. Laub. 1991. “The Link Between Offending and Victimization Among Adolescents.” Criminology 29: 265–292.

Lauritsen, Janet L., and Robin J. Schaum. 2005. Crime and Vic-timization in the Three Largest Metropolitan Areas, 1980 –98.Bureau of Justice Statistics Technical Report. Washing-ton, DC: U.S. Department of Justice, Offi ce of Justice Programs, Bureau of Justice Statistics.

Laycock, Gloria. 2002. “Methodological Issues in Working with Policy Advisers and Practitioners.” In Analysis for Crime Prevention, ed. Nick Tilley. Crime Prevention Studies,vol. 13. Monsey, NY: Criminal Justice Press.

Leiber, Michael J., and Jayne M. Stairs. 1999. “Race, Contexts, and the Use of Intake Diversion.” Journal of Research in Crime and Delinquency 36(1): 56– 86.

Lempert, Richard O. 1984. “From the Editor.” Law and Soci-ety Review 18: 505–513.

Levine, Robert. 1997. A Geography of Time: The Temporal Misad-ventures of a Social Psychologist. New York: Basic Books.

Lilly, J. Robert. 2006. “Issues Beyond Empirical EM Reports.” Criminology and Public Policy 5(1, February): 93–102.

Lineberry, Robert L. 1977. American Public Policy. New York: Harper and Row.

Loeber, Rolf, Magda Stouthamer-Loeber, Welmoet van Kam-men, and David P. Farrington. 1991. “Initiation, Esca-lation and Desistance in Juvenile Offending and Their Correlates.” Journal of Criminal Law and Criminology 82(1): 36– 82.

References 327

McVicker, Steve, and Roma Khanna. 2003. “DNA Find Sparks Call for Review: New Look at Policies in DA’s Of-fi ce Urged.” Houston Chronicle,15 March.

Merritt, Nancy, Terry Fain, and Susan Turner. 2006. “Or-egon’s Get Tough Sentencing Reform: A Lesson in Jus-tice System Adaptation.” Criminology and Public Policy5(1, February–March): 5–36.

Mieczkowski, Thomas. 1990. “Crack Distribution in De-troit.” Contemporary Drug Problems 17: 9–30.

Mieczkowski, Thomas M. 1996. “The Prevalence of Drug Use in the United States.” In Crime and Justice: An Annual Re-view of Research, ed. Michael Tonry. Chicago: University of Chicago Press.

Miller, Jane E. 2004. The Chicago Guide to Writing About Num-bers. Chicago: University of Chicago Press.

Miller, Joel. 2000. Profi ling Populations Available for Stops and Searches. Police Research Series, Paper 131. London: Po-licing and Reducing Crime Unit, Home Offi ce Research, Development and Statistics Directorate.

Mirrlees-Black, Catriona. 1995. “Estimating the Extent of Domestic Violence: Findings from the 1992 BCS.” Re-search Bulletin 37: 1–9.

Mirrlees-Black, Catriona. 1999. Domestic Violence: Findings from a New British Crime Survey Self-Completion Questionnaire.Home Offi ce Research Study. London: Home Offi ce Re-search, Development and Statistics Directorate.

Mirrlees-Black, Catriona, Pat Mayhew, and Andrew Percy. 1996. The 1996 British Crime Survey: England and Wales.Home Offi ce Statistical Bulletin. London: Home Offi ce Research, Development and Statistics Directorate.

Mitford, Jessica. 1973. Kind and Usual Punishment: The Prison Business. New York: Random House.

Monahan, John, et al. 1993. “Ethical and Legal Duties in Conducting Research on Violence: Lessons from the MacArthur Risk Assessment Study.” Violence and Victims8(4): 387–396.

Moore, Mark H., and Anthony Braga. 2003. The “Bottom Line” of Policing: What Citizens Should Value (and Measure) in Police Performance. Washington, DC: Police Executive Research Forum.

Mott, Joy, and Catriona Mirrlees-Black. 1995. Self-Reported Drug Misuse in England and Wales: Findings from the 1992 British Crime Survey. Research and Planning Unit Paper 89. London: Home Offi ce Research, Development and Statistics Directorate.

Murphy, Dean E. 2005. “Arrests Follow Searches in Medical Marijuana Raid.” New York Times, 23 June.

National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research. 1979. The Bel-mont Report: Ethical Principles and Guidelines for the Protec-tion of Human Subjects of Research. Washington, DC: U.S. Department of Health, Education, and Welfare.

National Institute of Justice. 2006. Research and Evaluation on the Abuse, Neglect, and Exploitation of Elderly Individu-als, Older Women, and Residents of Residential Care Facilities.Washington, DC: U.S. Department of Justice, Offi ce of Justice Programs, National Institute of Justice. www

www.ncjrs.org/pdffi les1/nij/187350.pdf (accessed May 23, 2008).

Maxfi eld, Michael G., and W. Carsten Andresen. 2002. Evalu-ation of New Jersey State Police In-Car Mobile Video Recording System. Final Report to the Offi ce of the Attorney Gen-eral. Newark, NJ: School of Criminal Justice, Rutgers University.

Maxfi eld, Michael G., and Terry L. Baumer. 1991. “Electronic Monitoring in Marion County Indiana.” OvercrowdedTimes 2: 5, 17.

Maxfi eld, Michael G., and Terry L. Baumer. 1992. “Home De-tention with Electronic Monitoring: A Nonexperimental Salvage Evaluation.” Evaluation Review 16: 315–332.

Maxfi eld, Mike, Mike Hough, and Pat Mayhew. 2007. “Sur-veying Crime in the 21st Century: Summary and Recom-mendations.” In Surveying Crime in the 21st Century, ed. Mike Hough and Mike Maxfi eld. Crime Prevention Studies,vol. 22. Monsey, NY: Criminal Justice Press.

Maxfi eld, Michael G., and George L. Kelling. 2005. New Jersey State Police and Stop Data: What Do We Know, What Should We Know, and What Should We Do? Newark, NJ: Police In-stitute at Rutgers–Newark (March).

Maxfi eld, Michael G., and Cathy Spatz Widom. 1996. “The Cycle of Violence: Revisited Six Years Later.” Archives of Pediatrics and Adolescent Medicine 150: 390–395.

Maxwell, Christopher D., Joel H. Garner, and Jeffrey A. Fa-gan. 2001. The Effects of Arrest on Intimate Partner Violence: New Evidence from the Spouse Assault Replication Program.Research in Brief. Washington, DC: U.S. Department of Justice, Offi ce of Justice Programs, National Institute of Justice ( July).

Maxwell, Joseph A. 2005. Qualitative Research Design: An In-teractive Approach. 2d ed. Thousand Oaks, CA: Sage Publications.

Mayhew, Patricia, Ronald V. Clarke, and David Elliott. 1989. “Motorcycle Theft, Helmet Legislation, and Displace-ment.” Howard Journal 28: 1– 8.

McCahill, Michael, and Clive Norris. 2003. “Estimating the Extent, Sophistication and Legality of CCTV in London.” In CCTV, ed. Martin Gill. Leicester: Perpetuity Press.

McCall, George J. 1978. Observing the Law: Field Methods in the Study of Crime and the Criminal Justice System. New York: Free Press.

McCleary, Richard. 1992. Dangerous Men: The Sociology of Pa-role. 2d ed. New York: Harrow and Heston.

McCleary, Richard, Barbara C. Nienstedt, and James M. Er-ven. 1982. “Uniform Crime Reports as Organizational Outcomes: Three Time Series Experiments.” Social Prob-lems 29: 361–372.

McDonald, Douglas C., and Christine Smith. 1989. Evaluat-ing Drug Control and System Improvement Projects. Wash-ington, DC: U.S. Department of Justice, Offi ce of Justice Programs, National Institute of Justice.

McGarrell, Edmund F., Steven Chermak, Alexander Weiss, and Jeremy Wilson. 2001. “Reducing Firearms Violence Through Directed Police Patrol.” Criminology and Public Policy 1(1): 119–148.

328 References

Crime Analysis for Local and Regional Action.” In Under-standing and Preventing Car Theft, ed. Michael G. Maxfi eld and Ronald V. Clarke. Crime Prevention Studies, vol. 17. Monsey, NY: Criminal Justice Press.

Pollard, Paul. 1995. “Pornography and Sexual Aggression.” Current Psychology 14: 200–221.

Pollock, Joycelyn M. 2003. Ethics in Crime and Justice: Dilemmas and Decisions. 4th ed. Belmont, CA: Wadsworth.

Posavec, Emil J., and Raymond G. Carey. 2002. Program Evalu-ation: Methods and Case Studies. 6th ed. Englewood Cliffs, NJ: Prentice Hall.

President’s Commission on Law Enforcement and Adminis-tration of Justice. 1967. The Challenge of Crime in a Free So-ciety. Washington, DC: U.S. Government Printing Offi ce.

Pudney, Stephen. 2002. The Road to Ruin? Sequences of Initia-tion Into Drug Use and Offending by Young People in Britain.Home Offi ce Research Study, 253. London: Her Maj-esty’s Stationery Offi ce. www.crimereduction.gov.uk/drugsalcohol62.htm (accessed May 23, 2008).

Quade, E. S. 1989. Policy Analysis for Public Decisions. 3d ed. Rev. Grace M. Carter. New York: North-Holland.

Ragin, Charles C. 2000. Fuzzy-Set Social Science. Chicago: Uni-versity of Chicago Press.

Ramsay, Malcolm, and Sarah Partridge. 1999. Drug Misuse De-clared in 1998: Results from the British Crime Survey. Home Offi ce Research Study, 197. London: Her Majesty’s Sta-tionery Offi ce.

Ramsay, Malcolm, and Andrew Percy. 1996. Drug Misuse De-clared: Results of the 1994 British Crime Survey. Home Offi ce Research Study, 151. London: Her Majesty’s Stationery Offi ce.

Rand, Michael, and Shannan M. Catalano. 2007. Criminal Victimization, 2006. Washington, DC: U.S. Department of Justice, Offi ce of Justice Programs, Bureau of Justice Statistics. www.ojp.usdoj.gov/bjs/abstract/cv06.htm (ac-cessed May 23, 2008).

Rand, Michael R., and Callie M. Rennison. 2005. “Bigger Is Not Necessarily Better: An Analysis of Violence Against Women Estimates from the National Crime Victimiza-tion Survey and the National Violence Against Women Survey.” Journal of Quantitative Criminology 21: 267–291.

Rasinski, Kenneth A. 1989. “The Effect of Question Wording on Public Support for Government Spending.” PublicOpinion Quarterly 53: 388–394.

Ratcliffe, Jerry H. 2004. “The Hotspot Matrix: A Framework for the Spatio-Temporal Targeting of Crime Reduction.” Police Practice and Research 5: 5–23.

Reuter, Peter, Robert MacCoun, and Patrick Murphy. 1990. Money from Crime: A Study of the Economics of Drug Dealing in Washington, DC. Santa Monica, CA: Rand.

Reynolds, Paul D. 1979. Ethical Dilemmas and Social Research.San Francisco: Jossey-Bass.

Rich, Thomas F. 1999. “Mapping the Path to Problem Solv-ing.” National Institute of Justice Journal (October).

Roberts, James C. 2002. Serving Up Trouble in the Barroom En-vironment. Unpublished Ph.D. dissertation, Rutgers Uni-

.ncjrs.gov/pdffi les1/nij/sl000746.pdf (accessed May 23, 2008).

National Research Council. 1996. The Evaluation of Foren-sic DNA Evidence. Washington, DC: National Academy Press.

National Research Council. 2001. Informing America’s Policy on Illegal Drugs: What we Don’t Know Keeps Hurting Us.Committee on Data and Research for Policy on Illegal Drugs, ed. Charles F. Manski, John V. Pepper, and Carol V. Petrie. Washington, DC: National Academy Press.

Nellis, Mike. 2006. “Surveillance, Rehabilitation, and Elec-tronic Monitoring: Getting the Issues Clear.” Criminology and Public Policy 5(1, February): 103–108.

Newman, Oscar. 1972. Defensible Space. New York: Macmillan.Newman, Oscar. 1996. Creating Defensible Space. Washington,

DC: U.S. Department of Housing and Urban Develop-ment, Offi ce of Policy Development and Research.

Nicholas, Sian, Chris Kershaw, and Alison Walker. 2007. Crime in England and Wales 2006/07. 4th ed. Home Offi ce Statis-tical Bulletin. London: Home Offi ce Research, Develop-ment and Statistics Directorate. www.homeoffi ce.gov.uk/rds/pdfs07/hosb1107.pdf (accessed May 23, 2008).

Nolan, James J., and Yoshio Akiyama. 1999. “An Analysis of Factors That Affect Law Enforcement Participation in Hate Crime Reporting.” Journal of Contemporary Criminal Justice 15(1): 111–127.

Nolan, James J., Yoshio Akiyama, and Samuel Berhanu. 2002. “The Hate Crime Statistics Act of 1990: Developing a Method for Measuring the Occurrence of Hate Vio-lence.” American Behavioral Scientist 46(1): 136–153.

Painter, Kate. 1996. “The Infl uence of Street Lighting Im-provements on Crime, Fear and Pedestrian Street Use After Dark.” Landscape and Urban Planning 35: 193–201.

Painter, Kate, and David P. Farrington. 1998. “Marital Vio-lence in Great Britain and Its Relationship to Marital and Non-marital Rape.” International Review of Victimol-ogy 5(3/4): 257–276.

Patton, Michael Quinn. 2001. Qualitative Research and Evaluation Methods. 3d ed. Thousand Oaks, CA: Sage Publications.

Pawson, Ray, and Nick Tilley. 1997. Realistic Evaluation. Thou-sand Oaks, CA: Sage Publications.

Pease, Ken. 1998. Repeat Victimisation: Taking Stock. Crime Pre-vention and Detection Series, Paper 90. London: Home Offi ce.

Perkins, Craig, and Darrell K. Gilliard. 1992. National Correc-tions Reporting Program, 1988. Washington, DC: U.S. De-partment of Justice, Offi ce of Justice Programs, Bureau of Justice Statistics.

Perrone, Dina. 2006. “New York City Club Kids: A Contex-tual Understanding of Club Drug Use.” In Drugs, Clubs and Young People, ed. Bill Sanders. Hampshire, UK: Ash-gate Publishing.

Pettiway, Leon E. 1995. “Copping Crack: The Travel Behavior of Crack Users.” Justice Quarterly 12(3): 499–524.

Plouffe, Nanci, and Rana Sampson. 2004. “Auto Theft and Theft from Autos in Parking Lots in Chula Vista, CA:

References 329

Violence Experiment.” Journal of Criminal Law and Crimi-nology 83: 137–169.

Sherman, Lawrence W. 1995. “Hot Spots of Crime and Crim-inal Careers of Places.” In Crime and Place, ed. John E. Eck and David Weisburd. Crime Prevention Studies, vol. 4. Monsey, NY: Criminal Justice Press.

Sherman, Lawrence W., and Richard A. Berk. 1984. The Min-neapolis Domestic Violence Experiment. Washington, DC: Police Foundation.

Sherman, Lawrence W., and Ellen G. Cohn. 1989. “The Im-pact of Research on Legal Policy: The Minneapolis Do-mestic Violence Experiment.” Law and Society Review 23: 117–144.

Sherman, Lawrence W., and Dennis P. Rogan. 1995. “Effects of Gun Seizures on Gun Violence: ‘Hot Spots’ Patrol in Kansas City.” Justice Quarterly 12(4): 673–694.

Sherman, Lawrence W., James W. Shaw, and Dennis P. Ro-gan. 1995. The Kansas City Gun Experiment. Research in Brief. Washington, DC: U.S. Department of Justice, Of-fi ce of Justice Programs, National Institute of Justice ( January).

Sieber, Joan E. 2001. Summary of Human Subjects Protection Is-sues Related to Large Sample Surveys. Washington, DC: U.S. Department of Justice, Offi ce of Justice Programs, Bu-reau of Justice Statistics ( June).

Sinauer, Nancy, et al. 1999. “Comparisons Among Female Homicides Occurring in Rural, Intermediate, and Urban Counties in North Carolina.” Homicide Studies 3: 107–128.

Singleton, Royce A., Jr., Bruce C. Straits, and Margaret Miller Straits. 2005. Approaches to Social Research. 4th ed. New York: Oxford University Press.

Silverman, Eli B. 1999. NYPD Battles Crime: Innovative Strate-gies in Policing. Boston: Northeastern University Press.

Skogan, Wesley G. 1985. Evaluating Neighborhood Crime Pre-vention Programs. The Hague, Netherlands: Ministry of Justice, Research and Documentation Centre.

Skogan, Wesley G. 1988. “Community Organizations and Crime.” In Crime and Justice: An Annual Review of Research,ed. Michael Tonry and Norval Morris. Chicago: Univer-sity of Chicago Press.

Skogan, Wesley G. 1990a. Disorder and Decline: Crime and the Spiral of Decay in American Neighborhoods. New York: Free Press.

Skogan, Wesley G. 1990b. The Police and the Public in England and Wales. Home Offi ce Research Study, 117. London: Her Majesty’s Stationery Offi ce.

Skogan, Wesley G., and Michael G. Maxfi eld. 1981. Copingwith Crime: Individual and Neighborhood Reactions. Thou-sand Oaks, CA: Sage Publications.

Smith, Steven K., and Carolyn C. DeFrances. 2003. Assessing Measurement Techniques for Identifying Race, Ethnicity, and Gender: Observation-Based Data Collection in Airports and at Immigration Checkpoints. Washington, DC: U.S. Depart-ment of Justice, Offi ce of Justice Programs, Bureau of Justice Statistics ( January).

Smith, Steven K., Greg W. Steadman, Todd D. Minton, and Meg Townsend. 1999. Criminal Victimization and Percep-

versity School of Criminal Justice. Newark, NJ: School of Criminal Justice, Rutgers University.

Roberts, Jennifer, et al. 2005. “A Test of Two Models of Recall for Violent Events.” Journal of Quantitative Criminology 21: 175–193.

Robins, Lee N. 1978. “Sturdy Childhood Predictors of Adult Antisocial Behavior: Replications from Longitudinal Studies.” Psychological Medicine 8: 611–622.

Rosenfeld, Richard, Timothy M. Bray, and Arlen Egley. 1999. “Facilitating Violence: A Comparison of Gang-Moti-vated, Gang-Affi liated, and Nongang Youth Homicides.” Journal of Quantitative Criminology 15(4): 495–516.

Rossi, Peter H., Howard E. Freeman, and Mark W. Lipsey. 1999. Evaluation: A Systematic Approach. 6th ed. Thousand Oaks, CA: Sage Publications.

Sampson, Rana. 2002. Acquaintance Rape of College Students.Problem-Oriented Guides for Police, no. 17. Washington, DC: U.S. Department of Justice, Offi ce of Community Oriented Policing Services.

Sampson, Robert J., and John H. Laub. 1993. Crime in the Mak-ing: Pathways and Turning Points Through Life. Cambridge, MA: Harvard University Press.

Sampson, Robert J., and Stephen W. Raudenbush. 1999. “Systematic Social Observation of Public Spaces: A New Look at Disorder in Urban Neighborhoods.” American Journal of Sociology 105(3, November): 603–651.

Schuck, Amie M., and Cathy Spatz Widom. 2001. “Child-hood Victimization and Alcohol Symptoms in Females: Causal Inferences and Hypothesized Mediators.” ChildAbuse and Neglect 25: 1069–1092.

Semaan, Salaam, Jennifer Lauby, and Jon Liebman. 2002. “Street and Network Sampling in Evaluation Studies of HIV Risk-Reduction Interventions.” AIDS Reviews 4: 213–223.

Shadish, William R., Thomas D. Cook, and Donald T. Camp-bell. 2002. Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Boston: Houghton Miffl in.

Shannon, David M., Todd E. Johnson, Shelby Searcy, and Alan Lott. 2002. “Using Electronic Surveys: Advice from Pro-fessionals.” Practical Assessment, Research and Evaluation8(1).

Shea, Christopher. 2000. “Don’t Talk to Humans: The Crack-down on Social Science Research.” Lingua Franca 10(6, September).

Shearing, Clifford D., and Phillip C. Stenning. 1992. “From the Panopticon to Disney World: The Development of Discipline.” In Situational Crime Prevention: Successful Case Studies, ed. Ronald V. Clarke. New York: Harrow and Heston.

Sherman, Lawrence W. 1992a. “The Infl uence of Criminology on Criminal Law: Evaluating Arrests for Misdemeanor Domestic Violence.” Journal of Criminal Law and Criminol-ogy 83: 1– 45.

Sherman, Lawrence W. 1992b. Policing Domestic Violence: Experiments and Dilemmas. New York: Free Press.

Sherman, Lawrence W., et al. 1992. “The Variable Effects of Arrest on Criminal Careers: The Milwaukee Domestic

330 References

Taylor, Ralph B. 1999. Crime, Grime, Fear, and Decline: A Lon-gitudinal Look. Research in Brief. Washington, DC: U.S. De-partment of Justice, Offi ce of Justice Programs, National Institute of Justice ( July).

Telemarketing Sales Rule. 2003. 16 CFR Part 310.

Thompson, Kimberly M., and Kevin Haninger. 2001. “Vio-lence in E-Rated Video Games.” Journal of the American Medical Association 286(5, August): 591–598.

Thompson, Steven K. 1997. Adaptive Sampling in Behavioral Surveys. NIDA Monograph no. 167. Bethesda, MD: U.S. Department of Health and Human Services, National Institute of Drug Abuse.

Tilley, Nick, and Gloria Laycock. 2002. Working Out What to Do: Evidence-Based Crime Reduction. Crime Reduction Re-search Series, Paper 11. London: Home Offi ce, Policing and Reducing Crime Unit, Research, Development and Statistics Directorate.

Tjaden, Patricia, and Nancy Thoennes. 2000. Full Report of the Prevalence, Incidence, and Consequences of Violence Against Women. Findings from the National Violence Against Women Survey. Research Report. Washington, DC: U.S. Department of Justice, Offi ce of Justice Programs, Na-tional Institute of Justice (November).

Tourangeau, Roger, and Tom W. Smith. 1996. “Asking Sen-sitive Questions: The Impact of Data Collection Mode, Question Format, and Question Context.” Public Opinion Quarterly 60: 275–304.

Townsley, Michael, Ross Homel, and Janet Chaseling. 2003. “Infectious Burglaries: A Test of the Near Repeat Hy-pothesis.” British Journal of Criminology 43: 615–633.

Tremblay, Pierre, Bernard Talon, and Doug Hurley. 2001. “Body Switching and Related Adaptations in the Re-sale of Stolen Vehicles.” British Journal of Criminology 41: 561–579.

United States Bureau of the Census. 2006. Statistical Abstract of the United States: 2006. Washington, DC: U.S. Bureau of the Census.

Van Dijk, Jan. 2007. “The International Crime Victims Sur-vey and Complementary Measures of Corruption and Organised Crime.” In Surveying Crime in the 21st Century,ed. Mike Hough and Mike Maxfi eld. Crime Prevention Studies, vol. 22. Monsey, NY: Criminal Justice Press.

Van Kirk, Marvin. 1977. Response Time Analysis. Washington, DC: U.S. Department of Justice, National Institute of Law Enforcement and Administration of Justice.

Vazquez, Salvador P., Mary K. Stohr, and Marcus Purkiss. 2005. “Intimate Partner Violence Incidence and Char-acteristics: Idaho NIBRS 1995 to 2001 Data.” Criminal Justice Policy Review 16: 99–114.

Walker, Jeffrey T. 1994. “Fax Machines and Social Surveys: Teaching an Old Dog New Tricks.” Journal of Quantitative Criminology 10(2, June): 181–188.

Walker, Samuel. 1994. Sense and Nonsense About Crime and Drugs: A Policy Guide. 3d ed. Belmont, CA: Wadsworth.

Ward, V. M., J. T. Bertrand, and L. E. Brown. 1991. “The Com-parability of Survey and Focus Group Results.” Evalua-tion Review 15: 266–283.

tions of Community Safety in 12 Cities, 1998. Washington, DC: U.S. Department of Justice, Offi ce of Justice Pro-grams, Bureau of Justice Statistics and Offi ce of Com-munity Oriented Police Services ( June).

Smith, William R., et al. 2003. “Self-Reported Police Speeding Stops: Results from a North Carolina Record Check Sur-vey.” In The North Carolina Highway Traffi c Safety Study, Ap-pendix E, Donald Tomaskovic-Devey and Cynthia Pfaff Wright. Raleigh, NC: North Carolina State University.

Smith, William R., Donald Tomaskovic-Devey, Matthew T. Zingraff, H. Marcinda Mason, Patricia Y. Warren, and Cynthia Pfaff Wright. 2003. “The North Carolina High-way Traffi c Study.” Final Report on Racial Profi ling to the National Institute of Justice.

Snyder, Howard N. 2000. Sexual Assault of Young Children as Reported to Law Enforcement: Victim, Incident, and Offender Characteristics. NIBRS Statistical Report. Washington, DC: U.S. Department of Justice, Offi ce of Justice Pro-grams, Bureau of Justice Statistics ( July).

Spohn, Cassia. 1990. “The Sentencing Decisions of Black and White Judges: Expected and Unexpected Similari-ties.” Law and Society Review 24: 1197–1216.

Spohn, Cassia, and Julie Horney. 1991. “The Law’s the Law, but Fair Is Fair: Rape Shield Laws and Offi cials’ As-sessment of Sexual History Evidence.” Criminology 29: 137–161.

Spriggs, Angela, Javier Arhomaniz, Martin Gill, and Jane Bryan. 2005. Public Attitudes Towards CCTV: Results from the Pre-intervention Public Attitude Survey Carried Out in Ar-eas Implementing CCTV. London: Research, Development and Statistics Directorate, Home Offi ce. www.homeoffi ce.gov.uk/rds/pdfs05/rdsolr1005.pdf (accessed May 24, 2008).

Stecher, Brian M., and W. Alan Davis. 1987. How to Focus an Evaluation. Thousand Oaks, CA: Sage Publications.

Straus, Murray A. 1999. “The Controversy Over Domestic Violence by Women: A Methodological, Theoretical, and Sociology of Science Analysis.” In Violence in Intimate Relationships, ed. Ximena Arriaga and Stuart Oskamp. Thousand Oaks, CA: Sage Publications.

Substance Abuse and Mental Health Services Administra-tion. 2005. Results from the 2004 National Survey of Drug Use and Health: National Findings. NSDUH Series H-28, Publication no. SMA 05-4062. Rockville, MD: U.S. De-partment of Health and Human Services, Substance Abuse and Mental Health Services Administration, Of-fi ce of Applied Studies. www.oas.samhsa.gov/nsduh.htm (accessed May 24, 2008).

Surette, Ray. 2006. Media, Crime, and Justice: Images, Realities and Policies. 3rd ed. Belmont, CA: Wadsworth.

Sutton, Mike. 2007. “Improving National Crime Surveys with a Focus on Fraud, High-Tech Crimes, and Stolen Goods.” In Surveying Crime in the 21st Century, ed. Mike Hough and Mike Maxfi eld. Crime Prevention Studies,vol. 22. Monsey, NY: Criminal Justice Press.

Taxman, Faye S., and Lori Elis. 1999. “Expediting Court Dis-positions: Quick Results, Uncertain Outcomes.” Journal of Research in Crime and Delinquency 36(1): 30–55.

References 331

Wilson, James Q., and Richard J. Herrnstein. 1985. Crime and Human Nature. New York: Simon and Schuster.

Wilson, James Q., and George Kelling. 1982. “Broken Win-dows: The Police and Neighborhood Safety.” Atlantic Monthly (March): 29–38.

Wilson, William Julius. 1996. When Work Disappears: The World of the New Urban Poor. New York: Knopf.

Winner, Lawrence, Lonn Lanza-Kaduce, Donna M. Bishhop, and Charles E. Frazier. 1997. “The Transfer of Juveniles to Criminal Court: Reexamining Recidivism Over the Long Term.” Crime and Delinquency 43: 548–563.

Wolfgang, Marvin E., Robert M. Figlio, and Thorsten Sellin. 1972. Delinquency in a Birth Cohort. Chicago: University of Chicago Press.

Wolfgang, Marvin E., Robert M. Figlio, Paul E. Tracy, and Simon I. Singer. 1985. The National Survey of Crime Se-verity. NCJ-96017. Washington, DC: U.S. Department of Justice, Offi ce of Justice Programs, Bureau of Justice Statistics.

Woodward, Lianne J., and David M. Fergusson. 2000. “Child-hood and Adolescent Predictors of Physical Assault: A Prospective Longitudinal Study.” Criminology 38(1, Feb-ruary): 233–261.

Wooten, Harold B., and Herbert J. Hoelter. 1998. “Operation Spotlight: The Community Probation-Community Po-lice Team Process.” Federal Probation 62(2): 30–35.

Wright, Doug, Peggy Barker, Joseph Gfroerer, and Lanny Piper. 2002. “Summary of NHSDA Design Changes in 1999.” In Redesigning an Ongoing National Household Sur-vey: Methodological Issues. Publication no. SMA 03–3768, ed. Joseph Gfroerer, Joe Eyerman, and James Chromy. Rockville, MD: Offi ce of Applied Studies, Substance Abuse and Mental Health Services Administration. www.oas.samhsa.gov/redesigningNHSDA.pdf (accessed May 27, 2008).

Wright, Richard T., and Scott H. Decker. 1994. Burglars on the Job: Streetlife and Residential Break-Ins. Boston: Northeast-ern University Press.

Wright, Richard T., and Scott H. Decker. 1997. Armed Robbers in Action: Stickups and Street Culture. Boston: Northeastern University Press.

Yin, Robert K. 2003. Case Study Research: Design and Methods.3d ed. Thousand Oaks, CA: Sage Publications.

Zanin, Nicholas, Jon M. Shane, and Ronald V. Clarke. 2004. Reducing Drug Dealing in Private Apartment Complexes in Newark, New Jersey. Final Report to the Offi ce of Com-munity Oriented Police Services. Washington, DC: U.S. Department of Justice, Offi ce of Community Oriented Police Services. www.popcenter.org/Library/researcherprojects/DrugsApartment.pdf (accessed May 27, 2008).

Weisburd, David, and Chester Britt. 2002. Statistics in Criminal Justice. 2d ed. Belmont, CA: Wadsworth.

Weisburd, David, Cynthia M. Lum, and Anthony Petrosino. 2001. “Does Research Design Affect Study Outcomes in Criminal Justice?” The Annals 578 (November): 50–70.

Weisburd, David, Anthony Petrosino, and Gail Mason. 1993. “Design Sensitivity in Criminal Justice Experiments.” In Crime and Justice: An Annual Review of Research, ed. Michael Tonry. Chicago: University of Chicago Press.

Weisel, Deborah. 1999. Conducting Community Surveys: A Prac-tical Guide for Law Enforcement Agencies. Washington, DC: U.S. Department of Justice, Offi ce of Justice Programs, Bureau of Justice Statistics and Offi ce of Community Oriented Police Services (October).

Weisel, Deborah Lamm. 2003. “The Sequence of Analysis in Solving Problems.” In Problem-Oriented Policing: From Innovation to Mainstream, ed. Johannes Knutsson. Crime Prevention Studies, vol. 15. Monsey, NY: Criminal Justice Press.

Weisel, Deborah Lamm. 2005. Analyzing Repeat Victimization.Problem Solving Tools Series. Washington, DC: U.S. Depart-ment of Justice, Offi ce of Community Oriented Policing Services.

Weiss, Carol H. 1995. “Nothing as Practical as Good Theory: Exploring Theory-Based Evaluation for Comprehensive Community Initiatives for Children and Families.” In New Approaches to Evaluating Community Initiatives: Con-cepts, Methods, and Contexts, ed. James P. Connell, Anne C. Kubisch, Lisbeth B. Schorr, and Carol H. Weiss. Washington, DC: Aspen Institute.

West, Donald J., and David P. Farrington. 1977. The Delin-quent Way of Life. London: Heinemann.

Whitt, Hugh P. 2006. “Where Did the Bodies Go? The So-cial Construction of Suicide Data, New York City, 1976–1992.” Sociological Inquiry 76: 166–187.

Widom, Cathy Spatz. 1989a. “Child Abuse, Neglect, and Adult Behavior: Research Design and Findings on Crim-inality, Violence, and Child Abuse.” American Journal of Orthopsychiatry 59: 355–367.

Widom, Cathy Spatz. 1989b. “The Cycle of Violence.” Science244(April): 160–166.

Widom, Cathy Spatz. 1992. The Cycle of Violence. Research in Brief. Washington, DC: U.S. Department of Justice, Of-fi ce of Justice Programs, National Institute of Justice (October).

Widom, Cathy Spatz, Barbara Luntz Weiler, and Linda B. Cotler. 1999. “Childhood Victimization and Drug Abuse: A Comparison of Prospective and Retrospective Findings.” Journal of Consulting and Clinical Psychology67(6): 867– 880.

Hood-Williams, John, 224 –225

Horney, Julie, 163, 171, 177Hoskin, Anthony W., 102Hough, Mike, 160, 185,

301–302Huizinga, David, 248Hunter, Rosemary, 68–69

Inciardi, James, 33, 34 –35

Jacob, Herbert, 233, 234, 239Jacobs, Bruce A., 29, 35, 165,

210Jeffery, Ray, 203Jerwood, David, 238Johansen, Helle Krogh,

33–34Johnson, Bruce A., 29, 34, 58Johnson, Ida M., 261Johnson, Kelly Dedel, 255Johnson, Mark B., 215Johnson, Todd E., 184Johnston, Lloyd D., 104Jolliffe, Darrick, 55

Kalichman, Seth, 37Karmen, Andrew, 58, 59Kearly, Brook W., 231Kelling, George L., 5, 59, 107,

221–222Kennedy, David M., 133–135Kennet, Joel, 177Kershaw, Chris, 157Kessler, David A., 65Khanna, Roma, 93Killias, Martin, 36, 67Kilstrom, Nancy, 68–69Kim, June, 58–59Kim, Young, 132Kinshott, G., 157Klaus, Patsy, 102

Langan, Patrick A., 102Lange, James E., 215Lanza-Kaduce, Lonn, 231Larson, Richard, 56–57Lattimore, Pamela, 231Laub, John H., 70, 231, 248Lauritsen, Janet L., 172,

233, 248Laycock, Gloria, 238, 281Leiber, Michael, 163Lempert, Richard O., 9Likert, Rensis, 174Lilly, J. Robert, 284Lineberry, Robert, 256–257Lin-Kelly, Wendy, 102Lipsey, Mark W., 257, 260, 282

Figlio, Robert M., 66–67, 91–92

Finkelhor, David, 102Finn, Peter, 262Forster, Emma, 189Fox, James Alan, 59Freeman, Howard E., 257,

260, 282Fujita, Shuryo, 277

Gallup, George, 170Gant, Frances, 210Garner, Joel H., 249Garofalo, James, 301Geerken, Michael, 94, 241–242Gendreau, Paul, 52Gfroerer, Joseph, 177, 188Gill, Martin, 125, 127–128,

163, 273Gilliard, Darrell, 240Glaser, Barney, 11, 201Glueck, Eleanor, 231Glueck, Sheldon, 231Golub, Andrew, 58Gottfredson, Denise C., 121,

231, 268Gottfredson, Michael R., 82,

97, 301Gotzsche, Peter, 33–34Gowdy, Voncile B., 261Grabosky, Peter, 210Greenwood, Peter, 6Griffi ths, Sue, 238Gurr, Ted Robert, 233–234,

240

Hall, Richard, 27Haney, Curtis, 43– 45Haninger, Kevin, 246Hanmer, Jalna, 238Harries, Keith, 276Harris, Patricia, 281Harrison, Paige M., 233Hartnagel, Timothy, 165Hawkins, J. D., 55Heeren, Timothy, 95Hempel, Carl G., 84 – 85Herrnstein, Richard, 97Hesseling, René, 36Heumann, Milton, 248, 249Hewitt, Hugh, 144Hindelang, Michael J., 301Hingson, Ralph W., 95Hirschi, Travis, 82, 97Hobbs, Louise, 238Hoelter, Herbert J., 262Holloway, Katy, 58Homel, Ross, 11, 201,

222–224

Chermak, Steven M., 231Clark, Jeff, 201, 222–224Clarke, Ronald V., 20, 21,

36, 125, 126, 201, 214, 231, 272–273, 274, 280–281, 284, 301–302

Cohen, Jacqueline, 55Cohen, Lawrence E., 11Cohn, Ellen G., 8–9, 243Coleman, Veronica, 276Cook, Thomas D., 52, 53, 57,

118, 122, 129, 132–133, 134, 271

Cotler, Linda B., 33, 70, 187Cullen, Francis T., 52, 255Cynamon, Marcie L., 154, 190

D’Alessio, Stewart J., 64Davis, W. Alan, 261Deane, Glenn, 102Decker, Scott H., 19, 35, 40,

42– 43, 166, 209–212, 225, 226–227, 274

Dennis, Michael, 36Devine, Joel, 70Dieckman, Duane, 5Dillman, Don, 183–184Dinkes, Rachel, 102Ditton, Jason, 96, 172Duhart, Detis T., 102Dunlap, Eloise, 58Durose, Matthew R., 102

Eck, John, 20, 36, 122, 267, 274, 281, 282

Edmunds, Alan, 238Egley, Arlen, 246–247Eisenberg, Michael, 87– 88Ekblom, Paul, 301–302Elis, Lori, 65Elliott, David, 272–273Elliott, Delbert S., 248Engel, Robin Shepard, 220Erven, James M., 129–130,

236, 243

Fabelo, Tony, 84 – 85, 87, 283, 284

Fagan, Jeffrey A., 58–59, 249Faggiani, Donald, 280Fain, Terry, 230, 272Farrall, Stephen, 96, 172Farrell, Graham, 238Farrington, David P., 55, 94,

96, 106–107, 113, 117, 164, 264

Felson, Marcus, 11, 36, 165Felson, Richard B., 102Fergusson, David, 37

Ageton, Suzanne S., 248Akiyama, Yoshio, 102, 196Anderson, Craig A., 231Andresen, W. Carsten, 206,

221–222Argomaniz, Javier, 127

Baker, Joanna, 231Banks, Craig, 43– 45Barker, Peggy, 188Baron, Stephen, 165Baum, Katrina, 102Baumer, Eric, 4 –5Baumer, Terry L., 203, 233,

240, 262–263, 269–271, 273

Beck, Allen J., 233Bennett, Trevor, 58Bennis, Jason, 218Berhanu, Samuel, 102Berk, Richard A., 9, 121, 267,

268Bertrand, J. T., 195–196Bichler-Robertson, Gisela,

220–221, 231Bishhop, Donna M., 231Block, Carolyn, 241Block, Richard, 241Blumberg, Stephen J., 154,

190Blumstein, Alfred, 58–59Boba, Rachel, 258Braga, Anthony, 133–135, 281Brantingham, Patricia, 203Brantingham, Paul, 203Bratton, William J., 58–59,

242Bray, Timothy M., 246–247Brown, Charles E., 5Brown, L. E., 195–196Brown, Rick, 21Browning, Katherine, 65Budd, T., 157Bush, Tracy, 224 –225Bushman, Brad, 231

Campbell, Donald T., 52, 53, 57, 118, 122, 123, 129, 132–133, 134, 271

Cantor, David, 240Carey, Raymond, 282Casady, Tom, 30Catalano, Shannon M., 160Cataldi, Emily Forrest, 102Chaiken, Jan M., 57Chaiken, Marcia R., 57Chainey, Spencer, 274Chamard, Sharon, 274Chaseling, Janet, 11

Name Index

332

Name Index 333

Tjaden, Patricia, 30, 172, 190, 194

Tomaskovic-Devey, Donald, 220–221

Tomsen, Steve, 201, 222–224Tourangeau, Roger, 192Townsend, Meg, 103Townsley, Michael, 11Tracy, Paul E., 91–92Tremblay, Pierre, 22Turner, Susan, 230, 272

van Dijk, Jan, 190Van Kirk, Marvin, 6Van Winkle, Barrik, 35, 40,

42– 43, 166Vazquez, Salvador, 102Voas, Robert V., 215von Kammen, Welmoet,

106–107

Walker, Jeffrey, 184Walker, Samuel, 19Ward, V. M., 195–196Waring, Elin J., 133–135Warren, Patricia Y., 220–221Weiler, Barbara Luntz, 33,

70, 187Weisburd, David, 123Weisel, Deborah Lamm, 103,

172, 190, 197, 274, 275Weiss, Alexander, 231Weiss, Carol H., 262West, Donald, 94, 96Whitt, Hugh, 236Widom, Cathy Spatz, 33, 39,

52, 70, 94, 125, 126, 187, 235–236

Williams, Terry, 29Wilson, James Q., 97, 107,

113, 117Winner, Lawrence, 231Wolfgang, Marvin E., 66–67,

91–92, 249Woodward, Lianne, 37Wooten, Harold B., 262Wright, Cynthia Pfaff,

220–221Wright, Doug, 188Wright, James, 70Wright, Richard T., 19, 166,

209–212, 225, 226–227, 233

Yin, Robert, 133, 134, 135

Zanin, Nicholas, 60Zawitz, Marianne, 306–307Zimbardo, Philip, 43– 45Zimring, Frank E., 58–59Zingraff, Matthew T.,

220–221

Scarcy, Shelby, 184Schaum, Robin J., 172Schmitt, Erica L., 102Schuck, Amie, M., 52Schultze, Phyllis A., 274Sellin, Thorsten, 66–67Semaan, Salaam, 164Shadish, William R., 52,

53, 57, 118, 122, 129, 132–133, 134, 271

Shannon, David M., 184Shaw, James W., 261, 272Shea, Christopher, 41Shearing, Clifford, 202Sherman, Lawrence W., 8–9,

55, 243, 249, 266, 267, 272, 280

Sieber, Joan E., 37, 41Silverman, Eli B., 59, 258Sinauer, Nancy, 231Singer, Simon I., 91–92Singleton, Royce A., Jr., 87Skogan, Wesley G., 107–109,

132, 218, 247–248, 262Skrobab, Stacy B., 65Smith, Christine, 256, 261Smith, Douglas A., 65Smith, Robert A., 95Smith, Steven K., 103Smith, Tom W., 192Smith, William R., 220–221Snyder, Howard N., 102Spohn, Cassia, 163, 248Spriggs, Angela, 125,

127–128, 163, 273Stairs, Jayne, 163Stanley, Julian, 123Steadman, Greg W., 103Stecher, Brian M., 261Steiner, Lynn, 218Stenning, Phillip, 202Stohr, Mary K., 102Stolzenberg, Lisa, 64Stouthamer-Loeber, Magda,

106–107Straits, Bruce C., 87Straits, Margaret Miller, 87Straus, Murray A., 194Strauss, Anselm, 11, 201Sundt, Jody L., 255Surette, Ray, 247Sutton, Mike, 185

Taxman, Faye S., 65Taylor, Ralph, 216Terry, W. Clinton, 64Thoennes, Nancy, 30, 172,

190, 194Thommeny, Jennifer, 201,

222–224Thompson, Kimberly, 246Tilley, Nick, 60, 134, 264,

280, 281

Najaka, Stacy S., 231Nellis, Mike, 284Newlyn, Andrea K., 262Newman, Oscar, 203Nicholas, Sian, 160Nienstedt, Barbara C.,

129–130, 236, 243Nolan, James J., 102, 196

Ohlin, Lloyd, 113, 117Ormrod, Richard, 102

Painter, Kate, 164, 203, 213Partidge, Sarah, 187Pate, Tony, 5Patton, Michael Quinn,

194 –195, 205, 213–214Pawson, Ray, 60, 134, 264,

280Pease, Ken, 238Percy, Andrew, 187, 193Perkins, Craig, 240Perrone, Dina, 28–29, 204Pettiway, Leon, 165Piehl, Anne Morrison,

133–135Piper, Lanny, 188Plouffe, Nanci, 275–276Poklemba, John J., 242–243Pollock, Jocelyn, 27Posavec, Emil, 282Pudney, Stephen, 58Purkiss, Marcus, 102

Quade, E. S., 273

Ragin, Charles, 133Ramsay, Malcolm, 187Rand, Michael R., 102, 160Raskinski, Kenneth, 175Ratcliffe, Jerry, 274, 277Raudenbush, Stephen,

107–108, 214Rennison, Callie M., 102Reuter, Peter, 29–30Reynolds, Paul, 37–38, 45Rich, Thomas, 277Ritti, Richard, 244Roberts, James, 201, 215Roberts, Jennifer, 171, 177Robins, Lee, 70Rogan, Dennis P., 55, 272Rosenbaum, Dennis, 203Rosenfeld, Richard, 58–59,

233, 246–247Rossi, Peter H., 257, 260, 268,

271–272, 282Rossman, Gretchen, 203

Sampson, Rana, 255, 275–276

Sampson, Robert J., 70, 107–108, 214, 231, 248

Loeber, Rolf, 106–107Loftin, Colin, 248, 249Lott, Alan, 184Ludwig, Jens, 55Luke, Julian V., 154, 190Lynch, James, 240

MacCoun, Robert, 30Macintyre, Stuart, 201MacKenzie, Doris Layton,

65, 261Maher, Lisa, 176Maltz, Michael, 295,

306–307, 310–311Marshall, Catherine, 203Marx, Karl, 170Mason, H. Marcinda,

220–221Mastrofski, Stephen, 216,

244Matz, David, 177Maxfi eld, Michael G., 4 –5,

38, 88– 89, 94, 160, 164 –165, 181, 185, 193, 206, 221–222, 226, 236, 238, 243–244, 247, 262–263, 270–271, 282, 284 –285

Maxwell, Christopher D., 249Maxwell, Joseph, 11, 52, 53,

225Mayhew, Pat, 185, 193,

272–273, 301–302McCall, George J., 166,

202–203, 204, 205McCleary, Richard, 129–130,

236, 240–241, 243McDonald, Douglas C., 256,

261McGarrell, Edmund F., 55McLaughlin, Colleen, 280McLeery, Alison, 189McVicker, Steve, 93Mele, Marie, 237–239, 242Mendelsohn, Robert I., 4 –5,

269–271, 273Merritt, Nancy, 230, 272Messner, Steven F., 102Mieczkowski, Thomas, 29,

104Milgram, Stanley, 45Miller, Jody, 165Minton, Todd D., 103Mirrlees-Black, Catriona,

30, 187–188, 193, 224, 225

Mitford, Jessica, 40Monahan, John, 30Morelock, Suzette, 95Mott, Joy, 187–188Mulvey, E. P., 171, 177Murphy, Dean E., 82Murphy, Patrick, 30

California Healthy Kids Sur-vey questionnaires, 181

Call-in polls, 144Caller ID technology, 126Calls-for-service (CFS)

records, 243Cameras

for fi eld research observa-tions, 214 –215

and prevention experi-ment design, 127–128

Capacity for informed consent, 39

Car theft. See Auto theftCase fl ow, adequacy of,

267–268Case-oriented research, 133Case study research, 133

problem-oriented policing and, 275

validity and, 134 –135CASI (computer-assisted

self-interviewing), 187–188

CAT (computer-assisted telephoneinterviewing), 191

Causality, 7Causation, 51–53.

See also Validityand case study design, 135criteria for, 52–53drug use and, 57–60experiments and, 117–123necessary and suffi cient

causes, 53, 54New York, declining crime

in, 58–59as probabilistic, 52scientifi c realism, 60–61summary of, 57

CCTV experiment design, 127–128

Cell phones and telephone surveys, 190

Census Bureau. See also NCVS (National Crime Victimization Survey)

community victimization surveys, 103

National Jail Census, 171published statistics, 232telephones, households

with, 189Centers for Disease Control

and prevention (CDC) questionnaires, 181

Central tendency measures. See Statistical analysis

sampling bias, 143–144selection biases, 119–120snowball sampling

and, 166and unsuccessful

studies, 33–34Bigotry, 15Binomial variable, 149Biomedical research, 28

National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research, 37

Bivariate analysis. SeeStatistical analysis

Blog polls, 144Booster samples, 161Boot camps, corrections, 281Boston Gun Project, 133–135British Crime Survey

(BCS), 156–157computer-assisted self-

interviewing (CASI), 187–188

defi ciencies of, 193and domestic violence,

238–239lifestyles and

crime, 301–303question design in, 176sampling illustration,

161–162“Broken Windows” (Wilson

& Kelling), 107Budget, proposal

addressing, 77Bureau of Justice Assistance

environmental surveys guidelines, 216–217

Bureau of Justice Statistics (BJS). See also (COPS) Community Oriented Policing Services

community victim surveys, 172

corrections data, cautions on use of, 240

on mandatory reporting requirements, 37

published statistics from, 232–233

questionnaires, copies of, 181

records of, 230secondary data from, 248

CAI (computer-assisted inter-viewing), 187–189

American Bar Association Code of Professional Responsibility, 38

American Psychological Association code of ethics, 37

American Society of Criminology (ASC) code of ethics, 38

American Sociological Association code of ethics, 37–38

Analysis. See also Content analysis; Problem analy-sis; Statistical analysis; Units of analysis

designing research and, 75ethics and, 33–34research proposal

addressing, 77Analystic mapping, 277Anonymity of subjects, 32Applied research, 20

agency records in, 231political context

of, 282–285scientifi c realism

and, 280–281Assertions, support for, 6Attitudes and surveys, 172Attributes, 15–17Attrition, 120Authority and human

inquiry, 7– 8Auto theft

crime mapping and, 277–279

problem-oriented policing example, 275–276

situational crime preven-tion and, 280–281

Available subjects, reliance on, 164 –165

Averages, 289–291

Bars and violence, fi eld research on, 222–224

Base, determination of, 289Beliefs and human

inquiry, 10Bell-shaped curve, 150The Belmont Report, 37Benefi cence as ethical

principle, 37Bias

experiments and, 118human inquiry and, 10in questions and

questionnaires, 175

Academic freedom, IRBs and, 41– 42

Academy of Criminal Justice Sciences (ACJS) code of ethics, 38

Accuracycluster sampling and,

157–158reliability and, 93

Accuracy in measurement, 92–93

Acquaintance Rape of College Students(Sampson), 255

Administrative Offi ce of U.S. Courts published statistics, 232

Agency records, 230data collection procedures,

237–238decision making and, 236discretionary actions,

effect of, 241duplication of data

in, 243–244hybrid source, data from,

236–237new data collected by

staff, 236–239nonpublic agency

records, 234 –236people-tracking

in, 241–242published statistics,

232–234quality of records,

241–243reliability of, 239–244short term variations

and, 236silo databases, 242–243social production of

data, 240–241as summary data, 233topics for, 230–232types of, 232–239validity of, 239–244volume and errors,

243–244Aggregates, 13–14Agreement reality, 4 –6Alcohol use

bars and violence, fi eld research on, 223–224

disorder and public drink-ing, 108

American Association of UniversityProfessors, 41

Subject Index

334

Subject Index 335

Cross-sectional studies, 66

Data. See also Agency records; Content analysis; Secondary analysis; Summary data

analysis, 12published statistics,

232–234qualitative data, 23–24quantitative data, 23–24

Data collection, 12research proposal defi ning

methods, 77Deception of subjects, 33

gang members study, 42– 43

informed consent and, 39Deductive reasoning, 22–23Delinquency

NCVS (National Crime Victimization Survey) and, 102

typology of, 106–107Delivery of program,

measuring, 264 –265Demographics and New York

City reduced crime rate, 59

Dependent variables. SeeVariables

Descriptive research, 19Descriptive statistics, 288Designing research,

70–76, 115analysis in, 75application of

conclusions, 75background research, 73conceptualization, 73defi ning terms, 73methods, choosing, 74observations in, 75operationalization, 74population, defi ning, 74process and, 71–72review of, 75–76sample, defi ning, 74

“Deviating from the Mean” (Maltz), 295

Dimensions. See also Time dimension

conceptualization and, 83– 84

Discriminant validity, 95Disorder, index of, 107–108Dispersion measures.

See Statistical analysisDisplacement, ethics of, 36Distributions. See Statistical

analysisDNA evidence, reliability

of, 93

coding in, 244 –245defi ned, 244gang-related homicides,

classifying, 246–247illustrations of, 246–247latent-content coding, 245manifest-content

coding, 245reliability of coding, 245topics for, 230–232of video game

violence, 246Contexts of program,

measuring, 264Contingency questions,

177–178Contingency tables, 300–301Control beats, 5Control groups, 115–116Controlled probability

sampling, 213Convenience, sample of, 143(COPS) Community

Oriented Policing Services, 103

community victim surveys, 172

content analysis and, 244problem-oriented policing

guides, 274stakeholders, education

of, 282Correctional Populations in the

United States (BJS), 233, 234

Corrections boot camps, 281Correlation and

causation, 52Court experience,

typology of, 106Cover letters for self-

administered question-naires, 182–183

Creaming, 120, 259Crime Bill of 1994, 103Crime calendars, 177Crime mapping, 274,

276–280computerized crime

mapping, 277ethical issues of

software, 30Crime prevention

programs, 35–36Crime seriousness, 82– 83

conceptualization of, 86– 87

operationalization of, 87Criminal Justice Policy

Council (CJPC), 283Criminal Victimization in the

United States, 232Criterion-related viability, 95

Computer-assistedself-interviewing (CASI), 187–188

Computer-assistedtelephone interviewing (CAT), 191

Computers. See also InternetCAI (computer-assisted

interviewing), 187–189CASI (computer-assisted

self-interviewing), 187–188

CAT (computer-assisted telephone interview-ing), 191

crime mapping and, 277self-administered

questionnaires, com-puter-based, 184 –185

self-reported items and, 177

telephone interviews, computer-assisted,190–191

viruses in questionnaires, 185

Conceptions and conceptscreating conceptual

order, 84 – 86defi ned, 81– 86defi nitions, 84 – 86of disorder, 107–108

Conceptualization, 97content analysis

coding and, 244 –245creating conceptual

order, 84 – 86of crime seriousness,

86– 87and designing research, 73and dimensions, 83– 84indicators, specifi cation

of, 83– 84process of, 83

Conclusionsapplication of, 75statistical conclusion

validity, 53–55Confi dence interval, 152Confi dence levels, 151–152Confi dence sampling,

164 –165Confi dentiality of

subjects, 32–33Federal Certifi cate of

Confi dentiality, 34Consent. See Informed

consentConstruct validity, 55–57,

95–96experiments and, 121–122validity and, 134

Content analysis, 230

Chi square comparisons, 307–309

Chicago Beat Meeting Observation Form, 218

Chicago Community Policing Evaluation Consortium, 172–173

Child abusemandatory reporting

requirements, 36–37nonequivalent-groups ex-

perimental design, 126nonpublic agency records

on, 235–236retrospective research

on, 68–69City-level surveys, 103Classical experiment.

See also Experimentsdefi ned, 113–114

Closed-circuit television (CCTV) cameras experiment design, 127–128

Closed-ended questions, 173–174

Cluster sampling, 157–160inferential statistics

and, 304with stratifi cation,

158–160Code of Federal

Regulations, Title 45, Chapter 4.6, 38

Codes of professional ethics, 37–38

Coding, 94in content analysis,

244 –245Cohort studies, 66–67

as nonequivalent-group design, 128

Community Oriented Polic-ing Services. See (COPS) Community Oriented Policing Services

Community victimization surveys, 103, 172. Seealso NCVS (National Crime Victimization Survey)

Comparison groups, 125in child abuse

research, 126in cohort designs, 128in obscene phone calls

research, 126Complete observer, 205Composite measures,

105–109CompuStat, 255Computer-assisted interview-

ing (CAI), 187–189

336 Subject Index

treatment integrity, maintaining, 268

units of analysis in, 265Evidence-based policy, 255Exceptions

regularities and, 13and rules, 10

Exclusive measurement, 88– 89

Exhaustive measurement, 88– 89

Experiential realities, 4 –6Experimental mortality, 120Experiments, 113. See also

Quasi-experimental designs; Testing; Time-series designs

attrition, 120avoiding internal validity

threats, 120–121basic design of, 115and causal

inference, 117–123cohort designs, 128comparison groups, 125construct validity

and, 121–122control groups, 115–116double-blind

experiments, 116–117experimental

mortality, 120external validity,

threats to, 122groups, 115–116

nonequivalent-groupsdesigns, 125–138

variations in, 124history and, 118, 120internal validity, threats

to, 118–123maturation and, 118–119,

120measurements and,

119, 120nonequivalent-groups

designs, 125–138pretesting and posttesting,

114 –115, 120–121randomization and,

120–121selection biases, 119–120statistical conclusion

validity threats, 122–123

statistical regression, 119summary of, 135–136time dimension,

ambiguity in, 120and validity threats,

117–123variable-oriented research,

133–135

Ethnicity and booster samples, 161

Evaluability assessment, 260–261

Evaluation research. See also Problem analysis

agency acceptance, 267case fl ow, adequacy

of, 267–268context of program,

measuring, 264creaming, 259defi ned, 255, 257delivery of program,

measuring, 264 –265designs for, 266–273development of, 260dysfunctional attitudes

about, 282evaluability assessment,

260–261goals in, 260, 261–262home detention

randomizedstudy, 269–271

illustration of randomized study, 269–271

impact assessment in, 259limits of randomized

designs, 268–269linking process

to, 257–259measurement in, 263–266micromodel in, 262midstream changes and

treatment integrity, 268minimizing exceptions to

randomassignment, 267

nonequivalent-groupsdesigns, 271–272

outcomes, specifying, 263–264

policy process, 256–259problem formulation

in, 261–263process evaluation in,

259, 273and public policy,

283–284quasi-experimental

designs, 271–273questions for, 261randomized evaluation

designs, 266–271secondary data for, 249and stakeholders, 263,

282–284target population for, 265time-series designs

in, 272–273topics for, 255–259

Errors. See also Sampling erroragency records, volume

and, 243–244in human inquiry, 8–10non-sampling errors,

304 –305in social science, 10standard error, 150–151,

304 –305Essential nature

defi nition, 84 – 86Ethical issues

in analysis of fi ndings, 33–34

anonymity of subjects, 32and applied research, 282The Belmont Report, 37codes of professional

ethics, 37–38compliance,

promoting, 37– 42confi dentiality of

subjects, 32–33controversies

involving, 42– 46crime-mapping

software, 30deception of subjects, 33defi nition of, 27and displacement, 36emotional trauma of par-

ticipants, avoiding, 30and extreme fi eld

research, 28–29of fi eld research, 204gang members, study

of, 42– 43harm to participants,

avoiding, 27–31informed consent, 39institutional review boards

(IRBs), 38– 42legal liability, 34 –35mandatory reporting

requirements, 36–37and participant

observation, 204promotion of crime by

research, 35–36recognition of, 27in reporting

fi ndings, 33–34and special

populations, 39– 41staff misbehavior, 35Stanford Prison

Experiment, 42– 45substantive ethical

issues, 27voluntary participation

requirements, 31–32withholding desirable

treatments, 36

Domestic violencearrest and, 8–9fi eld research on,

224 –225police records, improving,

237–239surveys and, 193

Double-blind experiments, 116–117

randomization in, 117subjects for, 116–117

Drug Abuse Resistance Edu-cation (D.A.R.E.), 281

Drug Abuse Warning Network (DAWN), 233

Drug usecausation and, 57–60fi eld research on

dealers, 212monitoring future, 104National Survey on Drug

Use and Health (NSDUH), 104

self-reported use, measuring, 177

snowball sampling and, 165

victimization surveys and, 103

Drug Use Forecasting (DUF), 233

E-mail of self-administered questionnaires, 184 –185

Ecological fallacy, 63Electronic formats,

published data in, 234Electronic monitoring

(ELMO), 4 –5political objections to, 284randomized study of,

269–271Electronic surveys, 184 –185Embarrassment of

participants, 28Emotional trauma of partici-

pants, avoiding, 30Empirical research, 6

interpretation in, 24measurement in, 24

Empirical support, 6Employment status,

measurement of, 88Environmental surveys,

216–217in auto theft illustration,

275–276safety audits, 220–221

Epistemology, 6Equal probability of selec-

tion method (EPSEM) samples, 144 –145

Subject Index 337

and inaccurate observation, 8

and overgeneralization, 8–9

political bias and, 10replication of inquiry, 9and selective

observation, 9–10tradition and, 7

Hypotheses, 11in case study research, 135null hypothesis, 307–308testing, 13

Ideographic explanation, 21–22

criteria for assessing, 52scientifi c realism and, 60

Ideological bias, 10Illogical reasoning, 10Immunity of confi dential

information, 34Impact assessment, 259In-person interviews.

See InterviewsInaccurate observation, 8Incident-based measures, 100Independent variables.

See VariablesIndex of disorder, 107–109Indicators and conceptual-

ization, 83– 84Individualistic fallacy, 63Individuals

behavior of, 13–14identifi cation of, 14 –15as sampling frames, 153as units of analysis, 61

Inductive reasoning, 22–23

Inferences from longitudinal studies, 67

Inferential statistics, 288, 303–311

cautions in interpreting, 309–311

chi square comparisons, 307–309

defi nition of statistical signifi cance, 305

interval estimates, 306–307

point estimates, 306–307substantive signifi cance,

309–310tests of statistical

signifi cance, 305–306univariate inferences,

304 –305visualizing statistical

signifi cance, 306–307Informal conversational

interviews, 205

ethical issues and juvenile members, 42– 43

National Youth Gang Survey, 64 –65

Gendermeasurement and, 88stratifi ed sampling

and, 156General Social Survey, 172,

296–297biased items in, 175

Generalizabilityof fi eld research, 226–227validity threats and,

121–123Generalizations from focus

groups, 195–196GIS and Crime Mapping

(Chainey & Ratcliffe), 274

Glossary, 313–320Goals in evaluation research,

260, 261–262Government Accounting

Offi ce’s content analysis guide, 247

Graffi ti, study of, 203Grounded theory, 11, 201Groups. See also Experiments

comparison groups, 125control groups, 115–116as units of analysis, 61

Harm to participants, 27–31potential for harm,

determining, 30History, experiments and,

118, 120Home detention. See

Electronic monitoring (ELMO)

Home Offi ce Research Bulletin,224 –225

Homeless persons and surveys, 193

Homicidecollection procedures,

changes in, 240content analysis of

gang-related homicides, 246–247

in New York City, 59published statistics

on, 233–234rates, computing,

295–296, 298–299Supplementary Homicide

Reports (SHR), 100Human inquiry, 6– 8

authority and, 7– 8errors in, 8–10ideological bias and, 10and illogical reasoning, 10

formal organizations, access to, 207–210

generalizability of, 226–227

guide for interviews, 206illustrations of, 219–224letters of introduction,

208–209linking observations with

other data, 217–219meeting with organiza-

tions for, 209–210natural settings and, 202New Jersey State Police

study, 221–222, 226photographs of

observations, 214 –215purposive sampling

in, 212–214questions in, 205–207recording observations,

214 –219reliability of, 225–226roles of observer,

203–205snowball sampling,

165–166, 210–211sponsors in formal

organizations, 207–208strengths and weaknesses

of, 224 –227structured observations,

216–217subcultures, access to, 210telephone calls to

organizations for, 209topics for, 202–203on traffi c enforcement,

219–222unstructured

interviews in, 205validity and, 122, 224 –225voice recorders,

use of, 214 –215voluntary participation

requirement, 31–32Fishing holes, 226Focus group

surveys, 195–196Forecasting and agency

records, 231Frequency

distributions, 288–289Frequency of offending, 171FTC (Federal Trade

Commission) telemar-keting restrictions, 190

Gambler’s fallacy, 10Gangs

content analysis of gang-related homicides, 246–247

variables, examination of, 114

variations in classical design, 123–124

Explanation. See Ideographic explanation; Nomo-thetic explanation

Explanatory research, 19–20agency records in, 231causation and, 51–53on perception of

crime, 172validity threats in, 122

Exploratory research, 18–19External validity, 55

case study research and, 135

of drug and crime research, 58–60

experiments, threats to, 122

Extreme fi eld research, 28–29

Face validity, 95Facts for policy makers, 283Faxed questionnaires, 184FBI (Federal Bureau of

Investigation)descriptive studies by, 19hierarchy rule, 99NIBRS (National Incident-

Based Reporting System), 100–102

published statistics, 232Federal Bureau of Prisons

published statistics, 232Federal Child Abuse Preven-

tion and Treatment Act of 1974, 36–37

Field notes, use of, 215–216Field research, 201–202

access to subjects, gaining, 207–214

audiotape recordings for, 214 –215

on bars and violence, 222–224

cameras for recording observations, 214 –215

cases for observation, selecting, 210–212

Chicago Beat Meeting Observation Form, 218

costs of, 224designing research

with, 74environmental surveys,

216–217ethical issues, 204extreme fi eld research,

ethics of, 28–29fi eld notes, use of,

215–216

338 Subject Index

ratio measures, 90–91of recidivism, 84 – 85reliability of, 93–94research proposal

defi ning, 77as scoring, 87– 88statistical conclusion

validity threats and, 123summary-based measure

of crime, 100summary of, 104 –105, 109test-retest method, 93–94typologies, 106–107UCR (Uniform Crime

Report) and, 99–100validity of, 94 –96of victimless crime, 98

Measurements Group, ques-tionnaires from, 181

Media, Crime, and Justice: Images, Realities, and Policies (Surette), 247

Median, 289–291Meetings in fi eld research,

209–210Member checks, 134Mental capacity for informed

consent, 39Methodology, 6Micromodel in evaluation

research, 262Mode, 289–291MTF (Monitoring the

Future: A Continuing Study of the Lifestyles and Values of Youth), 104 –105

self-administered ques-tionnaires from, 182

Multiple measures, comparing, 96

Multistage cluster sampling, 157

Multivariate analysis. See Statistical analysis

Murder. See HomicideMutually exclusive

measurement, 88– 89

National Academy of Sci-ences ethics booklet, 38

National Archive of Criminal Justice Data (NACJD), 249

National Commission for the Protection of Human Subjects of Biomedi-cal and Behavioral Research, 37

National Crime Victimiza-tion Survey (NCVS). SeeNCVS (National Crime Victimization Survey)

Mail surveysresponsibility for

conducting, 197self-administered

questionnaires, 182Mandatory reporting

requirements, 36–37Manifest-content coding, 245Market acceptability

testing, 195Market research, surveys

and, 195Matrix questions, 178–180Maturation and experiments,

118–119, 120Mean, 289–291Measurement. See also Opera-

tionalization; Statistical analysis

agency records and, 236attentiveness to

process, 197coding, 94composite

measures, 105–109construct validity and,

95–96, 122crime, approaches to

measuring, 97–105criterion-related

viability, 95defi ned, 87discriminant validity, 95of disorder, 107–109empirical research and, 24in evaluation

research, 263–266exclusive measurement,

88– 89exhaustive measurement,

88– 89experiments and, 119, 120face validity, 95implication of

levels, 91–92incident-based

measures, 100interrater reliability, 94interval measures, 90levels of, 89–91multiple measures,

comparing, 96NCVS (National Crime

Victimization Survey) and, 102

NIBRS (National Incident-Based Report-ing System), 100–102

nominal measures, 89–90ordinal measures, 90police, crimes known

to, 98–102quality, criteria for, 92–96

Jail stay concept, 88– 89Jill Dando Institute of Crime

Science, 274Johns Hopkins University

Medical School, Journal of Negative Observations in Genetic Oncology (NOGO), 34

Johnson, Lyndon, 5Journal of Negative Observa-

tions in Genetic Oncology (NOGO), 34

Junk e-mail, 184Junk phone calls, 190Justice as ethical principle, 37Juveniles. See also

Delinquency; Gangselectronic monitoring

(ELMO) and, 4 –5informed consent

and, 39– 40MTF (Monitoring the

Future: A Continuing Study of the Lifestyles and Values of Youth), 104

Kansas City Preventive Patrol Experiment, 5

and construct validity, 56

Language and informed consent, 39

Large-claiminterventions, 281

Latent-content coding, 245Lawyers and fi eld research

access, 210Legal liability, 34 –35Letters

cover letters for self-administered ques-tionnaires, 182–183

fi eld research letters of introduction, 208–209

Level of signifi cance, defi ned, 305

Lifestyles and crime, 301Likert scale, 174

matrix questions, 178Literature review in research

proposal, 77Logic. See Reasoning

and logicLogical support, 6Longitudinal studies, 66–67

collection methods, changes in, 240

logical inferences from, 67retrospective

studies, 67–70time-series

designs, 128–133

Informants for fi eld research, 210

Informed consent, 39example statement, 40

Inquiry. See Human inquiry; Social science

Institutional review boards (IRBs), 38– 42

rights of researchers and, 41– 42

special populations, 39– 41

Institutions. See alsoInstitutional review boards (IRBs)

as units of analysis, 62Internal validity, 55

case study research and, 134 –135

experiments and threats to, 118

regression and, 58Internet

secondary data, websites for, 249

surveys, 184 –185weblog polls, 144

Interpretationempirical research and, 24focus groups and, 196

Interrater reliability, 94Interrupted time-series

designs, 129, 132Interuniversity Consortium

for Political and Social Research (ICPSR), 248–249

Interval measures, 90Interventions and situational

crime prevention, 280–281

Interviews. See also Telephone interviews

comparison of survey methods, 191–192

computer-assistedinterviewing (CAI), 187–189

coordination and control of, 186–187

familiarity with question-naire, 186

informal conversational interviews, 205

probing for responses, 186role of interviewer,

185–186snowball sampling,

165–166specialized interviewing,

194 –195training of interviewers,

186–187

Subject Index 339

National Criminal Justice Reference Service (NCJRS), 3

National Household Survey on Drug Abuse, 188

National Institute of Justice (NIJ), 9

secondary data from, 248stakeholders, education

of, 282topics awarded funds

from, 284National Institute on Drug

Abuse, 104National Jail Census, 171National Organization for

the Reform of Marijuana Laws (NORML), 82

National Research Act, 37National Research

Council, 93National Survey on Drug Use

and Health (NSDUH), 104, 105

National Violence Against Women Survey, 172

National Youth Gang Survey, 64 –65

National Youth Survey (NYS), 248

Natural fi eld experiments, 126NCVS (National Crime

Victimization Survey), 30, 66, 102–103, 105

collection procedures, changes in, 240

content analysis and, 231contingency questions

in, 178, 179defi ciencies of, 193importance of, 192as panel study, 67primary sampling units

(PSUs), 160–161published statistics

from, 232sampling illustration,

160–161as targeted victim

survey, 172telephone interviews

in, 190Necessary causes, 53, 54Negative fi ndings,

reporting, 34New Jersey State Police study,

221–222, 226New York, declining crime

in, 58–59NIBRS (National Incident-

Based Reporting System), 100–102, 105

state or regional patterns in, 280

NIMBY (not in my back-yard!) responses, 195

Nominal measures, 89–90Nomothetic

explanation, 21–22criteria for assessing, 52scientifi c realism and, 60

Non-samplingerrors, 304 –305

Nonequivalent-groupsdesigns, 125–138

in evaluation research, 271–272

Nonprobability sampling. See Sampling

Nonpublic agency records, 234 –236

Normal curve, 150Norms, 13

of voluntary participation, 31

Notes in fi eld research, 215–216

Null hypothesis, 307–308

Objectivity and politics, 284 –285

Obscene phone call study, 126

Observation. See alsoField research

designing research and, 75inaccurate observation, 8as qualitative data, 23selective observation, 9–10social science and, 11time-series

designs, 128–133units of, 61

Observer-as-participant, 204 –205

Offenders. See also Prisoners; Self-reports

as fi eld research contact, 210

surveys, 103–104Offi ce of Justice Programs, 34Offi ce of Juvenile Justice and

DelinquencyPrevention (OJJDP)

National Youth Gang Survey, 64 –65

Open-ended questions, 173–174

Operational defi nitions, 84 – 86

Operationalization, 74, 86–92. See alsoMeasurement

content analysis coding and, 244 –245

of crime seriousness, 87victim surveys, 102–103

Ordinal measures, 90Organizations

fi eld research subjects, access as, 207–210

as sampling frames, 153as units of analysis, 62

Overgeneralization, 8–9

Panel studies, 67Parameter

estimates, 149–150Parole violators, records

on, 242–243Parsing, 275Participant-as-observer, 204Participant

observation, 203–204Patterns

aggregate patterns, 13–14and human inquiry, 6–7of regularity, 13

Percentage-downconvention, 297–300

Percentages,distributions by, 289

Perceptions and surveys, 172Periodicity, 155Personal interviews.

See InterviewsPhotographs for fi eld

research observations, 214 –215

Physical disorder, 108Placebos, 116Police Foundation, 5

domestic violence and arrest study, 8–9

Police patrols, 5Policing Domestic Violence: Ex-

periments and Dilemmas(Sherman), 266

Policy analysisapplied research and, 20surveys and, 173

Policy impacts/outputs, 246–247

Policy process, 256–257and problem analysis, 258

Politicsapplied research and,

282–285and bias, 10objectivity and, 284 –285

Population. See also Samplingdefi ned, 145designing research and, 74ethical issues with

special populations, 39– 41

probability sampling, 141–143

sampling frames and, 153–154

Postcode Address File (PAF) lists, 161–162

Posttesting, 120–121and experiments, 114 –115posttest-only designs, 124

Precision in measurement, 92–93

President’s Commission on Law Enforcement and Administration of Justice, 5

Pretesting, 120–121content analysis

coding scheme, 245and experiments, 114 –115measurement and, 124and purposive

sampling, 163Prevalence of offending, 171Primary sampling

units (PSUs), 160–161Prisoners

ethical issues and, 40– 41female prisoners,

number of, 223Stanford Prison

Experiment, 42– 45Private investigators and

fi eld research, 210Proactive beats, 5Probability, 7

causation and, 52Probability theory. See also

Samplingconfi dence interval, 152confi dence levels, 151–152sampling error,

estimating, 150–151summary of, 152–153

Probing for responses, 186Problem analysis, 249. See also

Crime mappingapplied research and,

20, 280–281guides for, 274 –275policy process and, 258problem-oriented

policing, 274 –280and scientifi c

realism, 273–281space-based analysis,

276–277, 280teams, development

of, 276time-based analysis, 273,

276–277, 280topics for, 255–259

Problem formulation in evaluation research, 261–263

Problem guides, 275

340 Subject Index

Problem-Oriented Guidesseries, 255

Problem-oriented policing, 274 –280

auto theft illustration, 275–276

case studies on, 275guides to, 274 –275

Problem solving, 274Process evaluations, 273Professional survey

fi rms, 197–198Program evaluation.

See Evaluation researchProposals. See Research

proposalsProspective studies, 68–70Prostitution, victimization

surveys and, 103Public drinking, 108Published statistics, 232–234Purposive sampling, 162–163

in fi eld research, 212–214

Qualitative data, 23–24case study design and, 133fi eld research and, 201

Quality in measurement, 92–96

Quantitative data, 23–24Quasi-experimental designs,

124 –135. See also Case study research; Time-series designs

in evaluation research, 271–273

summary of, 135–136Questions and question-

naires. See also Inter-views; Self-administered questionnaires

biased items and terms in, 175

clarity of items in, 174 –175closed-ended questions,

173–174construction of question-

naires, 177–181contingency questions,

177–178demographic questions,

ordering of, 180–181in evaluation research, 261in fi eld research, 205–207focus groups and, 196format of questionnaires,

177guidelines for, 173–177matrix questions, 178–180negative items, avoiding,

174 –175open-ended questions,

173–174

ordering items in questionnaire, 180–181

self-report items, designing, 175–177

short items, use of, 174social desirability of

questions, 175sources of existing

questionnaires, 181standardized

questionnaires, use of, 192–193

statements in questionnaires, 174

Quota sampling, 163–164

Random-digit dialing (RDD), 190

Random sampling, 154inferential statistics

and, 304Random selection

methods, 145Randomization, 120–121

in double-blind experiments, 117

evaluation research designs, 266–271

Range, 291–293Rape shield laws, 163Rates, computing, 295–296,

298–299Ratio measures, 90–91Reactive beats, 5Real defi nitions, 84 – 86Reasoning and logic

deductive reasoning, 22–23

ideographicexplanation, 21–22

illogical reasoning, 10inductive

reasoning, 22–23nomothetic explanation,

21–22of probability sampling,

141–143social science and, 11

Recidivism, 81defi ned, 84 – 85

Recording. See also Agency records

fi eld research observations, 214 –219

References in research proposal, 77

Regression. See Statistical regression

Regression to the mean, 119Regularity, patterns of, 13Relationships and variables,

17–18Reliability

of agency records, 239–244

of content analysis coding, 245

and fi eld research, 225–226

interrater reliability, 94of measurement, 93–94and nonpublic agency

records, 236surveys and, 193test-retest method, 93–94validity to reliability

analogy, 96Reliance on available

subjects, sampling by, 164 –165

Repeat victimization, 238–239

Replication of inquiry, 9Reporting

ethical issues in, 33–34mandatory reporting

requirements, 36–37organizing research

reports, 73Representativeness of

sample, 144 –145Research

applied research, 20avenues for inquiry, 20–24descriptive research, 19explanatory research,

19–20exploratory research,

18–19legal liability, 34 –35process, 71–72purposes of, 18–20

Research proposals, 76–78elements of, 76

Respect as ethical principle, 37

Response rate on self-administered question-naires, 183–184

Retrospective studies, 67–70Rival hypotheses, 135Rules, 13

Safety audits, 220–221Sample element, 145Sample statistics, 145Sampling. See also Field

research; Snowball sampling; Stratifi ed sampling

available subjects, reliance on, 164 –165

bias, 143–144British Crime Survey

example, 161–162cluster sampling, 157–160

confi dence sampling, 164 –165

convenience, sample of, 143

defi ned, 141designs, types of, 74,

154 –160disproportionate stratifi ed

sampling, 156–157equal probability of selec-

tion method (EPSEM) samples, 144 –145

multistage cluster sampling, 157

NCVS (National Crime Victimization Survey) example, 160–161

nonprobability sampling, 141, 162–166available subjects,

reliance on, 164 –165purposive sampling,

162–163quota sampling,

163–164review of, 166snowball sampling,

165–166parameter estimates,

149–150periodicity, 155population parameter, 145probability sampling,

141–143purposive sampling,

162–163, 212–214quota sampling, 163–164representativeness of

sample, 144 –145review of, 162sample element, 145sample statistics, 145simple random

sampling, 154standard error, 150–151,

304 –305stratifi ed sampling

cluster sampling with, 158–160

disproportionate stratifi ed sampling, 156–157

street-corner sampling, 304

systematic sampling, 154 –155

Sampling distribution, 145–149

summary of, 152–153Sampling error, 150–151

and cluster sampling, 158inferential statistics

and, 304 –305

Subject Index 341

non-sampling errors, 304 –305

for sampling, 155Sampling frame, 149

in fi eld research, 213populations of, 153–154

Sampling units, 157Schedule in research

proposal, 77Science, 1. See also

Social sciencerole of, 6traditional image of, 14

Scientifi c realism, 60–61and applied research,

280–281problem analysis

and, 273–281situational crime

prevention, 280–281variable-oriented research

and, 133–135Seasonality in time-series

design, 129Secondary analysis,

230, 247–250advantages and disadvan-

tages of, 249–250sources of secondary data,

248–249validity of, 249

Selection biases, 119–120Selective observation, 9–10Self-administered question-

naires, 181–185acceptable response

rates, 183–184comparison of survey

methods, 191–192computer-based

self-administration, 184 –185

cover letters for, 182–183follow-up mailings, 183mail distribution of, 182warning mailings,

182–183Self-reports, 171–172

intervals, interviewing subjects at, 176

multiple measures, comparing, 96

offender surveys, 103–104questions, designing,

175–177Sentencing policies, 283Seriousness of crime.

See Crime seriousnessShoplifting, fi eld study

of, 203Silo databases, 242–243Simple random

sampling, 154

Situational crime prevention, 280–281

political objections to, 284Skewed distributions,

294 –295Small-claim, small-area

problem solving, 281Snowball sampling, 165–166

in fi eld research, 210–211referral chart, 211

Social artifacts as units of analysis, 62–63

Social desirability of questions, 175

Social disorder index, 107–108

Social groups. See GroupsSocial production of data,

240–241Social science

aggregate patterns and, 13–14

avenues for inquiry, 20–24errors and, 10foundations of, 11–20theory and, 11–13

Socioeconomic status (SES), defi ned, 86

Sourcebook of Criminal Justice Statistics (BJS), 232–233

Space-based problem analysis, 276–277, 280

Spam fi lters, 184Special populations. See also

Juveniles; Prisonersethical issues with, 39– 41

Specialized interviewing, 194 –195

Speeding, fi eld research on, 219–222

Sponsors in formal organizations, 207–208

Squaring deviations, 291Staff misbehavior, 35Stakeholders in evaluation

research, 263, 282–284Standard deviation, 291–293Standard error, 150–151,

304 –305Stanford Prison

Experiment, 42– 45Statements in

questionnaires, 174Statistical analysis. See also

Inferential statisticsbase, determination

of, 289bivariate analysis, 296–301

defi ned, 288percentage-down

convention, 297–300central tendency, measures

of, 289–291

dispersion measures compared to, 293–295

chi square comparisons, 307–309

dispersion, measures of, 291–293central tendency

measure compared to, 293–295

distributionscomparisons of,

293–295skewed distributions,

294 –295of univariate data,

288–289frequency distributions,

288–289multivariate analysis,

301–303defi ned, 288tables in, 301–303

percentage-downconvention, 297–300

percentages, distributions by, 289

rates, computing, 295–296skewed distributions,

294 –295tables

in bivariate analysis, 297–300

chi square comparisons, 307–309

contingency tables, 300–301

lifestyles and crime, 301–303

in multivariate analysis, 301–303

percentaging, 297–300review of constructing

and reading, 300univariate analysis,

288–296inferential statistics

and, 304 –305Statistical conclusion

validity, 53–55, 55and drug and crime

research, 59experiments, threats

to, 122–123Statistical regression, 58

experiments and, 119Statistical signifi cance.

See Inferential statisticsStatistics

agency records and, 232–234

sample statistics, 145Status offenses, 102

Strategic Approaches to Community Safety Initiatives (SACSI), 276

Stratifi ed sampling, 155–156cluster sampling with,

158–160disproportionate stratifi ed

sampling, 156–157inferential statistics

and, 304Street-corner sampling, 304Street directories, sampling

from, 154Street lighting, evaluation

of, 203Structured observations,

216–217Subcultures and fi eld

research, 210Subjects. See also Deception

of subjects; Ethical issues

anonymity of, 32confi dentiality of, 32–33for double-blind

experiments, 116–117evaluation research,

adequacy of case fl ow for, 267–268

fi eld research subjects, access to, 207–214

in research proposal, 77Substance Abuse and Mental

Health Services Administration, 104

Substantive signifi cance, 309–310

Suffi cient causes, 53, 54Summary data, 100

agency records as, 233Supplementary

Homicide Reports (SHR), 100, 105

Surveys. See also Interviews; Questions and questionnaires; Self-administered question-naires; Self-reports; Telephone interviews

for applied studies, 172–173

costs of conducting, 196–197

counting crime, 171designing research

with, 74electronic surveys,

184 –185environmental

surveys, 216–217focus group

surveys, 195–196offender surveys, 103–104

342 Subject Index

Surveys (continued )on perceptions and

attitudes, 172professional survey fi rms,

197–198responsibility for

conducting, 196–198specialized interviewing,

194 –195strengths and weaknesses

of, 192–194targeted victim

surveys, 172topics appropriate

to, 171–173victim surveys, 102–103

Systematic sampling, 154 –155

inferential statistics and, 304

Tables. See Statistical analysisTarget population in

evaluation research, 265Targeted victim surveys, 172Tax maps, sampling

from, 154Taxonomy, 106Telemarketing, 190Telephone interviews,

189–191comparison of survey

methods, 191–192computer-assisted

telephone interviews, 190–191

problems with, 190random-digit

dialing (RDD), 190Telephones. See also

Telephone interviewsdirectories, sampling

from, 153–154fi eld research, calling

organizations for, 209obscene phone calls, quasi-

experimental design for research on, 126

Test-retest method, 93–94Testing. See also Posttesting;

Pretestingexperiments, validity

and, 119and internal validity, 119statistical signifi cance,

tests of, 305–306validity threats and, 124

Theft. See Auto theftTheories, 11–13

and relationships, 18Threats. See ValidityThree strikes laws, 283Time dimension, 65–70

causation and, 52cohort studies, 66–67

cross-sectional studies, 66and designing research, 71experiments and, 120in fi eld research, 213longitudinal studies,

66–67in problem analysis, 273,

276–277, 280secondary analysis and,

248summary of, 70

Time-series designs, 128–133in evaluation research,

272–273interrupted time-series

designs, 129, 132single-series design,

modifi cation of, 132with switching

replications, 132–133validity and, 132–133variations in, 132–133

Topicsfor agency records,

230–232for content analysis,

230–232for evaluation research,

255–259for fi eld research, 202–203and National Institute of

Justice (NIJ) funding, 284

for problem analysis, 255–259

Traditionand beliefs, 6human inquiry and, 7

Traffi c enforcementfi eld research on, 219–222video equipment for, 206

Treatment integrity, maintaining, 268

Trend studies, 66time-series designs

and, 128–133Typologies, 106–107

UCR (Uniform Crime Report), 19, 105

hierarchy rule, 99measuring crimes

by, 99–100as summary-based

measure of crime, 100as trend study, 66usefulness of data, 233

Units of analysis, 61–65in designing research, 71ecological fallacy, 63in evaluation research, 265groups as, 61–62individualistic fallacy, 63individuals, 61

and measuring crime, 97–99

in National Youth Gang Survey, 64 –65

organizations as, 62review of, 63–65sample elements and, 145social artifacts as, 62–63

Univariate analysis, 288–296University of Surrey

Question Bank questionnaires, 181

Unstructured interviews, 205U.S. Department of Health

and Human Services (HHS). See alsoInstitutional review boards (IRBs)

human research subjects, protection of, 38

National Survey on Drug Use and Health (NSDUH), 104

U.S. Department of Justiceincident-based

measures, 100NIBRS (National

Incident-Based Report-ing System), 100–102

Using Published Data: Errors and Remedies ( Jacobs), 239

Validity, 53–61. See alsoConstruct validity; External validity; Inter-nal validity; Statistical conclusion validity; Statistical regression

of agency records, 239–244

case study design and, 134criterion-related

viability, 95discriminant validity, 95experiments and, 117–123face validity, 95fi eld research and,

122, 224 –225generalizability

and, 121–123of measurement, 94 –96and nonpublic agency

records, 236reliability to validity

analogy, 96secondary analysis

and, 249of self-reports, 96summary of, 57surveys and, 193testing and, 124time-series designs and,

132–133

Variable-oriented research, 133–135

Variables, 14 –15. See alsoCausation; Statistical analysis

and attributes, 15–17binomial variable, 149dependent variables, 18

experiments and, 114experiments and, 114independent variables, 18

experiments and, 114individual variables, 14 –15levels of measurement

and, 91relationships

between, 17–18for stratifi ed

sampling, 156Variance, 291–292Victimless crime

measuring, 98NCVS (National Crime

Victimization Survey) and, 102

Victims and victimization, 102–103. See also NCVS (National Crime Victimization Survey)

agency records on, 232community victimization

surveys, 103lifestyles and, 301–303

Videofor fi eld research

observations, 214traffi c stops, equipment

for, 206Video game violence, content

analysis of, 246Violence. See also Domestic

violencebars and violence, fi eld

research on, 222–224content analysis and,

231, 246Voice recorders for fi eld

research observations, 214 –215

Volume of data and errors, 243–244

Voluntary participation in research, 31–32

Warning mailings, 182–183Weather as sampling

dimension, 213Weblog polls, 144Websites for secondary

data, 249Withholding desirable

treatments, 36Witness Intimidation

( Johnson), 255

Probability

df .30 .20 .10 .05 .02 .01 .001

11 1.074 1.642 2.706 3.841 5.412 6.635 10.82712 2.408 3.219 4.605 5.991 7.824 9.210 13.81513 3.665 4.642 6.251 7.815 9.837 11.341 16.26814 4.878 5.989 7.779 9.488 11.668 13.277 18.46515 6.064 7.289 9.236 11.070 13.388 15.086 20.517

16 7.231 8.558 10.645 12.592 15.033 16.812 22.45717 8.383 9.803 12.017 14.067 16.622 18.475 24.32218 9.524 11.030 13.362 15.507 18.168 20.090 29.12519 10.656 12.242 14.684 16.919 19.679 21.666 27.87710 11.781 13.442 15.987 18.307 21.161 23.209 29.588

11 12.899 14.631 17.275 19.675 22.618 24.725 31.26412 14.011 15.812 18.549 21.026 24.054 26.217 32.90913 15.119 16.985 19.812 22.362 25.472 27.688 34.52814 16.222 18.151 21.064 23.685 26.873 29.141 36.12315 17.322 19.311 22.307 24.996 28.259 30.578 37.697

16 18.841 20.465 23.542 26.296 29.633 32.000 39.25217 15.511 21.615 24.769 27.587 30.995 33.409 40.79018 20.601 22.760 25.989 28.869 32.346 34.805 42.31219 21.689 23.900 27.204 30.144 33.687 36.191 43.82020 22.775 25.038 28.412 31.410 35.020 37.566 45.315

21 23.858 26.171 29.615 32.671 36.343 38.932 46.79722 24.939 27.301 30.813 33.924 37.659 40.289 48.26823 26.018 28.429 32.007 35.172 38.968 41.638 49.72824 27.096 29.553 33.196 36.415 40.270 42.980 51.17925 28.172 30.675 34.382 37.652 41.566 44.314 52.620

26 29.246 31.795 35.563 38.885 42.856 45.642 54.05227 30.319 32.912 36.741 40.113 44.140 46.963 55.47628 31.391 34.027 37.916 41.337 45.419 48.278 56.89329 32.461 35.139 39.087 42.557 46.693 49.588 58.30230 35.530 36.250 40.256 43.773 47.962 50.892 59.703

Source: R. A. Fisher and F. Yates, Table IV from Statistical Tables for Biological, Agricultural, and Medical Research, 6th Edition, 1974. Longman Group Ltd.

Distribution of Chi Square, continued

  • Front Cover
  • Title Page
  • Copyright
  • Contents
  • Preface
  • PART ONE: An Introduction to Criminal Justice Inquiry
    • Chapter 1: Criminal Justice and Scientific Inquiry
      • Introduction
      • HOME DETENTION
      • What Is This Book About?
      • Personal Human Inquiry
      • ARREST AND DOMESTIC VIOLENCE
      • Errors in Personal Human Inquiry
      • Foundations of Social Science
      • Purposes of Research
      • Differing Avenues for Inquiry
      • Knowing through Experience: Summing Up and Looking Ahead
      • Main Points
    • Chapter 2: Ethics and Criminal Justice Research
      • Introduction
      • Ethical Issues in Criminal Justice Research
      • ETHICS AND EXTREME FIELD RESEARCH
      • Promoting Compliance with Ethical Principles
      • ETHICS AND JUVENILE GANG MEMBERS
      • Ethical Controversies
      • Main Points
  • PART TWO: Structuring Criminal Justice Inquiry
    • Chapter 3: General Issues in Research Design
      • Introduction
      • Causation in the Social Sciences
      • Validity and Causal Inference
      • CAUSATION AND DECLINING CRIME IN NEW YORK CITY
      • Units of Analysis
      • UNITS OF ANALYSIS IN THE NATIONAL YOUTH GANG SURVEY
      • The Time Dimension
      • How to Design a Research Project
      • The Research Proposal
      • Answers to the Units-of-Analysis Exercise
      • Main Points
    • Chapter 4: Concepts, Operationalization, and Measurement
      • Introduction
      • Conceptions and Concepts
      • WHAT IS RECIDIVISM?
      • Operationalization Choices
      • JAIL STAY
      • Criteria for Measurement Quality
      • Measuring Crime
      • UNITS OF ANALYSIS AND MEASURING CRIME
      • Composite Measures
      • Measurement Summary
      • Main Points
    • Chapter 5: Experimental and Quasi-Experimental Designs
      • Introduction
      • The Classical Experiment
      • Experiments and Causal Inference
      • Variations in the Classical Experimental Design
      • Quasi-Experimental Designs
      • Experimental and Quasi-Experimental Designs Summarized
      • Main Points
  • PART THREE: Modes of Observation
    • Chapter 6: Sampling
      • Introduction
      • The Logic of Probability Sampling
      • Probability Theory and Sampling Distribution
      • Populations and Sampling Frames
      • Types of Sampling Designs
      • Illustration: Two National Crime Surveys
      • Nonprobability Sampling
      • Main Points
    • Chapter 7: Survey Research and Other Ways of Asking Questions
      • Introduction
      • Topics Appropriate to Survey Research
      • Guidelines for Asking Questions
      • Questionnaire Construction
      • DON’T START FROM SCRATCH!
      • Self-Administered Questionnaires
      • In-Person Interview Surveys
      • Telephone Surveys
      • Comparison of the Three Methods
      • Strengths and Weaknesses of Survey Research
      • Other Ways of Asking Questions
      • Should You Do It Yourself ?
      • Main Points
    • Chapter 8: Field Research
      • Introduction
      • Topics Appropriate to Field Research
      • The Various Roles of the Observer
      • Asking Questions
      • Gaining Access to Subjects
      • Recording Observations
      • Illustrations of Field Research
      • CONDUCTING A SAFETY AUDIT
      • Strengths and Weaknesses of Field Research
      • Main Points
    • Chapter 9: Agency Records, Content Analysis, and Secondary Data
      • Introduction
      • Topics Appropriate for Agency Records and Content Analysis
      • Types of Agency Records
      • IMPROVING POLICE RECORDS OF DOMESTIC VIOLENCE
      • Reliability and Validity
      • HOW MANY PAROLE VIOLATORS WERE THERE LAST MONTH?
      • Content Analysis
      • Secondary Analysis
      • Main Points
  • PART FOUR: Application and Analysis
    • Chapter 10: Evaluation Research and Problem Analysis
      • Introduction
      • Topics Appropriate for Evaluation Research and Problem Analysis
      • Getting Started
      • Designs for Program Evaluation
      • Problem Analysis and Scientific Realism
      • The Political Context of Applied Research
      • WHEN POLITICS ACCOMMODATES FACTS
      • Main Points
    • Chapter 11: Interpreting Data
      • Introduction
      • Univariate Description
      • Describing Two or More Variables
      • MURDER ON THE JOB
      • Inferential Statistics
      • Main Points
  • Glossary
  • References
  • Name Index
  • Subject Index