|
|
Statistical
Blunders
by
the Proponents and
Opponents of
Accountability Resource
Index for presentation
April 16,
Education Writers Association 2004 Seminar
David
Rogosa
Stanford
University
rag_at_stat.stanford.edu
Handouts
for April 16 EWA presentation
|
|
|
|
|
|
|
|
Premise:
Current
events wrapper for statistical material:
Who deserves custody of accountability--states
or feds?
Recent NCLB revolts: O'Connell
and 14 states 3/24, Rendell
PA 4/8
Accountability is not a bad thing, but it can be done badly. And that's
where statisticians (should) come in, to insure that the policy directives
are implemented in a defensible form.
Most "experts" in
the educational research community that you as journalists would reasonably
rely upon for expertise in assessment and accountability
issues cannot supply such. Arising from this dearth of knowledge
on statistical issues key to accountability systems (or even large scale
assessments) is the
opportunity for many leading figures in educational research to substitute
their own ideological (anti-testing) biases for the facts or to bash
testing programs for self-promotional purposes. All educational
researchers best left behind?
|
NCLB
Research and Journalism Vignettes
- The
Volatility Scam
Claims of "volatility" in the school-level scores from testing
programs by Linn and Haug (2002) and by Kane and Staiger (2002) represent
a serious threat to defensible policy uses of test scores in school accountability
systems. Also the claimed volatility in year-to-year improvement has warped
NCLB design (growth versus status models); see CCSSO
AYPpaper (e.g. Fig 4). But,
the volatility claims are based on blunders at the level of high school
statistics instruction.
See Confusions
about Consistency in Improvement (especially intro examples in
section 1.2); additional, perhaps less accessible, material specific to Kane-Staiger
in Irrelevance
of Reliability Coefficients to Accountability Systems
Related material on successive cohorts
vs matched longitudinal "A Better Student Data System for
California"
- "Margin
of Error" and the NCLB Confidence Intervals
The margin of error is a misunderstanding of elementary statistical
concepts that leads to hilarious assertions.
If it could be, it is.
NCLB incarnation though Confidence Interval adjustment to AMO, "close
enough is good enough"
Miscalculations of burden of proof and benefit of doubt inspired by CCSSO
AYPpaper: Making Valid And Reliable
Decisions In Determining Adequate Yearly Progress.
NCLB Confidence Interval analyses, quantifying benefit of doubt
1. The
NCLB "99% confidence" scam: Utah-style calculations November 2003
2. How the Confidence Interval (margin of error)
Procedures Destroy the Credibility of State NCLB Plans Draft 11/03
sidenote: Participation Rate: Exhibit A in a case against feds being
fit custodians for accountability
Proper understanding of role of statistical uncertainty in accountability procedures.
Basic truth for AMO, statistical variability forces educational attainment to blow by criterion:
California's AMOs Are More Formidable Than They Appear October
2003
Older margin-of-error-folly material for California accountability: Book
Chapter treatment of the California API Orange County Register attacks.
Prior versions (9/02): see the "Commentaries on the Orange County
Register Series" at
the API Research
Page
- NCLB: No Subgroup Left Behind
Too
much of a good thing? Herding cats, effects on false positives and
negatives
Critics fall short (PACE calls for elimination of subgroup criteria)
Assessing
the effects of multiple subgroups: Rebuttal to PACE
Policy Brief December 2003 "Penalizing Diverse Schools? Similar test
scores, but different students, bring federal sanctions" December
2003
Full set of reports at NCLB section of the API
Research Page . Similar blunders (equality of opportunity vs equality of results) seen for Kane-Staiger claims for California
awards; see section 4 of Irrelevance
of Reliability Coefficients to Accountability Systems
- Teacher
Credentialing: NCLB 'highly qualified' teachers
Data analysis support for teacher
credentialing mandate? Spurious correlation versus potential effects.
- Closing
the Gap: Progress of Groups and Subgroups
Progress of groups and subgroups in four-peat
data analysis for California . Class not race?
Older analyses in
the Interpretive
Notes series
on the API
Research Page.
Demographics are far from Deterministic
The California Teachers Association (and other critics of testing programs)
seek to undermine the credibility of assessment programs with slogans such
as "It's All Zip Codes" and renaming the API as the "affluent
parent index". Many policy researchers (e.g. California Budget Project)
feed this misrepresentation with unthoughtful correlational and multiple
regression analyses. Reasonable data analysis shows that schools (and students)
with similar demographic composition have very different educational performance.
|
|
|
|
|
|
|
|
Acknowledgements Support
for the research reported here has been provided
by
- the
California Department of Education, Policy and
Evaluation Division.
- the Educational Research
and Development
Centers Program, PR/Award Number R305B60002 as administered
by the Institute of Education Sciences, U.S. Department of Education.
The findings and opinions expressed do not reflect
the positions or policies of
the National Institute on Student
Achievement,
Curriculum, and Assessment,
the Institute of Education Sciences, or
the U.S. Department of Education.
|
|
|