|
|
Educational
Accountability:
Fair
and Balanced Resource
Index for presentation
September 7, Seminar on Testing,
Hechinger Institute and Center on Education Policy David
Rogosa
Stanford
University
rag_at_stat.stanford.edu
NEW Handouts
from September 7 presentation
|
|
Premise:
Most "experts" in the educational research community that you as
journalists would reasonably rely upon for expertise in assessment and accountability
issues cannot supply such. Arising from this dearth of knowledge on statistical
issues key to accountability systems (or even large scale assessments) is the
opportunity for many leading figures in educational research to substitute
their own ideological (anti-testing) biases for the facts or to bash testing
programs for self-promotional purposes. All educational researchers best
left behind?
Accountability is not a bad thing, but it can be done badly. And that's where
statisticians (should) come in, to insure that the policy directives are implemented
in a defensible form. |
|
|
Policy
Research and Journalism Vignettes
- The
Volatility Scam
Claims of "volatility" in the school-level scores from testing
programs by Linn and Haug (2002) and by Kane and Staiger (2002) represent
a serious threat to defensible policy uses of test scores in school accountability
systems. However, such claims are based on blunders at the level of high
school statistics instruction. See Confusions
about Consistency in Improvement (especially intro examples in
section 1.2); additional, perhaps less accessible, material specific to Kane-Staiger
in Irrelevance
of Reliability Coefficients to Accountability Systems
- "Margin
of Error" Nonsense and the Orange County Register Debacle
The margin of error is a misunderstanding of elementary statistical
concepts that leads to hilarious assertions. Sadly, last August the Orange
County Register based their series of attacks on the California API
on this nonsense: chief experts/charlatans Richard Hill and Thomas Kane.
NEW 12/03. Book
Chapter treatment of the Orange County Register folly
Older versions: see the "Commentaries on the Orange County Register
Series" at
the API Research
Page ;
in particular the "High School Intern and the API Dollars" in What's
the Magnitude of False Positives in GPA Award Programs? and the "Blood
Pressure Parable" in Application
of OCR "margin of error" to API Award Programs
- Sanctions
are Not the Flip-side of Awards
Award programs, such as California API GPA have false positives and false
negatives, and these are not symmetric. Basing sanctions on a failure to
reach award criteria is undesirable. In other words, where should the
"the benefit of the doubt" be applied? Properties of award programs are discussed
in various documents on the API
Research Page
- Accuracy
of Individual Scores
Properties of individual student scores, such student percentile rank scores
from standardized tests that go to parents and schools and which are also
sometimes used for high stakes decisions, are typically described by test
reliability coefficients. Unfortunately, reliability coefficients are one
of the dumbest ideas ever and provide little useful information. Various
documents and analyses for the accuracy of individual scores (including analyses
of the CAT/6 and Stanford 9) are provided on the Accuracy
Guide page
Simplest place to start is the Shoe-Shopping Example.
- Demographics
are far from Deterministic
The California Teachers Association (and other critics of testing programs)
seek to undermine the credibility of assessment programs with slogans such
as "It's All Zip Codes" and renaming the API as the "affluent
parent index". Many policy researchers (e.g. California Budget Project)
feed this misrepresentation with unthoughtful correlational and multiple
regression analyses. Reasonable data analysis shows that schools (and students)
with similar demographic composition have very different educational performance.
See the analyses in the Interpretive Notes series on the API
Research Page.
NEW 10/03 four-peat
data analysis for California
- NCLB,
Where Accountability Came to Die?
A teaser for forthcoming work, should the question mark should be removed?
Wise man statement:
"It is a bad system to punish people when you set standards they can't possibly
make," said Roy Romer, superintendent of the Los Angeles Unified School
District, the largest school system in the state. (Los Angeles Times,
Aug 16)
NEW
1. California's AMOs Are More Formidable Than They Appear October 2003
2. The NCLB "99%
confidence" scam:
Utah-style calculations November 2003
3. Why NCLB is a Statistical Sham.
Part I: How the Confidence Interval (margin of error)
Procedures Destroy the Credibility of State NCLB Plans Draft November 2003
4. Assessing
the effects of multiple subgroups: Rebuttal to PACE
Policy Brief December 2003 "Penalizing Diverse Schools? Similar test
scores, but different students, bring federal sanctions" December
2003
Discussion
Item. Rebutting bad research, Process and policy?
Is null set the best and only answer? What do and what should
journalists do after reporting in good faith on demonstrably incompetent
research? Contrast education with reporting on medical research (e.g.
New York Times Tuesday Health section).
Education examples: charter
schools, teacher
credentialling
Public
Forum on School Accountability "A
Better Student Data System for
California"
|
|
Acknowledgements Support
for the research reported here has been provided
by
- the
California Department of Education, Policy and
Evaluation Division.
- the Educational Research
and Development
Centers Program, PR/Award Number R305B60002 as administered
by the Institute of Education Sciences, U.S. Department of Education.
The findings and opinions expressed do not reflect
the positions or policies of the National Institute on Student
Achievement,
Curriculum, and Assessment,
the Institute of Education Sciences, or
the U.S. Department of Education.
|
|
|
|
|
|
|
|
|