A Step-by-Step Guide to the CAATS Model
Foreword by Richard Kunkel
Preface
Acknowledgments
About the Authors
1. Expectations and Options for Accountability and Teacher Assessment
What This Chapter Is About
The Challenge from the National Commission on Teaching and America’s Future
Title II of the Higher Education Act Amendments of 1998
No Child Left Behind Legislation
National Research Council (2001) -- the Committee on Assessment and Teacher Quality
What A Few Others Have Said: A Brief Review of the Literature on Testing and Licensure
Standards: The Roadmap to Accountability and Scientifically-Based Performance Assessment
The Principal Sets of Standards Governing Our Work
National and State Pedagogical and Content Standards
Unit Accreditation and Operational Standards
Technical Standards for Measurement of Teacher Competency
Some Major Threats to Validity in Most Current Assessment Systems
Conceptual Frameworks: Pulling It All Together
NCATE Standards
AERA, APA, & NCME
INTASC Principles: Where NCATE and AERA, APA, & NCME Standards Converge
Making Sense of Conceptual Frameworks
Our Conceptual Framework: What We Value
Assessment Options
Records of Training Completed
Tests and Exam Scores
Observations of Performance
Portfolios of Assessable Artifacts
Job-Related Tasks and Work Sample Products
K-12 Student Work Samples
Wrap-Up
Activity #1: What’s Happening in Your State and School?
Activity #2: Questionnaire for Faculty Views on Competency Assessment
Activity #3: Assessment Belief Scale
Activity #4: Assessment Options
2. Portfolios – To Be or Not To Be?? That IS the Question!
What This Chapter Is About
The Portfolio: Panacea or Pandora’s Box
Portfolios as Certification “Tests”: Lessons from Standards and History
Assessment Illiteracy, Paradigms Shifts, and Conflicting Purposes
The Conflict of Formative vs. Summative Evaluation
The Conflict of Program Approval vs. Accreditation
The Conflict of Business or Legislative Personal/Professional Perspectives vs. Accountability Through Title II Requirements
The Conflict of Academic Freedom vs. Accountability
The Conflict of Constructivism vs. Positivism
Recommendations for Use of Portfolios in Accountability Contexts
Ten Recommendations for Assessment System Design
A Recommended, Standards-Based Model
Overview of “Competency Assessment Aligned with Teacher Standards” (CAATS) Model
CAATS Step 1: Define content, purpose, use, and other contextual
CAATS Step 2: Develop a valid sampling plan
CAATS Step 3: Create or update tasks aligned with standards and
CAATS Step 4: Design and implement data aggregation tracking and management systems
CAATS Step 5: Ensure credibility and utility of data
Wrap-Up
A Note about Chapters 8 and 9
Activities
Activity #1: Review Your Feelings
Activity #2: Thinking about Conflicting Paradigms
Activity #3: Getting Your Action Plan Started
3. CAATS Step 1: Assessment Design Inputs
Where We Have Been So Far
What This Chapter Is About
CAATS Step 1A: Define the Purposes and Uses of the System
The Importance of Purpose and Use
Different Strokes for Different Folks for Different Purposes
Purpose and Use in the Accountability Context
More on Accountability Based Systems: State Program Approval
CAATS Step 1B: Define the Propositions or Principles that Guide the System
CAATS Step 1C: Define the Conceptual Framework or Contents of the System
So What Is Assessment Content?
Standards as the Link between Purpose, Use, and Content
CAATS Step 1D: Review Local Factors That Impact the System
Wrap Up
Worksheets
Worksheet #1: Purpose, Use, Propositions, Content, & Context Checksheet
Worksheet #2: Purpose, Use, Content, Draft
Worksheet #3: Propositions
Worksheet #4: Contextual Analysis
4. CAATS Step 2: Planning with a Continuing Eye on Valid Assessment Decisions
Where We Have Been So Far
What This Chapter Is About
CAATS Step 2A: Organize Standards into Content Domains
All Those Standards Sets
Organizing for Alignment
“A Rose Is a Rose” Or More of the Same
Why Bother?
Crosswalks and Standards
CAATS Step 2B: Visualize the Competent Teacher Based on the Standards
CAATS Step 2C: Brainstorm a Set of Summative Tasks (Sampling Plan)
CAATS Step 2D: Sort Tasks into Formative and Summative Assessments
CAATS Step 2E: Build Assessment Frameworks
Framework Options
A Special Case: Aligning Tasks with NCATE Requirements
Wrap-Up
Worksheets
Worksheet #1: Organizing for Alignment (Version 1)
Worksheet #1: Organizing for Alignment (Version 2)
Worksheet #2: Our Critical Skills
Worksheet #3: Visualizing the Competent Teacher
Worksheet #4: Critical Task List
Worksheet #5: Sorting Formative and Summative Tasks
Worksheet #6: List of Summative Assessments by Competency Type
Worksheet #7: List of Summative Assessments by Levels of Inference
Worksheet #8: List of Summative Assessments by Points in Time
Worksheet #9: Matrix of Standard by Competency Type
Worksheet #10: Matrix of Critical Tasks by Competency Type and Benchmark
Worksheet #11: Aligning Tasks with NCATE Thematic Portfolios
5. CAATS Step 3 -- Writing Tasks Designed to Maximize Validity and Reliability
Where We Have Been So Far
What This Chapter Is About
CAATS Step 3A: Determine the Task Format for Data Aggregation
What Happens When There Is No Format?
Can I Use Percents and Total Points? No, They Don’t Cut It!
Elements of a Common Task Format
CAATS Step 3B: Create New Tasks or Modify Existing Tasks
Basic Concepts about Tasks
Hints and Advice about Writing Tasks
Rubric Examples: A Rose is a Rose
CAATS Step 3C: Conduct First Validity Study
CAATS Step 3D: Align Tasks with Instruction
Wrap-Up
Worksheets
Worksheet #1: Proficiency Level Descriptions
Worksheet #2: Task Design
Worksheet #3: Standards and Indicators Coverage Report
Worksheet #4: Individual Task Review for Job-Relatedness
Worksheet #5: Checklist for Reviewing Individual Tasks
Worksheet #6: Instructional Alignment
6. CAATS Step 4: Decision-Making and Data Management
Where We Have Been So Far
What This Chapter Is About
CAATS Step 4A: Determine How Data Will Be Aggregated
So What Is Data Aggregation?
The Relationship of Data Aggregation (Step 4A), Cut Scores, (Step 4B), and
An Approach to Decision-Making without Using Points and Percents
CAATS Step 4B: Set Standards for Minimal Competency
Different Strokes for Different Folks – Déjà Vu or Purpose Revisited
Is 100% Really Reasonable?
Criticality Yardstick Approach (CYA) to Cut Score Setting in Complex
Performance Assessments
First Cut
Second Cut
Working with Judges
An Example of What to Say to the Judges
CAATS Step 4C: Select and Develop a Tracking System
Sharing Information for Decision-Making: The Big Challenge
Data Storage Option #1: Course Grades or Records of Participation/Attendance
Data Storage Option #2: Teacher Folders
Data Storage Option #3: Portfolios
Data Storage Option #4: Electronic Data Management System
Reporting Aggregated Data
CAATS Step 4D: Develop Management System
Advising and Due Process
Scoring Procedures
Implementation
Wrap-Up
Worksheets
Worksheet #1: Cut Score Decisions
Worksheet #2: Sample Format for Candidate/Teacher Tracking Form
Worksheet #3: Format for Data Aggregation
Worksheet #4: Management Plan
Worksheet #5: Rater Monitoring Record
7. CAATS Step 5: Credible Data
Where We Have Been So Far
What This Chapter Is About
What Is Psychometric Integrity and Why Do We Have to Worry About It?
CAATS Step 5A: Create a Plan to Provide Evidence of Validity, Reliability, Fairness, and Utility
Elements of a Plan
Element #1: Purpose and Use
Element #2: Construct Measured
Element #3: Interpretation and Reporting of Scores
Element #4: Assessment Specifications and Content Map
Element #5: Assessor/Rater Selection and Training Procedures
Element #6: Analysis Methodology
Element #7: External Review Personnel and Methodology
Element #8: Evidence of Validity, Reliability, and Fairness (VRF)
CAATS Step 5B: Implement the Plan Conscientiously
Wrap Up
Worksheets and Examples
Worksheet #1: Assessment Specifications
Worksheet #2: Analysis of Appropriateness of Decisions for Teacher Failures
Worksheet #3: Program Improvement Record
Worksheet #4: Expert Rescoring
Worksheet #5: Fairness Review
Worksheet #6: Analysis of Remediation Efforts and EO Impact
Worksheet #7: Psychometric Plan Format
Example #1 (Empirical Data): Logistic Ruler for Content Validity
Example #2 (Empirical Data): Computation of the Lawshe (1975) Content
Example #3 (Empirical Data): Disparate Impact Analysis
Example #4 (Empirical Data): Computation of Cohen’s Kappa (1960) for
Example #5 (Empirical Data): Spearman Correlation Coefficient and Scatterplot:
Example #6 (Empirical Data): Pearson Correlation Coefficient and Scatterplot:
Example #7 (Empirical Data): Correlation Matrix and Scatterplots Knowledge,
Example #8 (Empirical Data): T-Test Comparing Mathematics and Science
Example #9 (Empirical Data): Differential Item and Person Functioning
8. The Trouble with Tribbles: Standard Setting for Professional Certification
What This Chapter Is About
An Overview of Cut Score Setting
From Norm-Referenced Objective Tests to Criterion-Referenced Subjective Tasks
Controlling Human Judgment
Difficulty vs. Importance
Social Consequences
Tidbits to Remember
Cut Scores Setting in Traditional, Objective Tests
Holistic Impressions: Method One
Item Content: Method Two
Performance of Examinees: Method Three
Combination Approach: Method Four
Standard or Cut Score Setting in Performance Tests for Professionals: Is it the Same as Multiple Choice?
Quotes to Remember
The Extended Angoff Procedure for Performance Assessment
The Judgmental Policy Capturing Approach for Performance Assessment
The Dominant Profile Judgment for Performance Assessment
Standard Setting Using Item Response Theory
What’s Really Wrong with Current Approaches to Cut Score Setting
Berk’s Suggestions and Commentaries Over the Years
The Criticality Yardstick Approach
Wrap Up
Activities
Activity #1: Change the Difficulty of an Item
Activity #2: Change the Criticality of a Task
Activity #3: Replicate a Cut Score
Activity #4: Take Your Choice
9. Using Teacher Scores for Continuous Improvement
What This Chapter is About
Reasons Why We Use the Rasch Model
The Classical Approach
A Quick Overview of Where Rasch Fits Into the Grand Scheme of IRT Models
Rasch: The Basics
Getting Started
Differences that Item Writers Make
Guttman Scaling
A Sample Rasch Ruler
From Pictures to Numbers
The Fit Statistic
Gain Scores – Real or Imagined?
Ratings and Raters
Learning More about Rasch
Wrap Up
Activities
Activity #1: Decision Making Tool for Measurement
10. Legal Integrity
What IF?? A Legal Scenario
Psychometric Issues and Legal Challenges
Legal Issues and Precedents
End Note
Appendix: Tasks Developed for Florida Alternative Certification Program
Glossary
References
Index