Measurement and Evaluation
Data-collection methods for assessment purposes typically fall into two categories: direct and indirect. Both are important, but indirect evidence by itself is insufficient. Direct evidence is required. Ideally, a program collects both types.
Why is direct evidence of student learning required?
Direct evidence reveals what students have learned, while indirect evidence can help faculty interpret direct information and guide improvements. For example, If students self-report on a survey (indirect evidence of learning) that their knowledge of world geography is excellent but later fail a multiple-choice world geography test (direct evidence), that's useful information. The indirect evidence by itself isn't meaningful without the direct evidence of students' knowledge. Programs can collect both direct and indirect evidence of student learning to gain a better picture of their students.
Measurement Best Practices
Direct and indirect measures.
- Assessment measures are divided into two broad classes: direct and indirect measures.
- Direct measures are those in which student demonstrate learning and measured with a scoring device.
- Indirect measures include those in which students report their attitudes, perceptions, or feelings about their learning, usually in the form of a survey. Indirect measures also include data that is related to student but that cannot be directly tied to an outcome (e.g. job/graduate school placement rates, publication counts, GPA, course grades).
- Direct measures are preferable as the primary form of assessment, but some objectives may only be measurable with more indirect methods.
Choose methods that will
- allow students to demonstrate specific student learning outcomes.
- be credible to the faculty and the intended users of the results.
- provide useful, meaningful and actionable information that can be used as a basis for decision-making. Quantity is not the goal. Faculty members must be willing to discuss and make changes to the program (as needed) based on the results.
Try not to “reinvent the wheel.”
- Use or modify existing evidence whenever possible. Assessment is usually already happening in courses. Inventory what evidence of student learning and perceptions about the program already exist. A curriculum matrix is a useful tool when conducting the inventory.
- Course-embedded assessments are often the most efficient and authentic way to assess learning.
- Consider adapting assessment tasks and/or scoring devices already in use by faculty in the discipline, in other departments, or at other institutions.
- Work to keep an assessment task as authentic to expected learning as possible.
- Use more than one indicator of learning whenever possible, whether that be multiple direct measures or a combination of direct and indirect measures.
Be aware of semantic issues.
- Use inclusive and culturally sensitive language.
- Keep assessment questions simple and to the point.
- Avoid leading or “double-barreled” questions that result in participant responses which are not focused on the relevant issue.
Pilot test when possible.
- It is generally best to try to pilot exams, assessment tasks, surveys and interview questions so that any confusion regarding interpretation of the question will be discovered. Cognitive interviewing techniques can be useful to obtain feedback during pilot testing.
Keep the length of the assessment measure as short as possible.
- Assessment tasks, surveys or focus groups/interviews that are too long lose their effectiveness and may result in inaccurate measuring of achievement or higher non-response or incomplete response rates.
- Make sure the assessment is feasible to carry out given the program's resources and amount of time faculty members are willing to invest in assessment activities.
Direct and Indirect Measures
Examples of Direct Measures of Student Learning
- Course-embedded tests, assignments, or projects
- Culminating experiences: capstone projects, senior theses, senior exhibits or performances
- Portfolio assessment
- Licensure, certification, or professional exams
- Essay questions blind scored by faculty across the department, division, school, or college
- Internal and external juried review of comprehensive senior projects, exhibitions, and performance
- Employer's or internship supervisor's direct evaluations of students' performances
- Faculty assessment of a student publication or conference presentation
Adapted from Cecilia Lopez, NCA Commission on Institutions of Higher Education: Opportunities for Improvement: Advice from Consultant-Evaluators of Programs to Assess Student Learning, March, 1996
Indirect Measures of Student Learning
- Alumni, employer, and student surveys
- Exit interviews of graduates and focus groups
- Interviews of instructors, program coordinators, residence halls leaders, and others who have direct contact with students
- Graduate follow-up studies
- Retention and transfer studies
- Length of time to degree
- SAT / ACT scores
- Graduation rates and transfer rates
- Job placement data
- Observing and recording students’ behaviors
Sources:
(1) University of Hawaii - Choose a Method to Collect Data/Evidence
(2) Center for Effective Collaboration and Practice’s list of Indirect Measures
Rubric Templates & Samples
Direct Assessments
Embedded Assignments
When faculty members collaborate to reach consensus on what is acceptable and exemplary student work in their discipline, students receive more consistent grading and feedback from professors in the program.
BENEFITS |
DRAWBACKS |
|
---|---|---|
Embedded testing or quiz |
|
|
Embedded assignment |
|
|
Portfolios
Keeping a portfolio can lead students to become more reflective and increase their motivation to learn.
BENEFITS | DRAWBACKS |
---|---|
|
|
Licensure or Certification Exams
BENEFITS | DRAWBACKS | |
---|---|---|
|
|
.
Pre- Post-test
BENEFITS | DRAWBACKS |
---|---|
|
|
Internship Supervisor's direct evaluation
BENEFITS | DRAWBACKS |
---|---|
|
|
Indirect Assessments
Surveys
Surveys are the most commonly used indirect assessment method. Surveys are useful tools for collecting information regarding attitudes, beliefs, experiences, values, needs, demographic information, perceptions, etc.
When To Use Surveys
As an assessment tool, surveys should be employed when the goal is to draw relatively quick conclusions regarding the perceptions of a target population. Surveys can reach a large number of people in a short amount of time and typically produce data that is easy to analyze. Your college and program can get disaggregated data from the Assessment team's surveys. You may also add college/program specific questions for student in your program by contacting assessment@ksu.edu.
BENEFITS | DRAWBACKS |
---|---|
|
|
Focus Groups
Focus groups are an indirect assessment method in which a small group of individuals is assembled together to gain feedback and insight into a particular product, program, service, concept, etc. In a focus group, questions are asked in an interactive setting in which participants are free to talk with other group members.
When To Use Focus Groups
Focus groups are best used when group discussion or interaction among people would bring out insights that would not be ascertained through individual interviews or survey items. Additionally, focus groups are useful when rich data is needed and there aren’t sufficient resources available to conduct individual interviews.
BENEFITS | DRAWBACKS |
---|---|
|
|
Interviews
Interviews are one-on-one data collection events that allow for direct questioning of research subjects. Interviews can be conducted either in person or via the telephone.
When To Use Interviews
Interviews are best used when the purpose is to gain in-depth insight into individuals’ perceptions, establish personal contact with participants, and/or follow-up on prior survey findings.
BENEFITS | DRAWBACKS |
---|---|
|
|
Rubrics
What is a Rubric?
Rubrics are guides that help to score the performance of students. Rubrics are to be used by instructors when creating, scoring, grading, or providing feedback on course tasks and assignments, and by students to guide the development of quality work. These performance guides can help to provide a common understanding between students and instructors regarding the learning and performance expectations. Rubrics can also be used to evaluate student work collected for purposes of assessment, program evaluation, and improvement of student learning.
Why are rubrics used?
- A rubric creates a common framework and language for assessment.
- Complex products or behaviors can be examined efficiently.
- Well-trained reviewers apply the same criteria and standards.
- Rubrics are criterion-referenced, rather than norm-referenced. Raters ask, "Did the student meet the criteria for level 5 of the rubric?" rather than "How well did this student do compared to other students?"
- Using rubrics can lead to substantive conversations among faculty.
- When faculty members collaborate to develop a rubric, it promotes shared expectations and grading practices.
How do develop a rubric
Rubrics are composed of four basic parts. In its simplest form, a rubric includes:
- A task description. The outcome being assessed with instructions students receive for an assignment.
- The characteristics to be rated (rows) . The skills, knowledge, and/or behavior to be demonstrated.
-
Levels of mastery/scale (columns)
. Labels used to describe the levels of mastery should be tactful and clear. Commonly used labels include:
- Not meeting, approaching, meeting, exceeding
- Exemplary, proficient, marginal, unacceptable
- Advanced, intermediate high, intermediate, novice
- 1, 2, 3, 4
- A description of each characteristic at each level of mastery/scale (cells) .
Developing a rubric
Step 1: Identify what you want to assess
Step 2: Define the characteristics to be rated (rows). Specify the skills, knowledge, and/or behaviors that you will be looking for.
-
- Include the characteristics required to show mastery/proficiency for the assignment as well as those that are most important to the student learning outcome.
Step 3: Identify the levels of mastery (columns).
- Aim for around four because any more that this makes it difficult to differentiate qualities of achievement.
- Always allow for a category beyond the expected rigor for the assignment, although few may achieve this level.
- The levels of mastery seldom directly reflect the grade for an assignment.
Step 4: Describe each level of mastery for each characteristic (cells).
-
- Describe the expected rigor for meeting assignment rigor. This is identified in the second to the highest category.
- Describe the best work you could expect using these characteristics, beyond typical expectations. This describes the top category.
- Describe an unacceptable product. This describes the lowest category.
- Develop descriptions of intermediate-level products for intermediate categories.
Important: Each description and each characteristic should be mutually exclusive and defined well enough so that another faculty member could reliably rate students' work.
Step 5: Test the rubric for clarity, rigor, scoring reliability.
-
- Discuss with colleagues and students. Review feedback and revise.
- Score a sample of student work to identify consistency of scoring and efficiency of use.
Tip: Faculty members often find it useful to establish the minimum score needed for the student work to be deemed passable. For example, faculty members may decided that a "1" or "2" on a 4-point scale (4=exemplary, 3=proficient, 2=marginal, 1=unacceptable), does not meet the minimum quality expectations. They may set their criteria for success as 90% of the students must score 3 or higher. If assessment study results fall short, action will need to be taken.
Important: When developing a rubric for program assessment, enlist the help of colleagues. Rubrics promote shared expectations and consistent grading practices which benefit faculty members and students in the program.
Step 6: Provide rubric when assigning the assessment task so students can use it to guide their work.
Adapted from the University of Hawaii at Manoa
Calibrating a Rubric
When using a rubric for program assessment purposes, faculty members apply the rubric to pieces of student work (e.g., reports, oral presentations, design projects). To produce dependable scores, each faculty member needs to interpret the rubric in the same way. The process of training faculty members to apply the rubric is called "norming." It's a way to calibrate the faculty members so that scores are accurate and consistent across the faculty. Below are directions for an assessment coordinator carrying out this process.
Suggested materials for a calibration session:
- Copies of the rubric
- Copies of student work that illustrate each level of mastery. Suggestion: have 6 example pieces (2 low, 2 middle, 2 high)
- Score sheets
Hold the scoring session in a room that allows the scorers to spread out as they rate the student pieces
Process:
- Describe the purpose of the activity, stressing that the purpose is to assess the scoring device (rubric), not individual students or faculty, and describe ethical guidelines, including respect for confidentiality and privacy.
- Describe the scoring rubric and its categories. Explain how it was developed.
- Analytic: Explain that readers should rate each dimension of an analytic rubric separately, and they should apply the criteria without concern for how often each score (level of mastery) is used. Holistic: Explain that readers should assign the score or level of mastery that best describes the whole piece.
- Give each scorer a copy of several student products that are exemplars of different levels of performance. Ask each scorer to independently apply the rubric to each of these products, writing their ratings on their individual scoring sheet.
- Once everyone is done, collect everyone's ratings and display them so everyone can see the degree of agreement. Alternatively, the facilitator could ask raters to raise their hands when their rating category is announced, making the extent of agreement very clear to everyone and identifying raters who routinely give unusually high or low ratings.
- Guide the group in a discussion of their ratings. Attempt to reach consensus on the most appropriate rating for each of the products being examined by inviting people who gave different ratings to explain their judgments. You might allow the group to revise the rubric to clarify its use but avoid allowing the group to drift away from the rubric and learning outcome(s) being assessed.
- Once the group is comfortable with how the rubric is applied, the rating begins. Explain how to record ratings using the score sheet and explain the procedures. Reviewers begin scoring.
- If you can quickly summarize the scores, present a summary to the group at the end of the reading. You might end the meeting with a discussion of five questions:
- Are results sufficiently reliable?
- What do the results mean? Are we satisfied with the extent of students' learning?
- Who needs to know the results?
- What are the implications of the results for curriculum, pedagogy, or student support services?
- How might the assessment rubric be improved?
Adapted from the University of Hawaii at Manoa
Tips for Developing and Using Rubrics
Ideas for Using Rubrics in Courses
- Hand out the rubric with the assignment so students will know your expectations and how they'll be graded.
- Use a rubric for grading student work and return the rubric with the grading on it. Include additional comments, either within each section or at the end.
- Develop a rubric with your students for an assignment or group project. Students can the monitor themselves and their peers using agreed-upon criteria that they helped develop. Many faculty members find that students will create higher standards for themselves than faculty members would impose on them.
- Have students apply your rubric to sample products before they create their own. Faculty members report that students are quite accurate when doing this, and this process should help them evaluate their own projects as they are being developed. The ability to evaluate, edit, and improve draft documents is an important skill.
- Have students exchange paper drafts and give peer feedback using the rubric. Then, give students a few days to revise before submitting the final draft to you. You might also require that they turn in the draft and peer-scored rubric with their final paper.
- Have students self-assess their products using the rubric and hand in their self-assessment with the product; then, faculty members and students can compare self- and faculty-generated evaluations.
Tips for developing a rubric
- Find and adapt an existing rubric! It is rare to find a rubric that is exactly right for your situation, but you can adapt an already existing rubric that has worked well for others and save a great deal of time. A faculty member in your program may already have a good one.
- Evaluate the rubric. Ask yourself: A) Does the rubric relate to the outcome(s) being assessed? (If yes, success!) B) Does it address anything extraneous? (If yes, delete.) C) Is the rubric useful, feasible, manageable, and practical? (If yes, find multiple ways to use the rubric: program assessment, assignment grading, peer review, student self assessment.)
- Collect samples of student work that exemplify each point on the scale or level. A rubric will not be meaningful to students or colleagues until the anchors/benchmarks/exemplars are available.
- Expect to revise.
- When you have a good rubric, SHARE IT via the Assessment Rubric Collection! assessment@ksu.edu
Adapted from the University of Hawaii at Manoa
2017 Critical Thinking Rubric Workshop Materials
Other Guides to Effective Student Learning Measurement
- Tips for Assessment in Large Classes (from Texas Tech University)
- Creating assessments in CANVAS
- Writing good multiple choice questions
- Designing Better Quizzes