K-State home
Office of Data, Assessment and Institutional Research
Assessment
Toolkit
Measurement and Evaluation

Measurement and Evaluation

Data-collection methods for assessment purposes typically fall into two categories: direct and indirect. Both are important, but indirect evidence by itself is insufficient. Direct evidence is required. Ideally, a program collects both types.

Why is direct evidence of student learning required?

Direct evidence reveals what students have learned, while indirect evidence can help faculty interpret direct information and guide improvements. For example, If students self-report on a survey (indirect evidence of learning) that their knowledge of world geography is excellent but later fail a multiple-choice world geography test (direct evidence), that's useful information. The indirect evidence by itself isn't meaningful without the direct evidence of students' knowledge. Programs can collect both direct and indirect evidence of student learning to gain a better picture of their students.

Measurement Best Practices

Direct and indirect measures.

Assessment measures are divided into two broad classes: direct and indirect measures.
- Direct measures are those in which student demonstrate learning and measured with a scoring device.
- Indirect measures include those in which students report their attitudes, perceptions, or feelings about their learning, usually in the form of a survey. Indirect measures also include data that is related to student but that cannot be directly tied to an outcome (e.g. job/graduate school placement rates, publication counts, GPA, course grades).
- Direct measures are preferable as the primary form of assessment, but some objectives may only be measurable with more indirect methods.

Choose methods that will

allow students to demonstrate specific student learning outcomes.
be credible to the faculty and the intended users of the results.
provide useful, meaningful and actionable information that can be used as a basis for decision-making. Quantity is not the goal. Faculty members must be willing to discuss and make changes to the program (as needed) based on the results.

Try not to “reinvent the wheel.”

Use or modify existing evidence whenever possible. Assessment is usually already happening in courses. Inventory what evidence of student learning and perceptions about the program already exist. A curriculum matrix is a useful tool when conducting the inventory.
Course-embedded assessments are often the most efficient and authentic way to assess learning.
- Consider adapting assessment tasks and/or scoring devices already in use by faculty in the discipline, in other departments, or at other institutions.
- Work to keep an assessment task as authentic to expected learning as possible.
Use more than one indicator of learning whenever possible, whether that be multiple direct measures or a combination of direct and indirect measures.

Be aware of semantic issues.

Use inclusive and culturally sensitive language.
Keep assessment questions simple and to the point.
Avoid leading or “double-barreled” questions that result in participant responses which are not focused on the relevant issue.

Pilot test when possible.

It is generally best to try to pilot exams, assessment tasks, surveys and interview questions so that any confusion regarding interpretation of the question will be discovered. Cognitive interviewing techniques can be useful to obtain feedback during pilot testing.

Keep the length of the assessment measure as short as possible.

Assessment tasks, surveys or focus groups/interviews that are too long lose their effectiveness and may result in inaccurate measuring of achievement or higher non-response or incomplete response rates.
Make sure the assessment is feasible to carry out given the program's resources and amount of time faculty members are willing to invest in assessment activities.

Direct and Indirect Measures

Examples of Direct Measures of Student Learning

Course-embedded tests, assignments, or projects
Culminating experiences: capstone projects, senior theses, senior exhibits or performances
Portfolio assessment
Licensure, certification, or professional exams
Essay questions blind scored by faculty across the department, division, school, or college
Internal and external juried review of comprehensive senior projects, exhibitions, and performance
Employer's or internship supervisor's direct evaluations of students' performances
Faculty assessment of a student publication or conference presentation

Adapted from Cecilia Lopez, NCA Commission on Institutions of Higher Education: Opportunities for Improvement: Advice from Consultant-Evaluators of Programs to Assess Student Learning, March, 1996

Indirect Measures of Student Learning

Alumni, employer, and student surveys
Exit interviews of graduates and focus groups
Interviews of instructors, program coordinators, residence halls leaders, and others who have direct contact with students
Graduate follow-up studies
Retention and transfer studies
Length of time to degree
SAT / ACT scores
Graduation rates and transfer rates
Job placement data
Observing and recording students’ behaviors

Sources:

(1) University of Hawaii - Choose a Method to Collect Data/Evidence
(2) Center for Effective Collaboration and Practice’s list of Indirect Measures

Direct Assessments

Embedded Assignments

When faculty members collaborate to reach consensus on what is acceptable and exemplary student work in their discipline, students receive more consistent grading and feedback from professors in the program.

	BENEFITS	DRAWBACKS
Embedded testing or quiz	Students motivated to do well because test/quiz is part of their course grade. Evidence of learning is generated as part of normal workload. Faculty typically more willing to make changes to curriculum because local exam is tailored to the curriculum and intended outcomes.	Campus or program is responsible for test reliability, validity, and evaluation. Faculty members may feel that they are being evaluated by student scored, even if they are not.
Embedded assignment	Students motivated to do well because assignment is part of their course grade. Faculty members more likely to use results because they are active participants in the assessment process. Online submission and review of materials possible. Data collection is unobtrusive to students.	Faculty members may feel that they are being evaluated by student scores, even if they are not. Faculty time required to develop and coordinate, to create a rubric to evaluate the assignment, and to actually score the assignment.

BENEFITS

DRAWBACKS

Embedded testing or quiz

Students motivated to do well because test/quiz is part of their course grade.
Evidence of learning is generated as part of normal workload.
Faculty typically more willing to make changes to curriculum because local exam is tailored to the curriculum and intended outcomes.

Campus or program is responsible for test reliability, validity, and evaluation.
Faculty members may feel that they are being evaluated by student scored, even if they are not.

Embedded assignment

Students motivated to do well because assignment is part of their course grade.
Faculty members more likely to use results because they are active participants in the assessment process.
Online submission and review of materials possible.
Data collection is unobtrusive to students.

Faculty members may feel that they are being evaluated by student scores, even if they are not.
Faculty time required to develop and coordinate, to create a rubric to evaluate the assignment, and to actually score the assignment.

Portfolios

Keeping a portfolio can lead students to become more reflective and increase their motivation to learn.

BENEFITS	DRAWBACKS
Provides a comprehensive, holistic view of student achievement and/or development over time. Students can see growth as they collect and reflect on the artifacts in the portfolio. Students can draw from the portfolio when applying for graduate school or employment. Online submission and review of materials possible.	Amount of resources needed: costly and time consuming for both students and faculty. Students may not take the process seriously (collection, reflection, etc.) Accommodations need to be made for transfer students (when longitudinal or developmental portfolios are used).

Licensure or Certification Exams

BENEFITS	DRAWBACKS
National comparisons can be made. Reliability and validity are monitored by the test developers. An external organization handles test administration and evaluation.	Faculty may be unwilling to make changes to their curriculum if students score low (reluctant to "teach to the test"). Test may not be aligned with the program's intended curriculum and outcomes. Information from test results is too broad to be used for decision making

Pre- Post-test

BENEFITS	DRAWBACKS
Provides "value-added" or growth information.	Learning is expected, so the challenge remains to identify if the increased learning actually meets the rigor of programmatic expectations. Increased workload to evaluate students more than once. Designing pre- post-tests that are truly comparable at different times is difficult. Statistician may be needed to properly analyze results.

Internship Supervisor's direct evaluation

BENEFITS	DRAWBACKS
Evaluation by a career professional is often highly valued by students. Faculty members receive updated information on what is expected by the industry.	Lack of standardization across evaluations may make summarization of the results difficult.

Indirect Assessments

Surveys

Surveys are the most commonly used indirect assessment method. Surveys are useful tools for collecting information regarding attitudes, beliefs, experiences, values, needs, demographic information, perceptions, etc.

When To Use Surveys

As an assessment tool, surveys should be employed when the goal is to draw relatively quick conclusions regarding the perceptions of a target population. Surveys can reach a large number of people in a short amount of time and typically produce data that is easy to analyze. Your college and program can get disaggregated data from the Assessment team's surveys. You may also add college/program specific questions for student in your program by contacting assessment@ksu.edu.

BENEFITS	DRAWBACKS
Can administer to large groups for a relatively low cost. Analysis of responses typically quick and straightforward. Reliable commercial surveys are available for purchase.	Low response rates are typical. With self-efficacy reports, students' perception may be different from their actual abilities. Designing reliable, valid questions can be difficult. Caution is needed when trying to link survey results and achievement of learning outcomes.

Focus Groups

Focus groups are an indirect assessment method in which a small group of individuals is assembled together to gain feedback and insight into a particular product, program, service, concept, etc. In a focus group, questions are asked in an interactive setting in which participants are free to talk with other group members.

When To Use Focus Groups

Focus groups are best used when group discussion or interaction among people would bring out insights that would not be ascertained through individual interviews or survey items. Additionally, focus groups are useful when rich data is needed and there aren’t sufficient resources available to conduct individual interviews.

BENEFITS	DRAWBACKS
Provides rich, in-depth information and allows for tailored follow-up questions. The group dynamic may spark more information--groups can become more than the sum of their parts. "Stories" and voices can be powerful evidence for some groups of intended users.	Trained facilitators needed. Transcribing, analyzing, and reporting are time consuming.

Interviews

Interviews are one-on-one data collection events that allow for direct questioning of research subjects. Interviews can be conducted either in person or via the telephone.

When To Use Interviews

Interviews are best used when the purpose is to gain in-depth insight into individuals’ perceptions, establish personal contact with participants, and/or follow-up on prior survey findings.

BENEFITS	DRAWBACKS
Provides rich, in-depth information and allows for tailored follow-up questions. "Stories" and voices can be powerful evidence for some groups of intended users.	Trained interviewers needed. Transcribing, analyzing, and reporting are time consuming.

Assessment Types: Formative vs. Summative
Understanding the distinction between formative and summative assessment is essential for evaluating student learning effectively. Each serves a unique purpose in guiding instruction and measuring achievement.

Formative Assessment: Monitors learning during a course to provide feedback that helps improve teaching and student performance. Examples: concept maps, in-class reflections, draft submissions.
Summative Assessment: Evaluates learning at the end of an instructional unit to measure achievement against benchmarks. Examples: final exams, capstone projects, recitals.

Rubrics

What is a Rubric?

Rubrics are guides that help to score the performance of students. Rubrics are to be used by instructors when creating, scoring, grading, or providing feedback on course tasks and assignments, and by students to guide the development of quality work. These performance guides can help to provide a common understanding between students and instructors regarding the learning and performance expectations. Rubrics can also be used to evaluate student work collected for purposes of assessment, program evaluation, and improvement of student learning.

Why are rubrics used?

A rubric creates a common framework and language for assessment.
Complex products or behaviors can be examined efficiently.
Well-trained reviewers apply the same criteria and standards.
Rubrics are criterion-referenced, rather than norm-referenced. Raters ask, "Did the student meet the criteria for level 5 of the rubric?" rather than "How well did this student do compared to other students?"
Using rubrics can lead to substantive conversations among faculty.
When faculty members collaborate to develop a rubric, it promotes shared expectations and grading practices.

How do develop a rubric

Rubrics are composed of four basic parts. In its simplest form, a rubric includes:

A task description. The outcome being assessed with instructions students receive for an assignment.
The characteristics to be rated (rows) . The skills, knowledge, and/or behavior to be demonstrated.
Levels of mastery/scale (columns) . Labels used to describe the levels of mastery should be tactful and clear. Commonly used labels include:
- Not meeting, approaching, meeting, exceeding
- Exemplary, proficient, marginal, unacceptable
- Advanced, intermediate high, intermediate, novice
- 1, 2, 3, 4
A description of each characteristic at each level of mastery/scale (cells) .

Developing a rubric

Step 1: Identify what you want to assess

Step 2: Define the characteristics to be rated (rows). Specify the skills, knowledge, and/or behaviors that you will be looking for.

- Include the characteristics required to show mastery/proficiency for the assignment as well as those that are most important to the student learning outcome.

Step 3: Identify the levels of mastery (columns).

Aim for around four because any more that this makes it difficult to differentiate qualities of achievement.

Always allow for a category beyond the expected rigor for the assignment, although few may achieve this level.

The levels of mastery seldom directly reflect the grade for an assignment.

Step 4: Describe each level of mastery for each characteristic (cells).

- Describe the expected rigor for meeting assignment rigor. This is identified in the second to the highest category.
- Describe the best work you could expect using these characteristics, beyond typical expectations. This describes the top category.
- Describe an unacceptable product. This describes the lowest category.
- Develop descriptions of intermediate-level products for intermediate categories.

Important: Each description and each characteristic should be mutually exclusive and defined well enough so that another faculty member could reliably rate students' work.

Step 5: Test the rubric for clarity, rigor, scoring reliability.

- Discuss with colleagues and students. Review feedback and revise.
- Score a sample of student work to identify consistency of scoring and efficiency of use.

Tip: Faculty members often find it useful to establish the minimum score needed for the student work to be deemed passable. For example, faculty members may decided that a "1" or "2" on a 4-point scale (4=exemplary, 3=proficient, 2=marginal, 1=unacceptable), does not meet the minimum quality expectations. They may set their criteria for success as 90% of the students must score 3 or higher. If assessment study results fall short, action will need to be taken.

Important: When developing a rubric for program assessment, enlist the help of colleagues. Rubrics promote shared expectations and consistent grading practices which benefit faculty members and students in the program.

Step 6: Provide rubric when assigning the assessment task so students can use it to guide their work.

Adapted from the University of Hawaii at Manoa

Calibrating a Rubric

When using a rubric for program assessment purposes, faculty members apply the rubric to pieces of student work (e.g., reports, oral presentations, design projects). To produce dependable scores, each faculty member needs to interpret the rubric in the same way. The process of training faculty members to apply the rubric is called "norming." It's a way to calibrate the faculty members so that scores are accurate and consistent across the faculty. Below are directions for an assessment coordinator carrying out this process.

Suggested materials for a calibration session:

Copies of the rubric
Copies of student work that illustrate each level of mastery. Suggestion: have 6 example pieces (2 low, 2 middle, 2 high)
Score sheets

Hold the scoring session in a room that allows the scorers to spread out as they rate the student pieces

Process:

Adapted from the University of Hawaii at Manoa

Tips for Developing and Using Rubrics

Ideas for Using Rubrics in Courses

Hand out the rubric with the assignment so students will know your expectations and how they'll be graded.
Use a rubric for grading student work and return the rubric with the grading on it. Include additional comments, either within each section or at the end.
Develop a rubric with your students for an assignment or group project. Students can the monitor themselves and their peers using agreed-upon criteria that they helped develop. Many faculty members find that students will create higher standards for themselves than faculty members would impose on them.
Have students apply your rubric to sample products before they create their own. Faculty members report that students are quite accurate when doing this, and this process should help them evaluate their own projects as they are being developed. The ability to evaluate, edit, and improve draft documents is an important skill.
Have students exchange paper drafts and give peer feedback using the rubric. Then, give students a few days to revise before submitting the final draft to you. You might also require that they turn in the draft and peer-scored rubric with their final paper.
Have students self-assess their products using the rubric and hand in their self-assessment with the product; then, faculty members and students can compare self- and faculty-generated evaluations.

Tips for developing a rubric

Find and adapt an existing rubric! It is rare to find a rubric that is exactly right for your situation, but you can adapt an already existing rubric that has worked well for others and save a great deal of time. A faculty member in your program may already have a good one.
Evaluate the rubric. Ask yourself: A) Does the rubric relate to the outcome(s) being assessed? (If yes, success!) B) Does it address anything extraneous? (If yes, delete.) C) Is the rubric useful, feasible, manageable, and practical? (If yes, find multiple ways to use the rubric: program assessment, assignment grading, peer review, student self assessment.)
Collect samples of student work that exemplify each point on the scale or level. A rubric will not be meaningful to students or colleagues until the anchors/benchmarks/exemplars are available.
Expect to revise.
When you have a good rubric, SHARE IT via the Assessment Rubric Collection! assessment@ksu.edu

Adapted from the University of Hawaii at Manoa

Measurement and Evaluation

Direct Assessments

Indirect Assessments

Rubric Templates & Samples