Friday, 16 August 2019

what makes a good assessment?

Not only in medical schools, but in any educational experience, assessment is the bane of students' existence but it is starting to be re-evaluated as a valuable tool in learning.

So if students hate assessments (they will always do, no matter how many candies you give them), how should we make it at least Effective and useful for the students?

A good assessment should have;
  • Generalizability
    • an "A" scored in an exam should mean, that the student has an exceptional ability in a certain field. Generalizability means that the score reflects the ability. for example, a student whom scored A in Anatomy will have more knowledge about anatomy in general, compared to another student whom scored a C. 
    • a good assessment with generalizability is one where we can conclude something from the results. for example a pass in Driving license exam suggests that the person is equipped with enough knowledge and skill to drive a car (hopefully)
  • valid (with evidence of validity) - does the test measure what it is supposed to measure?
    • content validity 
      • the assessment content should be one that reflects the learning outcomes and teaching strategies. 
      • some common mistakes made are;
        • content underrepresentation - too few data for assessment - e.g. to determine if a student will enter a university, a 5 minute interview was the only assessment. 
        • content irrelevant variance - irrelevant data was used for assessment e.g. anatomy - name the carpal bones, in latin.  
    • response process
      • is about the administration, management and implementation of the exam 
      • if the students say "we dont think it is fair to be tested this way" or "we dont know how to pass this exam", there is a need to reevaluate the validity of exam. 
    • internal structure
      • difficulty, distinguishing index
      • reliability 
      • standard variance 
    • relationship variables
      • this is how related the assessments are in terms of the abilities tested - for example, a clinical Mini-CEX exam score should correlate more with OSCE than a knowledge-based MCQ. 
      • hence an an assessment can be implied as valid if its score has a positive correlation with another exam score which tests the same abilities. 
    • consequences
      • the influence the assessment has on the stakeholders - i.e. the students, educators, and society. 
      • for example an assessment that is so difficult that it would discourage students, or drive them to insanity, is not a valid assessment. 
  • Reliability - degree to which an assessment tool produces stable and consistent results
    • test-retest reliability - does retaking the same test produce similar results?
    • parallel forms reliability - does asking the same construct in a different test produce the same score?
    • inter-rater reliability - do different judges score the same student differently?
    • internal consistency reliability - does asking the same construct in different question produce the same answer?
a more straight-forward explanation was done by Van der Vleuten (1996)
five criteria for determining the usefulness of a particular method of assessment:
  • reliability (the degree to which the measurement is accurate and reproducible), 
  • validity (whether the assessment measures what it claims to measure), 
  • impact on future learning and practice, 
  • acceptability to learners and faculty, 
  • costs (to the individual trainee, the institution, and society at large).
Van Der Vleuten CPM. The assessment of professional competence: developments, research and practical implications. Adv Health Sci Educ 1996;1:41-67

No comments:

Post a Comment