• An application of judgment analysis to examination marking in psychology

      Elander, James; Hardman, David; University of Derby; London Guildhall University (Wiley, 2002)
      Statistical combinations of specific measures have been shown to be superior to expert judgement in several fields. In this study judgement analysis was applied to examination marking to investigate factors that influenced marks awarded and contributed to differences between first and second markers. Seven markers in psychology rated 551 examination answers on seven 'aspects' for which specific assessment criteria had been developed to support good practice in assessment. The aspects were addressing the question, covering the area, understanding, evaluation, development of argument, structure and organisation, and clarity. Principal components analysis indicated one major factor and no more than two minor factors underlying the seven aspects. Aspect ratings were used to predict overall marks, using multiple regression regression to ‘capture’ the marking policies of individual markers. These varied from marker to marker in terms of the numbers of aspect ratings that made independent contributions to the prediction of overall marks and the extent to which aspect ratings explained the variance in overall marks. The number of independently predictive aspect ratings, and the amount of variance in overall marks explained by aspect ratings, were consistently higher for first markers (question setters) than for second markers. Co-markers’ overall marks were then used as an external criterion to test the extent to which a simple model consisting of the sum of the aspect ratings improved on overall marks in the prediction of co-markers marks. The model significantly increased the variance in co-markers’ marks accounted for, but only for second markers, who had not taught the material and not set the question. Further research is needed to develop the criteria and especially to establish the reliability and validity of specific aspects of assessment. The present results support the view that, for second markers at least, combined measures of specific aspects of examination answers may help to improve the reliability of marking.