The ATLAS Research Lab has a keen interest in understanding the implications of testing strategies on students and their academic performance. One of the ways the lab is doing this is by exploring how modifications to traditional multiple-choice questions (MCQs) can impact educational outcomes in an undergraduate course on human embryology.
Although considered an elective for many, the ANA301H1 undergraduate embryology course (Faculty of Arts & Science, University of Toronto), attracts a diverse range of students with varying academic foci. This diversity provides an excellent opportunity to observe the impact of different types of testing on student learning across disciplines, which the research team has leveraged. But why focus on embryology? The course deals with complex topics like the differentiation of mesodermal and ectodermal tissues—think muscle development versus skin. While many students like myself grasp these broader concepts, they often struggle with the finer details of developmental pathways. This challenge highlights a prevalent issue: comprehending overarching concepts does not necessarily equate to the applicability to apply them in more intricate scenarios. By tweaking how MCQs are designed, the research team hopes to bridge the gap between knowing and applying complex embryological concepts learned in lecture.
During a traditional MCQ, the test-taker is expected to read a question statement (the question stem) and select one correct response from a list of 3-5 options. The incorrect responses that students have to rule out are called distractors. While this assessment format can be easy to administer and grade, it tends to reflect a superficial understanding of content and is often criticized for being an inaccurate measure of student learning:“I never felt that traditional MCQs were appropriately assessing my students,” explains Dr. Danielle Bentley, the Lead PI of the ATLAS Research Lab and the Course Director of ANA301. “The research literature suggests poor validity, which in education research means that the testing strategy isn’t a true assessment of student knowledge. With traditional MCQs, we know that students can guess when they don’t know the answer, earning them a score of 100% on a topic they don’t understand. We also know that students may have a peripheral understanding of a concept but select the incorrect answer, earning them a score of 0% on a topic that they have a fundamental understanding of. These two extremes are the exact two scenarios I wanted to address when I designed the critique-and-correct multiple-choice questions (ccMCQs) that I employ across my courses.” Dr. Bentley’s insight highlights a fundamental flaw in traditional MCQs: they often reward guessing but fail to credit partial knowledge. As a student, I’ve personally experienced both extremes—getting answers right by chance and getting questions wrong despite understanding the big picture. The introduction of ccMCQs, however, offers a more balanced approach allowing students to demonstrate partial knowledge, lending itself to be a fairer assessment format.
The ccMCQ format, as its name suggests, requires students to engage with and critique all options. When completing ccMCQs, students are tasked with not just selecting the correct response (scored /1), but also correcting two of the remaining three distractors (scored /0.5 for each distractor). This approach promotes a deeper understanding of the material and may alleviate some of the stress associated with the all-or-none approach to traditional MCQs. Importantly, this format does not diminish academic rigour; instead, it raises the stakes by assigning more marks to each question, thus preserving high standards while providing a more robust assessment of knowledge.
Indeed, our latest data from n = 908 ANA301 students (from 2019-2024) show that questions marked as a ccMCQ do a better job of predicting student performance than when marked as a traditional MCQ. For each of the term tests, and also the final course grade (R^2 = 0.70 for pooled ccMCQs, R^2 = 0.54 for pooled traditional MCQs), ccMCQs consistently had stronger predictive power. Plus, ccMCQs had less variation in student scores (F = 1.41, p < 0.001, 95% CI = 1.31-1.52), which means they provide a more precise measure of what students have learned. These are all to say that ccMCQs might be a more valid way to assess student learning, compared to traditional MCQs.
But what do the students think about this new question format? After a roundtable discussion with two other ANA301 students who wrote this assessment format—[Nancy], [Joe], and myself—several important insights came to light. We all agreed that ccMCQs encourage deeper demonstration of knowledge, particularly by requiring students to explain why certain distractors are wrong, even if they are unsure of the correct answer:“Explaining why the distractors were incorrect helped me connect the material to the bigger picture. It felt more like problem-solving than just picking an answer,” as shared by [Joe]. The complexity of distinguishing between distractors often made the process more challenging than expected. “In my experience, the format was challenging and introduced unfamiliar stresses compared to a traditional MCQ exam, but this was intentional—it was designed to be more intellectually stimulating,” [Nancy] explained. There were times when the subtlety of the incorrect answers created confusion, making it difficult to focus on the core concept being tested. This suggests that while the ccMCQ format is valuable for deep learning and its validity, clearer and more thoughtfully-crafted distractors could enhance its effectiveness without compromising the rigour of the exam.
While the ccMCQ format may seem to favour students with stronger critical and writing skills, the critiques require only a few words with no emphasis on grammar or syntax. So, keen students with weaker writing skills should still be able to perform well, provided they understand the course content. An important drawback to consider is the logistical demand on the teaching side. Unlike traditional MCQs, ccMCQs are much harder to grade, as they require more human resources to evaluate not only the correct answer, but also the student’s rationale for each distractor. Moreover, the ccMCQ questions may take more time to complete than traditional MCQs, potentially increasing stress for students. However, the test design accounts for this by providing additional time to accommodate the more complex cognitive demands. This is to ensure that time pressure alone does not unfairly impact student performance. “It was harder to just rely on memorization with the ccMCQs. I had to really understand why an answer was right or wrong, which made the test more stressful but also more rewarding,” [Nancy] further explains. While ccMCQs undoubtedly promote critical engagement, the increased cognitive load they impose could unintentionally heighten stress for some, potentially affecting performance.
Despite these hurdles, the overall student response has been positive. As learning environments and student audiences continue to evolve, teaching innovations like the ccMCQ format seem to be a step in the right direction, helping students demonstrate what they have learned more directly.
Very well written!