Back to Blog
·7 min read

Oral Assessment at Scale: How AI Makes It Possible

Oral exams have always been the gold standard for assessing understanding. AI now makes them practical for classes of any size.

Oral AssessmentEdTechAI in Education
By Dr. Ehoneah Obed · Founder, Pruuva
Oral Assessment at Scale: How AI Makes It Possible

Ask any professor what the most reliable way to assess whether a student truly understands something, and you will hear some version of the same answer: talk to them.

This is not a new insight. Oral examination has been central to education for centuries. The viva voce (literally "with living voice") has been part of doctoral defense since medieval European universities. Medical schools assess clinical reasoning through oral boards. Law professors use the Socratic method to test whether students can think through a problem in real time. The reason these traditions persist is straightforward. It is very difficult to fake understanding when someone is asking you follow-up questions.

Every educator knows this. The problem has never been whether oral assessment works. The problem has been making it work at scale.

The Math That Kills Oral Exams

Consider a fairly standard university course with 150 enrolled students. Even a brief 10-minute oral assessment with each student adds up to 25 hours of faculty time, and that is before you account for scheduling, transitions between students, note-taking, and the inevitable no-shows that need to be rescheduled.

For a large introductory course with 300 or 500 students, the numbers become absurd. No professor, no matter how dedicated, can conduct meaningful oral assessments at that scale while also teaching, conducting research, mentoring graduate students, and serving on committees.

So we default to written exams and term papers. These are easy to distribute, relatively fast to grade (especially with TAs), and logistically manageable. They are also, as recent events have made painfully clear, easy to game with generative AI.

The irony is hard to miss. We know how to assess understanding well. We just cannot do it for everyone.

What Changes When AI Conducts the Conversation

The breakthrough is not a new pedagogical theory. It is a practical one. What if the conversation could be conducted by an AI that adapts to each student's responses in real time?

This is not a chatbot asking multiple-choice questions. An AI-powered oral probe is a structured conversation that follows the student's reasoning, adjusts its depth based on the responses, and generates a detailed evidence report for the educator to review.

Here is what that looks like in practice.

It adapts to each student individually. A written exam gives the same questions to every student. An adaptive probe works differently. If a student gives a strong, confident answer, the AI pushes deeper into the concept. If a student is struggling, it comes at the idea from a different angle. The result is that each student gets an assessment calibrated to their actual level of understanding, not a one-size-fits-all test.

It is consistent in ways humans cannot be. This is an uncomfortable truth about oral exams: human examiners are inconsistent. Studies have shown that examiners give different scores depending on the time of day, how many students they have already assessed, and even whether they have eaten recently. An AI probe applies the same evaluative framework to every student without fatigue, mood, or unconscious bias affecting the outcome.

It produces structured evidence, not just a score. After each probe, the system generates a comprehension report that shows which concepts the student demonstrated understanding of, where they struggled, and how their responses mapped to the course learning objectives. For educators, this is dramatically more useful than a percentage score on a written test. It shows you where the gaps are.

It works whether you have 15 students or 1,500. The system scales without requiring additional faculty time. Every student gets the same quality of assessment regardless of class size. This is the part that really changes the equation.

What the Experience Actually Feels Like

I think it is important to be concrete about this, because "AI oral assessment" can sound intimidating until you see how it actually works.

A typical probe takes somewhere between 5 and 10 minutes. The flow goes like this:

  1. The student submits their work. This could be an essay, a lab report, a case analysis, a problem set, or really any assignment where understanding matters.
  2. The system reviews the submission and prepares questions that target the key concepts, arguments, and reasoning in the student's own work.
  3. The student enters a conversation where the AI asks questions, listens to the answers, and follows up based on what the student says. If the student makes a strong point, the AI might ask them to extend it. If something seems unclear, it asks for clarification.
  4. When the probe is complete, both the student and the educator receive a comprehension report.

The feel of it is less like an interrogation and more like a good office hours conversation. The kind where a professor says "that's interesting, tell me more about why you chose that approach" or "what would change if this assumption didn't hold?" It is the conversation every student deserves to have about their work, but that no single faculty member can have with hundreds of students.

Honest Answers to Common Concerns

People have reasonable questions about this approach, and I want to address them directly.

"My students would be terrified of an oral assessment."

Some nervousness is normal, especially for the first probe. But here is what we have seen consistently: students adjust quickly the first time around. Many end up preferring it to written exams because they feel they can actually show what they know instead of hoping their written answers capture their full understanding. Students frequently say things like "I know this better than my paper shows" and a probe lets them demonstrate that.

"Can AI really assess understanding the way a professor would?"

The AI is not replacing the professor's judgment. Think of it more as a structured interview conducted on the educator's behalf. The AI gathers evidence through conversation, and the educator reviews that evidence. The professor still makes the final call. What changes is that they now have a rich, detailed picture of each student's comprehension instead of just a written submission.

"What about students with speech differences, anxiety disorders, or language barriers?"

This is something we think about constantly. A well-designed probe system focuses on conceptual understanding, not on how fluently or eloquently someone speaks. Students who need accommodations can receive adjusted timing, alternative question formats, or other modifications. The goal is always to let the student show what they know, and to remove barriers to doing so.

The Ripple Effect on Learning

There is something interesting that happens when students know they will need to explain their work verbally after submitting it. Their entire approach to the assignment changes.

They read more carefully. They think more deeply about why an argument holds, not just whether it sounds good on paper. They engage with the material on a level that goes beyond producing a polished submission, because they know they cannot just hand in something they do not understand.

This is the part that excites me most about this approach. Detection-based integrity systems create an adversarial dynamic where students try to avoid getting caught and institutions try to catch them. Verification creates a different dynamic entirely. It tells students: we care about what you actually learned, and we are going to give you a chance to show it.

That is not just a better integrity system. It is a better learning system.

The Road Ahead

The technology to make oral assessment scalable exists right now. The real challenges are the human ones: adoption, integration with existing institutional workflows, and building the kind of trust that comes from seeing the evidence firsthand.

For the educators who have always known that a conversation is the best way to understand what a student really knows, the message is simple. You were right all along. The technology has finally caught up.

Ready to verify understanding?

Join educators who are moving from detection to evidence-based assessment.

Get early access

Keep reading

Why AI Detection Is Failing Higher Education

Why AI Detection Is Failing Higher Education

AI detection tools promise to catch AI-generated work, but false positives, bias, and an arms race make them unreliable. There's a better path forward.

Beyond Plagiarism: Rethinking Academic Integrity in the AI Era

Beyond Plagiarism: Rethinking Academic Integrity in the AI Era

Academic integrity policies built around plagiarism don't work in the AI era. It's time to reframe integrity around demonstrated understanding.