JABSOM Students Take Top Prize for AI Tool That Evaluates Faculty Debriefing

Related News Articles

biostats team win

Inside JABSOM’s SimTiki lab, medical students practice clinical scenarios on lifelike manikins, designed to prepare them for real patient care. The debrief that follows involves a faculty member guiding students through what went well and what didn't. Evaluation of faculty who lead debrief sessions is intermittently conducted by other faculty members. That process helps ensure students are receiving high-quality feedback, but it can also be time-consuming and, at times, subjective.

Now, a team of JABSOM students and Biostatistics Core staff and SimTiki faculty are studying the use of artificial intelligence to enhance and strengthen the process.

MS2s Kristal Xie and Yash Vyas, working with Biostats Core members Kyle Ishikawa and Noa Brenner, along with SimTiki faculty Benjamin Berg, and Jannet Lee-Jayaram, and SimTiki Research Fellow  David Hyunchang Kim, Department of Anesthesiology and Pain Medicine, Gangnam Severance Hospital, Yonsei University College of Medicine developed a project using AI to evaluate faculty who conduct debriefing sessions. The work earned first place in the student poster category at the UH x Google: Accelerating Research in the Age of AI symposium.

Using video from simulation sessions, the research team prompted an AI model to review debriefs and score them using an existing standardized debriefing evaluation framework. The system was also prompted to provide rationale for its ratings, mirroring how a human evaluator would assess performance.

 “As a student, it's important you get that face-to-face kind of feedback,” Vyas said. “That is still happening here [in our study] because that is really important, but this way to ‘evaluate the evaluator,’ that's just a way to kind of try to make that process better and try to standardize it more.”

“This is about standardizing the process and making it more efficient,” Xie said. “There’s always going to be some subjectivity, but this helps create a more consistent baseline.”

Early results were promising. AI-generated evaluations showed a high level of agreement between AI and human expert reviewers of debriefing , suggesting the tool could serve as a reliable supplement to existing methods and possibly reduce time required for evaluation and feedback to debriefers. For students, that added layer of consistency matters.

“It’s reassuring to know there’s a system in place to make sure we’re all getting the same quality of feedback,” Xie said.  “You want to make sure that the debriefing conversation that's going on is more objective rather than subjective. Each individual [debriefer] will have their personal bias or may lean towards one thing. So I think having an objective criterion about how the debriefing should go is really important in making sure that students are getting an overall quality debriefing and not having it vary from faculty member to faculty member.”

Beyond JABSOM, the team sees the project as a way to help make high-quality feedback more accessible in medical training programs, regardless of resources.

“One of the exciting parts of this is how scalable it could be,” Ishikawa said. “You’re taking something that requires a lot of time and expertise and finding a way to support it more efficiently.”

That combination of innovation and real-world application helped the project stand out at the UH x Google symposium, where the JABSOM team of Xie, Vyas, and Ishikawa were the only ones representing the medical school among a stacked field across multiple disciplines.

“ I feel like it's an incredible feeling because not only are we representing JABSOM, but I think it just shows how important the work that we've done is to other people and how it can potentially spark or advance further research into the topic,” Xie said.

The project allowed Ishikawa to step outside the traditional statistical assistance he provides JABSOM researchers.

“I’m using a lot of Google's API, and someone would argue that it's a little bit more of a programmatic project because the stats happen after the data is retrieved,” he said. “In this case, a lot of the work is put into uploading it to Google and then getting the responses and organizing the responses. So this project is sort of opening our minds to the services that we can provide.”

If you’d like to learn more about the research, Xie and Vyas will be presenting at the upcoming Biomedical Sciences Symposium on April 24 and 25.

Faculty or students seeking Biostatistics Core support can learn more here.