Google's AMIE AI Tried Taking Patient Histories Before Real Doctor Visits

Google Research and Beth Israel Deaconess Medical Center (BIDMC) just published the results of a study that actually put AMIE, their conversational medical AI, in front of real patients before their primary care appointments. This isn’t another demo with actors playing patients — this was a prospective, IRB-approved feasibility study in an academic medical center.

Let me be clear about what this is and isn’t. AMIE didn’t diagnose anyone. It didn’t replace doctors. What it did was handle the pre-visit clinical history taking — the part where a patient describes their symptoms, medical history, and reasons for coming in. Think of it as an AI-powered intake process that generates a structured summary for the physician before they even walk into the exam room.

How the study actually worked

The setup is worth understanding because it’s more grounded than most AI-in-medicine experiments I’ve seen. Patients booked for new, non-emergency episodic complaints — both in-person and telehealth — were invited to participate during the booking process. They had time to review the protocols and were explicitly told opting out wouldn’t affect their care.

Participants interacted with AMIE through a secure web link before their physical consultation. Here’s the key safety mechanism: a physician supervised the entire AI-patient chat via a live video call with screen sharing. This “AI supervisor” had a predefined set of safety criteria and was trained to intervene if anything went sideways. That’s not just CYA — it’s how you responsibly test an experimental system with real people.

The AI generated a transcript and summary of the interaction, which, with patient consent, went to the clinician before the appointment. The supervising physician model mirrors how medical trainees operate — they talk to patients under supervision, get feedback, and gradually earn autonomy.

What I find interesting about this approach

Google has been talking about AMIE since early 2024 when they showed it outperforming primary care physicians in simulated diagnostic conversations. That was impressive but also, let’s be honest, a bit suspicious. Simulated patients aren’t real patients. They don’t have complicated histories, conflicting symptoms, or the emotional weight of actual illness.

This study addresses that gap directly. Moving from synthetic scenarios to real-world clinical workflow is the hardest step in medical AI deployment. The regulatory, ethical, and practical hurdles are enormous. I’ve seen too many promising AI tools die on that bridge.

What Google and BIDMC did here is smart: they didn’t try to replace the doctor-patient interaction. They augmented the pre-visit process — a relatively low-risk, high-value task that’s often rushed or incomplete in busy clinics. Every primary care physician I’ve talked to complains about not having enough time to take thorough histories. If AMIE can reliably handle that, it’s genuinely useful.

The limitations you need to know about

The paper is honest about the study’s constraints. It’s single-arm, single-center, and the sample size isn’t huge. There’s no control group comparing AMIE-assisted visits against standard care. The AI supervisor model, while necessary for safety, means we don’t know how AMIE performs without a safety net.

More importantly, this study measures feasibility, not efficacy. It asks “can we do this safely?” not “does this improve outcomes?” That’s appropriate for a first step, but we need to be clear-eyed about what the data actually shows.

I also wonder about the patient population. People who agree to participate in an AI study might be more tech-savvy or open-minded than the average clinic patient. Selection bias is real, and it matters when you’re evaluating a tool that might eventually serve a diverse, sometimes skeptical patient base.

Where this fits in the bigger picture

This is a necessary step, but it’s early. The evidence roadmap for medical AI requires multiple phases: synthetic testing, feasibility studies, efficacy trials, and finally implementation research. Google just checked the feasibility box.

The fact that they partnered with BIDMC rather than their own health division is notable. Academic medical centers have the infrastructure and regulatory experience for this kind of clinical research. Google doesn’t want to become a hospital — they want their tools used in hospitals.

What I’d really like to see next is a randomized controlled trial comparing AMIE-assisted visits against standard care, measuring things like diagnostic accuracy, clinician satisfaction, and patient outcomes. That’s the kind of evidence that moves the needle with skeptical healthcare systems.

For now, this study tells us that conversational diagnostic AI can be deployed safely in a real clinical setting with proper oversight. That’s not nothing — many AI systems never make it this far. But the distance between “feasible under supervision” and “broadly deployed” is measured in years and millions of dollars of additional research.

I’ll be watching where Google takes this next. If they’re serious about clinical translation, they’ll need to replicate these results across multiple sites, with diverse populations, and eventually without the safety net. That’s the hard part, and we’re not there yet.

Google’s AMIE AI Tried Taking Patient Histories Before Real Doctor Visits — Here’s How It Went

How the study actually worked

What I find interesting about this approach

The limitations you need to know about

Where this fits in the bigger picture

Comments (0)