AI Media

When Quantum Meets AI: IBM’s Quantum Computer Made an existing AI Model Smarter

May 26, 2026 · by Moe · 8 min read

So what did happen here?

A tiny quantum add-on just unlocked answers a state-of-the-art AI couldn’t get right. Here’s what that means and how you can access quantum hardware yourself today.

For years, the recipe for making an AI model more capable has been straightforward, if expensive: add more data, run longer training cycles, or scale up the number of parameters inside the model. GPT-5.5, for example, is estimated to carry somewhere between 2 and 5 trillion parameters a number that demands enormous infrastructure just to run, let alone improve.

But a research team at Multiverse Computing just published results that challenge that assumption in a quietly dramatic way. By attaching a tiny quantum computing component to Meta’s Llama 3.1 8B language model and running the hybrid system on IBM’s quantum hardware, they achieved something notable: the model began answering questions correctly that it had previously gotten wrong while adding a negligible number of new parameters.

This is the first time “quantum enhancement” has been demonstrated on a production-scale, widely deployed large language model running on real superconducting quantum hardware. It’s still early days, but the implications are worth understanding.

When Quantum Meets AI

The standard approach to improving an AI model’s accuracy is to reduce its perplexity, a measure of how “confused” the model is when predicting the next word in a sequence. Lower perplexity means the model produces more coherent, accurate outputs. Higher perplexity is associated with erratic or incorrect responses.

Normally, reducing perplexity requires significant resources: fine-tuning on new data, longer training runs, or simply making the model bigger by adding more parameters.

The Multiverse Computing team took a different path. Rather than retraining Meta’s Llama 3.1 8B (an 8 billion parameter model), they froze all of its original parameters entirely leaving the base model completely untouched and inserted a small set of specially designed mathematical components called Cayley-parameterized unitary adapters (CUAs) into one of the model’s layers.

These adapters were trained on a classical computer first. Then, the combined system, original frozen model plus the new CUA components was executed on IBM’s 156-qubit quantum processor during inference (the moment when the model generates a response).

The result? A 1.4% reduction in perplexity, achieved by adding just 6,000 new parameters to a model that already has 8 billion. That’s a 0.000075% increase in parameter count for a meaningful jump in output quality.

From Abstract Metric to Real Answers

Perplexity might sound like an academic measurement, but the improvement showed up in concrete, practical ways.

When tested on a multiple-choice astronomy question about which planets have rings, the base Llama model incorrectly indicated that only Saturn does. The quantum-enhanced version correctly identified all Jovian planets as ringed.

On a biology question about the population-genetic consequences of gene flow, the base model selected “Hardy–Weinberg disruption” wrong. The CUA-enhanced model correctly identified increased genetic homogeneity as the answer.

These aren’t trivial differences. They demonstrate that the quantum layer is nudging the model’s probability distributions in ways that surface more accurate knowledge that was already latent in the model’s weights.

Why This Approach Is Different

In classical AI development, you generally can’t get something for (almost) nothing. Better performance costs compute, memory, and time.

The quantum approach described here exploits a fundamentally different kind of computation. Quantum circuits can represent and manipulate mathematical relationships specifically, unitary transformations in high-dimensional spaces in ways that are extraordinarily expensive or intractable to replicate classically at scale. The CUAs act as a kind of mathematical lens, reweighting the model’s outputs using quantum-native operations.

The key trade-off is noise. Quantum hardware is incredibly sensitive interference can come from nearby qubits, Earth’s magnetic field, ambient Wi-Fi signals, and even cosmic radiation. This is precisely why the adapters are intentionally kept small: larger quantum circuits expose computations to more noise, which degrades the results. Managing this noise is currently the central engineering challenge in the field.

The researchers are candid that the magnitude of the perplexity improvement will grow as quantum hardware matures more qubits, better error correction, and higher gate fidelity will all compound the effect. But the significance of the current result, as they note in the paper, is simply that the improvement exists at all.

What Comes Next

The team’s roadmap points toward encoding the entire quantum circuit not just the adapter layer directly onto quantum hardware. This would theoretically allow an LLM to achieve performance levels currently requiring vastly larger classical models, but with far fewer parameters and a much smaller infrastructure footprint.

Ultimately, the goal is reaching quantum advantage: the point at which a quantum-based system can achieve results that no classical computer could replicate in any practical timeframe.

We’re not there yet. But this research lays a credible first stone on that path.

So… Can You Actually Access a Quantum Computer Right Now?

Yes, and more easily than most people realize. You don’t need a lab coat or a government budget. Here’s a practical overview of where quantum hardware is accessible today:

IBM Quantum Platform

quantum.ibm.com | Free tier available

IBM operates the largest publicly accessible fleet of quantum systems in the world, with over 240,000 registered users. Their Open Plan offers free cloud access to a selection of IBM quantum systems with monthly usage quotas suitable for learning, prototyping, and small experiments. The same 156-qubit IBM Heron processor used in the Multiverse Computing study is part of this network.

Pay-as-you-go and premium plans unlock priority access and larger system time for production workloads. IBM’s Qiskit is the standard SDK for writing and submitting quantum circuits.

Best for: Researchers, developers, students, and anyone wanting to reproduce or build on the kind of work described in this article.

Amazon Braket (AWS)

aws.amazon.com/braket | Pay-per-use

Amazon Braket gives you access to multiple quantum hardware providers under one roof including superconducting processors from IonQ and Rigetti, neutral-atom systems from QuEra, and more. Pricing is usage-based with no minimums, making it accessible for experimentation. For more intensive use, Braket Direct offers reserved device access and direct engagement with quantum computing specialists.

Best for: Teams already operating in the AWS ecosystem who want to compare hardware modalities or run hybrid quantum-classical workflows.

Microsoft Azure Quantum

azure.microsoft.com/products/quantum | Pay-per-use

Azure Quantum aggregates hardware from IonQ, Quantinuum, Rigetti, Pasqal, and others, alongside Microsoft’s own topological qubit research track. It integrates natively into the Azure cloud and supports Q# as well as Qiskit and Cirq. For teams building enterprise applications, Azure Quantum’s managed infrastructure and compliance tooling is a genuine advantage.

Best for: Enterprise teams and organizations already invested in the Microsoft stack.

Google Cloud Quantum

cloud.google.com/solutions/quantum-computing | Research access + Pasqal marketplace

Google’s Sycamore and Willow processors remain largely reserved for Google’s own research, but Google Cloud offers access to Pasqal’s neutral-atom hardware through its marketplace. Google’s Willow chip achieved remarkable error correction benchmarks in late 2024, and the roadmap targets a cryptographically relevant quantum computer in the coming years.

Best for: Researchers focused on neutral-atom computing or working with Google’s TensorFlow Quantum framework.

D-Wave Leap

cloud.dwavesys.com | Free developer access available

D-Wave takes a different architectural approach quantum annealing rather than gate-based quantum circuits. This makes it particularly suited to optimization problems (logistics, scheduling, resource allocation) rather than general-purpose quantum algorithms. The Leap service offers free developer access with real-time QPU response times and an impressive 99.9% uptime claim.

Best for: Optimization-focused use cases in business and operations research.

Multi-Vendor Platforms (qBraid, Strangeworks, Classiq)

For those who want to experiment across multiple hardware vendors without managing separate accounts, platforms like qBraid, Strangeworks, and Classiq aggregate access to IBM, IonQ, Rigetti, D-Wave, Quantinuum, and others under a single interface. These are particularly useful for benchmarking the same circuit across different hardware architectures.

The Takeaway

The Multiverse Computing result matters not because it transforms AI overnight, but because it demonstrates a new direction: instead of scaling classical infrastructure ever upward, quantum components can be slotted into existing AI systems to unlock improvements that would otherwise be computationally out of reach.

We are, in all likelihood, at the very beginning of quantum-enhanced AI. The hardware is noisy, the circuits are small, and the improvements are incremental. But the proof of concept now exists on real production hardware and the quantum computing access layer has never been more open.

If you’ve been curious about quantum computing, there’s never been a better time to start experimenting. The hardware is a cloud call away.

References: Aizpurua et al., arXiv:2605.05914 (May 2026); IBM Quantum Platform; AWS Braket; Multiverse Computing.
https://www.livescience.com/technology/quantum/scientists-trained-an-ai-model-using-an-ibm-quantum-computer-and-it-answered-questions-correctly-that-the-base-model-couldnt

#LLM