A PennyLane plugin for the Maestro quantum simulator by Qoro Quantum.
Drop-in replacement for default.qubit — one line change, same code, faster results.
# Before
dev = qp.device("default.qubit", wires=20)
# After — up to 20× faster
dev = qp.device("maestro.qubit", wires=20)Why Maestro?
- Iterate Faster — Up to 20× faster statevector simulations for VQE/QAOA loops.
- Scale Up — Simulate 1000+ qubits using Maestro's optimized MPS backend.
- Sample from MPS — The only PennyLane MPS backend that supports shot-based sampling.
- GPU Ready — 64× faster than PennyLane GPU on MPS workloads via cuQuantum.
pip install pennylane-maestroThat's it. This installs pennylane (≥0.38) and qoro-maestro (≥0.2.8) automatically.
import pennylane as qp
import numpy as np
dev = qp.device("maestro.qubit", wires=2)
@qp.qnode(dev)
def circuit(theta):
qp.RX(theta, wires=0)
qp.CNOT(wires=[0, 1])
return qp.expval(qp.PauliZ(1))
print(circuit(np.pi / 4))Benchmarked on PennyLane 0.44.1. Run
examples/benchmark_lightning_vs_maestro.pyto reproduce.
| Qubits | default.qubit |
lightning.qubit |
maestro.qubit |
vs dq | vs lq |
|---|---|---|---|---|---|
| 20 | 977 ms | 115 ms | 45 ms | 22× | 2.6× |
| 22 | 4.31 s | 543 ms | 184 ms | 23× | 3.0× |
| 24 | 10.5 s | 2.36 s | 820 ms | 13× | 2.9× |
| 26 | DNF | 10.1 s | 3.56 s | ∞ | 2.8× |
| Qubits | default.qubit |
default.tensor (quimb) |
maestro.qubit (MPS) |
vs quimb |
|---|---|---|---|---|
| 100 | OOM | 90 ms | 11 ms | 8× |
| 500 | OOM | 689 ms | 90 ms | 7.7× |
| 1000 | OOM | 1.96 s | 207 ms | 9.5× |
🔥 Only Maestro supports this. Neither
default.tensornor Qiskit Aer MPS offer shot-based sampling through PennyLane.
| Qubits | maestro.qubit (MPS) |
Unique Samples |
|---|---|---|
| 100 | 259 ms | 10,000 |
| 500 | 1.22 s | 10,000 |
| 1000 | 2.43 s | 10,000 |
All backends are selected via keyword arguments — no code changes needed:
# CPU Statevector (default)
dev = qp.device("maestro.qubit", wires=20)
# MPS for 100+ qubits
dev = qp.device("maestro.qubit", wires=100,
simulation_type="MatrixProductState",
max_bond_dimension=256)
# Stabilizer for Clifford circuits
dev = qp.device("maestro.qubit", wires=1000,
simulation_type="Stabilizer")
# Finite shots
dev = qp.device("maestro.qubit", wires=10, shots=10_000)
# GPU (requires separate license — see below)
dev = qp.device("maestro.qubit", wires=28, simulator_type="Gpu")All available options
simulator_type |
Description |
|---|---|
"QCSim" |
Qoro's optimized CPU simulator (default) |
"Gpu" |
CUDA-accelerated GPU simulator |
"CompositeQCSim" |
p-block simulation |
simulation_type |
Description |
|---|---|
"Statevector" |
Full statevector (default) |
"MatrixProductState" |
MPS / tensor-train for large qubit counts |
"Stabilizer" |
Clifford-only stabilizer |
"TensorNetwork" |
General tensor network |
"PauliPropagator" |
Pauli propagation |
"ExtendedStabilizer" |
Extended stabilizer |
Hamiltonians are evaluated natively via Maestro's batched estimate() — all Pauli terms in a single C++ call:
import pennylane as qp
import numpy as np
n_qubits = 20
H = qp.Hamiltonian(
[0.5] * (n_qubits - 1) + [0.3] * n_qubits,
[qp.PauliZ(i) @ qp.PauliZ(i+1) for i in range(n_qubits - 1)] +
[qp.PauliX(i) for i in range(n_qubits)]
)
dev = qp.device("maestro.qubit", wires=n_qubits)
@qp.qnode(dev, diff_method="parameter-shift")
def vqe_circuit(params):
for i in range(n_qubits):
qp.RY(params[i], wires=i)
for i in range(n_qubits - 1):
qp.CNOT(wires=[i, i + 1])
return qp.expval(H)
params = np.random.random(n_qubits)
energy = vqe_circuit(params)
gradient = qp.grad(vqe_circuit)(params)See examples/ising_phase_transition.py for a full demo that simulates a 100-qubit transverse-field Ising model and uses Maestro's exclusive MPS shot sampling to extract magnetization distributions and spatial correlations. Runs in ~30 seconds.
PennyLane automatically decomposes any unsupported gate (e.g. Rot) into Maestro's native gate set. No manual intervention needed.
For large-scale workloads, Maestro supports CUDA-accelerated simulation (statevector, MPS, tensor network) via NVIDIA cuQuantum.
dev = qp.device("maestro.qubit", wires=28, simulator_type="Gpu")| Simulator | Relative Runtime |
|---|---|
| Maestro GPU | 1× |
| Maestro CPU | 5× |
| Qibo GPU | 7.5× |
| Qiskit CPU | 14× |
| PennyLane GPU | 64× |
Larger instances failed to run on some platforms, limiting comparison.
→ Request GPU access & free trial at maestro.qoroquantum.net
If you use pennylane-maestro in your research, please cite:
@misc{pennylane_maestro,
author = {{Qoro Quantum}},
title = {PennyLane-Maestro: High-Performance C++ Backend for PennyLane},
year = {2026},
publisher = {GitHub},
howpublished = {\url{https://github.com/QoroQuantum/pennylane-maestro}}
}Maintained by Qoro Quantum (team@qoroquantum.de).
Licensed under GPL-3.0. See the LICENSE file for details.