QDP
Backend detection and selection for QDP.
Available backends:
- _qdp (Rust+CUDA) -- native extension, highest performance
- torch (PyTorch) -- reference implementation, must be explicitly selected
Auto-detection only activates the Rust backend. To use the PyTorch
reference backend, call force_backend(Backend.PYTORCH) or use
the .backend("pytorch") builder method on QdpBenchmark /
QuantumDataLoader.
Backend
class Backend(enum.Enum)
Available QDP encoding backends.
get_qdp
@lru_cache(maxsize=1)
def get_qdp() -> ModuleType | None
Return the _qdp Rust extension module, or None if unavailable.
get_torch
@lru_cache(maxsize=1)
def get_torch() -> ModuleType | None
Return the torch module, or None if unavailable.
get_backend
def get_backend() -> Backend
Return the active backend.
Only the Rust backend is auto-detected. The PyTorch reference
backend must be selected explicitly via :func:force_backend.
force_backend
def force_backend(backend: Backend | None) -> None
Override automatic backend detection.
Pass None to restore auto-detection. Primarily useful for
testing and benchmarking.
require_backend
def require_backend() -> Backend
Return the current backend or raise if none is available.
Benchmark API: supports Rust-optimized pipeline and PyTorch reference backend.
Usage: from qumat_qdp import QdpBenchmark, ThroughputResult, LatencyResult
# Rust backend (default):
result = (QdpBenchmark(device_id=0).qubits(16).encoding("amplitude")
.batches(100, size=64).warmup(2).run_throughput())
# PyTorch reference (must be explicitly selected):
result = (QdpBenchmark(device_id=0).backend("pytorch").qubits(16)
.encoding("amplitude").batches(100, size=64).run_throughput())
ThroughputResult
@dataclass
class ThroughputResult()
Throughput benchmark measurement.
Returned by :meth:QdpBenchmark.run_throughput. duration_sec is the
measured timed section after any configured warmup batches. vectors_per_sec
is computed over total_batches * batch_size encoded input vectors.
LatencyResult
@dataclass
class LatencyResult()
Latency benchmark measurement.
Returned by :meth:QdpBenchmark.run_latency. duration_sec is the same
timed interval used for throughput, and latency_ms_per_vector is the
average milliseconds per encoded input vector across the measured batches.
QdpBenchmark
class QdpBenchmark()
Builder for throughput/latency benchmarks.
Supports two backends:
"rust"(default): Rust-optimized pipeline (no Python for-loop, GIL released)."pytorch": Pure PyTorch reference implementation (must be explicitly selected).
qubits
def qubits(n: int) -> QdpBenchmark
Set the number of qubits for benchmarked encodings.
Arguments:
n: Number of qubits in each encoded output state.
Returns:
This builder for fluent chaining.
encoding
def encoding(method: str) -> QdpBenchmark
Set the encoding method to benchmark.
Arguments:
method: Encoding name, for example"amplitude","angle","basis","iqp", or"iqp-z".
Returns:
This builder for fluent chaining.
batches
def batches(total: int, size: int = 64) -> QdpBenchmark
Set the number and size of benchmark batches.
Arguments:
total: Number of timed batches to process.size: Number of vectors in each batch.
Returns:
This builder for fluent chaining.
prefetch
def prefetch(n: int) -> QdpBenchmark
Accept a prefetch setting for fluent API compatibility.
The current Rust benchmark pipeline manages work internally and the
PyTorch reference path does not use a Python-side prefetch queue, so
n is intentionally ignored.
Arguments:
n: Requested prefetch depth; currently unused.
Returns:
self for fluent builder chaining.
warmup
def warmup(n: int) -> QdpBenchmark
Set the number of warmup batches run before timing.
Arguments:
n: Number of batches to execute before measurements begin.
Returns:
This builder for fluent chaining.
backend
def backend(name: str) -> QdpBenchmark
Select the benchmark execution backend.
"rust" (the default) uses the native optimized pipeline exposed by
the _qdp extension and raises at run time if that extension or entry
point is unavailable. "pytorch" uses the pure-PyTorch reference
implementation on the selected CUDA device when usable, otherwise CPU.
Arguments:
name: Backend name, either"rust"or"pytorch".
Raises:
ValueError: Ifnameis not a supported backend.
Returns:
self for fluent builder chaining.
dtype
def dtype(dtype: str) -> QdpBenchmark
Set pipeline element dtype: 'f64' (default) or 'f32'.
'f32' activates the zero-copy float32 batch path where the encoding
supports it; encodings without an f32 kernel automatically fall back to
f64 inside the Rust pipeline.
run_throughput
def run_throughput() -> ThroughputResult
Run the configured throughput benchmark.
qubits() and batches() must be configured before calling this
method. The default "rust" backend calls the native _qdp
pipeline with any configured warmup batches; "pytorch" runs the
reference encoder loop and synchronizes CUDA timing when applicable.
Raises:
ValueError: If required benchmark parameters are missing.RuntimeError: If the Rust backend is selected but unavailable.
Returns:
A :class:ThroughputResult containing elapsed seconds and
encoded vectors per second.
run_latency
def run_latency() -> LatencyResult
Run the configured latency benchmark.
qubits() and batches() must be configured before calling this
method. The Rust backend reports latency from the native pipeline; the
PyTorch backend derives average latency from its throughput run.
Raises:
ValueError: If required benchmark parameters are missing.RuntimeError: If the Rust backend is selected but unavailable.
Returns:
A :class:LatencyResult containing elapsed seconds and mean
milliseconds per encoded vector.
Unified Python facade for explicit QDP backend selection.
QdpEngine
class QdpEngine()
Select and delegate to a native QDP encoding backend.
QdpEngine is the small public Python facade used by callers that want
explicit backend selection. backend="cuda" routes to the Rust/CUDA
extension-backed engine. backend="amd" and backend="triton_amd"
route to the AMD/Triton implementation. The selected backend is exposed as
self.backend ("cuda" or "amd") and all encode() calls are
forwarded to that engine.
Arguments:
device_id: GPU device ordinal to use.precision: Numeric precision requested from the backend, such as"float32"or"float64"when supported by that backend.backend: Backend selector. Valid values are"cuda","amd", and"triton_amd".
Raises:
ValueError: Ifbackendis not one of the supported selectors.
__init__
def __init__(device_id: int = 0,
precision: str = "float32",
backend: str = "cuda") -> None
Create a backend-selecting QDP engine facade.
Arguments:
device_id: GPU device ordinal to use.precision: Numeric precision requested from the backend.backend: Backend selector, either"cuda","amd", or"triton_amd".
Raises:
ValueError: Ifbackendis not supported.
encode
def encode(data: Any,
num_qubits: int,
encoding_method: str = "amplitude") -> Any
Encode input samples with the configured backend.
Arguments:
data: Input samples accepted by the selected backend.num_qubits: Number of qubits in the output state vector.encoding_method: Encoding strategy, such as"amplitude","angle","basis","iqp","iqp-z", or"phase"when supported by the backend.
Raises:
ValueError: If the backend does not supportencoding_method.
Returns:
Backend-native encoded tensor or tensor-like result.
Unified tensor facade for backend-native QDP results.
QdpTensor
@dataclass
class QdpTensor()
DLPack-compatible wrapper for backend-native QDP tensor results.
The Rust/CUDA path and other native backends may return objects whose
concrete tensor type is backend-specific. QdpTensor preserves that
object in value while exposing __dlpack__ and __dlpack_device__
so consumers such as PyTorch can import it without a copy.
Arguments:
value: Backend-native tensor-like object. It must implement the DLPack protocol when converted withto_torch()ortorch.from_dlpack.backend: Human-readable backend name used in error messages.
Raises:
RuntimeError: Ifvaluedoes not implement the required DLPack methods when conversion is attempted.
__dlpack__
def __dlpack__(stream: int | None = None) -> Any
Return a DLPack capsule for the wrapped backend tensor.
Arguments:
stream: Optional consumer stream to pass through to the wrapped tensor's__dlpack__implementation.
Raises:
RuntimeError: If the wrapped value does not implement__dlpack__.
Returns:
A DLPack capsule representing value.
__dlpack_device__
def __dlpack_device__() -> Any
Return the DLPack device descriptor for the wrapped tensor.
Raises:
RuntimeError: If the wrapped value does not implement__dlpack_device__.
Returns:
The (device_type, device_id) tuple reported by value.
to_torch
def to_torch() -> Any
Convert the wrapped tensor to a PyTorch tensor via DLPack.
Returns:
A torch.Tensor sharing storage with the backend tensor
when the backend's DLPack producer supports zero-copy exchange.
QuantumTensor
Backward-compatible alias for :class:QdpTensor.
Quantum Data Loader: Python builder for Rust-backed batch iterator.
Usage: from qumat_qdp import QuantumDataLoader
loader = (QuantumDataLoader(device_id=0).qubits(16).encoding("amplitude")
.batches(100, size=64).source_synthetic())
for qt in loader:
batch = torch.from_dlpack(qt)
...
QuantumDataLoader
class QuantumDataLoader()
Builder for batched QDP encoding iterators.
QuantumDataLoader can generate synthetic input samples or read supported
file formats, then encode each batch with the selected backend. The default
"rust" backend returns Rust-backed QuantumTensor batches, while the
explicit "pytorch" backend returns torch.Tensor batches. The
"auto" backend tries the Rust extension first and falls back to PyTorch
when the native extension is unavailable.
__init__
def __init__(device_id: int = 0,
num_qubits: int = 16,
batch_size: int = 64,
total_batches: int = 100,
encoding_method: str = "amplitude",
seed: int | None = None) -> None
Create a loader builder with default synthetic batching settings.
Arguments:
device_id: GPU device ordinal used by native and PyTorch backends.num_qubits: Number of qubits in each encoded output state.batch_size: Number of samples per emitted batch.total_batches: Maximum number of batches to emit.encoding_method: Encoding method name.seed: Optional synthetic data seed.
Raises:
ValueError: If any initial setting is invalid.
qubits
def qubits(n: int) -> QuantumDataLoader
Set the number of qubits used by subsequent encodings.
n must be a positive integer. The value controls the encoded state
size (for example, amplitude and phase-style encodings produce vectors
of length 2**n) and the expected input width for encodings such as
"angle" and "iqp-z".
Arguments:
n: Positive qubit count.
Raises:
ValueError: Ifnis not a positive integer.
Returns:
self for fluent builder chaining.
encoding
def encoding(method: str) -> QuantumDataLoader
Set the quantum feature encoding method.
Valid values are "amplitude", "angle", "basis",
"iqp", "iqp-z", and "phase". Use these canonical
lowercase names because the selected backend receives the string exactly
as supplied. The PyTorch reference backend supports the same methods as
Arguments:
method: Encoding method name.
Raises:
ValueError: Ifmethodis empty, not a string, or not a supported encoding.
Returns:
self for fluent builder chaining.
batches
def batches(total: int, size: int = 64) -> QuantumDataLoader
Set the number of batches to produce and samples per batch.
Both total and size must be positive integers. For synthetic
sources, total is the exact number of generated batches. For file
sources handled by the PyTorch fallback, iteration stops at the smaller
of total and the number of complete/partial batches available from
the loaded file.
Arguments:
total: Positive maximum number of batches to emit.size: Positive number of samples per encoded batch.
Raises:
ValueError: If either argument is not a positive integer.
Returns:
self for fluent builder chaining.
source_synthetic
def source_synthetic(total_batches: int | None = None) -> QuantumDataLoader
Select the synthetic data source.
Synthetic data is the default when no file source is configured, but
calling this method records the source choice explicitly. Use
seed() to make generated samples reproducible where the selected
backend supports seeded generation. If total_batches is provided,
it overrides the current batch count and must be a positive integer.
Selecting both source_synthetic() and source_file() on the same
loader is rejected when iteration starts.
Arguments:
total_batches: Optional positive replacement for the configured number of batches.
Raises:
ValueError: Iftotal_batchesis provided but is not a positive integer.
Returns:
self for fluent builder chaining.
source_file
def source_file(path: str, streaming: bool = False) -> QuantumDataLoader
Use a file data source.
Non-streaming native loading accepts .parquet, .arrow,
.feather, .ipc, .npy, .pt, .pth, and .pb files.
The PyTorch fallback path supports only .npy, .pt, and .pth
inputs because it loads the full tensor into memory before encoding.
Streaming mode is native-only and currently accepts .parquet files.
Remote s3:// and gs:// paths are accepted when the native remote
I/O feature is enabled; remote query strings and fragments are rejected.
Arguments:
path: Local or supported remote input path.streaming: Whether to request native streaming file loading.
Raises:
ValueError: Ifpathis empty, includes an unsupported remote query/fragment, or requests streaming for an unsupported extension.
Returns:
self for fluent builder chaining.
seed
def seed(s: int | None = None) -> QuantumDataLoader
Set or clear the synthetic data seed.
None leaves the loader unseeded for the native Rust path and maps to
the PyTorch reference path's default deterministic seed. Integer seeds
must fit Rust u64 so the same configuration can be passed to the
native backend.
Arguments:
s:Noneor an integer in[0, 2**64 - 1].
Raises:
ValueError: Ifsis notNoneor a valid Rustu64.
Returns:
self for fluent builder chaining.
null_handling
def null_handling(policy: str) -> QuantumDataLoader
Set how nullable file inputs are handled by the native loader.
Valid policies are "fill_zero" (replace nulls with zero before
encoding) and "reject" (fail on null input). The policy is passed
through to Rust file and synthetic loader creation when available. The
PyTorch fallback loaders do not consume this setting because supported
.npy/.pt/.pth inputs are loaded as dense tensors.
Arguments:
policy: Null handling policy, either"fill_zero"or"reject".
Raises:
ValueError: Ifpolicyis not supported.
Returns:
self for fluent builder chaining.
backend
def backend(name: str) -> QuantumDataLoader
Set encoding backend: 'rust', 'pytorch', or 'auto'.
'auto': tries the Rust backend first and falls back to the PyTorch
reference backend if the Rust extension is unavailable, emitting a
RuntimeWarning when the fallback occurs. 'rust' raises if the
extension is missing. 'pytorch' always uses the pure-PyTorch path.
Returns self for chaining.
as_torch_dataset
def as_torch_dataset()
Wrap this loader as a torch.utils.data.IterableDataset.
Returns a dataset that yields one encoded batch (torch.Tensor) per
iteration step, compatible with torch.utils.data.DataLoader.
Example::
from qumat_qdp import QuantumDataLoader import torch
dataset = (QuantumDataLoader() .qubits(16).encoding("amplitude") .batches(100, size=64) .source_synthetic() .as_torch_dataset()) loader = torch.utils.data.DataLoader(dataset, batch_size=None, num_workers=0) for batch in loader: ... # batch is torch.Tensor, shape (64, 2**16)
Note: batch_size=None in DataLoader disables DataLoader's own batching;
num_workers=0 is required because the Rust backend holds GPU state that
cannot be pickled for multi-process workers.
__iter__
def __iter__() -> Iterator[object]
Return iterator that yields one encoded batch per step.
With the default "rust" backend, yields QuantumTensor
(use torch.from_dlpack(qt)). With .backend("pytorch"),
yields torch.Tensor directly.
Triton AMD backend for QDP encodings on ROCm.
is_triton_amd_available
def is_triton_amd_available() -> bool
Return whether the Triton AMD backend appears usable.
Returns:
True when PyTorch reports a ROCm device, Triton imports, and
the active Triton target is HIP or cannot be queried reliably.
TritonAmdEngine
@dataclass
class TritonAmdEngine()
ROCm/Triton implementation of the QDP encoder interface.
This engine targets AMD GPUs through a PyTorch ROCm runtime plus the Triton
Python package. encode() accepts "amplitude", "angle",
"basis", "iqp", "iqp-z", and "phase". The phase encoder
uses a fused Triton HIP kernel for float32 and 1 <= num_qubits <= 32;
other supported cases fall back to vectorized PyTorch operations on the same
ROCm device.
precision accepts "float32"/"f32"/"float" and
"float64"/"f64"/"double". Runtime availability is checked when
encode() is called and raises a descriptive RuntimeError if PyTorch
ROCm or Triton is unavailable.
Arguments:
device_id: ROCm device ordinal, addressed through PyTorch ascuda:{device_id}.precision: Floating-point precision for real inputs and complex outputs.
check_runtime
def check_runtime() -> None
Validate that the process can use the Triton AMD backend.
Raises:
RuntimeError: If PyTorch ROCm support or Triton is unavailable.
encode_amplitude
def encode_amplitude(data: Any, num_qubits: int) -> Any
Encode real-valued samples as normalized amplitudes.
Arguments:
data: One- or two-dimensional samples with width at most2**num_qubits.num_qubits: Number of qubits in the encoded state.
Raises:
ValueError: If the sample width exceeds the state length.
Returns:
Complex tensor of shape (batch, 2**num_qubits).
encode_angle
def encode_angle(data: Any, num_qubits: int) -> Any
Encode samples with product-state angle encoding.
Arguments:
data: One- or two-dimensional angle samples with exactlynum_qubitsvalues per sample.num_qubits: Number of qubits in the encoded state.
Raises:
ValueError: If the sample width is notnum_qubits.
Returns:
Complex tensor of shape (batch, 2**num_qubits).
encode_basis
def encode_basis(data: Any, num_qubits: int) -> Any
Encode integer basis-state indices as one-hot quantum states.
Arguments:
data: One-dimensional indices or a two-dimensional column of indices in[0, 2**num_qubits - 1].num_qubits: Number of qubits in the encoded state.
Raises:
ValueError: If indices are empty, malformed, or out of range.
Returns:
Complex one-hot tensor of shape (batch, 2**num_qubits).
encode_iqp
def encode_iqp(data: Any, num_qubits: int, *, enable_zz: bool = True) -> Any
Encode samples with the IQP feature map.
Arguments:
data: IQP parameters. Withenable_zz=True, each sample must containnum_qubits + num_qubits * (num_qubits - 1) // 2values; otherwise each sample must containnum_qubitsvalues.num_qubits: Number of qubits in the encoded state.enable_zz: Include pairwise ZZ interactions whenTrue.
Raises:
ValueError: If the parameter width does not match the IQP variant.
Returns:
Complex tensor of shape (batch, 2**num_qubits).
encode_phase
def encode_phase(data: Any, num_qubits: int) -> Any
Encode samples as equal-magnitude states with data-dependent phase.
Arguments:
data: One- or two-dimensional phase samples with exactlynum_qubitsvalues per sample.num_qubits: Number of qubits in the encoded state.
Raises:
ValueError: If the sample width is notnum_qubits.
Returns:
Complex tensor of shape (batch, 2**num_qubits).
encode
def encode(data: Any,
num_qubits: int,
encoding_method: str = "amplitude") -> Any
Encode input samples using a named Triton AMD encoding.
Arguments:
data: Input samples for the selected encoding method.num_qubits: Number of qubits in the encoded state.encoding_method: One of"amplitude","angle","basis","iqp","iqp-z", or"phase".
Raises:
RuntimeError: If the Triton AMD runtime is unavailable.ValueError: Ifencoding_methodis unsupported or inputs are invalid for the selected encoder.
Returns:
Complex tensor of shape (batch, 2**num_qubits).