Quantum Support Vector Machine in PennyLane¶
Photo by Maksym Kaharlytskyi on Unsplash
In this Notebook, we will give a introduction of how to use Quantum Support Vector Mechine in PennyLane. For demonstrate purpose, we will work on 2 categories of load_wine
.
Python Workflow¶
- Preparing data set from load_wine
- PennyLane and scikit-learn
- Dataset dimension reduction
- Custom feature maps
Preparing data set from load_wine¶
First, let's import Sklean package load_wine
to explore our data. The wine recofnition dataset a labeled dataset with information about wines. There are 13 numerical variables that describe the color intensitym alcohol concentration, and other things.
Learn more about tutorial for
load_wine
: https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_wine.htmlFind out raw data of the
load_wine
: https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data
# import `load_wine` dataset
from sklearn.datasets import load_wine
import numpy as np
# Set seed for reproducibility
np.random.seed(seed=5678)
x,y = load_wine(return_X_y=True)
In load_wine
, the first 59 elements must belong to the first category (label 0) while the 71 sybsequent belongs to the second one (label 1). We can run the following code to obtain a labeled dataset with two categories.
# Let's print out our dataset
x = x[:59+71]
y = y[:59+71]
# print out x and y to explore our data
print("="*10)
print(f" x data\n {x}")
print("="*10)
print(f" The first row from the raw data: \n {x[0]}")
print(f" Number of x variables : {len(x[0])}")
print("="*10)
print(f" y data\n {y}")
========== x data [[1.423e+01 1.710e+00 2.430e+00 ... 1.040e+00 3.920e+00 1.065e+03] [1.320e+01 1.780e+00 2.140e+00 ... 1.050e+00 3.400e+00 1.050e+03] [1.316e+01 2.360e+00 2.670e+00 ... 1.030e+00 3.170e+00 1.185e+03] ... [1.179e+01 2.130e+00 2.780e+00 ... 9.700e-01 2.440e+00 4.660e+02] [1.237e+01 1.630e+00 2.300e+00 ... 8.900e-01 2.780e+00 3.420e+02] [1.204e+01 4.300e+00 2.380e+00 ... 7.900e-01 2.570e+00 5.800e+02]] ========== The first row from the raw data: [1.423e+01 1.710e+00 2.430e+00 1.560e+01 1.270e+02 2.800e+00 3.060e+00 2.800e-01 2.290e+00 5.640e+00 1.040e+00 3.920e+00 1.065e+03] Number of x variables : 13 ========== y data [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
Here,
x
is the data set includes the 13 variables andy
is the corresponding of thex
.0
represents the category 0 and1
represents the category 1.
Let's call train_test_split
from sklearn.model_selection
to help us split our training and test dataset. We set the training_size = 0.9
as we take 90% of the raw data from x and y as a training set and the remaining 10% as the test set.
from sklearn.model_selection import train_test_split
# Split the data into training and test sets with training size of 0.9.
x_tr, x_test, y_tr, y_test = train_test_split(x,y, train_size=0.9)
Before we do any quantum operations, we should normalized our dataset. We will use the easiest way of normalization ways: scaling each of the variables linearly in such way that the maximum absolute value taken by each variable be 1.
Please run MaxAbsScaler()
object from sklearn.preprocessing
to normalize our training set.
from sklearn.preprocessing import MaxAbsScaler
# Normalize our training data into a range of {0,1}
scaler = MaxAbsScaler()
x_tr = scaler.fit_transform(x_tr)
Now, all of our data are positive number between 0 and 1. It is worth to notice that we don't normalize the test simultaneously since we will includes the information from the training set. Instead, we normalize the test set independently from the training set. Please run the following code to normalize the testing set.
# Normalize our test set independently from the training set to avoid includes training set data leaks into test set.
x_test = scaler.transform(x_test)
x_test = np.clip(x_test,0,1)
PennyLane and scikit-learn¶
Since out dataset has 13 variables, using angle encoding or the ZZ feature map need 13 qubtis. This maight not be feasible if we want our kernal to be simulated on some powerful computer. Therefore, we will work on amplitude encoding using 4 qubots, which can encode up to 16 qubits, and we will refill the remaining qubits with zeros.
Here's the installation instruction for lightning.qubit
:
https://docs.pennylane.ai/projects/lightning/en/stable/lightning_qubit/installation.html
Amplitude Encoding¶
import pennylane as qml
# initialize our quantum device
nqubits = 4
dev = qml.device(name = 'default.qubit', wires = nqubits)
# define our kernel circuit
@qml.qnode(dev)
def kernel_circ(a, b):
# encode a into amplitude vector of n qubits
qml.AmplitudeEmbedding(
a, wires=range(nqubits), pad_with=0, normalize=True
)
# Conpute the inverse(adjoint) of the amplitude encoding of b
qml.adjoint(qml.AmplitudeEmbedding(
b, wires=range(nqubits), pad_with=0, normalize=True
))
return qml.probs(wires=range(nqubits))
Here,
- We use
AmplitudeEmbedding()
to returns an operation equivalent to the amplitude encoding of its first arguemnt. In our case, we useda
andb
for this first argument. AndAmplitudeEmbedding
encodes $2^n$ features into the amplitude vector of $n$ qubits. - There are the classical data that our kernel function takes as input. More, we also asked
AmplitudeEmbedding()
to noramlize each input vector for us, just as amplitude encoding needs us to do (as we were told in basic quantum state). - We use
pad_with = 0
to fill the remaining qubits with zeros since we are only considernqubits = 4
case out of total of 13 cases. - We implemented
qml.adjoint()
to compute the adjoint (inverse) of the amplitude encodingb
. - Lastly, we retrive the probabilities of measuring each possible state in the computational basis by using
qml.probs(wires=range(nqubits))
. The first element of this array will be the output of our kernel.
Recall that if the amplitude encoding feature map is given an inputs $x_{0},\cdots,x_{2^{n}-1}$, it simply prepares the state
$$ | \phi(\overrightarrow{a}\rangle) = \frac{1}{\sqrt{\sum_{k}x_{k}^{2}}} \sum_{k=0}^{2^{n}-1}x_{k}|k\rangle. $$
You can run the below to check if the circuit works as expected. The first entry should return 1, which corresponds to the output of the kernel.
kernel_circ(x_tr[0],x_tr[0])
tensor([1.00000000e+00, 9.90011286e-35, 1.49909334e-32, 2.57335126e-33, 1.47137684e-32, 8.95091217e-34, 4.65150803e-33, 9.97555849e-35, 4.93038066e-32, 1.26313737e-34, 7.40552773e-33, 4.06038555e-33, 1.37899948e-32, 6.66163615e-34, 1.50394221e-33, 2.17790719e-36], requires_grad=True)
Now, let's run the following code to train our model. In order to use a custom kernel, you are required to provide a kernel
function accepting two arrays, A
and B
, and returning a matrix with entries (j,k)
containing the kernel applied to A[j]
and B[k]
.
from sklearn.svm import SVC
def qkernal(A, B):
# [0] for return the first entry: the measured value.
return np.array([[kernel_circ(a,b)[0] for b in B] for a in A] )
# Fit the model
svm = SVC(kernel=qkernal).fit(x_tr, y_tr)
The training can take up to a few mintues depends on the performance of your computer. Once it's over, you can check the accuracy of your trained model with the following instructions:
from sklearn.metrics import accuracy_score
score_test_amp = accuracy_score(svm.predict(x_test), y_test)
print(f"Amplitude encoding test score: {score_test_amp}")
Amplitude encoding test score: 0.9230769230769231
In our casem this gives an accuracy of 0.92, meaning that the SVM is capable of classifying most of the elements in the test dataset correctly.
Dataset dimension reduction¶
We've seen how to use amplitude encoding to take full advantage of the 13 variables of our dataset while only using 4 qubits. Now, let's see how can we reduce the number of variables in the dataset - while trying to minimize the loss of information - and thus be able to use other feature maps that could perhaps yield better results.
Here, we will try to reduce the number of variables in our dataset into 8 and we train our QVSM with the new angle encoding.
The method we are going to use in this section is called principal component analysis.
Principle directions¶
When you have a dataset with $n$ variables, you are basically have a set of points in $R^{n}$.
- The first principle direction is the direction of the line that best fits the data as measured by the mean squared error.
- The second principle direction is the direction of the line that best fits the data while being orthogonal to the first principle direction.
This goes on in such way that the $k$-th principal direction is that of the line that best fits the data while being orthogonal to the first, second, and all the way up to $(k-1)$-th principle direction.
Let's consider an orthonormal basis $\{v_{1},\cdots,v_{n}\}$ of $R^{n}$ in which $v_{j}$ points in the direction of the $j$-th principal component. The vectors in this orthonormal basis will be of the form $v_{j} = (v_{j}^{1},\cdots,v_{j}^{n}) \in R^{n}$.
When using principal component analysis, we compute the vector of the aforementioned basis. Then we define variables
$$ \overrightarrow{x}_{j} = v_{j}^{1}x_{1} + \cdots +v_{j}^{n}x_{n}. $$
The PCA
class uses fit
method that analyzes the data and figures out the best way to reduce its dimensionality using principal component analysis. Follow by transform, which can then transform any data in the way it learned to do when fit
was invoked.
from sklearn.decomposition import PCA
# Apply PCA method with targeted dimension of 8
pca = PCA(n_components=8)
xs_tr = pca.fit_transform(x_tr)
xs_test = pca.fit_transform(x_test)
Angle Encoding¶
nqubits = 8
dev = qml.device(name = 'default.qubit', wires = nqubits)
# Apply Angle encoding
@qml.qnode(dev)
def kernel_circ(a,b):
qml.AngleEmbedding(a, wires = range(nqubits))
qml.adjoint(qml.AngleEmbedding(b, wires=range(nqubits)))
return qml.probs(wires=range(nqubits))
# Fit the model
svm = SVC(kernel=qkernal).fit(xs_tr, y_tr)
score_test_angle = accuracy_score(svm.predict(xs_test), y_test)
print(f"Angle encoding test score: {score_test_angle}")
Angle encoding test score: 0.9230769230769231
You should get the result of around 92.3% of accuracy and should have less time then the amplitude encoding with all 13 varaibels.
Custom feature maps¶
In this section, we will train a QSVM on the reduced dataset using our own implementation of the ZZ feature map.
from itertools import combinations
# creating a ZZFeatureMap
def ZZFeatureMap(nqubits, data):
# Number of variables that we will load:
# Could be smaller thatn the number of qubits
nload = min(len(data), nqubits)
# apply Hadamard and Rz rotation for each qubit,
for i in range(nload):
qml.Hadamard(i)
qml.RZ(2.0 * data[i], wires=i)
for pair in list(combinations(range(nload),2)):
q0 = pair[0]
q1 = pair[1]
qml.CZ(wires=[q0, q1])
qml.RZ(2.0 * (np.pi - data[q0]) * (np.pi - data[q1]), wires = q1)
qml.CZ(wires=[q0, q1])
We have used the combinations
function from the itertools
module. This function take 2 arguments: an array arr
and an integer l
. And it returns an array with all the sorted tuples of length l
with elements from the array arr
.
Let's use it as our kernel function and train our model!
nqubits = 4
dev = qml.device(name = 'default.qubit', wires = nqubits)
@qml.qnode(dev)
def kernel_circ(a,b):
ZZFeatureMap(nqubits, a)
qml.adjoint(ZZFeatureMap)(nqubits, b) ## qml.adjoint(fn: Operator)(nqubits, b)
return qml.probs(wires=range(nqubits))
# Fit the model
svm = SVC(kernel=qkernal).fit(xs_tr, y_tr)
score_test_ZZ = accuracy_score(svm.predict(xs_test), y_test)
print(f"ZZ feature map encoding test score: {score_test_ZZ}")
ZZ feature map encoding test score: 0.8461538461538461
It is a fact that qml.adjoint
is acting on the ZZFeatureMap
function itself, not on its qubit.
print(f"------------------------------------------")
print(f"Feature map | Test Score |")
print(f"------------------------------------------")
print(f"Amplitude encoding | {score_test_amp:10.3f} | ")
print(f"Angle encoding | {score_test_angle:10.3f} |")
print(f"ZZ feature map | {score_test_ZZ:10.3f} | ")
print(f"------------------------------------------")
------------------------------------------ Feature map | Test Score | ------------------------------------------ Amplitude encoding | 0.923 | Angle encoding | 0.923 | ZZ feature map | 0.846 | ------------------------------------------
import sys
import platform
import pennylane
import pennylane_lightning
print("="*10 + " Version Information " + "="*10)
print(f"Python : {sys.version}")
print(f"Operating System : {platform.system()} {platform.release()} ({platform.architecture()[0]})")
print("="*41)
print(f"Pennylane : {pennylane.__version__}")
print(f"pennylane_lightning : {pennylane_lightning.__version__} ")
print("="*41)
========== Version Information ========== Python : 3.11.11 (main, Dec 11 2024, 10:28:39) [Clang 14.0.6 ] Operating System : Darwin 24.3.0 (64bit) ========================================= Pennylane : 0.26.0 pennylane_lightning : 0.28.0 =========================================
References¶
[1]. Combarro, E. F., & González-Castillo, S. (2023). A practical guide to quantum machine learning and quantum optimisation: Hands-on approach to modern quantum algorithms. Packt Publishing.