Quantum Support Vector Machine in PennyLane¶

Photo by Maksym Kaharlytskyi on Unsplash

In this Notebook, we will give a introduction of how to use Quantum Support Vector Mechine in PennyLane. For demonstrate purpose, we will work on 2 categories of load_wine.

Python Workflow¶

Preparing data set from load_wine
PennyLane and scikit-learn
Dataset dimension reduction
Custom feature maps

Preparing data set from load_wine¶

First, let's import Sklean package load_wine to explore our data. The wine recofnition dataset a labeled dataset with information about wines. There are 13 numerical variables that describe the color intensitym alcohol concentration, and other things.

Learn more about tutorial for load_wine: https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_wine.html
Find out raw data of the load_wine: https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data

In [1]:

Copied!





# import `load_wine` dataset
from sklearn.datasets import load_wine
import numpy as np

# Set seed for reproducibility
np.random.seed(seed=5678)

x,y = load_wine(return_X_y=True)
# import `load_wine` dataset
from sklearn.datasets import load_wine
import numpy as np

# Set seed for reproducibility
np.random.seed(seed=5678)

x,y = load_wine(return_X_y=True)

In load_wine, the first 59 elements must belong to the first category (label 0) while the 71 sybsequent belongs to the second one (label 1). We can run the following code to obtain a labeled dataset with two categories.

In [2]:

Copied!





# Let's print out our dataset
x = x[:59+71]
y = y[:59+71]

# print out x and y to explore our data
print("="*10)
print(f" x data\n {x}")
print("="*10)
print(f" The first row from the raw data: \n {x[0]}")
print(f" Number of x variables : {len(x[0])}")
print("="*10)
print(f" y data\n {y}")
# Let's print out our dataset
x = x[:59+71]
y = y[:59+71]

# print out x and y to explore our data
print("="*10)
print(f" x data\n {x}")
print("="*10)
print(f" The first row from the raw data: \n {x[0]}")
print(f" Number of x variables : {len(x[0])}")
print("="*10)
print(f" y data\n {y}")

==========
 x data
 [[1.423e+01 1.710e+00 2.430e+00 ... 1.040e+00 3.920e+00 1.065e+03]
 [1.320e+01 1.780e+00 2.140e+00 ... 1.050e+00 3.400e+00 1.050e+03]
 [1.316e+01 2.360e+00 2.670e+00 ... 1.030e+00 3.170e+00 1.185e+03]
 ...
 [1.179e+01 2.130e+00 2.780e+00 ... 9.700e-01 2.440e+00 4.660e+02]
 [1.237e+01 1.630e+00 2.300e+00 ... 8.900e-01 2.780e+00 3.420e+02]
 [1.204e+01 4.300e+00 2.380e+00 ... 7.900e-01 2.570e+00 5.800e+02]]
==========
 The first row from the raw data: 
 [1.423e+01 1.710e+00 2.430e+00 1.560e+01 1.270e+02 2.800e+00 3.060e+00
 2.800e-01 2.290e+00 5.640e+00 1.040e+00 3.920e+00 1.065e+03]
 Number of x variables : 13
==========
 y data
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]

Here,

x is the data set includes the 13 variables and y is the corresponding of the x.
0 represents the category 0 and 1 represents the category 1.

Let's call train_test_split from sklearn.model_selection to help us split our training and test dataset. We set the training_size = 0.9 as we take 90% of the raw data from x and y as a training set and the remaining 10% as the test set.

In [3]:

Copied!

from sklearn.model_selection import train_test_split

# Split the data into training and test sets with training size of 0.9.
x_tr, x_test, y_tr, y_test = train_test_split(x,y, train_size=0.9)
from sklearn.model_selection import train_test_split

# Split the data into training and test sets with training size of 0.9.
x_tr, x_test, y_tr, y_test = train_test_split(x,y, train_size=0.9)

Before we do any quantum operations, we should normalized our dataset. We will use the easiest way of normalization ways: scaling each of the variables linearly in such way that the maximum absolute value taken by each variable be 1.

Please run MaxAbsScaler() object from sklearn.preprocessing to normalize our training set.

In [4]:

Copied!

from sklearn.preprocessing import MaxAbsScaler

# Normalize our training data into a range of {0,1}
scaler = MaxAbsScaler()
x_tr = scaler.fit_transform(x_tr)
from sklearn.preprocessing import MaxAbsScaler

# Normalize our training data into a range of {0,1}
scaler = MaxAbsScaler()
x_tr = scaler.fit_transform(x_tr)

Now, all of our data are positive number between 0 and 1. It is worth to notice that we don't normalize the test simultaneously since we will includes the information from the training set. Instead, we normalize the test set independently from the training set. Please run the following code to normalize the testing set.

In [5]:

Copied!

# Normalize our test set independently from the training set to avoid includes training set data leaks into test set.
x_test = scaler.transform(x_test)
x_test = np.clip(x_test,0,1)
# Normalize our test set independently from the training set to avoid includes training set data leaks into test set.
x_test = scaler.transform(x_test)
x_test = np.clip(x_test,0,1)

PennyLane and scikit-learn¶

Since out dataset has 13 variables, using angle encoding or the ZZ feature map need 13 qubtis. This maight not be feasible if we want our kernal to be simulated on some powerful computer. Therefore, we will work on amplitude encoding using 4 qubots, which can encode up to 16 qubits, and we will refill the remaining qubits with zeros.

Here's the installation instruction for lightning.qubit:

https://docs.pennylane.ai/projects/lightning/en/stable/lightning_qubit/installation.html

Amplitude Encoding¶

In [6]:

Copied!





import pennylane as qml

# initialize our quantum device
nqubits = 4
dev = qml.device(name = 'default.qubit', wires = nqubits)

# define our kernel circuit
@qml.qnode(dev)
def kernel_circ(a, b):
    # encode a into amplitude vector of n qubits
    qml.AmplitudeEmbedding(
        a, wires=range(nqubits), pad_with=0, normalize=True
    )
    # Conpute the inverse(adjoint) of the amplitude encoding of b
    qml.adjoint(qml.AmplitudeEmbedding(
        b, wires=range(nqubits), pad_with=0, normalize=True
    ))
    return qml.probs(wires=range(nqubits))
import pennylane as qml

# initialize our quantum device
nqubits = 4
dev = qml.device(name = 'default.qubit', wires = nqubits)

# define our kernel circuit
@qml.qnode(dev)
def kernel_circ(a, b):
    # encode a into amplitude vector of n qubits
    qml.AmplitudeEmbedding(
        a, wires=range(nqubits), pad_with=0, normalize=True
    )
    # Conpute the inverse(adjoint) of the amplitude encoding of b
    qml.adjoint(qml.AmplitudeEmbedding(
        b, wires=range(nqubits), pad_with=0, normalize=True
    ))
    return qml.probs(wires=range(nqubits))

Here,

We use AmplitudeEmbedding() to returns an operation equivalent to the amplitude encoding of its first arguemnt. In our case, we used a and b for this first argument. And AmplitudeEmbedding encodes $2^n$ features into the amplitude vector of $n$ qubits.
There are the classical data that our kernel function takes as input. More, we also asked AmplitudeEmbedding() to noramlize each input vector for us, just as amplitude encoding needs us to do (as we were told in basic quantum state).
We use pad_with = 0 to fill the remaining qubits with zeros since we are only consider nqubits = 4 case out of total of 13 cases.
We implemented qml.adjoint() to compute the adjoint (inverse) of the amplitude encoding b.
Lastly, we retrive the probabilities of measuring each possible state in the computational basis by using qml.probs(wires=range(nqubits)). The first element of this array will be the output of our kernel.

Recall that if the amplitude encoding feature map is given an inputs $x_{0},\cdots,x_{2^{n}-1}$, it simply prepares the state

$$ | \phi(\overrightarrow{a}\rangle) = \frac{1}{\sqrt{\sum_{k}x_{k}^{2}}} \sum_{k=0}^{2^{n}-1}x_{k}|k\rangle. $$

You can run the below to check if the circuit works as expected. The first entry should return 1, which corresponds to the output of the kernel.

In [7]:

Copied!

kernel_circ(x_tr[0],x_tr[0])
kernel_circ(x_tr[0],x_tr[0])

Out[7]:

tensor([1.00000000e+00, 9.90011286e-35, 1.49909334e-32, 2.57335126e-33,
        1.47137684e-32, 8.95091217e-34, 4.65150803e-33, 9.97555849e-35,
        4.93038066e-32, 1.26313737e-34, 7.40552773e-33, 4.06038555e-33,
        1.37899948e-32, 6.66163615e-34, 1.50394221e-33, 2.17790719e-36], requires_grad=True)

Now, let's run the following code to train our model. In order to use a custom kernel, you are required to provide a kernel function accepting two arrays, A and B, and returning a matrix with entries (j,k) containing the kernel applied to A[j] and B[k].

In [8]:

Copied!





from sklearn.svm import SVC

def qkernal(A, B):
    # [0] for return the first entry: the measured value.
    return np.array([[kernel_circ(a,b)[0] for b in B] for a in A] )

# Fit the model
svm = SVC(kernel=qkernal).fit(x_tr, y_tr) 
from sklearn.svm import SVC

def qkernal(A, B):
    # [0] for return the first entry: the measured value.
    return np.array([[kernel_circ(a,b)[0] for b in B] for a in A] )

# Fit the model
svm = SVC(kernel=qkernal).fit(x_tr, y_tr) 

The training can take up to a few mintues depends on the performance of your computer. Once it's over, you can check the accuracy of your trained model with the following instructions:

In [12]:

Copied!

from sklearn.metrics import accuracy_score

score_test_amp = accuracy_score(svm.predict(x_test), y_test)
print(f"Amplitude encoding test score: {score_test_amp}")
from sklearn.metrics import accuracy_score

score_test_amp = accuracy_score(svm.predict(x_test), y_test)
print(f"Amplitude encoding test score: {score_test_amp}")

Amplitude encoding test score: 0.9230769230769231

In our casem this gives an accuracy of 0.92, meaning that the SVM is capable of classifying most of the elements in the test dataset correctly.

Dataset dimension reduction¶

We've seen how to use amplitude encoding to take full advantage of the 13 variables of our dataset while only using 4 qubits. Now, let's see how can we reduce the number of variables in the dataset - while trying to minimize the loss of information - and thus be able to use other feature maps that could perhaps yield better results.

Here, we will try to reduce the number of variables in our dataset into 8 and we train our QVSM with the new angle encoding.

The method we are going to use in this section is called principal component analysis.

Principle directions¶

When you have a dataset with $n$ variables, you are basically have a set of points in $R^{n}$.

The first principle direction is the direction of the line that best fits the data as measured by the mean squared error.
The second principle direction is the direction of the line that best fits the data while being orthogonal to the first principle direction.

This goes on in such way that the $k$-th principal direction is that of the line that best fits the data while being orthogonal to the first, second, and all the way up to $(k-1)$-th principle direction.

Let's consider an orthonormal basis $\{v_{1},\cdots,v_{n}\}$ of $R^{n}$ in which $v_{j}$ points in the direction of the $j$-th principal component. The vectors in this orthonormal basis will be of the form $v_{j} = (v_{j}^{1},\cdots,v_{j}^{n}) \in R^{n}$.

When using principal component analysis, we compute the vector of the aforementioned basis. Then we define variables

$$ \overrightarrow{x}_{j} = v_{j}^{1}x_{1} + \cdots +v_{j}^{n}x_{n}. $$

The PCA class uses fit method that analyzes the data and figures out the best way to reduce its dimensionality using principal component analysis. Follow by transform, which can then transform any data in the way it learned to do when fit was invoked.

In [14]:

Copied!

from sklearn.decomposition import PCA

# Apply PCA method with targeted dimension of 8
pca = PCA(n_components=8)

xs_tr = pca.fit_transform(x_tr)
xs_test = pca.fit_transform(x_test)
from sklearn.decomposition import PCA

# Apply PCA method with targeted dimension of 8
pca = PCA(n_components=8)

xs_tr = pca.fit_transform(x_tr)
xs_test = pca.fit_transform(x_test)

Angle Encoding¶

Figure. Angle encoding of an input using different rotation angle.

In [15]:

Copied!





nqubits = 8
dev = qml.device(name = 'default.qubit', wires = nqubits) 

# Apply Angle encoding
@qml.qnode(dev)
def kernel_circ(a,b):
    qml.AngleEmbedding(a, wires = range(nqubits))
    qml.adjoint(qml.AngleEmbedding(b, wires=range(nqubits)))
    return qml.probs(wires=range(nqubits))
nqubits = 8
dev = qml.device(name = 'default.qubit', wires = nqubits) 

# Apply Angle encoding
@qml.qnode(dev)
def kernel_circ(a,b):
    qml.AngleEmbedding(a, wires = range(nqubits))
    qml.adjoint(qml.AngleEmbedding(b, wires=range(nqubits)))
    return qml.probs(wires=range(nqubits))

In [16]:

Copied!

# Fit the model
svm = SVC(kernel=qkernal).fit(xs_tr, y_tr) 

score_test_angle = accuracy_score(svm.predict(xs_test), y_test)
print(f"Angle encoding test score: {score_test_angle}")
# Fit the model
svm = SVC(kernel=qkernal).fit(xs_tr, y_tr) 

score_test_angle = accuracy_score(svm.predict(xs_test), y_test)
print(f"Angle encoding test score: {score_test_angle}")

Angle encoding test score: 0.9230769230769231

You should get the result of around 92.3% of accuracy and should have less time then the amplitude encoding with all 13 varaibels.

Custom feature maps¶

Figure. ZZ feature map of three qubits with inputs x

In this section, we will train a QSVM on the reduced dataset using our own implementation of the ZZ feature map.

In [17]:

Copied!





from itertools import combinations

# creating a ZZFeatureMap
def ZZFeatureMap(nqubits, data):
    # Number of variables that we will load:
    # Could be smaller thatn the number of qubits
    nload = min(len(data), nqubits)
    
    # apply Hadamard and Rz rotation for each qubit,
    for i in range(nload):
        qml.Hadamard(i)
        qml.RZ(2.0 * data[i], wires=i)
        
    for pair in list(combinations(range(nload),2)):
        q0 = pair[0]
        q1 = pair[1]

        qml.CZ(wires=[q0, q1])
        qml.RZ(2.0 * (np.pi - data[q0]) * (np.pi - data[q1]), wires = q1)
        qml.CZ(wires=[q0, q1])
from itertools import combinations

# creating a ZZFeatureMap
def ZZFeatureMap(nqubits, data):
    # Number of variables that we will load:
    # Could be smaller thatn the number of qubits
    nload = min(len(data), nqubits)
    
    # apply Hadamard and Rz rotation for each qubit,
    for i in range(nload):
        qml.Hadamard(i)
        qml.RZ(2.0 * data[i], wires=i)
        
    for pair in list(combinations(range(nload),2)):
        q0 = pair[0]
        q1 = pair[1]

        qml.CZ(wires=[q0, q1])
        qml.RZ(2.0 * (np.pi - data[q0]) * (np.pi - data[q1]), wires = q1)
        qml.CZ(wires=[q0, q1])

We have used the combinations function from the itertools module. This function take 2 arguments: an array arr and an integer l. And it returns an array with all the sorted tuples of length l with elements from the array arr.

Let's use it as our kernel function and train our model!

In [18]:

Copied!





nqubits = 4
dev = qml.device(name = 'default.qubit', wires = nqubits) 

@qml.qnode(dev)
def kernel_circ(a,b):
    ZZFeatureMap(nqubits, a)
    qml.adjoint(ZZFeatureMap)(nqubits, b) ## qml.adjoint(fn: Operator)(nqubits, b)
    return qml.probs(wires=range(nqubits))
nqubits = 4
dev = qml.device(name = 'default.qubit', wires = nqubits) 

@qml.qnode(dev)
def kernel_circ(a,b):
    ZZFeatureMap(nqubits, a)
    qml.adjoint(ZZFeatureMap)(nqubits, b) ## qml.adjoint(fn: Operator)(nqubits, b)
    return qml.probs(wires=range(nqubits))

In [19]:

Copied!





# Fit the model
svm = SVC(kernel=qkernal).fit(xs_tr, y_tr) 
score_test_ZZ = accuracy_score(svm.predict(xs_test), y_test)
print(f"ZZ feature map encoding test score: {score_test_ZZ}")
# Fit the model
svm = SVC(kernel=qkernal).fit(xs_tr, y_tr) 
score_test_ZZ = accuracy_score(svm.predict(xs_test), y_test)
print(f"ZZ feature map encoding test score: {score_test_ZZ}")

ZZ feature map encoding test score: 0.8461538461538461

It is a fact that qml.adjoint is acting on the ZZFeatureMap function itself, not on its qubit.

In [32]:

Copied!





print(f"------------------------------------------")
print(f"Feature map                 | Test Score |")
print(f"------------------------------------------")
print(f"Amplitude encoding          | {score_test_amp:10.3f} | ")
print(f"Angle encoding              | {score_test_angle:10.3f} |")
print(f"ZZ feature map              | {score_test_ZZ:10.3f} | ")
print(f"------------------------------------------")
print(f"------------------------------------------")
print(f"Feature map                 | Test Score |")
print(f"------------------------------------------")
print(f"Amplitude encoding          | {score_test_amp:10.3f} | ")
print(f"Angle encoding              | {score_test_angle:10.3f} |")
print(f"ZZ feature map              | {score_test_ZZ:10.3f} | ")
print(f"------------------------------------------")

------------------------------------------
Feature map                 | Test Score |
------------------------------------------
Amplitude encoding          |      0.923 | 
Angle encoding              |      0.923 |
ZZ feature map              |      0.846 | 
------------------------------------------

In [20]:

Copied!





import sys
import platform
import pennylane
import pennylane_lightning


print("="*10 + " Version Information " + "="*10)
print(f"Python              : {sys.version}")
print(f"Operating System    : {platform.system()} {platform.release()} ({platform.architecture()[0]})")
print("="*41)
print(f"Pennylane           : {pennylane.__version__}")
print(f"pennylane_lightning : {pennylane_lightning.__version__} ")
print("="*41)
import sys
import platform
import pennylane
import pennylane_lightning


print("="*10 + " Version Information " + "="*10)
print(f"Python              : {sys.version}")
print(f"Operating System    : {platform.system()} {platform.release()} ({platform.architecture()[0]})")
print("="*41)
print(f"Pennylane           : {pennylane.__version__}")
print(f"pennylane_lightning : {pennylane_lightning.__version__} ")
print("="*41)

========== Version Information ==========
Python              : 3.11.11 (main, Dec 11 2024, 10:28:39) [Clang 14.0.6 ]
Operating System    : Darwin 24.3.0 (64bit)
=========================================
Pennylane           : 0.26.0
pennylane_lightning : 0.28.0 
=========================================

References¶

[1]. Combarro, E. F., & González-Castillo, S. (2023). A practical guide to quantum machine learning and quantum optimisation: Hands-on approach to modern quantum algorithms. Packt Publishing.