Quantum Approximate Optimization Algorithm

The Quantum Approximate Optimization Algorithm (QAOA) is a gate-based algorithm that can be considered the counterpart to quantum annealing in the quantum circuit model. QAOA utilizes quantum circuits—with their qubits, quantum gates, and measurements—to solve combinatorial optimization problems formulated in the QUBO framework.

After reading this chapter,

You will understand how QAOA works.
You will know how to design the circuits used in the algorithm.
You will be able to solve your own combinatorial optimization problems using QAOA in Qiskit and PennyLane.

From adiabatic computing to QAOA

Discretizing adiabatic quantum computing

In the Adiabatic Quantum Computing & Quantum Annealing (AQQA), we've learn how to use adiabatic quantum computing and the quantum annearling to calculate combinatorial optimization problems. Both methods relied on the adiabatic theorem, which we used a time-dependent Hamiltonian that induced a continuous transformation of the state of a quantum system. You may wonder is there any sort of analog to this way of solving optimization problems for circuit-based quantum computers.

QAOA was initially proposed as a discretization of adiabatic quantum computing with the goal of approximating the optimal solution to combinatorial optimization problems. Recall that the Hamiltonian used in adiabatic quantum computing and quantum annealing is of the form:

$H(t) = A(t)H_{0} + B(t)H_{1},$

with $H_{0}$ and $H_{1}$ two fixed Hamiltonians and $A(t)$ and $B(t)$ functions satifying $A(0) = B(T) = 1$ and $A(T) = B(0) = 0$ , where $T$ is the total time of the process. It turns out that the evolution of the quantum system is governed by the famous time-depent Schrödinger equation.

In QAOA, all we need to know is that, applying discretization, we can express the solution as a production of operators of the form

$e^{i\Delta t(A(t_{c})H_{0} + B (t_{c})H_{1})}$

applied to the initial state. Where,

$i$ is the imaginary unit.
$t_{c}$ is a fixed time point in $[0,T]$
$\Delta t$ is a small amount of time.

It's worth to notice that:

In the inteval $[t_{c}, t_{c} + \Delta]$ we assume that the Hamiltonian is constant and equal to $H(t_{c}) = A(t_c)H_{0} + B(t_{c})H_{1}$ .
The smaller $\Delta t$ is, the better this approximation will be.
It is also critical to know that $e^{i\Delta t(A(t_{c})H_{0} + B (t_{c})H_{1})}$ is a unitary transformation.
Some of the quantum gates such as $R_X$ , $R_Y$ , and $R_Z$ are also exponentials of some matirces.

Thus, by using the descretiation technique, if $\lvert \psi \rangle$ is the initial state, then the final state can be approximated by

$\bigg( \prod_{m = 0}^{p}e^{i\Delta t(A(t_{m})H_{0} + B (t_{m})H_{1})} \bigg) \lvert \psi_{0} \rangle$

where $t_{m} = m\frac{\Delta t}{T}$ , and $p = \frac{T}{\Delta t}$ (discretized by $p$ steps).

To compute this state with a quantum circuit, we just need an additional approximation. You know that $e^{a+b} = e^{a}e^{b}$ for any real numbers. However, this identity doesn't hold for a matrices, unless the matrices commute. However, if $\Delta t$ is small, then

$e^{i\Delta t(A(t_{m})H_{0} + B (t_{m})H_{1})} \approx e^{i\Delta tA(t_{c})H_{0}}\ e^{i\Delta t B (t_{c})H_{1}},$

which is known as the Lie-Trotter formula.

Putting theLie-Trotter formula with the discretization we mentioned above, we can obtain,

$\bigg( \prod_{m = 0}^{p}e^{i\Delta t(A(t_{m})H_{0} + B (t_{m})H_{1})} \bigg) \lvert \psi_{0} \rangle = \prod_{m = 0}^{p} e^{i\Delta tA(t_{c})H_{0}}\ e^{i\Delta t B (t_{c})H_{1}} \lvert \psi_{0} \rangle,$

which is the inspiration for QAOA.

QAOA Algorithm

Let's start with a combinatorial optimization problem that we want to solve, and we encode it into an Ising Hamiltonian $H_{1}$ . To find its ground state and solve our problem, we seek to apply a quantum state evolution similar to that of quantum annealing, but using a quantum circuit instead of a quantum annealer.

To simulate with a quantum circuit the evolution of a state under a time-dependent Hamiltonian, you only need to take an initial state $\lvert \psi \rangle$ and the alternate for $p$ times the application of the operators $e^{i\gamma H_{1}}$ and $e^{i\beta H_{0}}$ for some values of $\gamma$ and $\beta$ . We will see that the unitary transformations $e^{i\gamma H_{1}}$ and $e^{i\beta H_{0}}$ can be implemented with just one-qubit and two-qubit quantum gates.

What we are doing is using a quantum circuit to prepare a state of the form

$e^{i\beta_{p}H_{0}}e^{i\gamma_{p}H_{1}} \dots e^{i\beta_{2}H_{0}}e^{i\gamma_{2}H_{1}} e^{i\beta_{1}H_{0}}e^{i\gamma_{1}H_{1}} \lvert \psi_{0} \rangle,$

where $p \geq 1.$ Usually, we collect all the coefficients in the exponents in two tuples $\beta = (\beta_{1}, \cdots, \beta_{p})$ and $\gamma = (\gamma_{1}, \cdots, \gamma_{p})$ and we denote the whole state by $\lvert \beta , \gamma \rangle$ .

In QAOA, we choose a fixed value of $p$ and we have some values for $\beta$ and $\gamma$ . We'll just consider these values to be plain real numbers.

So how to choose the best possible values? Remember that we are just trying to find the ground state of $H_{1}$ , the lower the value of the energy $\langle \beta, \gamma \lvert H_{1} \lvert \beta, \gamma \rangle$ , the better. In this way, we have transformed our optimization problem into finding the values $\beta$ and $\gamma$ that minimize

$E(\beta, \gamma) = \langle \beta, \gamma \lvert H_{1} \lvert \beta, \gamma \rangle.$

Since $\beta$ , $\gamma$ , and energy $E(\beta, \gamma)$ are real numbers, we are just having a problem of finding a minimum for a real-valued function with real inputs. Therefore, we can just apply some optimization algorith such as the gradient descent algorithm. However, as we know, the number of amplitudes needed to describe a state like $\lvert \beta , \gamma \rangle$ is exponential in the number of qubits that we are using. Thus, computing $E(\beta, \gamma)$ may be difficult with just a classical computer.

Luckly, estimating value of $E(\beta, \gamma)$ is something that we can do very efficiently with a quantum computer - when the number of terms in $H_1$ is polynomial in the number of qubits.

Thus, we can use the classical optimization algorithm for function minimization and use a quantum computer to estimate $E$ value. Then we give that value back to the classical algorithm until it needs another of value $E$ . This is what we call a hybrid algorithm.

Once we obtained the optimal values $\beta^{*}$ and $\gamma^{*}$ for $\beta$ and $\gamma$ , we can use the quantum computer once more in order to prepare the state $\lvert \beta^{*}, \gamma^{*}\rangle$ . This state should have a sizeable overlap with the ground state of $H_{1}$ , so when we measure it in the computational basis, we will have a good chance of obtaining a string of zeros and ones that is a good solution to our original problem.

Takeaways

The input to QAOA is an Ising Hamiltonian $H_{1}$ , the ground state of which we wich to approximate becasuse it encodes the solution to a certain combinatorial optimization problem.
We consider the energy function $E(\beta, \gamma)$ as defined before and we proceed to minimize it.
Pick $p \geq 1$ and some initial values $\beta_{0}$ and $\gamma_{0}$ for some optimization algorithm.
Run the optimization in a classic computer, use quantum computer $E$ to prepare the state $\lvert \beta, \gamma \rangle$ and to estimate energy, retrun the value back to the optmization algorithm in a classic computer.
Continue this porcess until the classical optmization algorithm finds the optimal values of $\beta^{*}$ and $\gamma^{*}$ .
Using qunautm copmuter to prepare $\lvert \beta^{*}, \gamma^{*}\rangle$ .

Here is the pseudocode in the following algorithm:

Algorithm 5.1 (QAOA).

Choose a value for p.
Choose a starting set of values:
- β = (β₁, ..., βₚ)
- γ = (γ₁, ..., γₚ)
While the stopping criteria are not met, do the following:
- Prepare state |β, γ⟩. This is done on the quantum computer!
- From measurements of |β, γ⟩, estimate E(β, γ).
- Update β and γ according to the minimization algorithm.
Obtain the optimal values:
- β* and γ* returned by the minimization algorithm.
Prepare state |β*, γ*⟩. This is done on the quantum computer!
Measure the state to obtain an approximate solution.

Circuits of QAOA

For QAOA, it always involves with preparing state of the form

$\lvert \beta, \gamma \rangle = e^{i\beta_{p}H_{0}}e^{i\gamma_{p}H_{1}} \dots e^{i\beta_{2}H_{0}}e^{i\gamma_{2}H_{1}} e^{i\beta_{1}H_{0}}e^{i\gamma_{1}H_{1}} \lvert \psi_{0} \rangle,$

where $\lvert \psi_{0} \rangle$ is the ground state of $H_{0}$ . As we know, $H_{0}$ is usually taken to be $-\sum_{j=0}^{n-1} X_{j}$ , while $H_{1}$ is an Ising Hamiltonian of the form

$-\sum_{j,k} J_{jk}Z_{j}Z_{k} - \sum_{j} h_{j}Z_{j}$

where the coefficients $J_{jk}$ and $h_{j}$ are real numbers. The ground state of $H_{0}$ is $\bigotimes_{i=1}^{n-1} \lvert + \rangle$ . This state can be esaily prepared, you just need to use Hadamard gate on each qubit of the circuit starting from $\lvert 0 \rangle$ .

Let's focus on the operations of the form $e^{i \beta_{k}H_{0}}$ , with $\beta_{k}$ a real number. We also know that $H_{0} = -\sum_{j=0}^{n-1} X_{j}$ and that all $X_{j}$ matrices commute with each other. Thus, we can say

$e^{i \beta_{k}H_{0}} = e^{-i \beta_{k} \sum_{j=0}^{n-1}X_{j}} = \prod_{j=0}^{n-1} e^{-i \beta_{k}X_{j}}$

Since $e^{i \beta X_{j}}$ is the expression for the rotation gate $R_{X}(2\beta)$ , we just need to apply this gate to each of the qubits in our circuit.

Next, we will take care of the $e^{i \gamma_{l}H_{1}}$ for any real coefficient $\gamma_{l}$ . We know that $H_{1}$ is a sum of terms of the form $J_{jk}Z_{j}Z_{k}$ and $h_{j}Z_{j}$ . Since these matrices commute with each other, we get

$e^{i \gamma_{l}H_{1}} = e^{i\gamma_{l}(\sum_{j,k} J_{jk}Z_{j}Z_{k} + \sum_{j}h_{j}Z_{j})} = \prod_{j,k}e^{-i\gamma_{l}J_{jk}Z_{j}Z_{k}} \prod_{j}e^{-i \gamma_{l} h_{j} Z_{j}}$

Same, the operations of the form $e^{-i\gamma_{l}h_{j}Z_{j}}$ can be carried out with rotation gate $R_{Z}$ . Thus, we only need to figure out how to represent $e^{-i \gamma_{l} h_{j} Z_{j}}$ using quantum gates.

First, let's denote the real number $\gamma_{l}J_{jk}$ by $a$ .
Notice that $e^{iaZ_{j}Z_{k}}$ is the exponential of a diagonal matrix, since $Z_{j}Z_{k}$ is the tensor product of diagonal matrices.
If $\lvert x \rangle$ is a computational basis, we have the following properties

$\begin{array}{ll} e^{-iaZ_{j}Z_{k}}\lvert x \rangle = e^{-ia}\lvert x \rangle & \text{if qubit j and k have the same values} \\ e^{-iaZ_{j}Z_{k}}\lvert x \rangle = e^{ia}\lvert x \rangle & \text{if qubit j and k have the different values} \end{array}$

This unitary action is implemented by the circuit below, where we have only depicted qubits $j$ and $k$ .

Figure. Circuit Implementation of $e^{-iaZ_{j}Z_{k}}$

Imagine that the Ising Hamiltonian of your problem is $3Z_{0}Z_{2} - Z_{1}Z_{2} + 2Z_{0}$ Then, the circuit used by QAOA to prepare $\beta, \gamma$

Figure. QAOA circuit with $p=1$

Don't worry, let's go through this together!

First, we prepare the ground state of $H_0$ with a column of Hadamard gates.
Remember that we set $a = \gamma_{l}J_{jk}$ and $e^{-iaZ_{j}Z_{k}}\lvert x \rangle = e^{ia}\lvert x \rangle$ if qubit j and k have the different values. Since $J = 3$ , we implemented $R_{Z}(2a = 2 \times 3\gamma_{l} = 6 \gamma_{1})$ with CNOT gate between qubit 0 and 2.
Next, we perform the same operation between qubit 1 and 2 with $R_{Z}(2a = 2 \times -1\gamma_{l} = -2 \gamma_{1})$ and CNOT gates.
Then we use an $R_Z$ gate on qubit $0$ to implement $e^{-i2\gamma_{1}Z_{0}}$ .
Finally, a column of $R_X(2\beta_1)$ gates implements $e^{-i\beta_{1}\sum_{j}X_{j}}$

If we increase the number of layers $p$ , the circuit would grow by repeating for another $p-1$ times the very same circuit structure shown above except for the inintal Hadamard gates. More, we have to replace $\beta$ and $\gamma$ with corresponding $p$ .

Estimating the energy

After knowing how to construct a quantum circuit, let's see how to estimate the energy for the state $\lvert \beta, \gamma \rangle$ .

Here, we are more interested in the energy of the $\lvert \beta, \gamma \rangle$ since this is the quantity that we want to minimize. That's beeing say, we need to evaluate $\langle \beta, \gamma \lvert H_{1} \lvert \beta, \gamma \rangle$ . Of course, we don't have to access to the state vector since we are preparing the state with a quantum computer.

Since we already know how to evaluate efficiently $\langle x\lvert H_{1} \lvert x \rangle$ for any basis state $\lvert x \rangle$ . In fact, $\langle x\lvert H_{1} \lvert x \rangle$ is the value of $x$ in the cost function of our combinatorial optimization problem, because we derived $H_{1}$ from it. We only need to notice that $\langle x\lvert Z_{j} \lvert x \rangle = 1$ if the $j$ -th bit of $x$ is 0 and that $\langle x\lvert Z_{j} \lvert x \rangle = -1$ otherwise. In the same fashion, $\langle x\lvert Z_{j}Z_{k} \lvert x \rangle = 1$ if $j$ -th and $k$ -th bits of x are equal and $\langle x\lvert Z_{j}Z_{k} \lvert x \rangle = -1$ if they are different.

Let's look into this problem, for instance, try to evaluate $\langle x \lvert H_{1} \lvert x \rangle$ if $H_{1} = 3Z_{0}Z_{2}-Z_{1}Z_{2}+2Z_{0}$ ,

$\langle 101 \lvert H_{1} \lvert 101\rangle = 3\langle 101 \lvert Z_{0}Z_{2} \lvert 101\rangle - \langle 101 \lvert Z_{1}Z_{2} \lvert 101\rangle + 2\langle 101 \lvert Z_{0} \lvert 101\rangle = 3 + 1 - 2 = 4$

Since state $101$ gives that $j=0,2$ -th bits. Therefore, $Z_{j}Z_{k}$ = 1 if $Z_{0}Z_{2} = -1$ otherwise. In the same fashion, $1$ -th bits is 0, therefore, only $\langle 101|Z_{1}|101\rangle =1$ .

Practice

Try to evaluate $\langle 100 \lvert H_{1} \lvert 100\rangle$ with $H_{1} = 3Z_{0}Z_{2}-Z_{1}Z_{2}+2Z_{0}$ = ?

Answer

$H_{1} = 3Z_{0}Z_{2}-Z_{1}Z_{2}+2Z_{0} = 3 \times (-1) - 1 + 2 \times (-1) = -6$ .

We also know that the can always write $\lvert \beta, \gamma \rangle$ as a linear combination of basis state,

$\lvert \beta, \gamma \rangle = \sum_{x} a_{x} \lvert x \rangle$

for certain amplitudes $a_{x}$ such that $\sum_{x}|a_{x}|^{2} = 1$ . Therefore, we can have,

$\langle \beta, \gamma \lvert H_{1} \lvert \beta, \gamma \rangle = \bigg( \sum_{y} a_{y}^{*} \langle y \lvert \bigg) H_{1} \bigg( \sum_{y} a_{x}^{*} \lvert x \rangle \bigg) = \sum_{y}\sum_{x} a_{y}^{*} a_{x} \langle y \lvert H_{1} \lvert x \rangle = \sum_{x} \lvert a_{x} \lvert ^{2} \langle x \lvert H_{1} \lvert x \rangle.$

because $H_{1} \lvert c \rangle$ is always a multiple of $\lvert x \rangle$ ( $H_{1}$ is a diagonal matrix since it is a sum of diagonal matrices), because $\langle y | x \rangle = 0$ when $y\neq x$ , and because $a_{y}^{*} a_{x} = | a_{x} |^{2}$ .

From this, we obtain

$\langle \beta, \gamma \lvert H_{1} \lvert \beta, \gamma \rangle \approx \sum_{x} \frac{m_{x}}{M} \langle x | H_{1} | x \rangle,$

where $m_{x}$ is the number of times that $x$ was measured. The higher the value of $M$ , the better this approximation will be.

QUBO and HOBO

The Higher Order Binary Optimization (HOBO) or Polynomial Unconstrained Binary Optimization (PUBO), are the optimization problems that deals with minimizing a binary polynomial - of any degree - with no additional restrictions.

Solving HOBO by tansforming it to QUBO problems

One way to solve HOBO problems is by transforming them into QUBO problems. For an example, you can substitute prodcuts $xy$ by a new binary variables $z$ as long as you introduce a penalty term $xy - 2xz - 2yz +3z$ , which is $0$ if and only if $xy = z$ . This kind of transformations can be access in D-Wave via BinaryPolynomial objects that you can reduce polynomials of degree $2$ with the make_quadratic function.

Solving QUBO via QAOA

We can consider a binary polynomial of any degree and transform it using the techniques we have covered in QUBO section. We will end up having a Hamiltonian that is a sum of tensor products of $Z_{j}$ matrices. Luckly, we can now deal with more than just one or two $Z_{j}$ matrices.

In the same fashion, if $\lvert x \rangle$ is a basis state, we have

$e^{-iaZ_{j1}Z_{j2}\cdots Z_{jm}} \lvert x \rangle = e^{-ia} \lvert x \rangle$

if the sum of the bits of $x$ in positions $j_{1}, j_{2}, \cdots, j_{m}$ is even, and

$$ e^{-iaZ_{j1}Z_{j2}\cdots Z_{jm}} \lvert x \rangle = e^{ia} \lvert x \rangle $$ if the sum is odd.

This unitary action can be implemented by using consecutive CNOT gates with control qubits in $j_{1}, j_{1}, \cdots, j_{m-1}$ and targets in $j_{m}$ then a $R_Z$ gate with parameter $2a$ on qubit $j_{m}$ and again, consecutive CNOT gates with control qubits $j_{m-1}, j_{m-2}, \cdots, j_{1}$ and targets in $j_{m}$ . You can see the implementation in the image down below for case $e^{-iaZ_{0}Z_{1}Z_{3}}$ .

Figure. Implementation of $ e^{-iaZ_{0}Z_{1}Z_{3}}$.

We can also expand the same concept to help us evaluate the Hamiltonian $H_{1}$ that includes tensor products of $Z$ matrices. We have

$\langle x | Z_{j_1}Z_{j_2} \cdots Z_{m} | x \rangle = 1$

if the sum of the bits of $x$ positions $j_{1}, j_{1}, \cdots, j_{m}$ is even. And

$\langle x | Z_{j_1}Z_{j_2} \cdots Z_{m} | x \rangle = -1$

if the sum of the bits of $x$ positions $j_{1}, j_{1}, \cdots, j_{m}$ is odd.

Example

Please evaluate $\langle 100| H_{1} |100 \rangle$ with $H_{1} = Z_{0}Z_{1}Z_{2} + 3Z_{0}Z_{2} - Z_{1}Z_{2} + 2Z_{0}$

Answer

Notice that the convention for labelling qubits in a state vector like $|x\rangle = |x_{n}x_{n-1}\cdots x_{0}\rangle$ follows the big-endian convention, where:

The leftmost bit corresponds to the most significant qubit ( $x_{n}$ ), which has the highest index.
The rightmost bit corresponds to the most significant qubit ( $x_{0}$ ), which has the lowest index.

This convention is used by IBM Qiskit and Google Cirq.

(a) $Z_0 Z_1 Z_2$

Positions involved: $j_0 = 0, j_1 = 1, j_2 = 2$
Sum of bits: $x_0 + x_1 + x_2 = 0 + 0 + 1 = 1$ (odd)
Result: $\langle 100 | Z_0 Z_1 Z_2 | 100 \rangle = -1$

(b) $3 Z_0 Z_2$

Positions involved: $j_0 = 0, j_2 = 2$
Sum of bits: $x_0 + x_2 = 0 + 1 = 1$ (odd)
Result: $\langle 100 | Z_0 Z_2 | 100 \rangle = -1$
Coefficient: $3$
Contribution: $3 \times (-1) = -3$

(c) $- Z_1 Z_2$

Positions involved: $j_1 = 1, j_2 = 2$
Sum of bits: $x_1 + x_2 = 0 + 1 = 1$ (odd)
Result: $\langle 100 | Z_1 Z_2 | 100 \rangle = -1$
Coefficient: $-1$
Contribution: $-1 \times (-1) = 1$

(d) $2 Z_0$

Position involved: $j_0 = 0$
Sum of bits: $x_0 = 0$ (even)
Result: $\langle 100 | Z_0 | 100 \rangle = 1$
Coefficient: $2$
Contribution: $2 \times 1 = 2$

Now, sum up the results from all terms $\langle 100 | H_1 | 100 \rangle = (-1)+ (-3) + 1 + 2 = -1$

Using QAOA with Qiskit

See Solving QUBO problems with QAOA in Qiskit

Using QAOA with PennyLane

See Using QAOA with PennyLane

Summary

In this chapter, you've learn the most popular quantum algorithms used to solve optimization problmes with gate-based quantum computers. You also learned that QAOA is a "discretization version" of a quantum annealing and it is implemented in hybrid way such that we run the classical optimization tool to update parameters such as $\beta$ and $\gamma$ and then use the powerful qunatum computer to prepare its energy state. And you surely know how to use these circuits to estimate expectation values in an efficient way.