This is distilled from https://learn.microsoft.com/en-us/azure/quantum/ under Creative Commons Attribution 4.0 International
Vectors and matrices in quantum computing¶
Some familiarity with linear algebra is essential to understand quantum computing. This article introduces the basic concepts of linear algebra and how to work with vectors and matrices in quantum computing.
Vectors¶
A column vector, or vector for short, $v$ of dimension (or size) $n$ is a collection of $n$ complex numbers $(v_1,v_2,\ldots,v_n)$ arranged as a column:
$$v =\begin{bmatrix} v_1\\\\ v_2\\\\ \vdots\\\\ v_n \end{bmatrix}$$
The norm of a vector $v$ is defined as $\sqrt{\sum_i |v_i|^2}$. A vector is called a unit vector if its norm is $1$.
The adjoint of a column vector $v$ is a row vector denoted as $v^\dagger$ and is defined as the conjugate transpose of $v$. For a column vector $v$ of dimension $n$, the adjoint is a row vector of dimension $1 \times n$:
$$\begin{bmatrix}v_1 \\\\ \vdots \\\\ v_n \end{bmatrix}^\dagger = \begin{bmatrix}v_1^* & \cdots & v_n^* \end{bmatrix}$$
where $v_i^*$ denotes the complex conjugate of $v_i$.
Using linear algebra, the state of a qubit $ \psi = a \ket{0} + b \ket{1}$ is described as a quantum state vector $\begin{bmatrix} a \\\\ b \end{bmatrix}$, where $|a|^2 + |b|^2 = 1$. For more information, see The qubit.
Scalar product¶
Two vectors can be multiplied together through the scalar product, also known as dot product or inner product. As the name implies, the result of the scalar product of two vectors is a scalar. The scalar product gives the projection of one vector onto another and is used to express one vector as a sum of other simpler vectors. The scalar product between two column vectors $u$ and $v$ is denoted as $\left\langle u, v\right\rangle = u^\dagger v $ and is defined as
$$ \left\langle u, v\right\rangle = u^\dagger v= \begin{bmatrix}u_1^* & \cdots & u_n^* \end{bmatrix} \begin{bmatrix}v_1\\\\ \vdots\\\\ v_n \end{bmatrix} = u_1^* v_1 + \cdots + u_n^* v_n. $$
With the scalar product, the norm of a vector $v$ can be written as $\sqrt{\langle v, v\rangle}$.
You can multiply a vector with a number $a$ to form a new vector whose entries are multiplied by $a$. You can also add two vectors $u$ and $v$ to form a new vector whose entries are the sum of the entries of $u$ and $v$. These operations are the following:
$$\mathrm{If}~u =\begin{bmatrix} u_1\\\\ u_2\\\\ \vdots\\\\ u_n \end{bmatrix}~\mathrm{and}~ v =\begin{bmatrix} v_1\\\\ v_2\\\\ \vdots\\\\ v_n \end{bmatrix},~\mathrm{then}~ au+bv =\begin{bmatrix} au_1+bv_1\\\\ au_2+bv_2\\\\ \vdots\\\\ au_n+bv_n \end{bmatrix} $$
Matrices¶
A matrix of size $m \times n$ is a collection of $m\cdot n$ complex numbers arranged in $m$ rows and $n$ columns as shown below:
$M = \begin{bmatrix} M_{11} ~~ M_{12} ~~ \cdots ~~ M_{1n}\\\\ M_{21} ~~ M_{22} ~~ \cdots ~~ M_{2n}\\\\ \ddots\\\\ M_{m1} ~~ M_{m2} ~~ \cdots ~~ M_{mn}\\\\ \end{bmatrix}$
[!NOTE] Note that a vector of dimension $n$ is simply a matrix of size $n \times 1$.
Quantum operations are represented by squared matrices, that is, the number of rows and columns are equal. For example, single-qubit operations are represented by $2 \times 2$ matrices, like the Pauli $X$ operation
$$X = \begin{bmatrix} 0 & 1 \\\\ 1 & 0 \end{bmatrix}$$
[!TIP] In Q#, the Pauli $X$ operation is represented by the
Xoperation.
As with vectors, you can multiply a matrix with a number $c$ to obtain a new matrix where every entry is multiplied with $c$, and two matrices of the same size can be added to produce a new matrix whose entries are the sum of the respective entries of the two matrices.
Matrix multiplication¶
You can also multiply a matrix $M$ of dimension $m\times n$ and a matrix $N$ of dimension $n \times p$ to get a new matrix $P$ of dimension $m \times p$ as follows:
$$ \begin{align} &\begin{bmatrix} M_{11} ~~ M_{12} ~~ \cdots ~~ M_{1n}\\\\ M_{21} ~~ M_{22} ~~ \cdots ~~ M_{2n}\\\\ \ddots\\\\ M_{m1} ~~ M_{m2} ~~ \cdots ~~ M_{mn} \end{bmatrix} \begin{bmatrix} N_{11} ~~ N_{12} ~~ \cdots ~~ N_{1p}\\\\ N_{21} ~~ N_{22} ~~ \cdots ~~ N_{2p}\\\\ \ddots\\\\ N_{n1} ~~ N_{n2} ~~ \cdots ~~ N_{np} \end{bmatrix}=\begin{bmatrix} P_{11} ~~ P_{12} ~~ \cdots ~~ P_{1p}\\\\ P_{21} ~~ P_{22} ~~ \cdots ~~ P_{2p}\\\\ \ddots\\\\ P_{m1} ~~ P_{m2} ~~ \cdots ~~ P_{mp} \end{bmatrix} \end{align} $$
where the entries of $P$ are $P_{ik} = \sum_j M_{ij}N_{jk}$. For example, the entry $P_{11}$ is the scalar product of the first row of $M$ with the first column of $N$. Note that since a vector is simply a special case of a matrix, this definition extends to matrix-vector multiplication.
Special types of matrices¶
One special square matrix is the identity matrix, denoted $\mathbb{I}$, which has all its diagonal elements equal to $1$ and the remaining elements equal to $0$:
$\mathbb{I}=\begin{bmatrix} 1 ~~ 0 ~~ \cdots ~~ 0\\\\ 0 ~~ 1 ~~ \cdots ~~ 0\\\\ ~~ \ddots\\\\ 0 ~~ 0 ~~ \cdots ~~ 1 \end{bmatrix}.$
For a square matrix $A$, a matrix $B$ is its inverse if $AB = BA = \mathbb{I}$. If a matrix $A$ has an inverse, the inverse matrix is unique and is written as $A^{-1}$.
For any matrix $M$, the adjoint or conjugate transpose of $M$ is a matrix $N$ such that $N_{ij} = M_{ji}^*$. The adjoint of $M$ is denoted $M^\dagger$.
A matrix $U$ is unitary if $UU^\dagger = U^\dagger U = \mathbb{I}$ or equivalently, $U^{-1} = U^\dagger$. One important property of unitary matrices is that they preserve the norm of a vector. This happens because
$\langle v,v \rangle=v^\dagger v = v^\dagger U^{-1} U v = v^\dagger U^\dagger U v = \langle U v, U v\rangle.$
[!NOTE] Quantum operations are represented by unitary matrices, which are squared matrices whose adjoint is equal to their inverse.
A matrix $M$ is called Hermitian if $M=M^\dagger$.
In quantum computing, there are essentially only two matrices that you encounter: Hermitian and unitary.
Tensor product¶
Another important operation is the tensor product, also called the matrix direct product or Kronecker product.
Consider the two vectors $v=\begin{bmatrix}a \\\\ b \end{bmatrix} $ and $u =\begin{bmatrix} c \\\\ d \end{bmatrix} $. Their tensor product is denoted as $v \otimes u$ and results in a block matrix.
$$ \begin{bmatrix} a \\\\ b \end{bmatrix} \otimes \begin{bmatrix} c \\\\ d \end{bmatrix} = \begin{bmatrix} a \begin{bmatrix} c \\\\ d \end{bmatrix} \\\\[1.5em] b \begin{bmatrix} c \\\\ d \end{bmatrix} \end{bmatrix} = \begin{bmatrix} a c \\\\ a d \\\\ b c \\\\ b d \end{bmatrix} $$
[!NOTE] Note that the tensor product is distinguished from matrix multiplication, which is an entirely different operation.
The tensor product is used to represent the combined state of multiple qubits. The real power of quantum computing comes from leveraging multiple qubits to perform computations. For more, see Operations on multiple qubits.
The tensor product of two square matrices $M$ and $N$ of size $n\times n$ is a larger matrix $P=M\otimes N$ of size $n^2 \times n^2$. For example:
$$ \begin{bmatrix} a\ b \\\\ c\ d \end{bmatrix} \otimes \begin{bmatrix} e\ f\\\\ g\ h \end{bmatrix} = \begin{bmatrix} a\begin{bmatrix} e\ f\\\\ g\ h \end{bmatrix} b\begin{bmatrix} e\ f\\\\ g\ h \end{bmatrix} \\\\[1em] c\begin{bmatrix} e\ f\\\\ g\ h \end{bmatrix} d\begin{bmatrix} e\ f\\\\ g\ h \end{bmatrix} \end{bmatrix} = \begin{bmatrix} ae\ af\ be\ bf \\\\ ag\ ah\ bg\ bh \\\\ ce\ cf\ de\ df \\\\ cg\ ch\ dg\ dh \end{bmatrix}. $$
Advanced matrix concepts in quantum computing¶
This article explores the concepts of eigenvalues, eigenvectors, and exponentials. These concepts form a fundamental set of matrix tools that are used to describe and implement quantum algorithms. For the basics of vectors and matrices in quantum computing, see Vectors and matrices.
Eigenvalues and eigenvectors¶
Consider a square matrix $M$ and a vector $v$. The vector $v$ is an eigenvector of $M$ if $Mv = cv$ for some number $c$. The integer $c$ is the eigenvalue corresponding to the eigenvector $v$.
In general, a matrix $M$ may transform a vector into any other vector. An eigenvector is special because it's unchanged except for being multiplied by a number. If $v$ is an eigenvector with eigenvalue $c$, then $av$ is also an eigenvector (for any nonzero $a$) with the same eigenvalue. For example, for the identity matrix, every vector $v$ is an eigenvector with eigenvalue $1$.
As another example, consider a diagonal matrix $D$, which only has non-zero entries on the diagonal:
$$ \begin{bmatrix} d_1 & 0 & 0 \\\\ 0 & d_2 & 0 \\\\ 0 & 0 & d_3 \end{bmatrix}. $$
The vectors
$$\begin{bmatrix}1 \\\\ 0 \\\\ 0 \end{bmatrix}, \begin{bmatrix}0 \\\\ 1 \\\\ 0\end{bmatrix} \text{and} \begin{bmatrix}0 \\\\ 0 \\\\ 1\end{bmatrix}$$
are eigenvectors of this matrix with eigenvalues $d_1$, $d_2$, and $d_3$, respectively. If $d_1$, $d_2$, and $d_3$ are distinct numbers, then these vectors (and their multiples) are the only eigenvectors of the matrix $D$.
In general, for a diagonal matrix it's easy to read off the eigenvalues and eigenvectors. The eigenvalues are all the numbers appearing on the diagonal, and their respective eigenvectors are the unit vectors with one entry equal to $1$ and the remaining entries equal to $0$.
Note in the example that the eigenvectors of $D$ form a basis for $3$-dimensional vectors. A basis is a set of vectors such that any vector can be written as a linear combination of them. More explicitly, $v_1$, $v_2$, and $v_3$ form a basis if any vector $v$ can be written as $v=a_1 v_1 + a_2 v_2 + a_3 v_3$ for some numbers $a_1$, $a_2$, and $a_3$.
Spectral theorem¶
In quantum computing, there are only two matrices that you encounter: Hermitian and unitary. Recall that a Hermitian matrix (also called self-adjoint) is a complex square matrix equal to its own complex conjugate transpose, while a unitary matrix is a complex square matrix whose inverse equals its complex conjugate transpose.
There's a general result known as the spectral theorem, which implies that for any Hermitian or unitary matrix $M$, there exists a unitary $U$ such that $M=U^\dagger D U$ for some diagonal matrix $D$. Furthermore, the diagonal entries of $D$ are the eigenvalues of $M$, and columns of $U^\dagger$ are the corresponding eigenvectors.
This factorization is known as spectral decomposition or eigendecomposition.
Matrix exponential¶
A matrix exponential is defined in exact analogy to the exponential function. The matrix exponential of a matrix $A$ can be expressed as
$$ e^A=\mathbf{1} + A + \frac{A^2}{2!}+\frac{A^3}{3!}+\cdots $$
Matrix exponentials are important because quantum mechanical time evolution is described by a unitary matrix of the form $e^{iB}$ for Hermitian matrix $B$. For this reason, performing matrix exponentials is a fundamental part of quantum computing and as such Q# offers intrinsic routines for describing these operations.
The easiest way to understand how to compute the exponential of a matrix is through the eigenvalues and eigenvectors of that matrix. Specifically, the spectral theorem discussed above says that for every Hermitian or unitary matrix $A$ there exists a unitary matrix $U$ and a diagonal matrix $D$ such that $A=U^\dagger D U$. Because of the properties of unitarity, $A^2 = U^\dagger D^2 U$ and similarly for any power $p$ $A^p = U^\dagger D^p U$. If one substitutes this into the operator definition of the operator exponential:
$$ e^A= U^\dagger \left(\mathbf{1} +D +\frac{D^2}{2!}+\cdots \right)U= U^\dagger \begin{bmatrix}\exp(D_{11}) & 0 &\cdots &0\\\\ 0 & \exp(D_{22})&\cdots& 0\\\\ \vdots &\vdots &\ddots &\vdots\\\\ 0&0&\cdots&\exp(D_{NN}) \end{bmatrix} U. $$
In other words, if you transform to the eigenbasis of the matrix $A$, then computing the matrix exponential is equivalent to computing the ordinary exponential of the eigenvalues of the matrix. As many operations in quantum computing involve performing matrix exponentials, this trick of transforming into the eigenbasis of a matrix to simplify performing the operator exponential appears frequently. It's the basis behind many quantum algorithms such as Trotter–Suzuki-style quantum simulation methods discussed later in this guide.
Another useful property holds for involutory matrices. An involutory matrix $B$ is both unitary and Hermitian, that is, $B=B^{-1}=B^\dagger$. Then, an involutory matrix is a square matrix equal to its own inverse, $B^2=\mathbf{1}$. By applying this property to the above expansion of the matrix exponential, grouping the $\mathbf{1}$ and the $B$ terms together, and applying Maclaurin's theorem to the cosine and sine functions, the identity
$$e^{iBx}=\mathbf{1} \cos(x)+ iB\sin(x)$$
holds for any real value $x$. This trick is especially useful because it allows you to reason about the actions that matrix exponentials have, even if the dimension of $B$ is exponentially large, for the special case when $B$ is involutory.
The qubit in quantum computing¶
Just as bits are the fundamental object of information in classical computing, qubits (quantum bits) are the fundamental object of information in quantum computing. To understand this correspondence, this article looks at the simplest example: a single qubit.
Representing a qubit¶
While a bit, or binary digit, can have a value either $0$ or $1$, a qubit can have a value that is either $0$, $1$ or a quantum superposition of $0$ and $1$.
The state of a single qubit can be described by a two-dimensional column vector of unit norm, that is, the magnitude squared of its entries must sum to $1$. This vector, called the quantum state vector, holds all the information needed to describe the one-qubit quantum system just as a single bit holds all of the information needed to describe the state of a binary variable. For the basics of vectors and matrices in quantum computing, see Vector and matrices.
Any two-dimensional column vector of real or complex numbers with norm $1$ represents a possible quantum state held by a qubit. Thus $\begin{bmatrix} \alpha \\\\ \beta \end{bmatrix}$ represents a qubit state if $\alpha$ and $\beta$ are complex numbers satisfying $|\alpha|^2 + |\beta|^2 = 1$. Some examples of valid quantum state vectors representing qubits include
$$\begin{bmatrix} 1 \\\\ 0 \end{bmatrix}, \begin{bmatrix} 0 \\\\ 1 \end{bmatrix}, \begin{bmatrix} \frac{1}{\sqrt{2}} \\\\ \frac{1}{\sqrt{2}} \end{bmatrix}, \begin{bmatrix} \frac{1}{\sqrt{2}} \\\\ \frac{-1}{\sqrt{2}} \end{bmatrix}, \text{ and }\begin{bmatrix} \frac{1}{\sqrt{2}} \\\\ \frac{i}{\sqrt{2}} \end{bmatrix}.$$
The quantum state vectors $\begin{bmatrix} 1 \\\\ 0 \end{bmatrix}$ and $\begin{bmatrix} 0 \\\\ 1 \end{bmatrix}$ take a special role. These two vectors form a basis for the vector space that describes the qubit's state. This means that any quantum state vector can be written as a sum of these basis vectors. Specifically, the vector $\begin{bmatrix} x \\\\ y \end{bmatrix}$ can be written as $x \begin{bmatrix} 1 \\\\ 0 \end{bmatrix} + y \begin{bmatrix} 0 \\\\ 1 \end{bmatrix}$. While any rotation of these vectors would serve as a perfectly valid basis for the qubit, this particular one is chosen, by calling it the computational basis.
These two quantum states are taken to correspond to the two states of a classical bit, namely $0$ and $1$. The standard convention is to choose
$$0\equiv \begin{bmatrix} 1 \\\\ 0 \end{bmatrix}, \qquad 1 \equiv \begin{bmatrix} 0 \\\\ 1 \end{bmatrix},$$
although the opposite choice could equally well be taken. Thus, out of the infinite number of possible single-qubit quantum state vectors, only two correspond to states of classical bits; all other quantum states do not.
Measuring a qubit¶
How to represent a qubit being explained, one can gain some intuition for what these states represent by discussing the concept of measurement. A measurement corresponds to the informal idea of “looking” at a qubit, which immediately collapses the quantum state to one of the two classical states $\begin{bmatrix} 1 \\\\ 0 \end{bmatrix}$ or $\begin{bmatrix} 0 \\\\ 1 \end{bmatrix}$. When a qubit given by the quantum state vector $\begin{bmatrix} \alpha \\\\ \beta \end{bmatrix}$ is measured, the outcome $0$ is obtained with probability $|\alpha|^2$ and the outcome $1$ with probability $|\beta|^2$. On outcome $0$, the qubit's new state is $\begin{bmatrix} 1 \\\\ 0 \end{bmatrix}$; on outcome $1$ its state is $\begin{bmatrix} 0 \\\\ 1 \end{bmatrix}$. Note that these probabilities sum up to $1$ because of the normalization condition $|\alpha|^2 + |\beta|^2 = 1$.
The properties of measurement also mean that the overall sign of the quantum state vector is irrelevant. Negating a vector is equivalent to $\alpha \rightarrow -\alpha$ and $\beta \rightarrow -\beta$. Because the probability of measuring $0$ and $1$ depends on the magnitude squared of the terms, inserting such signs does not change the probabilities whatsoever. Such phases are commonly called "global phases" and more generally can be of the form $e^{i \phi}$ rather than just $\pm 1$.
A final important property of measurement is that it does not necessarily damage all quantum state vectors. If one starts with a qubit in the state $\begin{bmatrix} 1 \\\\ 0 \end{bmatrix}$, which corresponds to the classical state $0$, measuring this state will always yield the outcome $0$ and leave the quantum state unchanged. In this sense, if there are only classical bits (for example, qubits that are either $\begin{bmatrix}1 \\\\ 0 \end{bmatrix}$ or $\begin{bmatrix}0 \\\\ 1 \end{bmatrix}$) then measurement does not damage the system. This means that one can replicate classical data and manipulate it on a quantum computer just as one could do on a classical computer. The ability, however, to store information in both states at once is what elevates quantum computing beyond what is possible classically and further robs quantum computers of the ability to copy quantum data indiscriminately, see also the no-cloning theorem.
Visualizing qubits and transformations using the Bloch sphere¶
Qubits may also be pictured in $3$D using the Bloch sphere representation. The Bloch sphere gives a way of describing a single-qubit quantum state (which is a two-dimensional complex vector) as a three-dimensional real-valued vector. This is important because it allows us to visualize single-qubit states and thereby develop reasoning that can be invaluable in understanding multi-qubit states (where sadly the Bloch sphere representation breaks down). The Bloch sphere can be visualized as follows:

The arrows in this diagram show the direction in which the quantum state vector is pointing and each transformation of the arrow can be thought of as a rotation about one of the cardinal axes. While thinking about a quantum computation as a sequence of rotations is a powerful intuition, it is challenging to use this intuition to design and describe algorithms. Q# alleviates this issue by providing a language for describing such rotations.
Single-qubit operations¶
Quantum computers process data by applying a universal set of quantum gates that can emulate any rotation of the quantum state vector. This notion of universality is akin to the notion of universality for traditional (for example, classical) computing where a gate set is considered to be universal if every transformation of the input bits can be performed using a finite length circuit. In quantum computing, the valid transformations that we are allowed to perform on a qubit are unitary transformations and measurement. The adjoint operation or the complex conjugate transpose is of crucial importance to quantum computing because it is needed to invert quantum transformations.
Single-qubit operations, or single-qubit quantum gates can be classified into two categories: Clifford gates and the non-Clifford gates. Non-Clifford gates consist only of $T$-gate (also known as the $\pi/8$ gate).
$$ T=\begin{bmatrix} 1 & 0 \\\\ 0 & e^{i\pi/4} \end{bmatrix}. $$
The standard set of single-qubit Clifford gates, included by default in Q#, include
$$ H=\frac{1}{\sqrt{2}}\begin{bmatrix} 1 & 1 \\\\ 1 &-1 \end{bmatrix} , \qquad S =\begin{bmatrix} 1 & 0 \\\\ 0 & i \end{bmatrix}= T^2, \qquad X=\begin{bmatrix} 0 &1 \\\\ 1& 0 \end{bmatrix}= HT^4H, $$
$$ Y = \begin{bmatrix} 0 & -i \\\\ i & 0 \end{bmatrix}=T^2HT^4 HT^6, \qquad Z=\begin{bmatrix}1&0\\\\ 0&-1 \end{bmatrix}=T^4. $$
Here the operations $X$, $Y$ and $Z$ are used especially frequently and are named Pauli operators after their creator Wolfgang Pauli. Together with the non-Clifford gate (the $T$-gate), these operations can be composed to approximate any unitary transformation on a single qubit.
While the previous constitute the most popular primitive gates for describing operations on the logical level of the stack (think of the logical level as the level of the quantum algorithm), it is often convenient to consider less basic operations at the algorithmic level, for example operations closer to a function description level. Fortunately, Q# also has methods available for implementing higher-level unitaries, which in turn allow high-level algorithms to be implemented without explicitly decomposing everything down to Clifford and $T$-gates.
The simplest such primitive is the single qubit-rotation. Three single-qubit rotations are typically considered: $R_x$, $R_y$ and $R_z$. To visualize the action of the rotation $R_x(\theta)$, for example, imagine pointing your right thumb along the direction of the $x$-axis of the Bloch sphere and rotating the vector with your hand through an angle of $\theta/2$ radians. This confusing factor of $2$ arises from the fact that orthogonal vectors are $180^\circ$ apart when plotted on the Bloch sphere, yet are actually $90^\circ$ degrees apart geometrically. The corresponding unitary matrices are:
\begin{align*} &R_z(\theta) = e^{-i\theta Z/2} = \begin{bmatrix} e^{-i\theta/2} & 0\\\\ 0& e^{i\theta/2} \end{bmatrix}, \\\\ &R_x(\theta) = e^{-i\theta X/2} = HR_z(\theta)H = \begin{bmatrix} \cos(\theta/2) & -i\sin(\theta/2)\\\\ -i\sin(\theta/2) & \cos(\theta/2) \end{bmatrix}, \\\\ &R_y(\theta) = e^{-i\theta Y/2} = SHR_z(\theta)HS^\dagger = \begin{bmatrix} \cos(\theta/2) & -\sin(\theta/2)\\\\ \sin(\theta/2) & \cos(\theta/2) \end{bmatrix}. \end{align*}
Just as any three rotations can be combined together to perform an arbitrary rotation in three dimensions, it can be seen from the Bloch sphere representation that any unitary matrix can be written as a sequence of three rotations as well. Specifically, for every unitary matrix $U$ there exists $\alpha,\beta,\gamma,\delta$ such that $U= e^{i\alpha} R_x(\beta)R_z(\gamma)R_x(\delta)$. Thus $R_z(\theta)$ and $H$ also form a universal gate set although it is not a discrete set because $\theta$ can take any value. For this reason, and due to applications in quantum simulation, such continuous gates are crucial for quantum computation, especially at the quantum algorithm design level. To achieve fault-tolerant hardware implementation, they will ultimately be compiled into discrete gate sequences that closely approximate these rotations.
Operations on multiple qubits¶
This article reviews the rules used to build multi-qubit states out of single-qubit states and discusses the gate operations needed to include in a gate set to form a universal many-qubit quantum computer. These tools are necessary to understand the gate sets that are commonly used in Q# code. They're also important to gain intuition about why quantum effects such as entanglement or interference render quantum computing more powerful than classical computing.
Single-qubit and multi-qubit gates¶
The true power of quantum computing only becomes evident as you increase the number of qubits. Single qubits possess some counter-intuitive features, such as the ability to be in more than one state at a given time. However, if all you had in a quantum computer were single-qubit gates, then a calculator and certainly a classical supercomputer would dwarf its computational power.
Quantum computing power arises, in part, because the dimension of the vector space of quantum state vectors grows exponentially with the number of qubits. This means that while a single qubit can be trivially modeled, simulating a fifty-qubit quantum computation would arguably push the limits of existing supercomputers. Increasing the size of the computation by only one extra qubit doubles the memory required to store the state and roughly doubles the computational time. This rapid doubling of computational power is why a quantum computer with a relatively small number of qubits can far surpass the most powerful supercomputers of today, tomorrow, and beyond for some computational tasks.
Two-qubit states¶
If you have two separate qubits, one in the state $\psi=\begin{bmatrix} \alpha \\\\ \beta \end{bmatrix}$ and the other in the state $\phi=\begin{bmatrix} \gamma \\\\ \delta \end{bmatrix}$, the corresponding two-qubit state is given by the tensor product (or Kronecker product) of vectors, which is defined as follows
$$ \psi \otimes \phi = \begin{bmatrix} \alpha \\\\ \beta \end{bmatrix} \otimes \begin{bmatrix} \gamma \\\\ \delta \end{bmatrix} =\begin{bmatrix} \alpha \begin{bmatrix} \gamma \\\\ \delta \end{bmatrix} \\\\ \beta \begin{bmatrix}\gamma \\\\ \delta \end{bmatrix} \end{bmatrix} = \begin{bmatrix} \alpha\gamma \\\\ \alpha\delta \\\\ \beta\gamma \\\\ \beta\delta \end{bmatrix}. $$
Therefore, given two single-qubit states $\psi$ and $\phi$, each of dimension 2, the corresponding two-qubit state $\psi \otimes \phi$ is 4-dimensional. The vector
$$ \begin{bmatrix} \alpha_{00} \\\\ \alpha_{01} \\\\ \alpha_{10} \\\\ \alpha_{11} \end{bmatrix} $$
represents a quantum state on two qubits if
$$|\alpha_{00}|^2+|\alpha_{01}|^2+|\alpha_{10}|^2+|\alpha_{11}|^2=1.$$
More generally, you can see that the quantum state of $n$ qubits is represented by a unit vector $v_1 \otimes v_2 \otimes \cdots \otimes v_n$ of dimension $2 \cdot 2 \cdot 2 \cdots = 2^n$ using this construction. Just as with single qubits, the quantum state vector of multiple qubits holds all the information needed to describe the system's behavior. For more information about vectors and tensor products, see Vectors and Matrices in Quantum Computing.
The computational basis for two-qubit states is formed by the tensor products of one-qubit basis states. For example, you have
$$ \begin{align} 00 \equiv \begin{bmatrix}1 \\\\ 0 \end{bmatrix}\otimes \begin{bmatrix}1 \\\\ 0 \end{bmatrix} &= \begin{bmatrix}1 \\\\ 0\\\\ 0\\\\ 0 \end{bmatrix},\qquad 01 \equiv \begin{bmatrix}1 \\\\ 0 \end{bmatrix}\otimes \begin{bmatrix}0 \\\\ 1 \end{bmatrix} = \begin{bmatrix}0 \\\\ 1\\\\ 0\\\\ 0 \end{bmatrix},\\\\ 10 \equiv \begin{bmatrix}0 \\\\ 1 \end{bmatrix}\otimes \begin{bmatrix}1 \\\\ 0 \end{bmatrix} &= \begin{bmatrix}0 \\\\ 0\\\\ 1\\\\ 0 \end{bmatrix},\qquad 11 \equiv \begin{bmatrix}0 \\\\ 1 \end{bmatrix}\otimes \begin{bmatrix}0 \\\\ 1 \end{bmatrix} = \begin{bmatrix}0 \\\\ 0\\\\ 0\\\\ 1 \end{bmatrix}. \end{align} $$
Note that while you can always take the tensor product of two single-qubit states to form a two-qubit state, not all two-qubit quantum states can be written as the tensor product of two single-qubit states. For example, there are no states $\psi=\begin{bmatrix} \alpha \\\\ \beta \end{bmatrix}$ and $\phi=\begin{bmatrix} \gamma \\\\ \delta \end{bmatrix}$ such that their tensor product is the state
$$\psi\otimes \phi = \begin{bmatrix} 1/\sqrt{2} \\\\ 0 \\\\ 0 \\\\ 1/\sqrt{2} \end{bmatrix}.$$
Such a two-qubit state, which cannot be written as the tensor product of single-qubit states, is called an "entangled state"; the two qubits are said to be entangled. Loosely speaking, because the quantum state cannot be thought of as a tensor product of single qubit states, the information that the state holds is not confined to either of the qubits individually. Rather, the information is stored non-locally in the correlations between the two states. This non-locality of information is one of the major distinguishing features of quantum computing over classical computing and is essential for a number of quantum protocols including quantum error correction.
Measuring two-qubit states¶
Measuring two-qubit states is very similar to single-qubit measurements. Measuring the state
$$ \begin{bmatrix} \alpha_{00} \\\\ \alpha_{01} \\\\ \alpha_{10} \\\\ \alpha_{11} \end{bmatrix} $$
yields $00$ with probability $|\alpha_{00}|^2$, $01$ with probability $|\alpha_{01}|^2$, $10$ with probability $|\alpha_{10}|^2$, and $11$ with probability $|\alpha_{11}|^2$. The variables $\alpha_{00}, \alpha_{01}, \alpha_{10},$ and $\alpha_{11}$ were deliberately named to make this connection clear. After the measurement, if the outcome is $00$ then the quantum state of the two-qubit system has collapsed and is now
$$ 00 \equiv \begin{bmatrix} 1 \\\\ 0 \\\\ 0 \\\\ 0 \end{bmatrix}. $$
It's also possible to measure just one qubit of a two-qubit quantum state. When you measure only one qubit of a two-qubit state, the impact of measurement is subtly different than measuring two qubits. This is because the entire state isn't collapsed to a computational basis state, rather it's collapsed to only one subsystem. In other words, measuring one qubit of a two-qubit state only collapses the related subsystem to a computational basis state.
To see this, consider measuring the first qubit of the following state, which is formed by applying the Hadamard transform $H$ on two qubits initially set to the "0" state:
$$ H^{\otimes 2} \left( \begin{bmatrix}1 \\\\ 0 \end{bmatrix}\otimes \begin{bmatrix}1 \\\\ 0 \end{bmatrix} \right) = \frac{1}{2}\begin{bmatrix}1 & 1 & 1 & 1 \\\\ 1 & -1 & 1 & -1 \\\\ 1 & 1 & -1 & -1 \\\\ 1 & -1 & -1 & 1 \end{bmatrix}\begin{bmatrix}1\\\\ 0\\\\ 0\\\\ 0\end{bmatrix} = \frac{1}{2}\begin{bmatrix}1\\\\ 1\\\\ 1\\\\ 1\end{bmatrix} \mapsto \begin{cases}\text{outcome }=0 & \frac{1}{\sqrt{2}}\begin{bmatrix}1\\\\ 1\\\\ 0\\\\ 0 \end{bmatrix}\\\\ \text{outcome }=1 & \frac{1}{\sqrt{2}}\begin{bmatrix}0\\\\ 0\\\\ 1\\\\ 1 \end{bmatrix}\\\\ \end{cases}. $$ Both outcomes have a 50% probability of occurring. That can be intuited from the fact that the quantum state before measurement does not change if $0$ is swapped with $1$ on the first qubit.
The mathematical rule for measuring the first or second qubit is simple. Let $e_k$ be the $k^{\rm th}$ computational basis vector and $S$ be the set of all $e_k$ such that the qubit in question takes the value $1$ for that value of $k$. For example, if you are interested in measuring the first qubit then $S$ would consist of $e_1\equiv 10$ and $e_3\equiv 11$. Similarly, if you are interested in the second qubit $S$ would consist of $e_2\equiv 01$ and $e_3 \equiv 11$. Then the probability of measuring the chosen qubit to be $1$ is for state vector $\psi$
$$ P(\text{outcome}=1)= \sum_{e_k \text{ in the set } S}\psi^\dagger e_k e_k^\dagger \psi. $$
[!NOTE] This article uses the little-endian format to label the computational basis. In little endian format, the least significant bits come first. For example, the number four in little-endian format is represented by the string of bits 001.
Since each qubit measurement can only yield $0$ or $1$, the probability of measuring $0$ is simply $1-P(\text{outcome}=1)$. This is why you only need a formula for the probability of measuring $1$.
The action that such a measurement has on the state can be expressed mathematically as
$$ \psi \mapsto \frac{\sum_{e_k \text{ in the set } S} e_k e_k^\dagger \psi}{\sqrt{P(\text{outcome}=1)}}. $$
The cautious reader may worry about what happens if the denominator is zero. While such state is undefined, you don't need to worry about such eventualities because the probability is zero!
If you take $\psi$ to be the uniform state vector given above and are interested in measuring the first qubit then
$$ P(\text{measurement of first qubit}=1) = (\psi^\dagger e_1)(e_1^\dagger \psi)+(\psi^\dagger e_3)(e_3^\dagger \psi)=|e_1^\dagger \psi|^2+|e_3^\dagger \psi|^2. $$
Note that this is just the sum of the two probabilities that would be expected for measuring the results $10$ and $11$. For our example, this evaluates to
$$ \frac{1}{4}\left|\begin{bmatrix}0&0&1&0\end{bmatrix}\begin{bmatrix}1\\\\ 1\\\\ 1\\\\ 1\end{bmatrix} \right|^2+\frac{1}{4}\left|\begin{bmatrix}0&0&0&1\end{bmatrix}\begin{bmatrix}1\\\\ 1\\\\ 1\\\\ 1\end{bmatrix} \right|^2=\frac{1}{2}. $$
which perfectly matches our intuition. Similarly, the state after the first qubit is measured as $1$ can be written as
$$ \frac{\frac{e_1}{2}+\frac{e_3}{2}}{\sqrt{\frac{1}{2}}}=\frac{1}{\sqrt{2}}\begin{bmatrix} 0\\\\ 0\\\\ 1\\\\ 1\end{bmatrix} $$
again in accordance with our intuition.
Two-qubit operations¶
As in the single-qubit case, any unitary transformation is a valid operation on qubits. In general, a unitary transformation on $n$ qubits is a matrix $U$ of size $2^n \times 2^n$ (so that it acts on vectors of size $2^n$), such that $U^{-1} = U^\dagger$. For example, the CNOT (controlled-NOT) gate is a commonly used two-qubit gate and is represented by the following unitary matrix:
$$ \operatorname{CNOT} = \begin{bmatrix} 1\ 0\ 0\ 0 \\\\ 0\ 1\ 0\ 0 \\\\ 0\ 0\ 0\ 1 \\\\ 0\ 0\ 1\ 0 \end{bmatrix} $$
We can also form two-qubit gates by applying single-qubit gates on both qubits. For example, if you apply the gates
$$ \begin{bmatrix} a\ b\\\\ c\ d \end{bmatrix} $$
and
$$\begin{bmatrix} e\ f\\\\ g\ h \end{bmatrix} $$
to the first and second qubits, respectively, this is equivalent to applying the two-qubit unitary given by their tensor product: $$\begin{bmatrix} a\ b\\\\ c\ d \end{bmatrix} \otimes \begin{bmatrix} e\ f\\\\ g\ h \end{bmatrix}= \begin{bmatrix} ae\ af\ be\ bf \\\\ ag\ ah\ bg\ bh \\\\ ce\ cf\ de\ df \\\\ cg\ ch\ dg\ dh \end{bmatrix}.$$
Thus, you can form two-qubit gates by taking the tensor product of some known single-qubit gates. Some examples of two-qubit gates include $H \otimes H$, $X \otimes \mathbf{1}$, and $X \otimes Z$.
Note that while any two single-qubit gates define a two-qubit gate by taking their tensor product, the converse is not true. Not all two-qubit gates can be written as the tensor product of single-qubit gates. Such a gate is called an entangling gate. One example of an entangling gate is the CNOT gate.
The intuition behind a controlled-not gate can be generalized to arbitrary gates. A controlled gate in general is a gate that acts as identity unless a specific qubit is $1$. You denote a controlled unitary, controlled in this case on the qubit labeled $x$, with a $\Lambda\_x(U)$. As an example $\Lambda_0(U) e\_{1}\otimes {\psi}=e\_{1}\otimes U{\psi}$ and $\Lambda\_0(U) e\_{0}\otimes {\psi}=e\_{0}\otimes{\psi}$, where $e\_0$ and $e\_1$ are the computational basis vectors for a single qubit corresponding to the values $0$ and $1$. For example, consider the following controlled-$Z$ gate then you can express this as
$$ \Lambda\_0(Z)= \begin{bmatrix}1&0&0&0\\\\0&1&0&0\\\\0&0&1&0\\\\0&0&0&-1 \end{bmatrix}=(\mathbf{1}\otimes H)\operatorname{CNOT}(\mathbf{1}\otimes H). $$
Building controlled unitaries in an efficient manner is a major challenge. The simplest way to implement this requires forming a database of controlled versions of fundamental gates and replacing every fundamental gate in the original unitary operation with its controlled counterpart. This is often quite wasteful and clever insight often can be used to just replace a few gates with controlled versions to achieve the same impact. For this reason, the framework provides the ability to perform either the naive method of controlling or allow the user to define a controlled version of the unitary if an optimized hand-tuned version is known.
Gates can also be controlled using classical information. A classically controlled not-gate, for example, is just an ordinary not-gate but it is only applied if a classical bit is $1$ as opposed to a quantum bit. In this sense, a classically controlled gate can be thought of as an if statement in the quantum code wherein the gate is applied only in one branch of the code.
As in the single-qubit case, a two-qubit gate set is universal if any $4\times 4$ unitary matrix can be approximated by a product of gates from this set to arbitrary precision. One example of a universal gate set is the Hadamard gate, the T gate, and the CNOT gate. By taking products of these gates, you can approximate any unitary matrix on two qubits.
Many-qubit systems¶
We follow exactly the same patterns explored in the two-qubit case to build many-qubit quantum states from smaller systems. Such states are built by forming tensor products of smaller states. For example, consider encoding the bit string $1011001$ in a quantum computer. You can encode this as
$$ 1011001 \equiv \begin{bmatrix} 0 \\\\ 1 \end{bmatrix}\otimes \begin{bmatrix} 1 \\\\ 0 \end{bmatrix}\otimes \begin{bmatrix} 0 \\\\ 1 \end{bmatrix}\otimes \begin{bmatrix} 0 \\\\ 1 \end{bmatrix} \otimes \begin{bmatrix} 1 \\\\ 0 \end{bmatrix}\otimes \begin{bmatrix} 1 \\\\ 0 \end{bmatrix}\otimes \begin{bmatrix} 0 \\\\ 1 \end{bmatrix}. $$
Quantum gates work in exactly the same way. For example, if you wish to apply the $X$ gate to the first qubit and then perform a CNOT between the second and third qubits you may express this transformation as
\begin{align} &(X \otimes \operatorname{CNOT}_{12}\otimes \mathbf{1}\otimes \mathbf{1} \otimes \mathbf{1} \otimes \mathbf{1}) \begin{bmatrix} 0 \\\\ 1 \end{bmatrix}\otimes \begin{bmatrix} 1 \\\\ 0 \end{bmatrix}\otimes \begin{bmatrix} 0 \\\\ 1 \end{bmatrix}\otimes \begin{bmatrix} 0 \\\\ 1 \end{bmatrix} \otimes \begin{bmatrix} 1 \\\\ 0 \end{bmatrix}\otimes \begin{bmatrix} 1 \\\\ 0 \end{bmatrix}\otimes \begin{bmatrix} 0 \\\\ 1 \end{bmatrix}\\\\ &\qquad\qquad\equiv 0011001. \end{align}
In many qubit systems, there is often a need to allocate and deallocate qubits that serve as temporary memory for the quantum computer. Such a qubit is said to be auxiliary. By default, you can assume the qubit state is initialized to $e_0$ upon allocation. You can further assume that it is returned again to $e_0$ before deallocation. This assumption is important because if an auxiliary qubit becomes entangled with another qubit register when it becomes deallocated then the process of deallocation will damage the auxiliary qubit. For this reason, you always assume that such qubits are reverted to their initial state before being released.
Finally, although new gates needed to be added to our gate set to achieve universal quantum computing for two qubit quantum computers, no new gates need to be introduced in the multi-qubit case. The gates $H$, $T$ and CNOT form a universal gate set on many qubits because any general unitary transformation can be broken into a series of two qubit rotations. You then can leverage the theory developed for the two-qubit case and use it again here when you have many qubits.
[!NOTE] While the linear algebraic notation that has been used thus far can certainly be used to describe multi-qubit states, it becomes increasingly cumbersome as you increase the size of the states. The resulting column-vector for a length 7 bit string, for example, is $128$ dimensional, which makes expressing it using the notation described previously very cumbersome. Instead, Dirac notation, a symbolic shorthand that simplifies the representation of quantum states, is used. For more information, see Dirac notation.
Dirac notation in quantum computing¶
Dirac notation is designed to fit the precise needs of expressing states and linear algebra in quantum mechanics. It's named after the physicist Paul Dirac, who developed the notation in the 1930s. Dirac notation is a concise and powerful way to describe quantum states and operations. It's used in quantum computing to describe quantum states, quantum operations, and quantum measurements.
This article introduce you to Dirac notation and show you how to use it to describe quantum states and operations.
Limitations of column vector notation¶
While column vector notation is common in linear algebra, it's often used in quantum computing, especially when dealing with multiple qubits. For example, when you define $\psi$ to be a vector it's not explicitly clear whether $\psi$ is a row or a column vector. Thus, if $\phi$ and $\psi$ are vectors, then it's equally unclear if $\phi \psi$ is even defined, because the shapes of $\phi$ and $\psi$ may be unclear in the context. Beyond the ambiguity about the shapes of vectors, expressing even simple vectors using linear algebraic notation can be cumbersome. For example, if you wish to describe an $n$-qubit state where each qubit takes the value $0$, then you would formally express the state as
$$\begin{bmatrix}1 \\\\ 0 \end{bmatrix}\otimes \cdots \otimes\begin{bmatrix}1 \\\\ 0 \end{bmatrix}. $$
Evaluating this tensor product is impractical because the vector lies in an exponentially large space. As such, this notation is, in fact, the best description of the state that can be given using the previous notation.
Types of vectors in Dirac notation¶
There are two types of vectors in Dirac notation: the bra vector and the ket vector, so named because when put together they form a braket or inner product. If $\psi$ is a column vector, then you can write it in Dirac notation as $\ket{\psi}$, where the $\ket{\cdot}$ denotes that it's a unit column vector, for example, a ket vector. Similarly, the row vector $\psi^\dagger$ is expressed as $\bra{\psi}$. In other words, $\psi^\dagger$ is obtained by applying entry-wise complex conjugation to the elements of the transpose of $\psi$. The bra-ket notation directly implies that $\braket{\psi|\psi}$ is the inner product of vector $\psi$ with itself, which is by definition $1$.
More generally, if $\psi$ and $\phi$ are quantum state vectors, then their inner product is $\braket{\phi|\psi}$. This inner product implies that the probability of measuring the state $\ket{\psi}$ to be $\ket{\phi}$ is $|\braket{\phi|\psi}|^2$.
The following convention is used to describe the quantum states that encode the values of zero and one (the single-qubit computational basis states):
$$ \begin{bmatrix} 1 \\\\ 0 \end{bmatrix} = \ket{0},\qquad \begin{bmatrix} 0 \\\\ 1 \end{bmatrix} = \ket{1}. $$
Example: Represent the Hadamard operation with Dirac notation¶
The following notation is often used to describe the states that result from applying the Hadamard gate to $\ket{0}$ and $\ket{1}$. These states correspond to the unit vectors in the $+x$ and $-x$ directions on the Bloch sphere:
$$ \frac{1}{\sqrt{2}}\begin{bmatrix} 1 \\\\ 1 \end{bmatrix}=H\ket{0} = \ket{+},\qquad \frac{1}{\sqrt{2}}\begin{bmatrix} 1 \\\\ -1 \end{bmatrix} =H\ket{1} = \ket{-} . $$
These states can also be expanded using Dirac notation as sums of $\ket{0}$ and $\ket{1}$:
$$ \ket{+} = \frac{1}{\sqrt{2}}(\ket{0} + \ket{1}),\qquad \ket{-} = \frac{1}{\sqrt{2}}(\ket{0} - \ket{1}). $$
Computational basis vectors¶
This demonstrates why these states are often called a computational basis: every quantum state can always be expressed as sums of computational basis vectors and such sums are easily expressed using Dirac notation. The converse is also true in that the states $\ket{+}$ and $\ket{-}$ also form a basis for quantum states. You can see this from the fact that
$$ \ket{0} = \frac{1}{\sqrt{2}}(\ket{+} + \ket{-}),\qquad \ket{1} = \frac{1}{\sqrt{2}}(\ket{+} - \ket{-}). $$
As an example of Dirac notation, consider the braket $\braket{0 | 1}$, which is the inner product between $0$ and $1$. It can be written as
$$ \braket{0 | 1}=\begin{bmatrix} 1 & 0 \end{bmatrix}\begin{bmatrix}0\\\\ 1\end{bmatrix}=0. $$
This example says that $\ket{0}$ and $\ket{1}$ are orthogonal vectors, meaning that $\braket{0 | 1} = \braket{1 | 0} =0$. Also, by definition $\braket{0 | 0} = \braket{1 | 1}=1$, which means that the two computational basis vectors can also be called orthonormal.
These orthonormal properties are used in the following example. If you have a state $\ket{\psi} = {\frac{3}{5}} \ket{1} + {\frac{4}{5}} \ket{0}$, then because $\braket{1 | 0} =0$ the probability of measuring $1$ is
$$ \big|\braket{1 | \psi}\big|^2= \left|\frac{3}{5}\braket{1 | 1} +\frac{4}{5}\braket{1 | 0}\right|^2=\frac{9}{25}. $$
Tensor product notation¶
Dirac notation also includes an implicit tensor product structure. This structure is important because in quantum computing, the state vector described by two uncorrelated quantum registers is the tensor products of the two state vectors. Concisely describing the tensor product structure, or lack thereof, is vital if you want to explain a quantum computation. The tensor product structure implies that you can write $\psi \otimes \phi$ for any two quantum state vectors $\phi$ and $\psi$ as $\ket{\psi} \otimes \ket{\phi}$. However, by convention writing $\otimes$ in between the vectors is unnecessary, and you can write $\ket{\psi} \ket{\phi} = \ket{\psi \phi} $. For more information about vectors and tensor products, see Vectors and Matrices in Quantum Computing. For example, the state with two qubits initialized to the zero state is:
$$ \ket{0} \otimes \ket{0} = \ket{0} \ket{0} = \ket{00} = \begin{bmatrix} 1 \\\\ 0 \end{bmatrix} \otimes \begin{bmatrix} 1 \\\\ 0 \end{bmatrix} = \begin{bmatrix} 1 \\\\ 0 \\\\ 0 \\\\ 0 \end{bmatrix}. $$
Similarly, the state $\ket{p}$ for integer $p$ represents a quantum state that encodes the integer $p$ in binary representation. For example, if you wish to express the number $5$ using an unsigned binary encoding, you could equally express it as
$$ \ket{1}\ket{0}\ket{1} = \ket{101} = \ket{5}. $$
Within this notation, $\ket{0}$ need not refer to a single-qubit state but rather to a qubit register that stores a binary encoding of $0$. The differences between these two notations are clear from the context. This convention is useful for simplifying the first example, which can be written in any of the following ways:
$$ \begin{bmatrix}1 \\\\ 0 \end{bmatrix}\otimes \cdots \otimes\begin{bmatrix}1 \\\\ 0 \end{bmatrix} = \ket{0} \otimes \cdots \otimes \ket{0}= |0\cdots 0\rangle = \ket{0}^{\otimes n} $$
where $\ket{0}^{\otimes n}$ represents the tensor product of $n$ $\ket{0}$ quantum states.
Example: Describe superposition with Dirac notation¶
As another example of how you can use Dirac notation to describe a quantum state, consider the following equivalent ways of writing a quantum state that is an equal superposition over every possible bit string of length $n$
$$ H^{\otimes n} \ket{0} = \frac{1}{2^{n/2}} \sum_{j=0}^{2^n-1} \ket{j} = \ket{+}^{\otimes n}. $$
Here you may wonder why the sum goes from $0$ to $2^{n}-1$ for $n$ bits. First, note that there are $2^{n}$ different configurations that $n$ bits can take. You can see this by noting that one bit can take $2$ values but two bits can take $4$ values and so forth. In general, this means that there are $2^n$ different possible bit strings but the largest value encoded in any of them $1\cdots 1=2^n-1$ and hence it is the upper limit for the sum. As a side note, in this example you did not use $\ket{+}^{\otimes n}=\ket{+}$ in analogy to $\ket{0}^{\otimes n} = \ket{0}$. This notational convention is reserved for the computational basis state with every qubit initialized to zero. While such a convention is sensible in this case, it's not employed in the quantum computing literature.
Express linearity with Dirac notation¶
Another feature of Dirac notation is the fact that it's linear. For example, for two complex numbers $\alpha$ and $\beta$, you can write
$$ \ket{\psi} \otimes ( \alpha\ket{\phi} + \beta\ket{\chi})= \alpha\ket{\psi}\ket{\phi} + \beta\ket{\psi}\ket{\chi}.$$
That is to say, you can distribute the tensor product notation in Dirac notation so that taking tensor products between state vectors ends up looking just like ordinary multiplication.
Bra vectors follow a similar convention to ket vectors. For example, the vector $\bra{\psi}\bra{\phi}$ is equivalent to the state vector $\psi^\dagger \otimes \phi^\dagger=(\psi\otimes \phi)^\dagger$. If the ket vector $\ket{\psi}$ is $\alpha \ket{0} + \beta \ket{1}$, then the bra vector version of the vector is $\bra{\psi}=\ket{\psi}^\dagger = (\bra{0}\alpha^* +\bra{1}\beta^*)$.
As an example, imagine that you wish to calculate the probability of measuring the state $\ket{\psi} = \frac{3}{5} \ket{1} + \frac{4}{5} \ket{0}$ using a quantum program for measuring states to be either $\ket{+}$ or $\ket{-}$. Then the probability that the device would output that the state is $\ket{-}$ is
$$|\braket{- | \psi}|^2= \left|\frac{1}{\sqrt{2}}(\bra{0} - \bra{1})(\frac{3}{5} \ket{1} + \frac{4}{5} \ket{0}) \right|^2=\left|-\frac{3}{5\sqrt{2}} + \frac{4}{5\sqrt{2}}\right|^2=\frac{1}{50}.$$
The fact that the negative sign appears in the calculation of the probability is a manifestation of quantum interference, which is one of the mechanisms by which quantum computing gains advantages over classical computing.
ketbra or outer product¶
The final item worth discussing in Dirac notation is the ketbra or outer product. The outer product is represented within Dirac notations as $\ket{\psi} \bra{\phi}$, and sometimes called ketbras because the bras and kets occur in the opposite order as brakets. The outer product is defined via matrix multiplication as $\ket{\psi} \bra{\phi} = \psi \phi^\dagger$ for quantum state vectors $\psi$ and $\phi$. The simplest, and arguably most common example of this notation, is
$$ \ket{0} \bra{0} = \begin{bmatrix}1\\\\ 0 \end{bmatrix}\begin{bmatrix}1&0 \end{bmatrix}= \begin{bmatrix}1 &0\\\\ 0 &0\end{bmatrix} \qquad \ket{1} \bra{1} = \begin{bmatrix}0\\\\ 1 \end{bmatrix}\begin{bmatrix}0&1 \end{bmatrix}= \begin{bmatrix}0 &0\\\\ 0 &1\end{bmatrix}. $$
Ketbras are often called projectors because they project a quantum state onto a fixed value. Since these operations aren't unitary (and do not even preserve the norm of a vector), a quantum computer cannot deterministically apply a projector. However projectors do a beautiful job of describing the action that measurement has on a quantum state. For example, if you measure a state $\ket{\psi}$ to be $0$, then the resulting transformation that the state experiences as a result of the measurement is
$$\ket{\psi} \rightarrow \frac{(\ket{0} \bra{0})\ket{\psi}}{|\braket{0 | \psi}|}= \ket{0},$$
as you would expect if you measured the state and found it to be $\ket{0}$. To reiterate, such projectors cannot be applied on a state in a quantum computer deterministically. Instead, they can at best be applied randomly with the result $\ket{0}$ appearing with some fixed probability. The probability of such a measurement succeeding can be written as the expectation value of the quantum projector in the state
$$ \bra{\psi} (\ket{0} \bra{0})\ket{\psi} = |\braket{\psi | 0}|^2, $$
which illustrates that projectors give a new way of expressing the measurement process.
If instead you consider measuring the first qubit of a multi-qubit state to be $1$, then you can also describe this process conveniently using projectors and Dirac notation:
$$ P(\text{first qubit = 1})= \bra{\psi}\left(\ket{1}\bra{1}\otimes \mathbf{1}^{\otimes n-1}\right) \ket{\psi}. $$
Here the identity matrix can be conveniently written in Dirac notation as
$$ \mathbb{I} = \ket{0} \bra{0}+\ket{1} \bra{1}= \begin{bmatrix}1&0\\\\ 0&1 \end{bmatrix}. $$
For the case where there are two-qubits the projector can be expanded as
$$ \ket{1} \bra{1} \otimes \mathbb{I} = \ket{1}\bra{1} \otimes (\ket{0} \bra{0}+\ket{1} \bra{1})= \ket{10}\bra{10} + \ket{11}\bra{11}. $$
you can then see that this is consistent with the discussion about measurement likelihoods for multiqubit states using column-vector notation:
$$ P(\text{first qubit = 1})= \psi^\dagger (e_{10}e_{10}^\dagger + e_{11}e_{11}^\dagger)\psi = |e_{10}^\dagger \psi|^2 + |e_{11}^\dagger \psi|^2, $$
which matches the multi-qubit measurement discussion. The generalization of this result to the multi-qubit case, however, is slightly more straightforward to express using Dirac notation than column-vector notation, and is entirely equivalent to the previous treatment.
Density operators¶
Another useful operator to express using Dirac notation is the density operator, sometimes also known as a state operator. As the quantum state vector, the density operator describes the quantum state of a system. While quantum state vectors can only represent pure states, density operators can also represent mixed states.
More generally, a given matrix $\rho$ is a valid density operator if the following conditions are fulfilled:
- $\rho$ is a matrix of complex numbers
- $\rho = \rho^{\dagger}$ (that is, $\rho$ is Hermitian)
- Every eigenvalue $p$ of $\rho$ is non-negative
- All the eigenvalues of $\rho$ sum to 1
Together, these conditions guarantee that $\rho$ can be thought of as an ensemble. A density operator for a quantum state vector $\ket{\psi}$ takes the form $\rho = \sum_i p_i \ket{\psi_i} \bra{\psi_i}$ is an eigenvalue decomposition of $\rho$, then $\rho$ describes the ensemble $\rho = \{ \ket{\psi_i} \text{with probability} p_i \}$.
Pure quantum states are those that are characterized by a single ket vector or wavefunction, and cannot be written as a statistical mixture (or convex combination) of other quantum states. A mixed quantum state is a statistical ensemble of pure states.
On a Bloch sphere, pure states are represented by a point on the surface of the sphere, whereas mixed states are represented by an interior point. The mixed state of a single qubit is represented by the center of the sphere, by symmetry. The purity of a state can be visualized as the degree in which it is close to the surface of the sphere.
This concept of representing the state as a matrix, rather than a vector, is often convenient because it gives a convenient way of representing probability calculations, and also allows you to describe both statistical uncertainty and quantum uncertainty within the same formalism.
A density operator $\rho$ represents a pure state if and only if:
- $\rho$ can be written as an outer product of a state vector, $\rho=\ket{\psi}\bra{\psi}$
- $\rho =\rho^2$
- $tr(\rho^2)=1$
To tell how close a given density operator $\rho$ is to being pure, you can look at the trace (that is, the sum of the diagonal elements) of $\rho^2$. A density operator represents a pure state if and only if $tr(\rho ^{2})=1$.
Q# gate sequences equivalent to quantum states¶
A final point worth raising about quantum notation and the Q# programming language: the beginning of this document mentioned that the quantum state is the fundamental object of information in quantum computing. It may then come as a surprise that in Q# there is no notion of a quantum state. Instead, all states are described only by the operations used to prepare them. The previous example is an excellent illustration of this. Rather than expressing a uniform superposition over every quantum bit string in a register, you can represent the result as $H^{\otimes n} \ket{0}$. This exponentially shorter description of the state not only has the advantage that you can classically reason about it, but it also concisely defines the operations needed to be propagated through the software stack to implement the algorithm. For this reason, Q# is designed to emit gate sequences rather than quantum states; however, at a theoretical level the two perspectives are equivalent.
Single-qubit and multi-qubit Pauli measurements¶
As you work with Q#, you find that Pauli measurements are a common type of measurement. Pauli measurements generalize computational basis measurements to include measurements in other bases and of parity between different qubits. In such cases, it is common to discuss measuring a Pauli operator, which is an operator such as $X,Y,Z$ or $Z\otimes Z, X\otimes X, X\otimes Y$, and so forth. For the basics of quantum measurement, see The qubit and Multiple qubits.
Discussing measurement in terms of Pauli operators is common in the subfield of quantum error correction.
Q# guide follows a similar convention; this article explains this alternative view of measurements.
[!TIP] In Q#, multi-qubit Pauli operators are generally represented by arrays of type
Pauli[]. For example, to represent $X \otimes Z \otimes Y$, you can use the array[PauliX, PauliZ, PauliY].
Before delving into the details of how to think of a Pauli measurement, it is useful to think about what measuring a single qubit inside a quantum computer does to the quantum state. Imagine a $n$-qubit quantum state; then measuring one qubit immediately rules out half of the $2^n$ possibilities that state could be in. In other words, the measurement projects the quantum state onto one of two half-spaces. You can generalize the way you think about measurement to reflect this intuition.
In order to concisely identify these subspaces, one needs a language for describing them. One way to describe the two subspaces is by specifying them through a matrix that just has two unique eigenvalues, taken by convention to be $\pm 1$. For a simple example of describing subspaces in this way, consider $Z$:
$$ \begin{align} Z & = \begin{bmatrix} 1 & 0 \\\\ 0 & -1 \end{bmatrix}. \end{align} $$
By reading the diagonal elements of the Pauli-$Z$ matrix, one can see that $Z$ has two eigenvectors, $\ket{0}$ and $\ket{1}$, with corresponding eigenvalues $\pm 1$.
Thus, if a measurement of the qubit results in Zero (corresponding to the state $\ket{0}$), it is known that the state of the qubit is a $+1$ eigenstate of the $Z$ operator.
Similarly, if the result is One, it is known that the state of the qubit is a $-1$ eigenstate of $Z$.
This process is referred to in the language of Pauli measurements as "measuring Pauli $Z$," and is entirely equivalent to performing a computational basis measurement.
Any $2\times 2$ matrix that is a unitary transformation of $Z$ also satisfies this criteria. That is, one could also use a matrix $A=U^\dagger Z U$, where $U$ is any other unitary matrix, to give a matrix that defines the two outcomes of a measurement in its $\pm 1$ eigenvectors. The notation of Pauli measurements references this unitary equivalence by identifying $X,Y,Z$ measurements as equivalent measurements that one could do to gain information from a qubit. These measurements are given here for convenience.
| Pauli Measurement | Unitary transformation |
|---|---|
| $Z$ | $\mathbf{1}$ |
| $X$ | $H$ |
| $Y$ | $HS^{\dagger}$ |
That is, using this language, "measure $Y$" is equivalent to applying $HS^\dagger$ and then measuring in the computational basis, where S is an intrinsic quantum operation sometimes called the "phase gate," and can be simulated using the unitary matrix
$$ \begin{align} S = \begin{bmatrix} 1 & 0 \\\\ 0 & i \end{bmatrix}. \end{align} $$
It is also equivalent to applying $HS^\dagger$ to the quantum state vector and then measuring $Z$, such that the following operation is equivalent to Measure([PauliY], [q]):
Q#
operation MeasureY(qubit : Qubit) : Result {
mutable result = Zero;
within {
Adjoint S(q);
H(q);
} apply {
set result = M(q);
}
return result;
}
The correct state is then found by transforming back to the computational basis, which amounts to applying $SH$ to the quantum state vector; in the code snippet, the transformation back to the computational basis is handled automatically with the use of the within … apply block.
In Q#, the outcome---that is, the classical information extracted from interacting with the state---is given using a Result value $j \in \\{\texttt{Zero}, \texttt{One}\\}$ indicating if the result is in the $(-1)^j$ eigenspace of the Pauli operator measured.
Multiple-qubit measurements¶
Measurements of multi-qubit Pauli operators are defined similarly, as seen from:
$$ Z\otimes Z = \begin{bmatrix}1 &0 &0&0\\\\ 0&-1&0&0\\\\ 0&0&-1&0\\\\ 0&0&0&1\end{bmatrix}. $$
Thus the tensor products of two Pauli-$Z$ operators forms a matrix composed of two spaces consisting of $+1$ and $-1$ eigenvalues. As with the single-qubit case, both constitute a half-space meaning that half of the accessible vector space belongs to the $+1$ eigenspace and the remaining half to the $-1$ eigenspace. In general, it is easy to see from the definition of the tensor product that any tensor product of Pauli-$Z$ operators and the identity also obeys this. For example,
$$ \begin{align} Z \otimes \mathbf{1} = \begin{bmatrix} 1 & 0 & 0 & 0 \\\\ 0 & 1 & 0 & 0 \\\\ 0 & 0 & -1 & 0 \\\\ 0 & 0 & 0 & -1 \end{bmatrix}. \end{align} $$
As before, any unitary transformation of such matrices also describes two half-spaces labeled with $\pm 1$ eigenvalues. For example, $X\otimes X = H\otimes H(Z\otimes Z)H\otimes H$ from the identity that $Z=HXH$. Similar to the one-qubit case, all two-qubit Pauli-measurements may be written as $U^\dagger (Z\otimes 1) U$ for $4\times 4$ unitary matrices $U$. The transformations are enumerated in the following table.
[!NOTE] In this table, $\operatorname{SWAP}$ is used to indicate the matrix $$ \begin{align} \operatorname{SWAP} & = \left(\begin{matrix} 1 & 0 & 0 & 0 \\\\ 0 & 0 & 1 & 0 \\\\ 0 & 1 & 0 & 0 \\\\ 0 & 0 & 0 & 1 \end{matrix}\right) \end{align} $$ used to simulate the intrinsic operation
SWAP.
| Pauli Measurement | Unitary transformation |
|---|---|
| $Z\otimes \mathbf{1}$ | $\mathbf{1}\otimes \mathbf{1}$ |
| $X\otimes \mathbf{1}$ | $H\otimes \mathbf{1}$ |
| $Y\otimes \mathbf{1}$ | $HS^\dagger\otimes \mathbf{1}$ |
| $\mathbf{1} \otimes Z$ | $\operatorname{SWAP}$ |
| $\mathbf{1} \otimes X$ | $(H\otimes \mathbf{1})\operatorname{SWAP}$ |
| $\mathbf{1} \otimes Y$ | $(HS^\dagger\otimes \mathbf{1})\operatorname{SWAP}$ |
| $Z\otimes Z$ | $\operatorname{CNOT}\_{10}$ |
| $X\otimes Z$ | $\operatorname{CNOT}\_{10}(H\otimes \mathbf{1})$ |
| $Y\otimes Z$ | $\operatorname{CNOT}\_{10}(HS^\dagger\otimes \mathbf{1})$ |
| $Z\otimes X$ | $\operatorname{CNOT}\_{10}(\mathbf{1}\otimes H)$ |
| $X\otimes X$ | $\operatorname{CNOT}\_{10}(H\otimes H)$ |
| $Y\otimes X$ | $\operatorname{CNOT}\_{10}(HS^\dagger\otimes H)$ |
| $Z\otimes Y$ | $\operatorname{CNOT}\_{10}(\mathbf{1} \otimes HS^\dagger)$ |
| $X\otimes Y$ | $\operatorname{CNOT}\_{10}(H\otimes HS^\dagger)$ |
| $Y\otimes Y$ | $\operatorname{CNOT}\_{10}(HS^\dagger\otimes HS^\dagger)$ |
Here, the CNOT operation appears for the following reason.
Each Pauli measurement that does not include the $\mathbf{1}$ matrix is equivalent up to a unitary to $Z\otimes Z$ by the earlier reasoning.
The eigenvalues of $Z\otimes Z$ only depend on the parity of the qubits that comprise each computational basis vector, and the controlled-not operations serve to compute this parity and store it in the first bit.
Then once the first bit is measured, one can recover the identity of the resultant half-space, which is equivalent to measuring the Pauli operator.
Also, while it may be tempting to assume that measuring $Z\otimes Z$ is the same as sequentially measuring $Z\otimes \mathbb{1}$ and then $\mathbb{1} \otimes Z$, this assumption would be false. The reason is that measuring $Z\otimes Z$ projects the quantum state into either the $+1$ or $-1$ eigenstate of these operators. Measuring $Z\otimes \mathbb{1}$ and then $\mathbb{1} \otimes Z$ projects the quantum state vector first onto a half space of $Z\otimes \mathbb{1}$ and then onto a half space of $\mathbb{1} \otimes Z$. As there are four computational basis vectors, performing both measurements reduces the state to a quarter-space and hence reduces it to a single computational basis vector.
Correlations between qubits¶
Another way of looking at measuring tensor products of Pauli matrices such as $X\otimes X$ or $Z\otimes Z$ is that these measurements let you look at information stored in the correlations between the two qubits. Measuring $X\otimes 1$ lets you look at information that is locally stored in the first qubit. While both types of measurements are equally valuable in quantum computing, the former illuminates the power of quantum computing. It reveals that in quantum computing, often the information you wish to learn is not stored in any single qubit but rather stored non-locally in all the qubits at once, and therefore only by looking at it through a joint measurement (e.g. $Z\otimes Z$) does this information become manifest.
Arbitrary Pauli operators such as $X\otimes Y \otimes Z \otimes \mathbf{1}$ can also be measured. All such tensor products of Pauli operators have only two eigenvalues $\pm 1$ and both eigenspaces constitute half-spaces of the entire vector space. Thus they coincide with the requirements stated earlier.
In Q#, such measurements return $j$ if the measurement yields a result in the eigenspace of sign $(-1)^j$. Having Pauli measurements as a built-in feature in Q# is helpful because measuring such operators requires long chains of controlled-NOT gates and basis transformations to describe the diagonalizing $U$ gate needed to express the operation as a tensor product of $Z$ and $1$. By being able to specify that you wish to do one of these pre-defined measurements, you don't need to worry about how to transform your basis such that a computational basis measurement provides the necessary information. Q# handles all the necessary basis transformations for you automatically.
The No-Cloning Theorem¶
Quantum information is powerful. It enables you to do amazing things such as factor numbers exponentially faster than the best known classical algorithms, or efficiently simulate correlated electron systems that classically require exponential cost to simulate accurately. However, there are limitations to the power of quantum computing. One such limitation is given by the No-Cloning Theorem.
The No-Cloning Theorem is aptly named. It disallows cloning of generic quantum states by a quantum computer. The proof of the theorem is remarkably straightforward. While a full proof of the no-cloning theorem is too technical for this article, the proof in the case of no additional auxiliary qubits is within the scope.
For such a quantum computer, the cloning operation must be described with a unitary matrix. Quantum measurement is disallowed, since it would corrupt the quantum state to be cloned. To simulate the cloning operation, the unitary matrix used needs to have the property that $$ U \ket{\psi} \ket{0} = \ket{\psi} \ket{\psi} $$ for any state $\ket{\psi}$. The linearity property of matrix multiplication then implies that for any second quantum state $\ket{\phi}$,
$$ \begin{align} U \left[ \frac{1}{\sqrt{2}}\left(\ket{\phi}+\ket{\psi} \right) \right] \ket{0} & = \frac{1}{\sqrt{2}} U\ket{\phi}\ket{0} + \frac{1}{\sqrt{2}} U\ket{\psi}\ket{0} \\\\ & = \frac{1}{\sqrt{2}} \left( \ket{\phi} \ket{\phi} + \ket{\psi} \ket{\psi} \right) \\\\ & \ne \left( \frac{1}{\sqrt{2}}\left(\ket{\phi}+\ket{\psi} \right) \right) \otimes \left( \frac{1}{\sqrt{2}}\left(\ket{\phi}+\ket{\psi} \right) \right). \end{align} $$
This provides the fundamental intuition behind the No-Cloning Theorem: any device that copies an unknown quantum state must induce errors on at least some of the states it copies. While the key assumption that the cloner acts linearly on the input state can be violated through the addition and measurement of auxiliary qubits, such interactions also leak information about the system through the measurement statistics and prevent exact cloning in such cases as well.
The No-Cloning Theorem is important for qualitative understanding of quantum computing because if you could clone quantum states inexpensively then you would be granted a near-magical ability to learn from quantum states. Indeed, you could violate Heisenberg's vaunted uncertainty principle. Alternatively, you could use an optimal cloner to take a single sample from a complex quantum distribution and learn everything you could possibly learn about that distribution from just one sample. This would be like you flipping a coin and observing heads and then upon telling a friend about the result having them respond "Ah the distribution of that coin must be Bernoulli with $p=0.512643\ldots$!" Such a statement would be nonsensical because one bit of information (the heads outcome) simply cannot provide the many bits of information needed to encode the distribution without substantial prior information. Similarly, without prior information one cannot perfectly clone a quantum state just as one cannot prepare an ensemble of such coins without knowing $p$.
Information is not free in quantum computing. Each qubit measured gives a single bit of information, and the No-Cloning Theorem shows that there is no backdoor that can be exploited to get around the fundamental tradeoff between information gained about the system and the disturbance invoked upon it.
Theory of Grover's search algorithm¶
In this article you'll find a detailed theoretical explanation of the mathematical principles that make Grover's algorithm work.
For a practical implementation of Grover's algorithm to solve mathematical problems, see Implement Grover's search algorithm.
Statement of the problem¶
Any search task can be expressed with an abstract function $f(x)$ that accepts search items $x$. If the item $x$ is a solution to the search task, then $f(x)=1$. If the item $x$ isn't a solution, then $f(x)=0$. The search problem consists of finding any item $x_0$ such that $f(x_0)=1$.
The task that Grover's algorithm aims to solve can be expressed as follows: given a classical function $f(x):\\{0,1\\}^n \rightarrow\\{0,1\\}$, where $n$ is the bit-size of the search space, find an input $x_0$ for which $f(x_0)=1$. The complexity of the algorithm is measured by the number of uses of the function $f(x)$. Classically, in the worst-case scenario, $f(x)$ has to be evaluated a total of $N-1$ times, where $N=2^n$, trying out all the possibilities. After $N-1$ elements, it must be the last element. Grover's quantum algorithm can solve this problem much faster, providing a quadratic speed up. Quadratic here implies that only about $\sqrt{N}$ evaluations would be required, compared to $N$.
Outline of the algorithm¶
Suppose there are $N=2^n$ eligible items for the search task and they are index by assigning each item an integer from $0$ to $N-1$. Further, suppose that there are $M$ different valid inputs, meaning that there are $M$ inputs for which $f(x)=1$. The steps of the algorithm are then as follows:
- Start with a register of $n$ qubits initialized in the state $\ket{0}$.
- Prepare the register into a uniform superposition by applying $H$ to each qubit of the register: $$|\text{register}\rangle=\frac{1}{\sqrt{N}} \sum_{x=0}^{N-1}|x\rangle$$
- Apply the following operations to the register $N_{\text{optimal}}$ times:
- The phase oracle $O_f$ that applies a conditional phase shift of $-1$ for the solution items.
- Apply $H$ to each qubit in the register.
- A conditional phase shift of $-1$ to every computational basis state except $\ket{0}$. This can be represented by the unitary operation $-O_0$, as $O_0$ represents the conditional phase shift to $\ket{0}$ only.
- Apply $H$ to each qubit in the register.
- Measure the register to obtain the index of a item that's a solution with very high probability.
- Check if it's a valid solution. If not, start again.
$N_\text{optimal} = \left\lfloor \frac{\pi}{4}\sqrt{\frac{N}{M}}-\frac{1}{2} \right\rfloor$ is the optimal number of iterations that maximizes the likelihood of obtaining the correct item by measuring the register.
[!NOTE] The joint application of the steps 3.b, 3.c and 3.d is usually known in the literature as Grover's diffusion operator.
The overall unitary operation applied to the register is:
$$(-H^{\otimes n}O_0H^{\otimes n}O_f)^{N_{\text{optimal}}}H^{\otimes n}$$
Following the register's state step by step¶
To illustrate the process, let's follow the mathematical transformations of the state of the register for a simple case in which there are only two qubits and the valid element is $\ket{01}.$
The register starts in the state: $$|\text{register}\rangle=|00\rangle$$
After applying $H$ to each qubit the register's state transforms to: $$|\text{register}\rangle = \frac{1}{\sqrt{4}} \sum_{i \in \\{0,1\\}^2}|i\rangle=\frac12(\ket{00}+\ket{01}+\ket{10}+\ket{11})$$
Then, the phase oracle is applied to get: $$|\text{register}\rangle = \frac12(\ket{00}-\ket{01}+\ket{10}+\ket{11})$$
Then $H$ acts on each qubit again to give: $$|\text{register}\rangle = \frac12(\ket{00}+\ket{01}-\ket{10}+\ket{11})$$
Now the conditional phase shift is applied on every state except $\ket{00}$: $$|\text{register}\rangle = \frac12(\ket{00}-\ket{01}+\ket{10}-\ket{11})$$
Finally, the first Grover iteration ends by applying $H$ again to get: $$|\text{register}\rangle = \ket{01}$$
By following the above steps, the valid item is found in a single iteration. As you will see later, this is because for N=4 and a single valid item, $N_\text{optimal}=1$.
Geometrical explanation¶
To see why Grover's algorithm works, let's study the algorithm from a geometrical perspective. Let $\ket{\text{bad}}$ be the superposition of all states that aren't a solution to the search problem. Supposing there are $M$ valid solutions:
$$\ket{\text{bad}}=\frac{1}{\sqrt{N-M}}\sum_{x:f(x)=0}\ket{x}$$
The state $\ket{\text{good}}$ is defined as the superposition of all states that are a solution to the search problem:
$$\ket{\text{good}}=\frac{1}{\sqrt{M}}\sum_{x:f(x)=1}\ket{x}$$
Since good and bad are mutually exclusive sets because an item cannot be valid and not valid, the states $\ket{\text{good}}$ and $\ket{\text{bad}}$ are orthogonal. Both states form the orthogonal basis of a plane in the vector space. One can use this plane to visualize the algorithm.
:::image type="content" source="media/plane-grovers.png" alt-text="Plot of the plane in the Bloch sphere projected by the orthogonal good and bad vectors.":::
Now, suppose $\ket{\psi}$ is an arbitrary state that lives in the plane spanned by $\ket{\text{good}}$ and $\ket{\text{bad}}$. Any state living in that plane can be expressed as:
$$\ket{\psi} = \alpha \ket{\text{good}} + \beta \ket{\text{bad}}$$
where $\alpha$ and $\beta$ are real numbers. Now, let's introduce the reflection operator $R_{\ket{\psi}}$, where $\ket{\psi}$ is any qubit state living in the plane. The operator is defined as:
$$R_{\ket{\psi}}=2\ket{\psi}\bra{\psi}-\mathcal{I}$$
It is called the reflection operator about $\ket{\psi}$ because it can be geometrically interpreted as reflection about the direction of $\ket{\psi}$. To see it, take the orthogonal basis of the plane formed by $\ket{\psi}$ and its orthogonal complement $\ket{\psi^{\perp}}$. Any state $\ket{\xi}$ of the plane can be decomposed in such basis:
$$\ket{\xi}=\mu \ket{\psi} + \nu {\ket{\psi^{\perp}}}$$
If one applies the operator $R_{\ket{\psi}}$ to $\ket{\xi}$:
$$R_{\ket{\psi}}\ket{\xi}=\mu \ket{\psi} - \nu {\ket{\psi^{\perp}}}$$
The operator $R_\ket{\psi}$ inverts the component orthogonal to $\ket{\psi}$ but leaves the $\ket{\psi}$ component unchanged. Therefore, $R_\ket{\psi}$ is a reflection about $\ket{\psi}$.
:::image type="content" source="media/reflection-operator.png" alt-text="Plot of the reflection operator about the quantum state visualized in the plane.":::
Grover's algorithm, after the first application of $H$ to every qubit, starts with an uniform superposition of all states. This can be written as:
$$\ket{\text{all}} = \sqrt{\frac{M}{N}}\ket{\text{good}} + \sqrt{\frac{N-M}{N}}\ket{\text{bad}}$$
:::image type="content" source="media/starting-state.png" alt-text="Plot of the starting state as a superposition of the good and bad states in the plane.":::
And thus the state lives in the plane. Note that the probability of obtaining a correct result when measuring from the equal superposition is just $|\braket{\text{good}|{\text{all}}}|^2=M/N$, which is what you would expect from a random guess.
The oracle $O_f$ adds a negative phase to any solution to the search problem. Therefore, it can be written as a reflection about the $\ket{\text{bad}}$ axis:
$$O_f = R_{\ket{\text{bad}}} = 2\ket{\text{bad}}\bra{\text{bad}} - \mathcal{I}$$
Analogously, the conditional phase shift $O_0$ is just an inverted reflection about the state $\ket{0}$:
$$O_0 = R_{\ket{0}}= -2\ket{0}\bra{0} + \mathcal{I}$$
Knowing this fact, it's easy to check that the Grover diffusion operation $-H^{\otimes n} O_0 H^{\otimes n}$ is also a reflection about the state $\ket{all}$. Just do:
$$-H^{\otimes n} O_0 H^{\otimes n}=2H^{\otimes n}\ket{0}\bra{0}H^{\otimes n} -H^{\otimes n}\mathcal{I}H^{\otimes n} = 2\ket{\text{all}}\bra{\text{all}} - \mathcal{I} = R_{\ket{\text{all}}}$$
It has been demonstrated that each iteration of Grover's algorithm is a composition of two reflections $R_\ket{\text{bad}}$ and $R_\ket{\text{all}}$.
:::image type="content" source="media/grovers-iteration.png" alt-text="Plot of the Grover iteration visualized as a sequence of two reflections in the plane.":::
The combined effect of each Grover iteration is a counterclockwise rotation of an angle $2\theta$. Fortunately, the angle $\theta$ is easy to find. Since $\theta$ is just the angle between $\ket{\text{all}}$ and $\ket{\text{bad}}$, one can use the scalar product to find the angle. It is known that $\cos{\theta}=\braket{\text{all}|\text{bad}}$, so one needs to calculate $\braket{\text{all}|\text{bad}}$. From the decomposition of $\ket{\text{all}}$ in terms of $\ket{\text{bad}}$ and $\ket{\text{good}}$, it follows:
$$\theta = \arccos{\left(\braket{\text{all}|\text{bad}}\right)}= \arccos{\left(\sqrt{\frac{N-M}{N}}\right)} $$
The angle between the state of the register and the $\ket{\text{good}}$ state decreases with each iteration, resulting in a higher probability of measuring a valid result. To calculate this probability, you just need to calculate $|\braket{\text{good}|\text{register}}|^2$. Denoting the angle between $\ket{\text{good}}$ and $\ket{\text{register}}$ $\gamma (k)$, where $k$ is the iteration count:
$$\gamma (k) = \frac{\pi}{2}-\theta -k2\theta = \frac{\pi}{2} -(2k + 1) \theta $$
Therefore, the probability of success is:
$$P(\text{success}) = \cos^2(\gamma(k)) = \sin^2\left[(2k +1)\arccos \left( \sqrt{\frac{N-M}{N}}\right)\right]$$
Optimal number of iterations¶
As the probability of success can be written as a function of the number of iterations, the optimal number of iterations $N_{\text{optimal}}$ can be found by computing the smallest positive integer that (approximately) maximizes the success probability function.
:::image type="content" source="media/success-probability-grovers.png" alt-text="A sinusoidal plot of the success probability as a function of Grover iterations. The optimal number of iterations is near the first peak.":::
It is known that $\sin^2{x}$ reaches its first maximum for $x=\frac{\pi}{2}$, so:
$$\frac{\pi}{2}=(2k_{\text{optimal}} +1)\arccos \left( \sqrt{\frac{N-M}{N}}\right)$$
This gives:
$$k_{\text{optimal}} = \frac{\pi}{4\arccos\left(\sqrt{1-M/N}\right)}-1/2 = \frac{\pi}{4}\sqrt{\frac{N}{M}}-\frac{1}{2}-O\left(\sqrt\frac{M}{N}\right)$$
Where in the last step, $\arccos \sqrt{1-x} = \sqrt{x} + O(x^{3/2})$.
Therefore, you can pick $N_\text{optimal}$ to be $N_\text{optimal} = \left\lfloor \frac{\pi}{4}\sqrt{\frac{N}{M}}-\frac{1}{2} \right\rfloor$.
Complexity analysis¶
From the previous analysis, $O\left(\sqrt{\frac{N}{M}}\right)$ queries of the oracle $O_f$ are needed to find a valid item. However, can the algorithm be implemented efficiently in terms of time complexity? $O_0$ is based on computing Boolean operations on $n$ bits and is known to be implementable using $O(n)$ gates. There are also two layers of $n$ Hadamard gates. Both of these components thus require only $O(n)$ gates per iteration. Because $N=2^n$, it follows that $O(n)=O(log(N))$. Therefore, if $O\left(\sqrt{\frac{N}{M}}\right)$ iterations and $O(log(N))$ gates per iteration are needed, the total time complexity (without taking into account the oracle implementation) is $O\left(\sqrt{\frac{N}{M}}log(N)\right)$.
The overall complexity of the algorithm ultimately depends on the complexity of the implementation of the oracle $O_f$. If a function evaluation is much more complicated on a quantum computer than on a classical one, the overall algorithm runtime will be longer in the quantum case, even though technically, it uses fewer queries.
References¶
If you want to continue learning about Grover's algorithm, you can check any of the following sources:
- Original paper by Lov K. Grover
- Quantum search algorithms section of Nielsen, M. A. & Chuang, I. L. (2010). Quantum Computation and Quantum Information.
- Grover's algorithm on Arxiv.org
Quantum circuit diagram conventions¶
Sometimes quantum algorithms are easier to understand in a circuit diagram than in the equivalent written matrix representation. This article explains how to read quantum circuit diagrams and their conventions.
For more information, see How to visualize quantum circuits diagrams.
Reading quantum circuit diagrams¶
In a quantum circuit, time flows from left to right. Quantum gates are ordered in chronological order with the left-most gate as the gate first applied to the qubits.
Take the following quantum circuit diagram as an example:
:::image type="content" source="media\annotated-sample-circuit.png" alt-text="Diagram of a quantum circuit with two registers, one hadamard gate, one controlled gate and one measurement. ":::
- Qubit register: Qubit registers are displayed as horizontal lines, with each line representing a qubit. The top line is qubit register labeled 0, the second line is qubit register labeled 1, and so on.
- Quantum gate: Quantum operations are represented by quantum gates. The term quantum gate is analogous to classical logic gates. Gates acting on one or more qubit registers are denoted as a box. In this example, the symbol represents a Hadamard operation.
- Controlled gate: Controlled gates act on two or more qubits. In this example, the symbol represent a CNOT gate. The black circle represents the control qubit, and the cross within a circle represents the target qubit.
- Measurement operation: The meter symbol represents a measurement operation. The measurement operation takes a qubit register as input and outputs classical information.
Applying quantum gates¶
Because time flows from left to right, the left-most gate is applied first For example, the action of the following quantum circuit is the unitary matrix $CBA$.
:::image type="content" source="media\3.svg" alt-text="Diagram of quantum gates being applied left-to-right in a quantum circuit.":::
[!NOTE] Matrix multiplication obeys the opposite convention: the right-most matrix is applied first. In quantum circuit diagrams, however, the left-most gate is applied first. This difference can at times lead to confusion, so it is important to note this significant difference between the linear algebraic notation and quantum circuit diagrams.
Inputs and outputs of quantum circuits¶
In a quantum circuit diagram, the wires entering a quantum gate represent the qubits that are input to the quantum gate, and the wires exiting the quantum gate represent the qubits that are output from the quantum gate.
The number of inputs of a quantum gate are equal to the number of outputs of a quantum gate. This is because quantum operations are unitary and hence reversible. If a quantum gate had more outputs than inputs, it would not be reversible and hence not unitary, which is a contradiction.
For this reason any box drawn in a circuit diagram must have precisely the same number of wires entering it as exiting it.
Multi-qubit operations¶
Multi-qubit circuit diagrams follow similar conventions to single-qubit ones. For example, a two-qubit unitary operation $B$ can be defined to be $(H S\otimes X)$, so the equivalent quantum circuit is the following:
:::image type="content" source="media\4.svg" alt-text="Circuit diagram of a two-qubit unitary operation.":::
You can also view $B$ as having an action on a single two-qubit register rather than two one-qubit registers depending on the context in which the circuit is used.
Perhaps the most useful property of such abstract circuit diagrams is that they allow complicated quantum algorithms to be described at a high level without having to compile them down to fundamental gates. This means that you can get an intuition about the data flow for a large quantum algorithm without needing to understand all the details of how each of the subroutines within the algorithm work.
Controlled gates¶
Quantum controlled gates are two-qubit gates that apply a single-qubit gate to a target qubit if a control qubit is in a specific state.
For example, consider a quantum controlled gate, denoted $\Lambda(G)$, where a single qubit's value controls the application of the $G$ operation,. The controlled gate $\Lambda(G)$ can be understood by looking at the following example of a product state input:
$\Lambda(G) (\alpha \ket{0} + \beta \ket{1}) \ket{\psi} = \alpha \ket{0} \ket{\psi} + \beta \ket{1} G\ket{\psi}$
That is to say, the controlled gate applies $G$ to the register containing $\psi$ if and only if the control qubit takes the value $1$. In general, such controlled operations are described in circuit diagrams by the following symbol:
:::image type="content" source="media\5.svg" alt-text="Circuit diagram of a singly controlled gate.":::
Here the black circle denotes the quantum bit on which the gate is controlled and a vertical wire denotes the unitary that is applied when the control qubit takes the value $1$.
For the special cases where $G=X$ and $G=Z$, the following notation is used to describe the controlled version of the gates (note that the controlled-X gate is the CNOT gate):
:::image type="content" source="media\6.svg" alt-text="Circuit diagram for special cases of controlled gates.":::
Q# provides methods to automatically generate the controlled version of an operation, which saves the programmer from having to hand code these operations. An example of this is shown below:
qsharp
operation PrepareSuperposition(qubit : Qubit) : Unit
is Ctl { // Auto-generate the controlled specialization of the operation
H(qubit);
}
Classically controlled gates¶
Quantum gates can also be applied after a measurement, where the result of the measurement acts as a classical control bit.
The following symbol represents a classically controlled gate, where $G$ is applied conditioned on the classical control bit being value $1$:
:::image type="content" source="media\8.svg" alt-text="Circuit diagram representing a controlled operation.":::
Measurement operator¶
Measurement operations take a qubit register, measures it, and outputs the result as classical information.
A measurement operation is denoted by a meter symbol and always takes as input a qubit register (denoted by a solid line) and outputs classical information (denoted by a double line). Specifically, the symbol of the measurement operation looks like:
:::image type="content" source="media\7.svg" alt-text="Symbol representing a measurement operation.":::
In Q#, the Measure operator implements the measurement operation.
Example: Unitary transformation¶
Consider the unitary transformation $\text{ CNOT}_{01}(H\otimes 1)$. This gate sequence is of fundamental significance to quantum computing because it creates a maximally entangled two-qubit state:
$\mathrm{CNOT}_{01}(H\otimes 1)\ket{00} = \frac{1}{\sqrt{2}} \left(\ket{00} + \ket{11} \right),$
Operations with this or greater complexity are ubiquitous in quantum algorithms and quantum error correction.
The circuit diagram for preparing this maximally entangled quantum state is:
:::image type="content" source="media\1.svg" alt-text="Circuit diagram for a maximally entangled two-qubit state.":::
The symbol behind the Hadamard gate represents a CNOT gate, where the black circle indicates the control qubit and the cross within a circle indicates the target qubit. This quantum circuit is depicted as acting on two qubits (or equivalently two registers consisting of one qubit).
Example: Teleportation circuit diagram¶
Quantum teleportation is one of the best quantum algorithm for illustrating circuit components.
Quantum teleportation is a protocol that allows a quantum state to be transmitted from one qubit to another, with the help of a shared entangled state between the sender and receiver, and classical communication between them.
For learning purposes, the sender is called Alice, the receiver is called Bob, and the qubit to be teleported is called the message qubit. Alice and Bob hold one qubit each, and Alice has an extra qubit which is the message qubit.
The following circuit diagram illustrates the teleportation protocol:
:::image type="content" source="media\annotated-teleportation-circuit.png" alt-text="Diagram of the quantum circuit of the teleportation protocol.":::
Let's break down the steps of the teleportation protocol:
- Qubit register q0 is the message qubit, qubit register q1 is Alice's qubit, and qubit register q2 is Bob's qubit. The message qubit is in an unknown state, and Alice and Bob's qubits are in the $\ket{0}$ state.
- A Hadamard gate is applied to Alice's qubit. Because the qubit is in the $\ket{0}$ state, the resulting state is $\frac{1}{\sqrt{2}}(\ket{0} + \ket{1})$.
- A CNOT gate is applied to Alice and Bob's qubits. Alice's qubit is the control qubit, and Bob's qubit is the target qubit. The resulting state is $\frac{1}{\sqrt{2}}(\ket{00} + \ket{11})$. Alice and Bob now share an entangled state.
- A CNOT gate is applied to the message qubit and Alice's qubit. Since Alice's qubit is also entangled with Bob's qubit, the resulting state is a three-qubit entangled state.
- A Hadamard gate is applied to the message qubit.
- Alice measures her two qubits and tells the measurement results to Bob, which isn't reflected in the circuit. The measurement results are two classical bits, which can take the values 00, 01, 10, or 1
- Two classically controlled Pauli gates X and Z are applied to Bob's qubit, depending on the result bit being value $1$. The resulting state is the original message qubit state.
Entanglement and correlations¶
Entanglement is a fundamental concept in quantum mechanics that describes a quantum correlation between quantum systems. When two or more qubits are entangled, the state of one qubit is dependent on the state of the other qubit, even if they're far apart. This quantum correlation is a unique feature of quantum systems that doesn't have a classical counterpart.
This article provides an overview of entanglement, correlations, and explains how to create entanglement using quantum gates.
What is entanglement?¶
Imagine that you have two qubits, $A$ and $B$. The qubits are independent from each other, which means that the information about the state of qubit $A$, whatever it is, belongs only to qubit $A$. Similarly, the information about the state of qubit $B$ belongs to qubit $B$. In this case, the qubits aren't entangled, because they aren't sharing any information about their states.
Now imagine that you entangle the qubits. If qubits $A$ and $B$ are entangled, the information about the state of qubit $A$ isn't independent of the state of qubit $B$. When entangled, information is shared between both qubits, and there's no way to know the state of qubit $A$ or qubit $B$. You can only describe the state of the global system, not the state of the individual qubits.
Entanglement is a quantum correlation between two or more particles. If two particles are entangled, they can't be described independently, but only as a whole system.
Two or more particles can be entangled even if they are separated by large distances. This correlation is stronger than any classical correlation, and it's a key resource for quantum information processing tasks such as quantum teleportation, quantum cryptography, and quantum computing. If you want to learn how to teleport a qubit using entanglement, check out this module in the Azure Quantum training path.
[!NOTE] Entanglement is a property of multi-qubit systems, not of single qubits. That is, a single qubit can't be entangled.
Defining entanglement in quantum systems¶
Imagine two qubits $A$ and $B$ such that the state of the global system $\ket{\phi}$ is:
$$\ket{\phi}=\frac1{\sqrt2}(\ket{0_A 0_B}+ \ket{1_A 1_B})$$
[!NOTE] In Dirac notation, $\ket{0_A 0_B}= |0\rangle_\text{A} |0\rangle_\text{B}$. The first position corresponds to the first qubit, and the second position corresponds to the second qubit.
The global system $\ket{\phi}$ is in a superposition of the states $|00\rangle$ and $|11\rangle$. But what is the individual state of qubit $A$? And of qubit $B$? If you try to describe the state of qubit $A$ without considering the state of qubit $B$, you fail. Subsystems $A$ and $B$ are entangled and can't be described independently.
If you measure both qubits, only two outcomes are possible: $\ket{00}$ and $\ket{11}$, each with the same probability of $\frac{1}{2}$. The probability of obtaining the states $|01\rangle$ and $|10\rangle$ is zero.
But what happens if you measure only one qubit? When two particles are entangled the measurement outcomes are also correlated. That is, whatever operation happens to the state of one qubit in an entangled pair, also affects to the state of the other qubit.
If you measure only the qubit $A$ and you get the $|0\rangle$ state, this means that the global system collapses to the state $\ket{00}$. This is the only possible outcome, since the probability of measuring $|01\rangle$ is zero. So, without measuring the qubit $B$ you can be sure that the second qubit is also in $|0\rangle$ state. The measurement outcomes are correlated because the qubits are entangled.
The quantum state $\ket{\phi}$ is called a Bell state. There are four Bell states:
$$\ket{\phi^{+}}=\frac1{\sqrt2}\ket{00} + \frac1{\sqrt2}\ket{11}$$ $$\ket{\phi^{-}}=\frac1{\sqrt2}\ket{00} - \frac1{\sqrt2}\ket{11} $$ $$ \ket{\psi^{+}}=\frac1{\sqrt2}\ket{01} + \frac1{\sqrt2}\ket{10} $$ $$\ket{\psi^{-}}=\frac1{\sqrt2}\ket{01} - \frac1{\sqrt2}\ket{10} $$
[!NOTE] This example uses two qubits, but quantum entanglement isn't limited to two qubits. In general it's possible that multiple-qubit systems share entanglement.
Creating entanglement with quantum operations¶
You can use quantum operations to create quantum entanglement. One of the most common ways to create entanglement to two qubits in the state $|00\rangle$ is by applying the Hadamard operation $H$ and the controlled-NOT operation $CNOT$ to transform them into the Bell state $\ket{\phi^+}=\frac1{\sqrt2}(|00\rangle+|11\rangle)$.
The $CNOT$ operation takes two qubits as input, one acts as control qubit and the other is the target qubit. The CNOT operation flips the state of the target qubit if, and only if, the state of the control qubit is $|1\rangle$.
| Input | Output |
|---|---|
| $\ket{00}$ | $\ket{00}$ |
| $\ket{01}$ | $\ket{01}$ |
| $\ket{10}$ | $\ket{11}$ |
| $\ket{11}$ | $\ket{10}$ |
Here's how it works:
Take two qubits in the state $|00\rangle$. The first qubit is the control qubit and the second qubit is the target qubit.
Create a superposition state only in the control qubit by applying $H$.
$$H |0_c\rangle= \frac{1}{\sqrt{2}}(|0_c\rangle+|1_c\rangle)$$
[!NOTE] The subscripts ${}_c$ and ${}_t$ specify the control and target qubits.
Apply the $CNOT$ operator to the control qubit, which is in a superposition state, and the target qubit, which is in the state $|0_t\rangle$.
$$ CNOT \frac{1}{\sqrt{2}}(\ket{0_c}+\ket{1_c})\ket{0}_t = CNOT \frac{1}{\sqrt2}(\ket{0_c 0_t}+|\ket{1_c 0_t})= $$ $$ =\frac{1}{\sqrt2}(CNOT \ket{0_c 0_t} + CNOT \ket{1_c 0_t})= $$ $$= \frac{1}{\sqrt2}(\ket{0_c 0_t}+\ket{1_c 1_t})$$
[!TIP] To learn how to entangle two qubits with Q#, see Quickstart: Create your first Q# program.
Separability and quantum entanglement¶
Entanglement can be seen as the lack of separability: a state is entangled when it isn't separable.
A quantum state is separable if it can be written as a product state of the subsystems. That is, a state $\ket{\phi}_{\text{AB}}$ is separable if it can be written as a combination of product states of the subsystems, that is $\ket{\phi}_{\text{AB}} = \ket{a}_A \otimes \ket{b}_B$.
Entanglement in pure states¶
A pure quantum state is a single ket vector, such as the state $\ket{+} = \frac{1}{\sqrt{2}}(\ket{0} + \ket{1})$.
Pure states can't be written as a statistical mixture (or convex combination) of other quantum states.
On the Bloch sphere, pure states are represented by a point on the surface of the sphere, whereas mixed states are represented by an interior point.
A pure state $\ket{\phi}_{AB}$ is entangled if it can't be written as a combination of product states of the subsystems, that is $\ket{\phi}_{AB} = \ket{a}_A \otimes \ket{b}_B$.
For example, consider the state $$ \ket{\psi}_{AB} = \frac{1}{2} (\ket{00} + \ket{10} +\ket{01} +\ket{11})$$
At first, the state $\ket{\psi}_{AB}$ doesn't look like a product state, but if we rewrite the state as
$$ \ket{\psi}_{AB} = \frac{1}{\sqrt{2}} (\ket{0}_A +\ket{1}_A) \otimes \frac{1}{\sqrt{2}} (\ket{0}_B +\ket{1}_B)= \ket{+}_A \ket{+}_B$$
the state $\ket{\psi}_{\text{AB}}$ is a product state, therefore it's not entangled.
Entanglement in mixed states¶
Mixed quantum states are a statistical ensemble of pure states. To describe mixed states is easier to use their density matrix $\rho$ rather than the ket notation.
A mixed state $\rho$ is separable if it can be written as a convex combination of product states of the subsystems, such as
$$\rho = \sum_j p_j \rho^{A}_j \otimes \rho^{B}_j$$
where $p_j \geq 0, \sum p_j = 1$ and $\rho^{A}_j \geq 0, \rho^{B}_j \geq 0$.
For more information, see Density matrices.
A mixed state $\rho$ is entangled if it's not separable, that is, it can't be written as a convex combination of product states.
[!NOTE]
- If an entangled state $\rho$ is pure, then it contains only quantum correlations.
- If an entangled state $\rho$ is mixed, then it contains both classical and quantum correlations.
Understanding classical correlations¶
Classical correlations are due to the lack of knowledge of the state of the system. That is, there's some randomness associated to classical correlation, but it can be eliminated by gaining knowledge.
For example, consider two boxes, each containing one ball. You know that both balls are the same color, either blue or red. If you open one box and find out that the ball inside is blue, then we know that the other ball is blue too. Therefore, they are correlated. However, the uncertainty that we have when opening the box is due to our lack of knowledge, it's not fundamental. The ball was blue before we opened the box. Thus, this is a classical correlation, not a quantum correlation.
The mixed quantum state of the system formed by the two boxes $\rho_{boxes}$ can be written as
$$ \rho_{boxes} = \frac{1}{2} (\ket{red}\bra{red}_{A} \otimes \ket{red}\bra{red}_B) +\frac{1}{2} (\ket{blue}\bra{blue}_A \otimes \ket{blue}\bra{blue}_B) $$
Notice that the state $\rho_{boxes}$ is separable, where $p_1 = p_2 = \frac{1}{2}$ then it contains only classical correlations. Another example of a mixed separable state is
$$ \rho = \frac{1}{2} (\ket{0}\bra{0}_A \otimes \ket{0}\bra{0}_B) +\frac{1}{2} (\ket{1}\bra{1}_A \otimes \ket{1}\bra{1}_B) $$
Now, consider the following state:
$$ \rho = \frac{1}{4} (\ket{00}\bra{00} + \ket{00}\bra{11} + \ket{11}\bra{00} + \ket{11}\bra{11}) = \ket{\phi^+}\bra{\phi^+} $$
In this case, our knowledge of the state is perfect, we know with maximal certainty that the system $AB$ is in the Bell state $\ket{\phi^+}$ and $\rho$ is a pure state. Therefore, there aren't classical correlations. But if we measure an observable on subsystem $A$, we obtain a random result which gives us information about the state of the subsystem $B$. This randomness is fundamental, namely these are quantum correlations.
An example of a quantum state that contains both classical and quantum correlations is
$$ \rho = \frac{1}{2} (\ket{\phi^+}\bra{\phi^+} + \ket{\phi^-}\bra{\phi^-})$$
[!NOTE] A separable state contains only classical correlations.
Introduction to quantum error correction¶
This article explains the basics of quantum error correction, the types of quantum errors, and some common quantum error correction codes. It also provides an example of how to correct errors using the three-qubit code.
What is quantum error correction?¶
Quantum error correction (QEC) is a technique that allows us to protect quantum information from errors. Error correction is especially important in quantum computers, because efficient quantum algorithms make use of large-scale quantum computers, which are sensitive to noise.
The basic principle behind quantum error correction is that the number of bits used to encode a given amount of information is increased. This redundancy allows the code to detect and correct errors.
The error rates for quantum computers are typically higher than classical computer's errors due to the challenges associated with building and operating quantum systems. Noise, decoherence, and imperfections in quantum gates can cause errors in quantum computations. Current quantum computers have error rates in the range of 1% to 0.1%. In other words, this means that on average one out of every 100 to 1000 quantum gate operations results in an error.
Types of quantum errors¶
There are two fundamental types of quantum errors: bit flips and phase flips.
Bit flip errors occur when a qubit changes from $\ket{0}$ to $\ket{1}$ or vice versa. Bit flip errors are also known as $\sigma_x$-errors, because they map the qubit states $\sigma_x \ket{0} = \ket{1}$ and $\sigma_x \ket{1} = \ket{0}$. This error is analogous to a classical bit flip error.
Phase flip errors occur when a qubit changes its phase. They are also known as $\sigma_z$-errors, because they map the qubit states $\sigma_z \ket{0} = \ket{0}$ and $\sigma_z \ket{1} = -\ket{1}$. This type of error has no classical analog.
In quantum computing, quantum errors can manifest as bit flips, phase flips, or a combination of both.
How does quantum error correction work?¶
Quantum error correction codes work by encoding the quantum information into a larger set of qubits, called the physical qubits. The joint state of the physical qubits represents a logical qubit.
The physical qubits are subject to errors due to decoherence and imperfections in quantum gates. The code is designed so that errors can be detected and corrected by measuring some of the qubits in the code.
For example, imagine you want to send the single-qubit message $\ket{0}$. You could use three physical qubits to encode the message, sending $\ket{000}$, which is known as a codeword. This error-correcting code is a repetition code, because the message is repeated three times.
Now, imagine that a single bit-flip error occurs during transmission so that what the recipient receives is the state $\ket{010}$. In this scenario, the recipient may be able to infer that the intended message is $\ket{000}$. However, if the message is subject to two bit-flip errors, the recipient may infer an incorrect message. Finally, if all three bits are flipped so that the original message $\ket{000}$ becomes $\ket{111}$, the recipient has no way of knowing an error occurred.
The code distance of a QEC code is the minimum number of errors that change one codeword into another, that is, the number of errors that can't be detected. The code distance $d$ can be defined as
$$d = 2t + 1$$
where $t$ is the number of errors the code can correct. For example, the three-bit code can detect and correct one bit-flip error, so $t = 1$, and thus the code distance is $d = 3$.
Note that repetition codes, such as the three-bit code used in this example, can only correct bit-flip errors, and not phase flip errors. To correct both types of errors, more sophisticated quantum error correction codes are needed.
Types of QEC codes¶
There are many different types of QEC codes, each with its own properties and advantages. Some common QEC codes are:
Repetition code: The simplest quantum error correction code, where a single qubit is encoded into multiple qubits by repeating it multiple times. The repetition code can correct bit flip errors, but not phase flip errors.
Shor code: The first quantum error correction code, developed by Peter Shor. It encodes one logical qubit into nine physical qubits. Shor code can correct one-bit flip error or one phase flip error, but it can't correct both types of errors at the same time.
Steane code: This is a seven-qubit code that can correct both bit flip and phase flip errors. It has the advantage of being fault-tolerant, meaning that the error correction process itself doesn't introduce extra errors.
Surface code: This is a topological error correction code that uses a two-dimensional lattice of qubits to encode logical qubits. It has a high error correction threshold and is considered one of the most promising techniques for large-scale, fault-tolerant quantum computing. The surface code is used by the Azure Quantum Resource Estimator.
Hastings-Haah code: This quantum error correction code offers better space-time costs than surface codes on Majorana qubits in many regimes. For gate-based instruction sets, the overhead is larger, which makes this approach less efficient than the surface code.
Example: The three-qubit code¶
The three-qubit error correction code is a simple repetition code that can detect and correct one bit flip error. It encodes a single logical qubit into three physical qubits by repeating the qubit three times.
[!TIP] Check out the Q# code sample for the three-qubit code.
Imagine you want to send an arbitrary single qubit $\ket{\phi}= \alpha\ket{0} + \beta \ket{1}$. To avoid errors, you encode the basis states $\ket{0}$ and $\ket{1}$ into a joint state of three qubits. The two logical basis states are $\ket{0_L} = \ket{000}$ and $\ket{1_L} = \ket{111}$.
Therefore, the single qubit $\ket{\phi}= \alpha \ket{0} + \beta\ket{1}$ is encoded as:
$$\ket{\phi_L} = \alpha \ket{000} + \beta\ket{111} = \alpha \ket{0_L} + \beta \ket{1_L}$$
Let's break down the steps of the three-qubit code.
Preparing the qubits¶
First, you encode your single qubit $\ket{\phi}=\alpha\ket{0} +\beta\ket{1}$ into a joint state of three qubits.
Next, you prepare two further qubits in the state $\ket{0}$. So, the global state of all three qubits is $(\alpha\ket{0} +\beta\ket{1})\ket{0}\ket{0}= \alpha\ket{000} + \beta\ket{100} $.
Lastly, you encode the single qubit into a joint state of three qubits, by applying two CNOT operations. The first CNOT uses the first qubit as control and acts on the second qubit, producing $\alpha \ket{000} + \beta \ket{110}$. The second CNOT uses the first qubit as control and acts on the third qubit. The state of the three qubits is now $\alpha\ket{000} + \beta\ket{111}$.
Sending the qubits¶
You send all three qubits. Assuming only one-bit flip errors can occur, the received qubits are in one of the following states:
| State | Error |
|---|---|
| $\alpha \ket{000} + \beta \ket{111}$ | No error |
| $\alpha \ket{100} + \beta\ket{011}$ | Qubit 1 |
| $\alpha \ket{010} + \beta\ket{101}$ | Qubit 2 |
| $\alpha \ket{001} + \beta\ket{110}$ | Qubit 3 |
Adding auxiliary qubits¶
First, you introduce two more qubits, prepared in the state $\ket{00}$. This auxiliary pair of qubits are used to extract information of the error without directly measuring or obtaining information about the logical state.
Next, you carry out four CNOT operations: the first two operations use the first and second received qubits as control and act on the first auxiliary qubit, and last two operations use the first and third received qubits as control and act on the second auxiliary bit. The total state of all five qubits is now:
| State | Error |
|---|---|
| $(\alpha \ket{000} + \beta \ket{111})\ket{00}$ | No error |
| $(\alpha \ket{100} + \beta\ket{011})\ket{11}$ | Qubit 1 |
| $(\alpha \ket{010} + \beta\ket{101})\ket{10}$ | Qubit 2 |
| $(\alpha \ket{001} + \beta\ket{110})\ket{01}$ | Qubit 3 |
Retrieving the error syndrome¶
To retrieve the error information, you measure the two auxiliary qubits in the computational basis states $\ket{0}$ and $\ket{1}$. By doing this, you recover the joint state, which is called the error syndrome because it helps diagnose the errors in the received qubits.
Now you know which of the four possible states the three received qubits are in. You can correct the error by applying the correction operation. In this case you're dealing with bit flip errors, so the correction is a $\sigma_x$ operation applied to one (or none) of the qubits.
For example, if the error syndrome is $\ket{00}$, then the received qubits are in the state $\alpha \ket{000} + \beta\ket{111}$, which is the state you originally sent. If the error syndrome is $\ket{11}$, then the received qubits are in the state $\alpha \ket{100} + b\ket{011}$. There's a bit flip error on the first qubit, which you can correct by applying a $\sigma_x$ operation to the first qubit.
| Error syndrome | Collapse state | Correction |
|---|---|---|
| $\ket{00}$ | $ \alpha \ket{000} + \beta\ket{111}$ | Do nothing |
| $\ket{01}$ | $\alpha \ket{100} + \beta\ket{011}$ | Apply $\sigma_x$ to qubit 3 |
| $\ket{10}$ | $\alpha \ket{010} + \beta\ket{101}$ | Apply $\sigma_x$ to qubit 2 |
| $\ket{11}$ | $\alpha \ket{001} + \beta\ket{110}$ | Apply $\sigma_x$ to qubit 1 |
Extracting the original qubit¶
Finally, to extract the single qubit you wanted to transmit originally, you apply two CNOT operations: one uses the first qubit as control and acts on the second qubit, and other uses the first qubit as control and acts on the third one.
The state of the first qubit is now $\alpha \ket{0} + \beta \ket{1}$, which is the original qubit you wanted to transmit.
[!IMPORTANT] The QEC code doesn't gain any information regarding the coefficients $\alpha$ and $\beta$, hence superpositions of the computational state remain intact during correction.
Quantum intermediate representation¶
Quantum Intermediate Representation (QIR) is an intermediate representation which serves as a common interface between quantum programming languages/frameworks and targeted quantum computation platforms. QIR specifies a set of rules for representing quantum programs using a language and hardware agnostic format within the LLVM IR. The QIR is a project developed by the QIR Alliance of which Microsoft is one of its members.
What is an intermediate representation?¶
A common pattern in classical compilers is to start by compiling the source language into an intermediate representation. An intermediate representation is – as its name indicates – an intermediary step in the conversion of instructions from source code to machine language.
An intermediate representation acts as an abstract representation of a program. All programs, irrespective of the language they are written in, are translated into this intermediate representation by a so-called front-end compiler, while a back-end component is responsible for translating that intermediate representation into a machine representation. The intermediate representation allows thus to decouple source languages from hardware platforms and makes it possible to build a compiler in a modular way, where each new language only requires a new front end to be supported on all platforms for which a back end is available.
The intermediate representation is typically designed to allow many different source languages to be represented. Moreover, at this intermediate level it is also possible to perform some optimization and circuit rearrangement that makes the final implementation more efficient. Once the final target execution platform is known, the intermediate representation can be compiled to actual executable code.
This approach allows many source languages to share a common set of optimizers and executable generators. It also makes it easy to compile a single source language for many different targets. The intermediate representation provides a common platform that can be shared across many sources and targets and allows a great deal of reuse in compiler machinery.
What is Quantum Intermediate Representation?¶
QIR is an intermediate representation for quantum programs developed by the QIR Alliance, to which Microsoft belongs. It provides a common interface that supports many languages and target platforms for quantum computation. You can think of QIR as a universal mid-layer language that enables communication between high-level languages and machines. While Q# compiles to QIR, QIR is not specific to Q#: any quantum programming framework can leverage QIR to represent a quantum program. It is hardware-agnostic, which means that it does not specify a quantum instruction or gate set, leaving that to the target computing environment.
QIR is based on the popular open-source LLVM classical compiler. LLVM is a collection of modular and reusable compiler and toolchain technologies that has been adapted by a wide set of languages. QIR specifies a set of rules for representing quantum constructs in LLVM, however it does not require any extensions or modifications to LLVM.
The fact that LLVM is the underlying toolchain means that QIR is naturally able to process both classical and quantum logic. This feature is essential for hybrid quantum–classical algorithms, which have become increasingly important for applications of quantum computing. In addition, it allows you to leverage compilation and optimization tools from the classical computing industry, and, therefore, to reduce the cost of writing translations.
Many leading quantum computing industries have adopted QIR already. For example, NVIDIA, Oak Ridge National Laboratory, Quantinuum, Quantum Circuits Inc., and Rigetti Computing are building toolchains that leverage QIR.
For more information, see the QIR Specification. If you are interested in compiler tools and projects that use QIR, please take a look at these QIR repositories.
What is the QIR Alliance?¶
The QIR Alliance is a joint effort to develop a forward-looking quantum intermediate representation with the goal to enable full interoperability within the quantum ecosystem, reduce development effort from all parties, and provide a representation suitable for current and future heterogeneous quantum processors.
Quantum SDKs and languages appear and evolve at a fast pace, along with new quantum processors with unique and distinct capabilities from each other. To provide interoperability between new languages and new hardware capabilities it is imperative for the ecosystem to develop and share an intermediate representation that works with present and future quantum hardware.
With their collective work and partnership, the QIR Alliance aims to:
- Reduce the required development effort for all parties by promoting interoperability between different frameworks and languages.
- Enable the development of shared libraries both for quantum application development, and for quantum compiler development.
- Build on state-of-the-art compiler technology and leverage existing tools, libraries and learnings from high-performance computing.
- Allow for incremental and progressive evolution in how classical and quantum computations can interact at the hardware level.
- Provide the flexibility to easily connect emerging technologies in a way that permits experimentation with distinct and differentiated hardware capabilities.
The QIR Alliance is part of the Linux Foundation’s Joint Development Foundation work on open standards. Founding members include Microsoft, as well as Quantinuum (formerly Honeywell), Oak Ridge National Laboratory, Quantum Circuits Inc. and Rigetti Computing.
What does Quantum Intermediate Representation look like?¶
Since QIR is based on LLVM, QIR looks like LLVM.
For example, consider the following Q# code to generate a Bell pair:
qsharp
operation CreateBellPair(q1 : Qubit, q2 : Qubit) : Unit {
H(q1);
CNOT(q1, q2);
}
When compiled to QIR, this becomes:
define void @CreateBellPair__body(%Qubit* %q1, %Qubit* %q2) {
entry:
call void @__quantum__qis__h(%Qubit* %q1)
call void @__quantum__qis__cnot(%Qubit* %q1, %Qubit* %q2)
ret void
}
In this snippet, you can see a few QIR features:
- Operations in Q# (or any other quantum programming language) are represented by LLVM functions.
- Qubits are represented as pointers to a named opaque structure type called
%Qubit.
While the QIR for the CreateBellPair operation is very simple, QIR inherits all of the capabilities of LLVM to express loops, conditionals, and other complex control flow. QIR also inherits LLVM’s ability to express arbitrary classical computation.
For more information, watch Microsoft’s developer session from the 2021 Q2B event.
Why is Quantum Intermediate Representation important?¶
QIR is an essential tool when running quantum algorithms on real hardware. But intermediate representations can play an important role even if you just want to develop algorithms at a more theoretical level.
For example, one application enabled by QIR is to use the Clang compiler, a C language front end for LLVM, to compile QIR into executable machine code for a classical target. This provides an easy path to building a simulator in C or C++ by implementing the quantum instructions, which could simplify the creation of quantum simulators.
Moreover, you could use the intermediate representation to generate code that is later provided as input into a quantum simulator – instead of a real device – which could potentially use a different language than the source code. In this way, you can easily compare and benchmark different languages or simulators using a common framework.
In terms of code optimization, there are optimization steps that can be performed at the intermediate level that can make the overall algorithm implementation more efficient. Investigating this optimization of your input code can help you get a better understanding of where to make algorithms more efficient and how to improve the quantum programming languages.
Another application is to use the standard LLVM “pass” infrastructure to create quantum code optimizers that operate on QIR. The language- and hardware-independent approach of QIR enables reusing those optimizers for different computation languages and computing platforms with almost no effort.
Understanding quantum oracles¶
An oracle, $O$, is an unexposed operation that is used as input to another algorithm. Often, such operations are defined using a classical function $f : \\{0, 1\\}^n \to \\{0, 1\\}^m$, which takes an $n$-bit binary input and produces an $m$-bit binary output. To do so, consider a particular binary input $x = (x_{0}, x_{1}, \dots, x_{n-1})$. You can label qubit states as $\ket{\vec{x}} = \ket{x_{0}} \otimes \ket{x_{1}} \otimes \cdots \otimes \ket{x_{n-1}}$.
You may first attempt to define $O$ so that $O\ket{x} = \ket{f(x)}$, but this method has a couple of problems. First, $f$ may have a different size of input and output ($n \ne m$), such that applying $O$ would change the number of qubits in the register. Second, even if $n = m$, the function may not be invertible: if $f(x) = f(y)$ for some $x \ne y$, then $O\ket{x} = O\ket{y}$ but $O^\dagger O\ket{x} \ne O^\dagger O\ket{y}$. This means you won't be able to construct the adjoint operation $O^\dagger$, and oracles have to have an adjoint defined for them.
Define an oracle by its effect on computational basis states¶
You can deal with both of these problems by introducing a second register of $m$ qubits to hold the answer. Then, define the effect of the oracle on all computational basis states: for all $x \in \\{0, 1\\}^n$ and $y \in \\{0, 1\\}^m$,
$$ \begin{align} O(\ket{x} \otimes \ket{y}) = \ket{x} \otimes \ket{y \oplus f(x)}. \end{align} $$
Now $O = O^\dagger$ by construction and you've resolved both of the earlier problems.
[!TIP] To see that $O = O^{\dagger}$, note that $O^2 = \mathbf{1}$ since $a \oplus b \oplus b = a$ for all $a, b \in \{0, 1\}$. As a result, $O \ket{x} \ket{y \oplus f(x)} = \ket{x} \ket{y \oplus f(x) \oplus f(x)} = \ket{x} \ket{y}$.
Importantly, defining an oracle this way for each computational basis state $\ket{x}\ket{y}$ also defines how $O$ acts for any other state. This behavior follows immediately from the fact that $O$, like all quantum operations, is linear in the state that it acts on. Consider the Hadamard operation, for example, which is defined by $H \ket{0} = \ket{+}$ and $H \ket{1} = \ket{-}$. If you wish to know how $H$ acts on $\ket{+}$, you can use that $H$ is linear,
$$ \begin{align} H\ket{+} & = \frac{1}{\sqrt{2}} H(\ket{0} + \ket{1}) = \frac{1}{\sqrt{2}} (H\ket{0} + H\ket{1}) \\\\ & = \frac{1}{\sqrt{2}} (\ket{+} + \ket{-}) = \frac12 (\ket{0} + \ket{1} + \ket{0} - \ket{1}) = \ket{0}. \end{align} $$
When defining the oracle $O$, you can similarly use that any state $\ket{\psi}$ on $n + m$ qubits can be written as
$$ \begin{align} \ket{\psi} & = \sum_{x \in \\{0, 1\\}^n, y \in \\{0, 1\\}^m} \alpha(x, y) \ket{x} \ket{y} \end{align} $$
where $\alpha : \\{0, 1\\}^n \times \\{0, 1\\}^m \to \mathbb{C}$ represents the coefficients of the state $\ket{\psi}$. Thus,
$$ \begin{align} O \ket{\psi} & = O \sum_{x \in \\{0, 1\\}^n, y \in \\{0, 1\\}^m} \alpha(x, y) \ket{x} \ket{y} \\\\ & = \sum_{x \in \\{0, 1\\}^n, y \in \\{0, 1\\}^m} \alpha(x, y) O \ket{x} \ket{y} \\\\ & = \sum_{x \in \\{0, 1\\}^n, y \in \\{0, 1\\}^m} \alpha(x, y) \ket{x} \ket{y \oplus f(x)}. \end{align} $$
Phase oracles¶
Alternatively, you can encode $f$ into an oracle $O$ by applying a phase based on the input to $O$. For example, you might define $O$ such that $$ \begin{align} O \ket{x} = (-1)^{f(x)} \ket{x}. \end{align} $$
If a phase oracle acts on a register initially in a computational basis state $\ket{x}$, then this phase is a global phase and hence not observable. But such an oracle can be a powerful resource if applied to a superposition or as a controlled operation. For example, consider a phase oracle $O_f$ for a single-qubit function $f$. Then, $$ \begin{align} O_f \ket{+} & = O_f (\ket{0} + \ket{1}) / \sqrt{2} \\\\ & = ((-1)^{f(0)} \ket{0} + (-1)^{f(1)} \ket{1}) / \sqrt{2} \\\\ & = (-1)^{f(0)} (\ket{0} + (-1)^{f(1) - f(0)} \ket{1}) / \sqrt{2} \\\\ & = (-1)^{f(0)} Z^{f(0) - f(1)} \ket{+}. \end{align} $$
[!NOTE] Note that $Z^{-1}=Z^{\dagger}=Z$ and therefore $Z^{f(0)-f(1)}=Z^{f(1)-f(0)}.$
More generally, both views of oracles can be broadened to represent classical functions, which return real numbers instead of only a single bit.
Choosing the best way to implement an oracle depends heavily on how this oracle is to be used within a given algorithm. For example, Deutsch-Jozsa algorithm relies on the oracle implemented in the first way, while Grover's algorithm relies on the oracle implemented in the second way.
For more information, see the discussion in Gilyén et al. 1711.00465.
The role of T gates and T factories in quantum computing¶
This article describes the role of T gates and T factories in fault tolerant quantum computing. Giving a quantum algorithm, the estimation of required resources for running the T gates and T factories becomes crucial to determine the feasibility of the algorithm. The Azure Quantum Resource Estimator calculates the number of T states needed to run the algorithm, the number of physical qubits for a single T factory, and the runtime of the T factory.
Universal set of quantum gates¶
According to DiVincenzo's criteria a scalable quantum computer must be able to implement a universal set of quantum gates. A universal set contains all the gates needed to perform any quantum computation, that is, any computation must decompose back into a finite sequence of universal gates. At a minimum, a quantum computer must be able to move single qubits to any position on the Bloch Sphere (using single-qubit gates), as well as introduce entanglement in the system, which requires a multi-qubit gate.
There are only four functions that map one bit to one bit on a classical computer. In contrast, there are an infinite number of unitary transformations on a single qubit on a quantum computer. Therefore, no finite set of primitive quantum operations or gates, can exactly replicate the infinite set of unitary transformations allowed in quantum computing. This means, unlike classical computing, it is impossible for a quantum computer to implement every possible quantum program exactly using a finite number of gates. Thus quantum computers cannot be universal in the same sense of classical computers. As a result, when we say that a set of gates is universal for quantum computing we actually mean something slightly weaker than we mean with classical computing.
For universality, it's required that a quantum computer only approximates every unitary matrix within a finite error using a finite length gate sequence.
In other words, a set of gates is a universal gate set if any unitary transformation can be approximately written as a product of gates from this set. It's required that for any prescribed error bound, there exist gates $G_{1}, G_{2}, \ldots, G_N$ from the gate set such that
$$ G_N G_{N-1} \cdots G_2 G_1 \approx U. $$
Because the convention for matrix multiplication is to multiply from right to left the first gate operation in this sequence, $G_N$, is actually the last one applied to the quantum state vector. More formally, such a gate set is said to be universal if for every error tolerance $\epsilon>0$ there exists $G_1, \ldots, G_N$ such that the distance between $G_N\ldots G_1$ and $U$ is at most $\epsilon$. Ideally the value of $N$ needed to reach this distance of $\epsilon$ should scale poly-logarithmically with $1/\epsilon$.
For example, the set formed by Hadamard, CNOT and T gates is a universal set, from which any quantum computation (on any number of qubits) can be generated. The Hadamard and the T gate set generates any single-qubit gate:
$$ H=\frac{1}{\sqrt{2}}\begin{bmatrix} 1 & 1 \\\\ 1 &-1 \end{bmatrix}, \qquad T=\begin{bmatrix} 1 & 0 \\\\ 0 & e^{i\pi/4} \end{bmatrix}. $$
In a quantum computer, quantum gates can be classified into two categories: Clifford gates and non-Clifford gates, in this case, the T gate. Quantum programs made from only Clifford gates can be simulated efficiently using a classical computer, and therefore, non-Clifford gates are required to obtain quantum advantage. In many quantum error correction (QEC) schemes the so-called Clifford gates are easy to implement, that is they require very few resources in terms of operations and qubits to implement fault tolerantly, whereas non-Clifford gates are quite costly when requiring fault tolerance. In a universal quantum gate set, the T gate is commonly used as the non-Clifford gate.
The standard set of single-qubit Clifford gates, included by default in Q#, include
$$ H=\frac{1}{\sqrt{2}}\begin{bmatrix} 1 & 1 \\\\ 1 &-1 \end{bmatrix} , \qquad S =\begin{bmatrix} 1 & 0 \\\\ 0 & i \end{bmatrix}= T^2, \qquad X=\begin{bmatrix} 0 &1 \\\\ 1& 0 \end{bmatrix}= HT^4H, $$
$$ Y = \begin{bmatrix} 0 & -i \\\\ i & 0 \end{bmatrix}=T^2HT^4 HT^6, \qquad Z=\begin{bmatrix}1&0\\\\ 0&-1 \end{bmatrix}=T^4. $$
Together with the non-Clifford gate (the T gate), these operations can be composed to approximate any unitary transformation on a single qubit.
T factories in the Azure Quantum Resource Estimator¶
The non-Clifford T gate preparation is crucial because the other quantum gates are not sufficient for universal quantum computation. To implement non-Clifford operations for practical-scale algorithms, low error rate T gates (or T states) is required. However, they can be difficult to directly implement on logical qubits, and can also be difficult for some physical qubits.
In a fault tolerant quantum computer, the required low error rate T states are produced using a T state distillation factory, or T factory for short. These T factories typically involve a sequence of rounds of distillation, where each round takes in many noisy T states encoded in a smaller distance code, processes them using a distillation unit, and outputs fewer less noisy T states encoded in a larger distance code, with the number of rounds, distillation units, and distances all being parameters which can be varied. This procedure is iterated, where the output T states of one round are fed into the next round as inputs.
Based on the duration of the T factory, the Azure Quantum Resource Estimator determines how often a T factory can be invoked before it exceeds the total runtime of the algorithm, and thus how many T states can be produced during the algorithm runtime. Usually, more T states are required than what can be produced within the invocations of a single T factory during the algorithm runtime. In order to produce more T states, the Resource Estimator uses copies of the T factories.
T factory physical estimation¶
The Resource Estimator calculates the total number of T states needed to run the algorithm, and the number of physical qubits for a single T factory and its runtime.
The goal is to produce all T states within the algorithm runtime with as few T factory copies as possible. The following diagram shows an example of the runtime of the algorithm and the runtime of one T factory. You can see that the runtime of the T factory is shorter than the runtime of the algorithm. In this example, one T factory can distill one T state. Two questions arise:
- How often can the T factory be invoked before the end of the algorithm?
- How many copies of the T factory distillation round are necessary to create the number of T states required during the algorithm's runtime?
Before the end of the algorithm, the T factory can be invoked eight times, which is called a distillation round. For example, if you need 30 T states, a single T factory is invoked eight times during runtime of the algorithm and thus, it creates eight T states. Then you need four copies of the T factory distillation round running in parallel to distill the 30 T states needed.
[!NOTE] Note that T factory copies and T factory invocations aren't the same.
The T state distillation factories are implemented in a sequence of rounds, where each round consists of a set of copies of distillation units running in parallel. The Resource Estimator calculates how many physical qubits are needed to run one T factory and for how long the T factory runs, among other required parameters.
You can only do full invocations of a T factory. Therefore, there may be situations in which the accumulated runtime of all T factory invocations is less than the algorithm runtime. Since qubits are reused by different rounds, the number of physical qubits for one T factory is the maximum number of physical qubits used for one round. The runtime of the T factory is the sum of the runtime in all rounds.
[!NOTE] If the physical T gate error rate is lower than the required logical T state error rate, the Resource Estimator cannot perform a good resource estimation. When you submit a resource estimation job, you may encounter that the T factory cannot be found because the required logical T state error rate is either too low or too high.
For more information, see Appendix C of Assessing requirements to scale to practical quantum advantage.