Quantum computing is a buzz-word that’s been thrown around quite a bit. Unfortunately, despite its virality in pop culture and quasi-scientific Internet communities, its capabilities are still quite limited.
As a very new field, quantum computing presents a complete paradigm shift to the traditional model of classical computing. Classical bits — which can be 0 or 1 — are replaced in quantum computing with qubits, which instead holds the value of a probability.
Relying on the quirks of physics at a very, very small level, a qubit is forced into a state of 0 or 1 with a certain probability each time it is measured. For instance, if a qubit is in a state of 0.85:0.15
, we would expect it to measure zero about 85% of the time, and one 15% of the time.
Although quantum computing still has a long way to go, machine learning is an especially promising potential avenue. To get a simple grasp of the computing power quantum computing could offer, consider this:
- A qubit can hold both 0 and 1. So, two qubits can hold four values together — values for the states 00, 01, 10, and 11 — and three qubits can hold eight, and so on.
- Hence — at least, theoretically — it takes 2ⁿ bits to represent the information stored in n qubits.
Beyond this, the fluid probabilistic nature of quantum circuits may offer unique advantages to deep learning, which gains its power from the probabilistic flow and transformation of information through networks.
Quantum machine learning is catching on. TensorFlow, Google’s popular deep learning framework, relatively recently launched TensorFlow Quantum .
This article will introduce quantum variations of three machine learning methods and algorithms: transfer learning, k-means, and the convolutional neural network. It will attempt to do so with as little quantum knowledge as needed, and to demonstrate some important considerations when designing quantum applications of machine learning.
Quantum Transfer Learning
Transfer learning is perhaps one of the biggest successes of deep learning. Given that deep learning models take a tremendous amount of time to train, transfer learning offers a valuable way to speed up training time. Furthermore, the model often arrives at a better solution using transfer solution than if it were trained from scratch.
As an idea, transfer learning is relatively simple — a “base model”, which shall be denoted A, is trained on a generic task. Then, an additional block of layers, which shall be denoted B, is appended to A. Often, the last few layers of A will be chopped off before B is added. Afterwards, the model is “fine-tuned” on the specific dataset, where A’ (the modified A) provides a filter of sorts for B to extract meaningful information relevant to the specific task at hand.
We can formalize this idea of building a “hybrid neural network” for transfer learning as follows:
- Train a generic network A on a generic dataset to perform a generic task (predict a certain label).
- Take a section A’ of the generic network A, and attach a new block B to A’. While A’ is pre-trained and hence should be freezed (made untrainable), B is trainable.
- Train this A’B hybrid model on a specific dataset to perform a specific task.
Given that there are two components, A’ and B, and each component can be a classical or quantum network, there are four possible types of hybrid neural networks.
- Classical-to-classical (CC). The traditional view of transfer learning, .
- Classical-to-quantum (CQ). The classical pre-trained network acts as a filter for the quantum network to use. This method’s practicality is particularly alluring.
- Quantum-to-classical (QC). The quantum pre-trained network acts as a filter for the classical network to use. Perhaps this will be more plausible in the future when quantum computing develops more.
- Quantum-to-quantum (QQ). A completely quantum hybrid network. Likely implausible on a feasible level now, but perhaps will be promising later.
Classical-to-quantum networks are particularly interesting and practical, as large input samples are preprocessed and thinned down to only the most important features. These information-features can then be post-processed by quantum circuits, which — at the current stage of development — can take in significantly less features than classical networks.
On the other hand, quantum-to-classical networks treat the quantum system as the feature extractor, and a classical network is used to further post-process these extracted features. There are two use cases for QC networks.
- The dataset consists of quantum states. For instance, if some information about a quantum state needs to be predicted, the quantum feature-extractor would seem to be the right tool to process the inputs. Alternatively, quantum-mechanical systems like molecules and superconductors can benefit from a quantum feature extractor.
- A very good quantum computer outperforms classical feature extractors.
In tests, the authors find that these quantum-classical hybrid models can attain similar scores to standard completely-classical networks. Given how early quantum computing is, this is indeed promising news.
Quantum Convolutional Neural Networks
Convolutional neural networks have become commonplace in image recognition, along with other use-cases, like signal processing. The size of these networks continues to grow, though, and quantum computing could offer a heavy speedup over classical machine learning methods.
The QCNN algorithm is highly similar to the classical CNN algorithm. However, it’s quite interesting to see some of the other considerations and changes implemented to allow for the quantum method.
First, note that quantum circuits require quantum random access memory, or QRAM. This acts like RAM, but the address and output registers consist of qubits, rather than bits. It was developed such that the time to insert, update, or delete any entry in the memory is O(log²(n))
.
Consider the forward pass for a convolutional “block” of the design, which is similar to that of a classical CNN, but slightly different.
- Perform the quantum convolution. This is where the quantum operation occurs. This is done in QRAM, and a nonlinearity is applied.
- Quantum sampling. Perform a sampling such that all positions and values can be obtained if their exact value is known with a high probability. Hence, the probabilistic qubit value gets “converted” into a classical form. This is known as quantum tomography.
- QRAM update and pooling. The QRAM needs to be updated, and pooling is done — like the convolution — in the QRAM structure.
The sampling step is the main difference between a classical and quantum forward step — often sampling is needed for practical purposes in quantum algorithms both for performance (because of the easily altered and sensitive nature of quantum calculations) and speed.
The speedup of the forward pass for Quantum CNNs compared to classical ones is —
- Exponential in the number of kernels
- Quadratic on the dimensions of the input
That’s a big speedup!
This sampling step, however, comes at the restriction that the nonlinear function must be bounded — it’s difficult, especially in the quantum world, to sample from infinitely large possible spaces. So, the ReLU function may be redefined as being capped at y = 1, such that it looks more like a flat version of the sigmoid function.
This indeed is a drawback of sampling, and an interesting demonstration of the tradeoffs present in using quantum algorithms.
Q-Means
To begin with, unlabeled data is flooding the data space at an unprecedented rate. Labels are expensive; there is a need to deal with unlabeled data in an effective and efficient way. Quantum computing can offer a significant speedup over traditional classical unsupervised learning algorithms, which has large implications for dealing with this flow of unsupervised information.
The traditional classical k-means algorithm is commonly used for clustering. Using repeated alternation between two steps, the algorithm returns the locations of the “centroids” (center of each cluster):
- Label assignment. Each data point is assigned the label of the closest centroid. (Centroid locations are randomly set initially.)
- Centroid estimation. Update each centroid to be the average of the data points assigned to the corresponding cluster.
Consider, now, δ-k-means, which can be thought of as a noisy — but still classical — version of k-means. Assume δ is a preset parameter. The algorithm alternates between the same two steps, with some added noise:
- Label assignment. Each data point is assigned a random centroid whose distance is less than δ. That is, any centroid whose distance from the data point is less than a threshold has an equal chance of assignment.
- Centroid estimation. During the calculation of the location of each centroid, add δ/2 Gaussian noise.
Lastly, consider q-means, which is a truly quantum variant of k-means. As a quick prerequisite, recall that qubits contain probabilities; this makes them especially prone to measurement errors and noise from the environment, as opposed to bits.
- Label assignment. Estimate via quantum methods the distance between each data point and the centroid. Because of noise, this quantum distance estimation will have a certain level of noise. Then, assign each data point to a centroid.
- Centroid estimation. Using the same quantum tomography idea discussed in the sampling step of the QCNN method, states that can be measured correctly with a high probability are “converted” into classical form. There is, again, a certain level of noise inherent in this operation.
q-means seems very similar to k-means. The difference, though, is the noise; the introduction of δ-k-means acts as the “classical version” of q-means that captures that element of noise. The proposers behind q-means prove that analyzing δ-k-means can reveal information about how the q-means algorithm runs.
For instance, the δ-k-means algorithm often converges to a clustering that achieves a similar, if not better, accuracy than the k-means algorithm, when the (non-zero) value of δ is selected appropriately. Thus — while there is less freedom in choosing the amount of noise in the quantum variant — one can expect q-means to perform reasonably well to k-means.
Similarly, the δ-k-means algorithm is polylogarithmic in its running time. The q-means algorithm, then, is also polylogarithmic, a speedup over the k-means algorithm allowed for by introducing some error and relaxing stricter and more precise calculations.
Currently, q-means is too complex for quantum simulators nor quantum computers to test. However, via the δ-k-means algorithm, there is empirical evidence that q-means can perform generally at a similar level to k-means.
What is the purpose of quantum clustering, then? Further research may allow for clustering of quantum states or data, as well as spatial clustering of molecules and other very small phenomena — a very important task. In general, quantum methods seem to have some potential to surpass classical methods at traditional tasks as well.
Thank you for sharing wonderful information with us to get some idea about that content.
ReplyDeleteBest AWS Training Online
AWS Online Training Course
Thanks for the sharing information about machine learning. If anyone interest in you can check out.
ReplyDeleteB. Tech CSE with cloud computing course admission