EEUM researchers prove the impact of quantum variational circuits in the “policy-based” context

HasLab – INESC TEC UMinho researchers have developed a study that aims to show the feasibility and usefulness of variational quantum circuits in terms of reinforcement learning, using quantum circuits in the central processing unit responsible for automatic decision-making. This is one of the first research papers to prove the impact of variational quantum circuits on the policy-based context, and it was recently published in the journal Quantum Machine Intelligence.

The use of variational quantum circuits in machine learning is frequent, starting with machines capable of automatic pattern recognition in data, but also in terms of informed decision-making. However, most of the results obtained in research on the use of these circuits focus mainly on supervised or generative learning, with little literature on the impact of variational circuits – or even quantum computing – in the context of reinforcement learning.

This way, a group of researchers from HasLab – INESC TEC UMinho sought, on the one hand, to demonstrate the usefulness of said circuits in the domain of reinforcement learning (particularly in the policy-based context), and, on the other hand, to establish a comparison with the use of classical models, e.g., neural networks.

“This paper shows that variational circuits can indeed be used as processing units in the context of reinforcement learning, both in applications with classical and purely quantum data”, explained André Sequeira, a researcher at HasLab – INESC TEC UMinho and student at the School of Engineering of the University of Minho.

The applicability of variational quantum circuits to machines and real prolems is still being studied, especially in the field of state control and preparation, but also in the classical domain. André Sequeira pointed out that said algorithms may have a “transformative impact on several areas, like the financial sector or recommendation systems”.

Another relevant result confirms that the models have a significantly lower number of associated trainable parameters, when compared to the neural networks traditionally used in the problems that served as reference to this research. The same researcher added that, by using the theory, it’s possible to prove that these algorithms need a few samples that scale logarithmically with the total number of trainable parameters.

In other words, as problems become larger, they require equivalent quantum circuits, with more parameters to be optimised. The theory has shown that it is possible to train these models efficiently regarding the number of samples — which increase according to the size of the circuit, in a logarithmic (and not proportional or even exponential) way.

The scientific paper that stemmed from this research is one of the first in the scientific community focusing on this area of knowledge, showing the impact of variational quantum circuits on the policy-based context of reinforcement learning. The same applies to quantum computing, which is another case study that confirms the feasibility of the paradigm of variational quantum algorithms as an alternative to a purely quantum solutions — associated with long and hardly executable circuits.

The research Policy Gradients Using Variational Quantum Circuits was published in the Springer group’s journal Quantum Machine Intelligence, recognised as a Q1-level journal. This work was developed by André Sequeira, researcher at INESC TEC High-Assurance Software Laboratory (HASLab), Luís Paulo Santos and Luís Soares Barbosa – also HasLab – INESC TEC UMinho researchers and professors at the School of Engineering of the University of Minho.

The researchers mentioned in this news piece are associated with INESC TEC and UMinho.