This research aims to increase our understanding and our mathematical control of “natural” (i.e.”spontaneous/emergent”) information processing skills shown by Artificial Intelligence (AI), namely by neural networks and learning machines. Indeed AI is experiencing a “magic moment” as finally theorists have been overwhelmed by “big data” that can be used to train these networks and we can check their capabilities concretely.
Among a plethora of variations on theme, in particular a bulk of algorithms overall termed “Deep Learning” is showing impressive successes in several fields, ranging from scientific applications (e.g. statistical learning and feature extraction from high dimensional data for health care) to more applied ones (e.g. image and video processing and/or natural language processing). As an immediate consequece of these recent great triumphs (see e.g. [1]), the quest for a deeper (mathematical) control on these systems is continuously raising and we aim to contribute to construct a “rationale” for Deep Learning by taking advantage of methods and techniques typical of Theoretical Physics.
Indeed in the past decades Theoretical Physics has been heavily involved in the mathematical formalization of the emergent/spontaneous properties shown by neural networks as, for instance, distributed memory, pattern classification, feature extraction, multitasking capabilities and much more. The bulk of contributions came from Statistical Mechanics and Stochastic Processes: the former (Statistical Mechanics) has been used to paint the “phase diagrams” of crucial networks in statistical learning (e.g. restricted Boltzmann machines) as well as in pattern recognition (e.g. Hopfield neural networks) while the latter (Stochastic Processes) has been naturally adapted to describe the dynamical evolution of the (artificial) neurons and synapses building up the aforementioned networks.
We have proved that the mathematical framework(s) stemming from Theoretical Physics can be enlarged in order to include both classical and relativistic mechanics too: in particular, for these models (i.e. Boltzmann machines and Hopfield networks), the variational principle usually underlying any minimization of a cost function (in their learning/retrieval algorithms) can be shown to coincide sharply with the Least Action Principle: as a natural consequence, the equations for the evolution of the order parameters (e.g. Mattis overlaps with the stored patterns, etc.) in the space of the tunable parameters (e.g. noise level, load of the net, etc..) do coincide naturally with the equations of motion as prescribed by Lagrangian Mechanics in Physics and this allows drawing a number of conclusions.
At first, this bridge between the mathematics involved in a rationale for AI and Lagrangian mechanics allows to import an arsenal of mathematical weapons ready to be used by researchers working in machine learning and neural networks: for instance dynamical instabilities of the network’s evolution can now be inspected by classical Hopf bifurcation theory and conserved quantities -if present- can be studied by inspecting symmetries à la Noether.
Then, focusing on the priorities in AI research, that is Deep Learning, through the perspective we offer to tackle the problem, it shines clearly that Boltzmann machines and Hopfield nets play solely as the “classical limit” (storing just pairwise correlation functions of the learnt/retrieved patterns of information) of a much broader theory (i.e., the relativistic extension), where all the higher order correlations functions are properly accounted. It is worth pointing out that the relativistic generalization shows several “deep-learning-like” skills: beyond the development of the general mechanic approach to neural networks, in the paper we have extensively shown (both analytically and numerically) how the relativistic extension outperforms w.r.t. the “classical limit”.
Next steps in this branch of research will be achieved by a systematic exploration of the proposed relation among AI and Lagrangian Mechanics with the hope that this analogy can act as a little Pandora box: we plan to report soon our findings.
Reference:
- LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. “Deep learning.” nature 521.7553 (2015): 436.