[ad_1]
Head over to our on-demand library to view sessions from VB Transform 2023. Register Here
In the current artificial intelligence (AI) landscape, the buzz around large language models (LLMs) has led to a race toward creating increasingly larger neural networks. However, not every application can support the computational and memory demands of very large deep learning models.
The constraints of these environments have led to some interesting research directions. Liquid neural networks, a novel type of deep learning architecture developed by researchers at the Computer Science and Artificial Intelligence Laboratory at MIT (CSAIL), offer a compact, adaptable and efficient solution to certain AI problems. These networks are designed to address some of the inherent challenges of traditional deep learning models.
Liquid neural networks can spur new innovations in AI and are particularly exciting in areas where traditional deep learning models struggle, such as robotics and self-driving cars.
What are liquid neural networks?
“The inspiration for liquid neural networks was thinking about the existing approaches to machine learning and considering how they fit with the kind of safety-critical systems that robots and edge devices offer,” Daniela Rus, the director of MIT CSAIL, told VentureBeat. “On a robot, you cannot really run a large language model because there isn’t really the computation [power] and [storage] space for that.”
Event
VB Transform 2023 On-Demand
Did you miss a session from VB Transform 2023? Register to access the on-demand library for all of our featured sessions.
Register Now
Rus and her collaborators wanted to create neural networks that were both accurate and compute-efficient so that they could run on the computers of a robot without the need to be connected to the cloud.
At the same time, they were inspired by the research on biological neurons found in small organisms, such as the C. Elegans worm, which performs complicated tasks with no more than 302 neurons. The result of their work was liquid neural networks (LNN).
Liquid neural networks represent a significant departure from traditional deep learning models. They use a mathematical formulation that is less computationally expensive and stabilizes neurons during training. The key to LNNs’ efficiency lies in their use of dynamically adjustable differential equations, which allows them to adapt to new situations after training. This is a capability not found in typical neural networks.
“Basically what we do is increase the representation learning capacity of a neuron over existing models by two insights,” Rus said. “First is a kind of a well-behaved state space model that increases the neuron stability during learning. And then we introduce nonlinearities over the synaptic inputs to increase the expressivity of our model during both training and inference.”
LNNs also use a wiring architecture that is different from traditional neural networks and allows for lateral and recurrent connections within the same layer. The underlying mathematical equations and the novel wiring architecture enable liquid networks to learn continuous-time models that can adjust their behavior dynamically.
“This model is very interesting because it is able to be dynamically adapted after training based on the inputs it sees,” Rus said. “And the time constants that it observes are dependent on the inputs that it sees, and so we have much more flexibility and adaptation through this formulation of the neuron.”
The advantages of liquid neural networks
One of the most striking features of LNNs is their compactness. For example, a classic deep neural network requires around 100,000 artificial neurons and half a million parameters to perform a task such as keeping a car in its lane. In contrast, Rus and her colleagues were able to train an LNN to accomplish the same task with just 19 neurons.
This significant reduction in size has several important consequences, Rus said. First, it enables the model to run on small computers found in robots and other edge devices. And second, with fewer neurons, the network becomes much more interpretable. Interpretability is a significant challenge in the field of AI. With traditional deep learning models, it can be difficult to understand how the model arrived at a particular decision.
“When we only have 19 neurons, we can extract a decision tree that corresponds to the firing patterns and essentially the decision-making flow in the system with 19 neurons,” Rus said. “We cannot do that for 100,000 or more.”
Another challenge that LNNs address is the issue of causality. Traditional deep learning systems often struggle with understanding causal relationships, leading them to learn spurious patterns that are not related to the problem they are solving. LNNs, on the other hand, appear to have a better grasp of causal relationships, allowing them to better generalize to unseen situations.
For instance, the researchers at MIT CSAIL trained LNNs and several other types of deep learning models for object detection on a stream of video frames taken in the woods in summer. When the trained LNN was tested in a different setting, it was still able to perform the task with high accuracy. In contrast, other types of neural networks experienced a significant performance drop when the setting changed.
“We observed that only the liquid networks were able to still complete the task in the fall and in the winter because these networks focus on the task, not on the context of the task,” Rus said. “The other models did not succeed at solving the task, and our hypothesis is that it’s because the other models rely a lot on analyzing the context of the test, not just the task.”
Attention maps extracted from the models show that LNNs give higher values to the main focus of the task, such as the road in driving tasks, and the target object in the object detection task, which is why it can adapt to the task when the context changes. Other models tend to spread their attention to irrelevant parts of the input.
“Altogether, we have been able to achieve much more adaptive solutions because you can train in one environment and then that solution, without further training, can be adapted to other environments,” Rus said.
The applications and limitations of liquid neural networks
LNNs are primarily designed to handle continuous data streams. This includes video streams, audio streams, or sequences of temperature measurements, among other types of data.
“In general, liquid networks do well when we have time series data … you need a sequence in order for liquid networks to work well,” Rus said. “However, if you try to apply the liquid network solution to some static database like ImageNet, that’s not going to work so well.”
The nature and characteristics of LNNs make them especially suitable for computationally constrained and safety-critical applications such as robotics and autonomous vehicles, where data is continuously fed to machine learning models.
The MIT CSAIL team has already tested LNNs in single-robot settings, where they have shown promising results. In the future, they plan to extend their tests to multi-robot systems and other types of data to further explore the capabilities and limitations of LNNs.
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.
[ad_2]
Source link