SCARA Robots for fast-cycle automation in a Smart Factory Environment
Top 5 ways to make better AI with less data
1. Transfer Learning
Transfer learning is used a lot in machine learning now since the benefits are big. The general idea is simple. You train a big neural network for purposes with a lot of data and a lot of training. When you then have a specific problem you sort of “cut the end off” the big network and train a few new layers with your own data. The big network already understands a lot of general patterns that you with transfer learning don’t have to teach the network this time.
A good example is if you try to train a network to recognize images of different dog species. Without transfer learning you need a lot of data, maybe a 100.000 images of different dog species since the network has to learn everything from scratch. If you train a new model with transfer learning you might only need 50 images of every species.
You can read more about Transfer Learning here.
2. Active learning
Active learning is a data collection strategy that enables you to pick the data that your AI models would benefit the most from when training. Let’s stick with the dog species example. You have trained a model that can differentiate between different species but for some reason the model always has trouble identifying the german shepherds. With an active learning strategy you would automatically or at least with an established process pick out these images and send them for labelling.
I made a longer post about how active learning works here.
3. Better data
I’ve put in a strategy here that might sound obvious but is sometimes overlooked. With better quality data you often need way less data since the AI does not have to train through the same amount of noise and wrong signals. In the media AI is often talked about as “with a lot of data you can do anything”. But in many cases making an extra effort to get rid of bad data and make sure that only correctly labeled data is used for training, makes more sense than going for more data.
4. GAN’s
GAN’s or Generative Adversarial Networks is a way to build neural networks that sounds almost futuristic in it’s design. Basically this kind of neural network is built by having two networks compete against each other in a game where one network creates new fake training data examples from the data set and the other is trying to guess what is fake and what is real data. The network building fake data is called the generator and the network trying to guess what is fake and what is real is called the discriminator. This is a deep learning approach and both networks keep improving during the game. When the generator is so good at generating fake data that the discriminator is consistently having problems separating fake from real we have a finished model.
For GAN’s you still need a lot of data but you don’t need as labelled data and since it’s usually the labelling that is the costly part you can save time and on your data with this approach.
5. Probabilistic Programming
One of my very favorite technologies. Probabilistic programming has a lot of benefits and one of them is that you can often get away with using less data. The reason is simply that you build “priors” into your models. That means that you can code your domain knowledge into the model and let data take it from there. In more many other machine learning approaches everything has to be learned by the model from scratch no matter how obvious it is.
A good example here is document data capture models. In many cases the data we are looking for is obvious by the keyword to the left of it. Like “ID number: #number# is a common format. With probabilistic programming you can tell the model before training that you expect the data to be to the right of the keyword. Many neural networks are taught from scratch requiring more data.
You can also read more about probabilistic programming here: https://www.danrose.ai/blog/63qy8s3vwq8p9mogsblddlti4yojon
Japanese grocery chain testing remotely controlled robot stockers
PufferBot: A flying robot with an expandable body
PufferBot: A flying robot with an expandable body
Helping robots avoid collisions
How to Simplify the Integration of Robotic Automation
Without Micromotors Global Logistics Would Be Lost
Covid-19 – the accelerator to smart logistics, smart transportation and smart supply chain services
Recurrent Neural Networks: building GRU cells VS LSTM cells in Pytorch
Robot takes contact-free measurements of patients’ vital signs
By Anne Trafton
During the current coronavirus pandemic, one of the riskiest parts of a health care worker’s job is assessing people who have symptoms of Covid-19. Researchers from MIT, Boston Dynamics, and Brigham and Women’s Hospital hope to reduce that risk by using robots to remotely measure patients’ vital signs.
The robots, which are controlled by a handheld device, can also carry a tablet that allows doctors to ask patients about their symptoms without being in the same room.
“In robotics, one of our goals is to use automation and robotic technology to remove people from dangerous jobs,” says Henwei Huang, an MIT postdoc. “We thought it should be possible for us to use a robot to remove the health care worker from the risk of directly exposing themselves to the patient.”
Using four cameras mounted on a dog-like robot developed by Boston Dynamics, the researchers have shown that they can measure skin temperature, breathing rate, pulse rate, and blood oxygen saturation in healthy patients, from a distance of 2 meters. They are now making plans to test it in patients with Covid-19 symptoms.
“We are thrilled to have forged this industry-academia partnership in which scientists with engineering and robotics expertise worked with clinical teams at the hospital to bring sophisticated technologies to the bedside,” says Giovanni Traverso, an MIT assistant professor of mechanical engineering, a gastroenterologist at Brigham and Women’s Hospital, and the senior author of the study.
The researchers have posted a paper on their system on the preprint server techRxiv, and have submitted it to a peer-reviewed journal. Huang is one of the lead authors of the study, along with Peter Chai, an assistant professor of emergency medicine at Brigham and Women’s Hospital, and Claas Ehmke, a visiting scholar from ETH Zurich.
Measuring vital signs
When Covid-19 cases began surging in Boston in March, many hospitals, including Brigham and Women’s, set up triage tents outside their emergency departments to evaluate people with Covid-19 symptoms. One major component of this initial evaluation is measuring vital signs, including body temperature.
The MIT and BWH researchers came up with the idea to use robotics to enable contactless monitoring of vital signs, to allow health care workers to minimize their exposure to potentially infectious patients. They decided to use existing computer vision technologies that can measure temperature, breathing rate, pulse, and blood oxygen saturation, and worked to make them mobile.
To achieve that, they used a robot known as Spot, which can walk on four legs, similarly to a dog. Health care workers can maneuver the robot to wherever patients are sitting, using a handheld controller. The researchers mounted four different cameras onto the robot — an infrared camera plus three monochrome cameras that filter different wavelengths of light.
The researchers developed algorithms that allow them to use the infrared camera to measure both elevated skin temperature and breathing rate. For body temperature, the camera measures skin temperature on the face, and the algorithm correlates that temperature with core body temperature. The algorithm also takes into account the ambient temperature and the distance between the camera and the patient, so that measurements can be taken from different distances, under different weather conditions, and still be accurate.
Measurements from the infrared camera can also be used to calculate the patient’s breathing rate. As the patient breathes in and out, wearing a mask, their breath changes the temperature of the mask. Measuring this temperature change allows the researchers to calculate how rapidly the patient is breathing.
The three monochrome cameras each filter a different wavelength of light — 670, 810, and 880 nanometers. These wavelengths allow the researchers to measure the slight color changes that result when hemoglobin in blood cells binds to oxygen and flows through blood vessels. The researchers’ algorithm uses these measurements to calculate both pulse rate and blood oxygen saturation.
“We didn’t really develop new technology to do the measurements,” Huang says. “What we did is integrate them together very specifically for the Covid application, to analyze different vital signs at the same time.”
Continuous monitoring
In this study, the researchers performed the measurements on healthy volunteers, and they are now making plans to test their robotic approach in people who are showing symptoms of Covid-19, in a hospital emergency department.
While in the near term, the researchers plan to focus on triage applications, in the longer term, they envision that the robots could be deployed in patients’ hospital rooms. This would allow the robots to continuously monitor patients and also allow doctors to check on them, via tablet, without having to enter the room. Both applications would require approval from the U.S. Food and Drug Administration.
The research was funded by the MIT Department of Mechanical Engineering and the Karl van Tassel (1925) Career Development Professorship, and Boston Dynamics.
Medical robotic hand? Rubbery semiconductor makes it possible
Can RL from pixels be as efficient as RL from state?
By Misha Laskin, Aravind Srinivas, Kimin Lee, Adam Stooke, Lerrel Pinto, Pieter Abbeel
A remarkable characteristic of human intelligence is our ability to learn tasks quickly. Most humans can learn reasonably complex skills like tool-use and gameplay within just a few hours, and understand the basics after only a few attempts. This suggests that data-efficient learning may be a meaningful part of developing broader intelligence.
On the other hand, Deep Reinforcement Learning (RL) algorithms can achieve superhuman performance on games like Atari, Starcraft, Dota, and Go, but require large amounts of data to get there. Achieving superhuman performance on Dota took over 10,000 human years of gameplay. Unlike simulation, skill acquisition in the real-world is constrained to wall-clock time. In order to see similar breakthroughs to AlphaGo in real-world settings, such as robotic manipulation and autonomous vehicle navigation, RL algorithms need to be data-efficient — they need to learn effective policies within a reasonable amount of time.
To date, it has been commonly assumed that RL operating on coordinate state is significantly more data-efficient than pixel-based RL. However, coordinate state is just a human crafted representation of visual information. In principle, if the environment is fully observable, we should also be able to learn representations that capture the state.
Recent advances in data-efficient RL
Recently, there have been several algorithmic advances in Deep RL that have improved learning policies from pixels. The methods fall into two categories: (i) model-free algorithms and (ii) model-based (MBRL) algorithms. The main difference between the two is that model-based methods learn a forward transition model $p(s_{t+1}|,s_t,a_t)$ while model-free ones do not. Learning a model has several distinct advantages. First, it is possible to use the model to plan through action sequences, generate fictitious rollouts as a form of data augmentation, and temporally shape the latent space by learning a model.
However, a distinct disadvantage of model-based RL is complexity. Model-based methods operating on pixels require learning a model, an encoding scheme, a policy, various auxiliary tasks such as reward prediction, and stitching these parts together to make a whole algorithm. Visual MBRL methods have a lot of moving parts and tend to be less stable. On the other hand, model-free methods such as Deep Q Networks (DQN), Proximal Policy Optimization (PPO), and Soft Actor-Critic (SAC) learn a policy in an end-to-end manner optimizing for one objective. While traditionally, the simplicity of model-free RL has come at the cost of sample-efficiency, recent improvements have shown that model-free methods can in fact be more data-efficient than MBRL and, more surprisingly, result in policies that are as data efficient as policies trained on coordinate state. In what follows we will focus on these recent advances in pixel-based model-free RL.
Why now?
Over the last few years, two trends have converged to make data-efficient visual RL possible. First, end-to-end RL algorithms have become increasingly more stable through algorithms like the Rainbow DQN, TD3, and SAC. Second, there has been tremendous progress in label-efficient learning for image classification using contrastive unsupervised representations (CPCv2, MoCo, SimCLR) and data augmentation (MixUp, AutoAugment, RandAugment). In recent work from our lab at BAIR (CURL, RAD), we combined contrastive learning and data augmentation techniques from computer vision with model-free RL to show significant data-efficiency gains on common RL benchmarks like Atari, DeepMind control, ProcGen, and OpenAI gym.
Contrastive Learning in RL Setting
CURL was inspired by recent advances in contrastive representation learning in computer vision (CPC, CPCv2, MoCo, SimCLR). Contrastive learning aims to maximize / minimize similarity between two similar / dissimilar representations of an image. For example, in MoCo and SimCLR, the objective is to maximize agreement between two data-augmented versions of the same image and minimize it between all other images in the dataset, where optimization is performed with a Noise Contrastive Estimation loss. Through data augmentation, these representations internalize powerful inductive biases about invariance in the dataset.
In the RL setting, we opted for a similar approach and adopted the momentum contrast (MoCo) mechanism, a popular contrastive learning method in computer vision that uses a moving average of the query encoder parameters (momentum) to encode the keys to stabilize training. There are two main differences in setup: (i) the RL dataset changes dynamically and (ii) visual RL is typically performed on stacks of frames to access temporal information like velocities. Rather than separating contrastive learning from the downstream task as done in vision, we learn contrastive representations jointly with the RL objective. Instead of discriminating across single images, we discriminate across the stack of frames.
By combining contrastive learning with Deep RL in the above manner we found, for the first time, that pixel-based RL can be nearly as data-efficient as state-based RL on the DeepMind control benchmark suite. In the figure below, we show learning curves for DeepMind control tasks where contrastive learning is coupled with SAC (red) and compared to state-based SAC (gray).
We also demonstrate data-efficiency gains on the Atari 100k step benchmark. In this setting, we couple CURL with an Efficient Rainbow DQN (Eff. Rainbow) and show that CURL outperforms the prior state-of-the-art (Eff. Rainbow, SimPLe) on 20 out of 26 games tested.
RL with Data Augmentation
Given that random cropping was a crucial component in CURL, it is natural to ask — can we achieve the same results with data augmentation alone? In Reinforcement Learning with Augmented Data (RAD), we performed the first extensive study of data augmentation in Deep RL and found that for the DeepMind control benchmark the answer is yes. Data augmentation alone can outperform prior competing methods, match, and sometimes surpass the efficiency of state-based RL. Similar results were also shown in concurrent work – DrQ.
We found that RAD also improves generalization on the ProcGen game suite, showing that data augmentation is not limited to improving data-efficiency but also helps RL methods generalize to test-time environments.
If data augmentation works for pixel-based RL, can it also improve state-based methods? We introduced a new state-based augmentation — random amplitude scaling — and showed that simple RL with state-based data augmentation achieves state-of-the-art results on OpenAI gym environments and outperforms more complex model-free and model-based RL algorithms.
Contrastive Learning vs Data Augmentation
If data augmentation with RL performs so well, do we need unsupervised representation learning? RAD outperforms CURL because it only optimizes for what we care about, which is the task reward. CURL, on the other hand, jointly optimizes the reinforcement and contrastive learning objectives. If the metric
used to evaluate and compare these methods is the score attained on the task at hand, a method that purely focuses on reward optimization is expected to be better as long as it implicitly ensures similarity consistencies on the augmented views.
However, many problems in RL cannot be solved with data augmentations alone. For example, RAD would not be applicable to environments with sparse-rewards or no rewards at all, because it learns similarity consistency implicitly through the observations coupled to a reward signal. On the other hand, the contrastive learning objective in CURL internalizes invariances explicitly and is therefore able to learn semantic representations from high dimensional observations gathered from any rollout regardless of the reward signal. Unsupervised representation learning may therefore be a better fit for real-world tasks, such as robotic manipulation, where the environment reward is more likely to be sparse or absent.
This post is based on the following papers:
-
CURL: Contrastive Unsupervised Representations for Reinforcement Learning
Michael Laskin*, Aravind Srinivas*, Pieter Abbeel
Thirty-seventh International Conference Machine Learning (ICML), 2020.
arXiv, Project Website -
Reinforcement Learning with Augmented Data
Michael Laskin*, Kimin Lee*, Adam Stooke, Lerrel Pinto, Pieter Abbeel, Aravind Srinivas
arXiv, Project Website
References
- Hafner et al. Learning Latent Dynamics for Planning from Pixels. ICML 2019.
- Hafner et al. Dream to Control: Learning Behaviors by Latent Imagination. ICLR 2020.
- Kaiser et al. Model-Based Reinforcement Learning for Atari. ICLR 2020.
- Lee et al. Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model. arXiv 2019.
- Henaff et al. Data-Efficient Image Recognition with Contrastive Predictive Coding. ICML 2020.
- He et al. Momentum Contrast for Unsupervised Visual Representation Learning. CVPR 2020.
- Chen et al. A Simple Framework for Contrastive Learning of Visual Representations. ICML 2020.
- Kostrikov et al. Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels. arXiv 2020.
This article was initially published on the BAIR blog, and appears here with the authors’ permission.
What is artificial intelligence? – The pragmatic definition
There’s no definitive or academically correct definition of artificial intelligence. The common ones usually define AI as computer models that perform tasks in a human like manner and simulate intelligent behaviour.
The truth is that for each expert on the field of AI you ask about the definition, you will get some new variation of the definition. In some cases you might even get a very religious schooling on how it’s only AI if the system contains deep learning models. Do you ask many startup founders if their systems contain AI even though it’s just a very simple regression algorithm like we have had in Excel for ages. In fact a VC fund recently found that less than half of startups claiming to be AI-startups actually have any kind of AI.
So when is something AI? My definition is pretty pragmatic. For me AI has a set of common features that if present then I would define it as AI. Especially when working with AI these features will be very much in your face. In other words, if it looks like a duck, and it swims like a duck, and quacks like a duck, then it probably is a duck. The same goes for AI.
The common features I see:
- The system simulates a human act or task.
- The learning comes from examples instead of instructions. The entails the need for data.
- Before the system is ready you cannot predict that input X will give you exactly output Y. Only just about.
So to me it’s not really important what underlying algorithm does the magic. If the common features are present then it will feel the same to use and to develop and the challenges you will face, will be the same.
It might seem like I’m with this definition taking AI to a simpler and less impactful place but this is just a tiny scratch in the surface. The underlying human, technical and organizational challenges are enormous when working with AI.