The researchers, from the University of Cambridge, programmed their robotic chef with a ‘cookbook’ of eight simple salad recipes. After watching a video of a human demonstrating one of the recipes, the robot was able to identify which recipe was being prepared and make it.
In addition, the videos helped the robot incrementally add to its cookbook. At the end of the experiment, the robot came up with a ninth recipe on its own. Their results, reported in the journal IEEE Access, demonstrate how video content can be a valuable and rich source of data for automated food production, and could enable easier and cheaper deployment of robot chefs.
Robotic chefs have been featured in science fiction for decades, but in reality, cooking is a challenging problem for a robot. Several commercial companies have built prototype robot chefs, although none of these are currently commercially available, and they lag well behind their human counterparts in terms of skill.
Human cooks can learn new recipes through observation, whether that’s watching another person cook or watching a video on YouTube, but programming a robot to make a range of dishes is costly and time-consuming.
“We wanted to see whether we could train a robot chef to learn in the same incremental way that humans can – by identifying the ingredients and how they go together in the dish,” said Grzegorz Sochacki from Cambridge’s Department of Engineering, the paper’s first author.
Sochacki, a PhD candidate in Professor Fumiya Iida’s Bio-Inspired Robotics Laboratory, and his colleagues devised eight simple salad recipes and filmed themselves making them. They then used a publicly available neural network to train their robot chef. The neural network had already been programmed to identify a range of different objects, including the fruits and vegetables used in the eight salad recipes (broccoli, carrot, apple, banana and orange).
Using computer vision techniques, the robot analysed each frame of video and was able to identify the different objects and features, such as a knife and the ingredients, as well as the human demonstrator’s arms, hands and face. Both the recipes and the videos were converted to vectors and the robot performed mathematical operations on the vectors to determine the similarity between a demonstration and a vector.
By correctly identifying the ingredients and the actions of the human chef, the robot could determine which of the recipes was being prepared. The robot could infer that if the human demonstrator was holding a knife in one hand and a carrot in the other, the carrot would then get chopped up.
Of the 16 videos it watched, the robot recognised the correct recipe 93% of the time, even though it only detected 83% of the human chef’s actions. The robot was also able to detect that slight variations in a recipe, such as making a double portion or normal human error, were variations and not a new recipe. The robot also correctly recognised the demonstration of a new, ninth salad, added it to its cookbook and made it.
“It’s amazing how much nuance the robot was able to detect,” said Sochacki. “These recipes aren’t complex – they’re essentially chopped fruits and vegetables, but it was really effective at recognising, for example, that two chopped apples and two chopped carrots is the same recipe as three chopped apples and three chopped carrots.”
The videos used to train the robot chef are not like the food videos made by some social media influencers, which are full of fast cuts and visual effects, and quickly move back and forth between the person preparing the food and the dish they’re preparing. For example, the robot would struggle to identify a carrot if the human demonstrator had their hand wrapped around it – for the robot to identify the carrot, the human demonstrator had to hold up the carrot so that the robot could see the whole vegetable.
“Our robot isn’t interested in the sorts of food videos that go viral on social media – they’re simply too hard to follow,” said Sochacki. “But as these robot chefs get better and faster at identifying ingredients in food videos, they might be able to use sites like YouTube to learn a whole range of recipes.”
The research was supported in part by Beko plc and the Engineering and Physical Sciences Research Council (EPSRC), part of UK Research and Innovation (UKRI).
Reference:
Grzegorz Sochacki et al. ‘Recognition of Human Chef’s Intentions for Incremental Learning of Cookbook by Robotic Salad Chef.’ IEEE Access (2023). DOI: 10.1109/ACCESS.2023.3276234
Researchers have trained a robotic ‘chef’ to watch and learn from cooking videos, and recreate the dish itself.
The text in this work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Images, including our videos, are Copyright ©University of Cambridge and licensors/contributors as identified. All rights reserved. We make our image and video content available in a number of ways – as here, on our main website under its Terms and conditions, and on a range of channels including social media that permit your use and sharing of our content under their respective Terms.