Computerworld

Military-funded robots can learn by watching YouTube

So far, the robots are only teaching themselves to cook
  • Tim Hornyak (IDG News Service)
  • 30 January, 2015 23:22
University of Maryland computer scientist Yiannis Aloimonos (center) is helping robots such as Rethink Robotics' Baxter learn new skills by watching YouTube videos.

University of Maryland computer scientist Yiannis Aloimonos (center) is helping robots such as Rethink Robotics' Baxter learn new skills by watching YouTube videos.

Those fearing the rise of an all-powerful artificial intelligence like Skynet, take note: Robots are now learning by watching YouTube.

Depending on your views of the video-sharing service, that can be hilarious or frightening. But so far, the machines are just watching cooking videos, according to researchers backed by the U.S. Defense Advanced Projects Research Agency (DARPA).

The computer scientists from the University of Maryland have succeeded in getting humanoid robots to reproduce what they see in a set of YouTube cooking clips, including recognizing, grabbing and using the right kitchen tools.

Part of DARPA's Mathematics of Sensing, Exploitation and Execution program, the research involves getting the machines to understand what's happening in a scene, not just recognizing objects within it.

More significantly, the machines were able to autonomously decide the most efficient combination of motions they observe to accomplish the task at hand.

"Cooking is complex in terms of manipulation, the steps involved and the tools you use," University of Maryland Computer Lab director Yiannis Aloimonos said in a posting on the University's website. "If you want to cut a cucumber, for example, you need to grab the knife, move it into place, make the cut and observe the results to make sure you did them properly."

For lower-level tasks involved in the cooking experiment, the team used convolutional neural networks (CNNs), which are learning frameworks based on biological models, to process the visual data. One CNN was for classifying hand grasp motions while another CNN was for object recognition.

On top of these, the robots classified the individual actions they saw into what the researchers describe as words in a sentence. The machines were then able to put the individual vocabulary units together into goal-oriented sentences.

The researchers recorded recognition accuracy of 79 percent on objects, 91 percent on grasping types and 83 percent on predicted actions, according to their paper on the experiment.

"The ability to learn actions from human demonstrations is one of the major challenges for the development of intelligent systems," Aloimonos and colleagues, including a researcher from the National Information Communications Technology Research Centre of Excellence in Australia (NICTA), wrote in the paper, which doesn't discuss whether the robots reached the stage of actually being able to cook and whether their cooking is edible or not.

"Our ultimate goal is to build a self-learning robot that is able to enrich its knowledge about fine-grained manipulation actions by 'watching' demo videos."

The work was presented this week at a meeting of the Association for the Advancement of Artificial Intelligence in Austin, Texas.

Tim Hornyak covers Japan and emerging technologies for The IDG News Service. Follow Tim on Twitter at @robotopia.