Google DeepMind investigates how robots learn to kill or co-operate | Toronto Star

February 9, 2017

When our robot overlords arrive, will they decide to kill us or co-operate with us?

New research from DeepMind, Alphabet Inc.’s London-based artificial intelligence unit, could ultimately shed light on this fundamental question.

They have been investigating the conditions in which reward-optimizing beings, whether human or robot, would chose to co-operate, rather than compete. The answer could have implications for how computer intelligence may eventually be deployed to manage complex systems such as an economy, city traffic flows or environmental policy.

Joel Leibo, the lead author of a paper DeepMind published online Thursday, said in an email that his team’s research indicates that whether agents learn to co-operate or compete depends strongly on the environment in which they operate.

While the research has no immediate real-world application, it would help DeepMind design artificial intelligence agents that can work together in environments with imperfect information. In the future, such work could help such agents navigate a world full of intelligent entities — both human and machine — whether in transport networks or stock markets.

DeepMind’s paper describes how researchers used two different games to investigate how software agents learn to compete or co-operate.

In the first, two of these agents had to maximize the number of apples they could gather in a two-dimensional digital environment. Researchers could vary how frequently apples appeared. The researchers found that when apples were scarce, the agents quickly learned to attack one another — zapping, or “tagging” their opponent with a ray that temporarily immobilized them. When apples were abundant, the agents preferred to coexist more peacefully.

Rather chillingly, however, the researchers found when they tried this same game with more intelligent agents that drew on larger neural networks, a kind of machine intelligence designed to mimic how certain parts of the human brain work — they would “try to tag the other agent more frequently, i.e. behave less co-operatively, no matter how we vary the scarcity of apples,” they wrote in a blog post on DeepMind’s website.

In a second game, called Wolfpack, the AI agents played wolves that had to learn to capture “prey.” Success resulted in a reward not just for the wolf making the capture, but for all wolves present within a certain radius of the capture. The more wolves present in this capture radius, the more points all the wolves would receive.

In this game, the agents generally learned to co-operate. Unlike in the apple-gathering game, in Wolfpack the more cognitively advanced the agent was, the better it learned to co-operate. The researchers postulate that this is because in the apple-gathering game, the zapping behaviour was more complex — it requiring aiming the beam at the opponent; while in Wolfpack game, co-operation was the more complex behaviour.

The researchers speculated that because the less sophisticated artificial intelligence systems had more difficulty mastering these complex behaviours, the more simple AI couldn’t learn to use them effectively.

DeepMind, which Google purchased in 2014, is best known for having created an artificial intelligence that can beat the world’s top human players in the ancient Asian strategy game Go. In November, DeepMind announced it was working with Blizzard Entertainment Inc., the division of Activision Blizzard that makes the video game Starcraft II, to turn that game into a platform for AI research.

Leibo said that the agents used in the apple-gathering and Wolfpack experiments had no short-term memory, and as a result could not make any inferences about the intent of the other agent. “Going forward it would be interesting to equip agents with the ability to reason about other agent’s beliefs and goals,” he said.

In the meantime, it might be wise to keep a few spare apples around.

The Toronto Star and thestar.com, each property of Toronto Star Newspapers Limited, One Yonge Street, 4th Floor, Toronto, ON, M5E 1E6. You can unsubscribe at any time. Please contact us or see our privacy policy for more information.

Our editors found this article on this site using Google and regenerated it for our readers.

Google DeepMind investigates how robots learn to kill or co-operate | Toronto Star

Forex News

Next Steps for McDonald’s Following E. Coli Outbreak

Boeing Machinists Strike Continues as New Labor Contract is Rejected

Boeing Machinists Vote on New Proposal with 35% Raises, Could End Strike

Port of Los Angeles Experiences Record High Freight Rail Delays with Piling Holiday and Everyday Items

McDonald’s $5 Meal Deals Fail to Entice Demanding Customers

Breaking Financial News

Wendel (WNDLF) Q3 2024 Earnings Call Transcript

Tesla Stock Surges to 13-Month High After Strong Earnings Rally

Analyzing Apple Stock: 3 Future Key Factors, 2 Negative (NASDAQ:AAPL)

Next Steps for McDonald’s Following E. Coli Outbreak

AT&T Stock Price Reaches Post Spinoff Territory – NYSE:T