A Breakthrough for A.I. Technology: Passing an 8th Grade Science Test


SAN FRANCISCO — Four years ago, more than 700 computer scientists competed in a contest to build artificial intelligence that could pass an eighth-grade science test. There was $80,000 in prize money on the line.

They all flunked. Even the most sophisticated system couldn’t do better than 60 percent on the test. A.I. couldn’t match the language and logic skills that students are expected to have when they enter high school.

But on Wednesday, the Allen Institute for Artificial Intelligence, a prominent lab in Seattle, unveiled a new system that passed the test with room to spare. It correctly answered more than 90 percent of the questions on an eighth-grade science test and more than 80 percent on a 12th-grade exam.

The system, called Aristo, is an indication that in just the past several months researchers have made significant progress in developing A.I. that can understand languages and mimic the logic and decision-making of humans.

The world’s top research labs are rapidly improving a machine’s ability to understand and respond to natural language. Machines are getting better at analyzing documents, finding information, answering questions and even generating language of their own.

Aristo was built solely for multiple-choice tests. It took standard exams written for students in New York, though the Allen Institute removed all questions that included pictures and diagrams. Answering questions like that would have required additional skills that combine language understanding and logic with so-called computer vision.

Some test questions, like this one from the eighth-grade exam, required little more than information retrieval:

A group of tissues that work together to perform a specific function is called:

(1) an organ

(2) an organism

(3) a system

(4) a cell

But others, like this question from the same exam, required logic:

Which change would most likely cause a decrease in the number of squirrels living in an area?

(1) a decrease in the number of predators

(2) a decrease in competition between the squirrels

(3) an increase in available food

(4) an increase in the number of forest fires

Researchers at the Allen Institute started work on Aristo — they wanted to build a “digital Aristotle” — in 2013, just after the lab was founded by the Seattle billionaire and Microsoft co-founder Paul Allen. They saw standardized science tests as a more meaningful alternative to typical A.I. benchmarks, which relied on games like chess and backgammon or tasks created solely for machines.

A science test isn’t something that can be mastered just by learning rules. It requires making connections using logic. An increase in forest fires, for example, could kill squirrels or decrease the food supply needed for them to thrive and reproduce.

The Allen Institute built their Aristo system on top of the Bert technology. They fed Bert a wide range of questions and answers. In time, it learned to answer similar questions on its own.

Not long ago, researchers at the lab defined the behavior of their test-taking system one line of software code at a time. Sometimes they still do that painstaking coding. But now that the system can learn from digital data on its own, it can improve at a much faster rate.

Systems like Bert — called “language models” — now drive a wide range of research projects, including conversational systems and tools designed to identify false news. With more data and more computing power researchers believe the technology will continue to improve.

But Dr. Etzioni stressed that the future of these systems was hard to predict and that language was only one piece of the puzzle.

Ms. Liu and her fellow Microsoft researchers have tried to build a system that can pass the Graduate Records Exam, the test required for admission to graduate school.

The language section was doable, she said, but building the reasoning skills required for the math section was another matter. “It was far too challenging.”



Source link Nytimes.com

Get more stuff like this

Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

Leave a Reply

Your email address will not be published. Required fields are marked *