May 11, 1995
Vol. 14, No. 17

current issue
archive / search

    Teach your robot well: Chip learns to recycle

    In the Animate Agent Laboratory on the second floor of Ryerson Hall, Chip the robot can be found standing at attention, cables snaking from underneath his metallic belly to the computers ringing the room. Aluminum can in "hand," the robot is surrounded by the latest tools of its trade: trash bins -- some empty, some overflowing with aluminum cans, others piled high with wads of paper. Chip is learning to recycle.

    This brainchild of the Artificial Intelligence Laboratory of the Computer Science Department is preparing for an international robot competition, and recycling is considered a state-of-the-art task for animate agents.

    The task at last year's competition, in which 12 robots from universities around the world competed, was to pick up a piece of trash from the floor and throw it in a waste bin. Chip was the only one actually to pick up the trash and put it in the bin, although he suffered communication problems and it took him a long time to complete the task, something for which he was penalized. Another robot was able to identify the trash on the floor, but politely asked a bystander to pick it up. Still another entry, a team of three robots, pushed the trash across the floor toward the bin. Chip ended up in fifth place, but this year his team is determined to be at the top of the class in the competition, which will be held this summer in Montreal.

    Michael Swain, Associate Professor in Computer Science and one of Chip's designers, likened taking Chip through the competition to being on the pit crew at a car race. "You try to make sure that your hardware makes it all the way through the contest," he said, "and that's almost more difficult than building it in the first place!"

    Chip's "body" is cobbled together from make-do parts, but it's the software that is the key ingredient. Swain and Chip's co-creator, James Firby, Associate Professor in Computer Science, have created an integrated system for Chip, making him a sensing and planning robot that can adapt to a changing environment. The combination of vision, visual recognition and planning algorithms makes Chip one of the best integrated systems in the world. "We don't need to do state-of-the-art mechanics -- Chip can't juggle, for example -- but his software is very sophisticated," Firby said.

    Teaching animate agents to operate in a human environment is not as easy as it seems. "A good deal of what's hard about this is getting the robot to do the kinds of things that people do without thinking about them," Firby said. "For example, seeing something on the floor for a person is effortless. But for a robot, it's a very hard thing to do."

    Industrial robots, which operate in a fixed environment, don't need to be able to sense anything -- their environments can be carefully controlled, so that finding something doesn't involve actually looking for it. "If they try to pick something up, it had better be where it's supposed to be," Firby said. Sensing, navigating and adapting to a changing environment are the main challenges for robots trying to operate in a natural, human-populated environment.

    One challenge for the computer scientist is to overcome the misconceptions people have about robots they see in movies and on television -- Star Wars' C3PO, for example -- that seem to be able to do everything except experience human emotions. "So many people have seen robots in movies do so much more than robots in artificial-intelligence labs that it's hard to convince people that the robots in movies are not actually doing all that they appear to be doing," Swain said.

    Firby said, "We're primarily interested in two things. First, how to perceive the world so that a robot can tell what's going on. And second, how to organize plans and knowledge so that a robot can react in a reasonable sort of way."

    For Chip, the first step is vision. For the trash and recycling tasks, he has to be able to find something on the floor. With the two cameras mounted on his "head" for stereo perception, Chip looks down to receive an image. Video signals are fed into a computer, and a monitor lets Firby and Swain see what Chip is "seeing." Chip processes the image, looking for edges, where light and dark patches meet -- at a very basic level, this is how the human eye works -- and groups the edges into shapes. The floor itself doesn't show up as long as it doesn't have any detectable edges.

    "This works pretty well, until you're on a carpet with a pattern, or a tile floor. Then we would have to look for changes in color or changes in texture," Firby said. "Right now we're pretty much relying on the trash to be on a floor that doesn't have any edges.

    "For a person, this sort of knowledge about how the world works -- that the floor is down, a carpet has a regular pattern -- is intuitive. So when a robot can see an object on the floor, people might say, what's the big deal? Anyone could see that. But to get Chip to see something on the floor is really a very complex task," Firby said.

    Once Chip sees something on the floor, he compares it with models of things he knows are trash. His stereo vision allows him to determine the distance to the object. The number of lines, or edges, on an object helps him determine whether it is a crumpled piece of paper or an aluminum can.

    Chip then decides how best to move toward the trash so that he can pick it up. "It turns out that moving is another task that seems like it ought to be pretty simple," said Firby. "And as long as nothing else is moving, it is. But if there are people who are going to step into the way, that presents a problem."

    Chip has to sense his environment continually, using sonar to find objects that may lie between him and his objective, Firby explained.

    "People can tell that a chair isn't going to move toward them, but Chip has no intuition about those kinds of things," he said. "People do all of that stuff rather effortlessly, but Chip has trouble telling whether an object is even there on the ground. These kinds of things are where all the real challenges are for us."

    Chip aligns himself with the trash on the floor and reaches down to pick it up. Then he looks for a trash bin by scanning the room and comparing objects he sees with models of a trash can until he sees one. As long as Chip has seen a trash bin before, he can recognize it -- he has to "acquire" images of new trash cans and be told that's what they are.

    Swain, who concentrates on Chip's visual processing and recognition system, said, "We are doing state-of-the-art visual processing. We can find a trash can better than anyone in the world! Some people believe that you can only work on general problems in vision, or that you do hacks that are only going to work for your own special problem. But we think that you can actually have the best of both worlds."

    Chip then moves toward the trash bin (or recycling bin, in the case of an aluminum can) and drops the item in -- if he manages to keep his grip on it in the first place, that is. If the can or trash slips out of his "hand," Chip has to first sense that it's no longer there, then figure out where it went, and then undergo the whole process of realigning with it and picking it up.

    In artificial-intelligence lingo, this is called invoking a new "plan." The core of Chip's brain is a series of these kinds of plans, developed by Firby. "I am interested in how the knowledge that each of us has about how to do things is organized," Firby said. "Like 20 different ways to push a button, for example -- depending on whether it is an elevator button, an on-off switch on a machine or a dimmer switch for a light. The world is actually quite complex, and the way you deal with that is to know lots of simple things, and the way you do more complex things is to build up libraries of simple tasks and combine them.

    "We've been working on this sort of thing for Chip, writing lots and lots of plans for simple tasks. It's knowing this sort of stuff that lets you cope with the world in any reasonable sort of way -- the kinds of things that little kids spend hours and hours learning, how to push a button or flick a switch, for example.

    "Over the next couple of years, as we expand our library of little plans, we'll have a robot that can do all sorts of things," he said. "And that will be really exciting."

    The ultimate goal for Swain and Firby is to build a fully autonomous robot that can do service-oriented tasks, such as cleaning house or taking care of the lawn, or, in an office context, delivering packages or performing janitorial tasks. They acknowledge that it will take some time, but say that their goal is well within reach.

    Along the way, however, technology developed for Chip has the potential to be used in other contexts. Swain is working on adapting his visual processing system to search large visual data bases -- to look quickly for a particular frame of a movie, for example, or to find photographs in a photographic library. Firby has two offshoots in the works: the planning architecture he designed is being tested at Johnson Space Center for potential use in service robots, and he is interested in developing a semi-autonomous wheelchair that would pick things up for people who cannot do it themselves.

    -- Diana Steele

    Spy on Chip's progress through Internet camera To watch how Chip develops, viewers can peek into the Animate Agent Laboratory using a "spy" camera hooked up to the World Wide Web. The URL is http://vision.uchicago.edu/cgi-bin/labcam. As many as 1,000 people a day from around the world view the still images captured by "Labcam" via the Internet, and more than 165,000 pictures have been taken since postdoc Peter Prokopowicz set up the camera last October. The camera swivels so often -- viewers can point the camera anywhere in the room -- that the head on which it turns has worn out and has had to be replaced.

    A viewer might see graduate student Roger E. "Ari" Kahn (who gets "fan mail" from people who have seen him via Labcam) teaching Chip how to pay attention to where he's pointing -- a task akin to teaching a dog where to look to find something that you want him to fetch. Kahn stands where Chip can see him, waves his hand and points to an object on the floor. Chip recognizes Kahn as a human being, because he's moving slightly, finds the top of his head and draws a ray from there to the tip of Kahn's fingers. He then looks for something on the ground near the terminus of the ray.

    "We've gotten used to the fact that people from all over the world are watching us," Prokopowicz said. "But we have to be careful not to embarrass ourselves, because the people watching often save the pictures. Sometimes they mail me a picture where I have a particularly stupid expression on my face, just to show me what I sometimes look like."

    -- D.S.