Imagine you are moving a couch with a friend. As you start, you both need to crouch down, securely grab it and simultaneously lift it up.
As you carry it, you may say a word here and there, but that is not enough to coordinate your actions. You mostly communicate through force: you can feel whether to move forward or backward, left or right, and it can tell you to raise the couch higher, put it down or to stop moving.
Miloš Žefran, professor of electrical and computer engineering, and Barbara Di Eugenio, professor of computer science, understand the importance of clear communication while collaborating. Now, they have teamed up to enable robots to communicate and collaborate with people.
The goal of the project is to develop a computational and data-driven framework for a robot to collaborate with humans in everyday tasks that involve physical interaction, such as handing over or moving an object together. Using models learned from observing humans perform these tasks, the robot will engage in back-and-forth communication, where both spoken words and forces are exchanged.
“This is a continuation of our research [to assists the elderly], but it’s more specific than before,” Žefran said. “Before it was about trying to understand how multiple modalities in communication can be brought together. Now we are focusing on force and speech and trying to understand the back-and-forth exchanges between collaborators in everyday tasks.”
In natural language processing (NLP), Di Eugenio’s area of expertise, researchers have studied models of dialogue between humans. In these models, it can be determined who is the conversation leader and follower. Humans know what to do to become a leader or when to be a follower.
“We are trying to adapt the approach developed in NLP to inform the robot how to act in collaborative tasks,” Zefran saod. “If the robot is a participant instead of another human, when should the robot become a leader instea d of a follower? The robot, in a sense, has to be programmed to understand all of these things.”
“The tasks that we are interested in are part of how people collaborate in real life,” Di Eugenio said. “Like preparing dinner together, moving furniture together, or passing food at the dinner table. When we execute these collaborative tasks, signals at different levels of abstraction are exchanged: speech is at a high level, while force exchanges are at a lower level. The two combine in such a way that for example, the item we carry together doesn’t fall. We are trying to model that coordination.”
“When we are holding something together, the only thing that I can sense from your part is the force you are exerting on the object,” Žefran added.
There is still a lot of research to be done when interpreting conversation where people are collaborating on physical tasks. The two researchers see these force exchanges as gestures that need to be incorporated to make sense of the entire interaction. Conversation is just the language part, but the whole interaction includes all these signals and one needs to make sense of all of them so, ultimately, the robot can interact in an appropriate way.
“There is always a physical connection – either direct or through an object,” Žefran said. “Through the physical connection, you communicate forces. And through these forces I can tell you what to do. If we are moving and I stop the object, you will feel it and you will stop. So, I have signaled something to you.”
“If you listen to language on its own or sense forces on their own, you don’t get the complete picture,” he said. “For example, in language, you may not get the full meaning unless you see what the person is doing. So there’s additional information that helps you understand. We are trying to formalize the way we can inform the meaning of language from force signals. This is something that has not been studied so far. We also want to use language to understand forces. We are putting all of this in one space where there is no distinction between force signal and speech.”
In order to collaborate with a person, the robot needs to be smart enough to know whether to do something physically or say something, and there needs to be a smooth transition between the two.
“It’s amazing that people are so good at coordinating, and we don’t even think about it unless something goes wrong,” Di Eugenio said. “When we move something together or hand something to each other I don’t consciously pay attention to the force I sense on your part so I can safely let the object go. Humans are very good at this. How can we get robots to do the same thing?”
“This is sometimes called embodied intelligence,” Žefran said. “It is very difficult, because it requires reasoning about the physical world and all the information you have about it. It’s different from using AI algorithms to process data. That’s where we are trying to make some progress.”