| |
|
|
|
|
|
|
The PlayMate: An Object Manipulation Scenario
Development
Year 1
The emphasis for the first year was on integrating spatial understanding
from vision with that from language. This required building systems that
could maintain several models of the world, and to exchange information
between them. We were able to demonstrate a system that could couple the
learning of the names of objects with the learning of their appearance,
and could then answer questions about the relative locations and names of
these objects. The system allows the human speaker to move objects in and
out of the scene in real time.
Year 2
During the second year we worked on integrated vision and dialogue
together with planning, and learning of ontologies. This required a new
kind of software architecture that maintains a distributed set of
representations of the scene, the intentions of the actors, and of its
general knowledge. In addition to all the functionality of the first year
system the resulting system is capable of learning about the meanings of
colour words, and descriptions of shape, and size. It is also able to
follow instructions to move objects. These can specified in terms of
spatial relations that are quite natural for humans, for example "Put the
blue thing to the left of the red box".
Year 3
During the third year we worked on integrating manipulation with
continual planning. This means that the robot can replan manipulations to
achieve qualitative states --- put the red thing to the left of the blue
thing --- even when the human interferes with its activities. If the
world turns out differently than expected it replans on the fly. The
third year also allowed integration of incremental processing of
utterances with information from other modalities via binding. Finally we
used the PlayMate scenario to investigate trade-offs in the space of
architectures.
Year 4
During Year 4 we built on the planning and binding work from Year 3 to
produce a robotic control architecture based on a novel fusion of these
technologies. Using cross-modal binding as an information exchange
substrate we gave the MAPSIM continual planner the role of a process
mediator able to trigger actions across the whole architecture. The
planner was fed by goals produced by a motivation subsystem that could
convert from cross-modal intentional content into MAPL (the language of
MAPSIM). Using this framework the control flow in the PlayMate became a
lot simpler and more principled, and the abstract interface provided by
the planner in CAST allowed the system to become a lot more modular (and
thus more easily ported to the Explorer). With this approach we
integrated an action recognition subsystem into the PlayMate
demonstrator which allowed us to both recognise and play (using the
manipulation work from Year 3) simple games with coloured shapes. We also
built upon previous work on visual learning to provide a cleaner
interface for implicit and explicit property learning, whilst using this
as a platform for further exploring ideas about clarification (which we
initially started in Year 1). Within the planner itself, we started to
treat separate CAST subarchitectures as separate agents in a multi-agent
planning problem, allowing us to broaden the scope of the internal
processing behaviours that we could reason about.
Print this page
|
|
|