Motivation

In order to be fully robust and responsive to a dynamically changing real-world environment, intelligent robots will need to engage in a variety of simultaneous reasoning modalities. In particular, we consider their needs to:

  1. reason with commonsense knowledge (the knowledge that is normally true, but not always),
  2. model their nondeterministic action outcomes and partial observability, and
  3. plan toward maximizing long-term rewards.


Examples of commonsense knowledge include "all birds can fly" and "people prefer coffee in the morning". Both are normally true but not always. One of the challenges in commonsense reasoning is that new knowledge must be able to "defeat" the old knowledge when conflicts are detected.


On one hand, existing research in knowledge representation and reasoning (KRR) has enabled efficient, robust commonsense reasoning, but is not good at planning under the uncertainty in action outcomes and observations. On the other hand, probabilistic planning algorithms can be used for planning under uncertainty toward maximizing long-term reward, but is ill-equipped for general reasoning tasks.




Method: our first attempt - CORPP

The original CORPP algorithm (details in our AAAI'15 paper) uses a commonsense reasoner to:

  1. compute the set of possible worlds (that together specify the state space for planning); and
  2. generate an informative prior belief distribution for POMDP-based probabilistic planning.

The following figure presents a visualization of this two-step process.


CORPP has been applied to spoken dialog systems, helping robots to ask "smart" questions. The following videos shows a service robot that can buy an item for a person and deliver to a room. The goal is use spoken language to identify this request, accurately and efficiently.

The main challenge in such spoken dialog systems comes from the unreliable speech recognition. For instance, the robot sometimes find it difficult to distinguish between "coffee" and "toffee" due to their similar pronunciations.




Method: the iCORPP method

More recently, we have developed the interleaved CORPP (iCORPP) algorithm that dynamically constructs (PO)MDPs. This requires the commonsense reasoners to not only reason about the possible worlds (state space) but also reason about the reward and transition functions.

The following video shows an illustrative trial where iCORPP helps the robot in navigation tasks in a relatively small domain:

  1. the domain map is discretized into thirty positions;
  2. only five weather conditions are modeled; and
  3. three times are modeled (morning, noon, and afternoon).

The MDP constructed by iCORPP includes only 60 states, while still being adaptive to exogenous domain changes (such as the changes of time, human walker positions, and weather). The traditional way of enumerating all combinations of attribute values produces more than 2^69 states, which cannot be solved (accurately or approximately) in practice.




Papers

  • Shiqi Zhang, Piyush Khandelwal, and Peter Stone. "Dynamically Constructed (PO)MDPs for Adaptive Robot Planning." Thirty-First AAAI Conference on Artificial Intelligence (AAAI). 2017
    [pdf] [extended version] [bib]
  • Shiqi Zhang, and Peter Stone. "CORPP: Commonsense Reasoning and Probabilistic Planning, as Applied to Dialog with a Mobile Robot." Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI). 2015
    [pdf] [bib]