[ad_1]
Teaching mobile robots to navigate complex outdoor environments is crucial for real-world applications such as delivery or search and rescue. However, this is also a difficult problem because the robot needs to perceive the environment and then identify paths to the goal. Another common challenge is that the robot needs to overcome uneven terrain, such as stairs, curbs or rocks on a path, while avoiding obstacles and pedestrians. In our previous work, we explored the second challenge by teaching a quadruped robot to navigate complex uneven obstacles and various outdoor terrains.
In “IndoorSim-to-OutdoorReal: Learning to navigate outdoors without any outdoor experience”, we present our latest work to solve the robotic challenge of reasoning about the perceived environment to identify a viable way to navigate the outdoor environment. We present a learning-based indoor-outdoor transfer algorithm that uses deep reinforcement learning to train a navigation policy in a simulated indoor environment and successfully transfers the same policy to a real outdoor environment. We also introduce Context-Maps (user-created maps based on environmental observations) that are used in our algorithm to enable efficient long-range navigation. We show that with this policy, robots can successfully traverse hundreds of meters in a new outdoor environment, around previously unseen outdoor obstacles (trees, bushes, buildings, pedestrians, etc.) and under different weather conditions (sunny, cloudy, sunset).
PointGoal navigation
User input can tell the robot where to go with commands like “Go to the Android statue,” images that show the target location, or simply by picking a point on a map. In this work, we specify the navigation goal (a selected point on the map) as the relative coordinate of the robot’s current position (ie, “go ∆x, ∆y”), this is also known as the PointGoal Visual Navigation ( PointNav ) task. PointNav is a general formulation for navigation tasks and is one of the standard choices for indoor navigation tasks. However, due to the varied visuals, uneven terrain, and long-range targets in outdoor environments, PointNav policy training for outdoor environments is a challenging task.
internal-external transfer
Recent successes in training wheeled and legged robotic agents to navigate indoor environments have been made possible by the development of fast, scalable simulators and the availability of large-scale datasets of photorealistic 3D scans of indoor environments. To capitalize on these successes, we are developing indoor-outdoor transfer techniques that allow our robots to learn from a simulated indoor environment and be deployed in a real outdoor environment.
To overcome the differences between the simulated indoor environment and the real outdoor environment, we use kinematic control and image augmentation techniques in our training system. When using kinematic control, we assume that there is a reliable low-level motion controller that can control the robot to accurately reach a new location. This assumption allows us to directly move the robot to the target location during simulation training through Euler integration and frees us from explicitly modeling the underlying robot dynamics in the simulation, which dramatically improves the throughput of simulation data generation. Previous work has shown that kinematic control can lead to better transfer from sim to real compared to a dynamic control approach where the full dynamics of the robot is modeled and a low-level motion controller is required to move the robot.
left kinematic control; right: dynamic control |
We created an outdoor maze-like environment for initial experiments using objects found indoors, where we used Boston Dynamics’ Spot robot for test navigation. We found that the robot could navigate around new obstacles in a new outdoor environment.
The Spot robot successfully navigates the obstacles found in the indoor environment with a policy that is fully trained in the simulation. |
However, when faced with unknown external obstacles not seen during training, such as a large slope, the robot failed to navigate the slope.
The robot cannot navigate slopes because slopes are rare in indoor environments and the robot has not been trained to handle them. |
In order for the robot to be able to climb and land on slopes, we use image augmentation techniques during simulation training. Specifically, we accidentally become a simulated camera on the robot during training. It can be directed up or down 30 degrees. This boost effectively forces the robot to sense slopes even though the floor is level. Training on these perceived slopes allows the robot to navigate slopes in the real world.
By randomly shifting the camera angle during simulated training, the robot can now climb and descend slopes. |
Since the robots were only trained in a simulated indoor environment where they typically need to move toward a goal several meters away, we found that the learned network failed to process longer distance inputs—eg, the policy failed to advance 100 meters in an empty space. In order for the policy network to be able to handle long-range inputs that are common for outdoor navigation, we adjust the goal vector using the log of the goal distance.
Context maps for complex long-distance navigation
Putting it all together, the robot can navigate outdoors to a target while walking over uneven terrain and avoiding trees, pedestrians and other outdoor obstacles. However, one key component is still missing: the robot’s ability to plan an efficient long-range path. With this scale of navigation, wrong turns and backtracking can be costly. For example, we find that the local search strategy explored by the standard PointNav policy is insufficient to find a long-range target and usually leads to a dead end (shown below). This is because the robot navigates the environment without context, and the optimal path may not be visible to the robot in the first place.
Navigation policies without environmental context do not address the complex objectives of long-range navigation. |
In order for the robot to be able to consider the context and purposefully plan an efficient path, we provide a context map (a binary image that represents a top-down occupancy map of the region the robot is in) as an additional observation for the robot. . Below is an example of a context map, where the black region represents areas occupied by obstacles and the white region that the robot can navigate. A green and red circle marks the start of the navigation task and the target location. Through the Context-Map, we can provide clues to the robot (eg, a narrow opening on the route below) to help it plan an efficient navigation route. In our experiments, we create a context map for each route guided by Google Maps satellite images. We denote this variant of PointNav with environmental context as Context-driven PointNav.
An example of a context map (right) for the navigation task (left). |
It is important to note that context mapping does not need to be precise, as it only serves as a rough outline for planning. while navigating, The robot still needs to rely on onboard cameras to detect and adapt to pedestrians that aren’t on the map. In our experiments, a human operator quickly renders a context map from a satellite image, covering regions that should be avoided. This context map, along with other on-board sensory inputs, including depth images and relative position to the target, is fed to a neural network with attention models (ie, transformers) trained using DD-PPO, a proximal distributed implementation. Policy optimization in large-scale simulations.
The Context-Guided PointNav architecture consists of a 3-layer convolutional neural network (CNN) for processing depth images from the robot camera and a multilayer perceptron (MLP) for processing the target vector. Functions are transferred to the Gate Recursive Unit (GRU). We use an additional CNN encoder to handle the context map (top-down map). We calculate the scaled dot product between the attention map and the depth image and use a second GRU to process these features (Context Attn., Depth Attn.). The output of the policy is the linear and angular velocity that the Spot robot follows. |
results
We evaluate our system in three long-term outdoor navigation tasks. The context maps provided are rough, incomplete outlines of the environment that omit obstacles such as cars, trees, or chairs.
With the proposed algorithm, our robot can successfully reach the remote target location 100% of the time without a single collision or human intervention. The robot could navigate pedestrians and real-world clutter not represented in the context map, and navigate a variety of terrains, including dirt slopes and grass.
Route 1
Route 2
Route 3
conclusion
This work opens up robotic navigation research in a less-explored domain of diverse outdoor environments. Our indoor-outdoor transfer algorithm uses zero real-world experience and does not require a simulator to model predominantly outdoor phenomena (terrain, ditches, pavements, vehicles, etc.). Success in the approach comes from a combination of robust motion control, low sim-real gap of depth and mapping sensors, and large-scale simulation training. We show that providing robots with approximate, high-level maps can enable long-term navigation in novel outdoor environments. Our results provide compelling evidence to challenge the hypothesis that a new simulator must be created for each new scenario we want to explore. See our project page for more information.
Acknowledgments
We would like to thank Sonia Chernova, Tinnan Zhang, April Zitkovich, Dhruv Batra, and Ji Tan for their advice and contributions to the project. We would also like to thank Naoki Yokoyama, Nuby Lee, Diego Reiss, Ben Janis, and Gus Kuretas for helping us set up the robot experiment.
[ad_2]
Source link