Latency-Free Driving Scene Prediction for On-Road Teledriving With Future-Image-Generation

1Dongguk University, 2Korea Advanced Institute of Science and Technology (KAIST)

*Corresponding Author
IEEE Transactions on Intelligent Transportation Systems

Abstract

Teledriving could serve as a practical solution for handling unforeseen situations in autonomous driving. However, the latency of transmission networks remains a prominent concern. Despite advancements like 5G networks, delays in remote driving scenes cannot be entirely eradicated, potentially leading to unwanted incidents. While a few attempts have been made to address this issue by predicting the future driving scenes, these efforts have been restricted in their ability to accurately foresee clear and relevant driving scenarios. This study presents a method to predict a latency-free future driving scene. Unlike prior approaches, our method incorporates the command signal of a remote driver into the prediction network, as well as the past driving video frames and vehicle status. As a result, we can accurately predict relevant and clear latency-free future driving scenes. A combination of convolutional long short-term memory (ConvLSTM) and generative adversarial networks (GAN) was utilized in a deep neural network to predict the future driving scenes based on latency. The dataset was gathered from on-road teledriving experiments, with a maximum vehicle speed of 53 km/h and a driving route length of approximately 1.3 km. The dataset used to train the deep neural network was gathered from on-road teledriving experiments. The proposed method can estimate the future driving scene for up to 0.5 s, surpassing the performance of both baseline video prediction methods and a method that does not utilize the input command of the driver.

Teleoperated driving system

In this study, an approach to predict future driving videos is developed to feedback latency-free driving video to the operator during teledriving. Unlike previous studies, command signals are used with delayed information from the vehicle. Through this, it overcomes the limitations of the existing method that did not reflect the future vehicle state and predict a realistic future driving video that reflects the intention of the driver. Figure illustrates the proposed network for predicting latency-free driving videos based on the delayed information from the vehicle (image and status of vehicle) and the command signal (manual input from the teleoperator). The objective is to overcome latency of up to 0.5 sec. This maximum latency was established by considering both the aforementioned impact of latency on human operating ability, the latency of 5G data, and video inference time.


teaser

Driving Secne Prediction Network Architecture

As depicted in Figure, the proposed method consists of generator and discriminator networks. The generator network uses the delayed driving video frames from a specific time (t) and the status information of the remote vehicle at the time t-time as inputs. In addition, it predicts future frames based on the command signals from the operator.


teaser