HMI

2024-04-25

Redefining Safety for Autonomous Vehicles

Authors: Philip Koopman, William Widen

Link: http://arxiv.org/abs/2404.16768v1open in new window

Abstract: Existing definitions and associated conceptual frameworks for computer-based system safety should be revisited in light of real-world experiences from deploying autonomous vehicles. Current terminology used by industry safety standards emphasizes mitigation of risk from specifically identified hazards, and carries assumptions based on human-supervised vehicle operation. Operation without a human driver dramatically increases the scope of safety concerns, especially due to operation in an open world environment, a requirement to self-enforce operational limits, participation in an ad hoc sociotechnical system of systems, and a requirement to conform to both legal and ethical constraints. Existing standards and terminology only partially address these new challenges. We propose updated definitions for core system safety concepts that encompass these additional considerations as a starting point for evolving safe-ty approaches to address these additional safety challenges. These results might additionally inform framing safety terminology for other autonomous system applications.

Cross-Domain Spatial Matching for Camera and Radar Sensor Data Fusion in Autonomous Vehicle Perception System

Authors: Daniel Dworak, Mateusz Komorkiewicz, Paweł Skruch, Jerzy Baranowski

Link: http://arxiv.org/abs/2404.16548v1open in new window

Abstract: In this paper, we propose a novel approach to address the problem of camera and radar sensor fusion for 3D object detection in autonomous vehicle perception systems. Our approach builds on recent advances in deep learning and leverages the strengths of both sensors to improve object detection performance. Precisely, we extract 2D features from camera images using a state-of-the-art deep learning architecture and then apply a novel Cross-Domain Spatial Matching (CDSM) transformation method to convert these features into 3D space. We then fuse them with extracted radar data using a complementary fusion strategy to produce a final 3D object representation. To demonstrate the effectiveness of our approach, we evaluate it on the NuScenes dataset. We compare our approach to both single-sensor performance and current state-of-the-art fusion methods. Our results show that the proposed approach achieves superior performance over single-sensor solutions and could directly compete with other top-level fusion methods.

2024-04-23

OccGen: Generative Multi-modal 3D Occupancy Prediction for Autonomous Driving

Authors: Guoqing Wang, Zhongdao Wang, Pin Tang, Jilai Zheng, Xiangxuan Ren, Bailan Feng, Chao Ma

Link: http://arxiv.org/abs/2404.15014v1open in new window

Abstract: Existing solutions for 3D semantic occupancy prediction typically treat the task as a one-shot 3D voxel-wise segmentation perception problem. These discriminative methods focus on learning the mapping between the inputs and occupancy map in a single step, lacking the ability to gradually refine the occupancy map and the reasonable scene imaginative capacity to complete the local regions somewhere. In this paper, we introduce OccGen, a simple yet powerful generative perception model for the task of 3D semantic occupancy prediction. OccGen adopts a ''noise-to-occupancy'' generative paradigm, progressively inferring and refining the occupancy map by predicting and eliminating noise originating from a random Gaussian distribution. OccGen consists of two main components: a conditional encoder that is capable of processing multi-modal inputs, and a progressive refinement decoder that applies diffusion denoising using the multi-modal features as conditions. A key insight of this generative pipeline is that the diffusion denoising process is naturally able to model the coarse-to-fine refinement of the dense 3D occupancy map, therefore producing more detailed predictions. Extensive experiments on several occupancy benchmarks demonstrate the effectiveness of the proposed method compared to the state-of-the-art methods. For instance, OccGen relatively enhances the mIoU by 9.5%, 6.3%, and 13.3% on nuScenes-Occupancy dataset under the muli-modal, LiDAR-only, and camera-only settings, respectively. Moreover, as a generative perception model, OccGen exhibits desirable properties that discriminative models cannot achieve, such as providing uncertainty estimates alongside its multiple-step predictions.

Enhancing High-Speed Cruising Performance of Autonomous Vehicles through Integrated Deep Reinforcement Learning Framework

Authors: Jinhao Liang, Kaidi Yang, Chaopeng Tan, Jinxiang Wang, Guodong Yin

Link: http://arxiv.org/abs/2404.14713v1open in new window

Abstract: High-speed cruising scenarios with mixed traffic greatly challenge the road safety of autonomous vehicles (AVs). Unlike existing works that only look at fundamental modules in isolation, this work enhances AV safety in mixed-traffic high-speed cruising scenarios by proposing an integrated framework that synthesizes three fundamental modules, i.e., behavioral decision-making, path-planning, and motion-control modules. Considering that the integrated framework would increase the system complexity, a bootstrapped deep Q-Network (DQN) is employed to enhance the deep exploration of the reinforcement learning method and achieve adaptive decision making of AVs. Moreover, to make AV behavior understandable by surrounding HDVs to prevent unexpected operations caused by misinterpretations, we derive an inverse reinforcement learning (IRL) approach to learn the reward function of skilled drivers for the path planning of lane-changing maneuvers. Such a design enables AVs to achieve a human-like tradeoff between multi-performance requirements. Simulations demonstrate that the proposed integrated framework can guide AVs to take safe actions while guaranteeing high-speed cruising performance.

2024-04-22

PLUTO: Pushing the Limit of Imitation Learning-based Planning for Autonomous Driving

Authors: Jie Cheng, Yingbing Chen, Qifeng Chen

Link: http://arxiv.org/abs/2404.14327v1open in new window

Abstract: We present PLUTO, a powerful framework that pushes the limit of imitation learning-based planning for autonomous driving. Our improvements stem from three pivotal aspects: a longitudinal-lateral aware model architecture that enables flexible and diverse driving behaviors; An innovative auxiliary loss computation method that is broadly applicable and efficient for batch-wise calculation; A novel training framework that leverages contrastive learning, augmented by a suite of new data augmentations to regulate driving behaviors and facilitate the understanding of underlying interactions. We assessed our framework using the large-scale real-world nuPlan dataset and its associated standardized planning benchmark. Impressively, PLUTO achieves state-of-the-art closed-loop performance, beating other competing learning-based methods and surpassing the current top-performed rule-based planner for the first time. Results and code are available at https://jchengai.github.io/pluto.

Autonomous Forest Inventory with Legged Robots: System Design and Field Deployment

Authors: Matías Mattamala, Nived Chebrolu, Benoit Casseau, Leonard Freißmuth, Jonas Frey, Turcan Tuna, Marco Hutter, Maurice Fallon

Link: http://arxiv.org/abs/2404.14157v1open in new window

Abstract: We present a solution for autonomous forest inventory with a legged robotic platform. Compared to their wheeled and aerial counterparts, legged platforms offer an attractive balance of endurance and low soil impact for forest applications. In this paper, we present the complete system architecture of our forest inventory solution which includes state estimation, navigation, mission planning, and real-time tree segmentation and trait estimation. We present preliminary results for three campaigns in forests in Finland and the UK and summarize the main outcomes, lessons, and challenges. Our UK experiment at the Forest of Dean with the ANYmal D legged platform, achieved an autonomous survey of a 0.96 hectare plot in 20 min, identifying over 100 trees with typical DBH accuracy of 2 cm.

Collaborative Perception Datasets in Autonomous Driving: A Survey

Authors: Melih Yazgan, Mythra Varun Akkanapragada, J. Marius Zoellner

Link: http://arxiv.org/abs/2404.14022v1open in new window

Abstract: This survey offers a comprehensive examination of collaborative perception datasets in the context of Vehicle-to-Infrastructure (V2I), Vehicle-to-Vehicle (V2V), and Vehicle-to-Everything (V2X). It highlights the latest developments in large-scale benchmarks that accelerate advancements in perception tasks for autonomous vehicles. The paper systematically analyzes a variety of datasets, comparing them based on aspects such as diversity, sensor setup, quality, public availability, and their applicability to downstream tasks. It also highlights the key challenges such as domain shift, sensor setup limitations, and gaps in dataset diversity and availability. The importance of addressing privacy and security concerns in the development of datasets is emphasized, regarding data sharing and dataset creation. The conclusion underscores the necessity for comprehensive, globally accessible datasets and collaborative efforts from both technological and research communities to overcome these challenges and fully harness the potential of autonomous driving.

Neural Radiance Field in Autonomous Driving: A Survey

Authors: Lei He, Leheng Li, Wenchao Sun, Zeyu Han, Yichen Liu, Sifa Zheng, Jianqiang Wang, Keqiang Li

Link: http://arxiv.org/abs/2404.13816v1open in new window

Abstract: Neural Radiance Field (NeRF) has garnered significant attention from both academia and industry due to its intrinsic advantages, particularly its implicit representation and novel view synthesis capabilities. With the rapid advancements in deep learning, a multitude of methods have emerged to explore the potential applications of NeRF in the domain of Autonomous Driving (AD). However, a conspicuous void is apparent within the current literature. To bridge this gap, this paper conducts a comprehensive survey of NeRF's applications in the context of AD. Our survey is structured to categorize NeRF's applications in Autonomous Driving (AD), specifically encompassing perception, 3D reconstruction, simultaneous localization and mapping (SLAM), and simulation. We delve into in-depth analysis and summarize the findings for each application category, and conclude by providing insights and discussions on future directions in this field. We hope this paper serves as a comprehensive reference for researchers in this domain. To the best of our knowledge, this is the first survey specifically focused on the applications of NeRF in the Autonomous Driving domain.

2024-04-21

Soar: Design and Deployment of A Smart Roadside Infrastructure System for Autonomous Driving

Authors: Shuyao Shi, Neiwen Ling, Zhehao Jiang, Xuan Huang, Yuze He, Xiaoguang Zhao, Bufang Yang, Chen Bian, Jingfei Xia, Zhenyu Yan, Raymond Yeung, Guoliang Xing

Link: http://arxiv.org/abs/2404.13786v1open in new window

Abstract: Recently,smart roadside infrastructure (SRI) has demonstrated the potential of achieving fully autonomous driving systems. To explore the potential of infrastructure-assisted autonomous driving, this paper presents the design and deployment of Soar, the first end-to-end SRI system specifically designed to support autonomous driving systems. Soar consists of both software and hardware components carefully designed to overcome various system and physical challenges. Soar can leverage the existing operational infrastructure like street lampposts for a lower barrier of adoption. Soar adopts a new communication architecture that comprises a bi-directional multi-hop I2I network and a downlink I2V broadcast service, which are designed based on off-the-shelf 802.11ac interfaces in an integrated manner. Soar also features a hierarchical DL task management framework to achieve desirable load balancing among nodes and enable them to collaborate efficiently to run multiple data-intensive autonomous driving applications. We deployed a total of 18 Soar nodes on existing lampposts on campus, which have been operational for over two years. Our real-world evaluation shows that Soar can support a diverse set of autonomous driving applications and achieve desirable real-time performance and high communication reliability. Our findings and experiences in this work offer key insights into the development and deployment of next-generation smart roadside infrastructure and autonomous driving systems.

Autonomous Robot for Disaster Mapping and Victim Localization

Authors: Michael Potter, Rahil Bhowal, Richard Zhao, Anuj Patel, Jingming Cheng

Link: http://arxiv.org/abs/2404.13767v1open in new window

Abstract: In response to the critical need for effective reconnaissance in disaster scenarios, this research article presents the design and implementation of a complete autonomous robot system using the Turtlebot3 with Robotic Operating System (ROS) Noetic. Upon deployment in closed, initially unknown environments, the system aims to generate a comprehensive map and identify any present 'victims' using AprilTags as stand-ins. We discuss our solution for search and rescue missions, while additionally exploring more advanced algorithms to improve search and rescue functionalities. We introduce a Cubature Kalman Filter to help reduce the mean squared error [m] for AprilTag localization and an information-theoretic exploration algorithm to expedite exploration in unknown environments. Just like turtles, our system takes it slow and steady, but when it's time to save the day, it moves at ninja-like speed! Despite Donatello's shell, he's no slowpoke - he zips through obstacles with the agility of a teenage mutant ninja turtle. So, hang on tight to your shells and get ready for a whirlwind of reconnaissance! Full pipeline code https://github.com/rzhao5659/MRProject/tree/main Exploration code https://github.com/rzhao5659/MRProject/tree/main

A Practical Multilevel Governance Framework for Autonomous and Intelligent Systems

Authors: Lukas D. Pöhler, Klaus Diepold, Wendell Wallach

Link: http://arxiv.org/abs/2404.13719v1open in new window

Abstract: Autonomous and intelligent systems (AIS) facilitate a wide range of beneficial applications across a variety of different domains. However, technical characteristics such as unpredictability and lack of transparency, as well as potential unintended consequences, pose considerable challenges to the current governance infrastructure. Furthermore, the speed of development and deployment of applications outpaces the ability of existing governance institutions to put in place effective ethical-legal oversight. New approaches for agile, distributed and multilevel governance are needed. This work presents a practical framework for multilevel governance of AIS. The framework enables mapping actors onto six levels of decision-making including the international, national and organizational levels. Furthermore, it offers the ability to identify and evolve existing tools or create new tools for guiding the behavior of actors within the levels. Governance mechanisms enable actors to shape and enforce regulations and other tools, which when complemented with good practices contribute to effective and comprehensive governance.

2024-04-19

FipTR: A Simple yet Effective Transformer Framework for Future Instance Prediction in Autonomous Driving

Authors: Xingtai Gui, Tengteng Huang, Haonan Shao, Haotian Yao, Chi Zhang

Link: http://arxiv.org/abs/2404.12867v1open in new window

Abstract: The future instance prediction from a Bird's Eye View(BEV) perspective is a vital component in autonomous driving, which involves future instance segmentation and instance motion prediction. Existing methods usually rely on a redundant and complex pipeline which requires multiple auxiliary outputs and post-processing procedures. Moreover, estimated errors on each of the auxiliary predictions will lead to degradation of the prediction performance. In this paper, we propose a simple yet effective fully end-to-end framework named Future Instance Prediction Transformer(FipTR), which views the task as BEV instance segmentation and prediction for future frames. We propose to adopt instance queries representing specific traffic participants to directly estimate the corresponding future occupied masks, and thus get rid of complex post-processing procedures. Besides, we devise a flow-aware BEV predictor for future BEV feature prediction composed of a flow-aware deformable attention that takes backward flow guiding the offset sampling. A novel future instance matching strategy is also proposed to further improve the temporal coherence. Extensive experiments demonstrate the superiority of FipTR and its effectiveness under different temporal BEV encoders.

2024-04-18

An Online Spatial-Temporal Graph Trajectory Planner for Autonomous Vehicles

Authors: Jilan Samiuddin, Benoit Boulet, Di Wu

Link: http://arxiv.org/abs/2404.12256v1open in new window

Abstract: The autonomous driving industry is expected to grow by over 20 times in the coming decade and, thus, motivate researchers to delve into it. The primary focus of their research is to ensure safety, comfort, and efficiency. An autonomous vehicle has several modules responsible for one or more of the aforementioned items. Among these modules, the trajectory planner plays a pivotal role in the safety of the vehicle and the comfort of its passengers. The module is also responsible for respecting kinematic constraints and any applicable road constraints. In this paper, a novel online spatial-temporal graph trajectory planner is introduced to generate safe and comfortable trajectories. First, a spatial-temporal graph is constructed using the autonomous vehicle, its surrounding vehicles, and virtual nodes along the road with respect to the vehicle itself. Next, the graph is forwarded into a sequential network to obtain the desired states. To support the planner, a simple behavioral layer is also presented that determines kinematic constraints for the planner. Furthermore, a novel potential function is also proposed to train the network. Finally, the proposed planner is tested on three different complex driving tasks, and the performance is compared with two frequently used methods. The results show that the proposed planner generates safe and feasible trajectories while achieving similar or longer distances in the forward direction and comparable comfort ride.

Trajectory Planning for Autonomous Vehicle Using Iterative Reward Prediction in Reinforcement Learning

Authors: Hyunwoo Park

Link: http://arxiv.org/abs/2404.12079v1open in new window

Abstract: Traditional trajectory planning methods for autonomous vehicles have several limitations. Heuristic and explicit simple rules make trajectory lack generality and complex motion. One of the approaches to resolve the above limitations of traditional trajectory planning methods is trajectory planning using reinforcement learning. However, reinforcement learning suffers from instability of learning and prior works of trajectory planning using reinforcement learning didn't consider the uncertainties. In this paper, we propose a trajectory planning method for autonomous vehicles using reinforcement learning. The proposed method includes iterative reward prediction method that stabilizes the learning process, and uncertainty propagation method that makes the reinforcement learning agent to be aware of the uncertainties. The proposed method is experimented in the CARLA simulator. Compared to the baseline method, we have reduced the collision rate by 60.17%, and increased the average reward to 30.82 times.

S4TP: Social-Suitable and Safety-Sensitive Trajectory Planning for Autonomous Vehicles

Authors: Xiao Wang, Ke Tang, Xingyuan Dai, Jintao Xu, Quancheng Du, Rui Ai, Yuxiao Wang, Weihao Gu

Link: http://arxiv.org/abs/2404.11946v1open in new window

Abstract: In public roads, autonomous vehicles (AVs) face the challenge of frequent interactions with human-driven vehicles (HDVs), which render uncertain driving behavior due to varying social characteristics among humans. To effectively assess the risks prevailing in the vicinity of AVs in social interactive traffic scenarios and achieve safe autonomous driving, this article proposes a social-suitable and safety-sensitive trajectory planning (S4TP) framework. Specifically, S4TP integrates the Social-Aware Trajectory Prediction (SATP) and Social-Aware Driving Risk Field (SADRF) modules. SATP utilizes Transformers to effectively encode the driving scene and incorporates an AV's planned trajectory during the prediction decoding process. SADRF assesses the expected surrounding risk degrees during AVs-HDVs interactions, each with different social characteristics, visualized as two-dimensional heat maps centered on the AV. SADRF models the driving intentions of the surrounding HDVs and predicts trajectories based on the representation of vehicular interactions. S4TP employs an optimization-based approach for motion planning, utilizing the predicted HDVs'trajectories as input. With the integration of SADRF, S4TP executes real-time online optimization of the planned trajectory of AV within lowrisk regions, thus improving the safety and the interpretability of the planned trajectory. We have conducted comprehensive tests of the proposed method using the SMARTS simulator. Experimental results in complex social scenarios, such as unprotected left turn intersections, merging, cruising, and overtaking, validate the superiority of our proposed S4TP in terms of safety and rationality. S4TP achieves a pass rate of 100% across all scenarios, surpassing the current state-of-the-art methods Fanta of 98.25% and Predictive-Decision of 94.75%.

2024-04-17

Developing Situational Awareness for Joint Action with Autonomous Vehicles

Authors: Robert Kaufman, David Kirsh, Nadir Weibel

Link: http://arxiv.org/abs/2404.11800v1open in new window

Abstract: Unanswered questions about how human-AV interaction designers can support rider's informational needs hinders Autonomous Vehicles (AV) adoption. To achieve joint human-AV action goals - such as safe transportation, trust, or learning from an AV - sufficient situational awareness must be held by the human, AV, and human-AV system collectively. We present a systems-level framework that integrates cognitive theories of joint action and situational awareness as a means to tailor communications that meet the criteria necessary for goal success. This framework is based on four components of the shared situation: AV traits, action goals, subject-specific traits and states, and the situated driving context. AV communications should be tailored to these factors and be sensitive when they change. This framework can be useful for understanding individual, shared, and distributed human-AV situational awareness and designing for future AV communications that meet the informational needs and goals of diverse groups and in diverse driving contexts.

Designing Touchscreen Menu Interfaces for In-Vehicle Infotainment Systems: the Effect of Depth and Breadth Trade-off and Task Types on Visual-Manual Distraction

Authors: Louveton Nicolas, McCall Rod, Engel Thomas

Link: http://arxiv.org/abs/2404.11469v1open in new window

Abstract: Multitasking with a touch screen user-interface while driving is known to impact negatively driving performance and safety. Literature shows that list scrolling interfaces generate more visual-manual distraction than structured menus and sequential navigation. Depth and breadth trade-offs for structured navigation have been studied. However, little is known on how secondary task characteristics interact with those trade-offs. In this study, we make the hypothesis that both menu's depth and task complexity interact in generating visual-manual distraction. Using a driving simulation setup, we collected telemetry and eye-tracking data to evaluate driving performance. Participants were multitasking with a mobile app, presenting a range of eight depth and breadth trade-offs under three types of secondary tasks, involving different cognitive operations (Systematic reading, Search for an item, Memorize items' state). The results confirm our hypothesis. Systematic interaction with menu items generated a visual demand that increased with menu's depth, while visual demand reach an optimum for Search and Memory tasks. We discuss implications for design: In a multitasking context, display design effectiveness must be assessed while considering menu's layout but also cognitive processes involved.

Autonomous aerial perching and unperching using omnidirectional tiltrotor and switching controller

Authors: Dongjae Lee, Sunwoo Hwang, Jeonghyun Byun, Seung Jae Lee, H. Jin Kim

Link: http://arxiv.org/abs/2404.11310v1open in new window

Abstract: Aerial unperching of multirotors has received little attention as opposed to perching that has been investigated to elongate operation time. This study presents a new aerial robot capable of both perching and unperching autonomously on/from a ferromagnetic surface during flight, and a switching controller to avoid rotor saturation and mitigate overshoot during transition between free-flight and perching. To enable stable perching and unperching maneuvers on/from a vertical surface, a lightweight ($\approx$ $1$ \si{kg}), fully actuated tiltrotor that can hover at $90^\circ$ pitch angle is first developed. We design a perching/unperching module composed of a single servomotor and a magnet, which is then mounted on the tiltrotor. A switching controller including exclusive control modes for transitions between free-flight and perching is proposed. Lastly, we propose a simple yet effective strategy to ensure robust perching in the presence of measurement and control errors and avoid collisions with the perching site immediately after unperching. We validate the proposed framework in experiments where the tiltrotor successfully performs perching and unperching on/from a vertical surface during flight. We further show effectiveness of the proposed transition mode in the switching controller by ablation studies where large overshoot and even collision with a perching site occur. To the best of the authors' knowledge, this work presents the first autonomous aerial unperching framework using a fully actuated tiltrotor.

How to deal with glare for improved perception of Autonomous Vehicles

Authors: Muhammad Z. Alam, Zeeshan Kaleem, Sousso Kelouwani

Link: http://arxiv.org/abs/2404.10992v1open in new window

Abstract: Vision sensors are versatile and can capture a wide range of visual cues, such as color, texture, shape, and depth. This versatility, along with the relatively inexpensive availability of machine vision cameras, played an important role in adopting vision-based environment perception systems in autonomous vehicles (AVs). However, vision-based perception systems can be easily affected by glare in the presence of a bright source of light, such as the sun or the headlights of the oncoming vehicle at night or simply by light reflecting off snow or ice-covered surfaces; scenarios encountered frequently during driving. In this paper, we investigate various glare reduction techniques, including the proposed saturated pixel-aware glare reduction technique for improved performance of the computer vision (CV) tasks employed by the perception layer of AVs. We evaluate these glare reduction methods based on various performance metrics of the CV algorithms used by the perception layer. Specifically, we considered object detection, object recognition, object tracking, depth estimation, and lane detection which are crucial for autonomous driving. The experimental findings validate the efficacy of the proposed glare reduction approach, showcasing enhanced performance across diverse perception tasks and remarkable resilience against varying levels of glare.

2024-04-16

Safety-critical Autonomous Inspection of Distillation Columns using Quadrupedal Robots Equipped with Roller Arms

Authors: Jaemin Lee, Jeeseop Kim, Aaron D. Ames

Link: http://arxiv.org/abs/2404.10938v1open in new window

Abstract: This paper proposes a comprehensive framework designed for the autonomous inspection of complex environments, with a specific focus on multi-tiered settings such as distillation column trays. Leveraging quadruped robots equipped with roller arms, and through the use of onboard perception, we integrate essential motion components including: locomotion, safe and dynamic transitions between trays, and intermediate motions that bridge a variety of motion primitives. Given the slippery and confined nature of column trays, it is critical to ensure safety of the robot during inspection, therefore we employ a safety filter and footstep re-planning based upon control barrier function representations of the environment. Our framework integrates all system components into a state machine encoding the developed safety-critical planning and control elements to guarantee safety-critical autonomy, enabling autonomous and safe navigation and inspection of distillation columns. Experimental validation in an environment, consisting of industrial-grade chemical distillation trays, highlights the effectiveness of our multi-layered architecture.

Trajectory Planning using Reinforcement Learning for Interactive Overtaking Maneuvers in Autonomous Racing Scenarios

Authors: Levent Ögretmen, Mo Chen, Phillip Pitschi, Boris Lohmann

Link: http://arxiv.org/abs/2404.10658v1open in new window

Abstract: Conventional trajectory planning approaches for autonomous racing are based on the sequential execution of prediction of the opposing vehicles and subsequent trajectory planning for the ego vehicle. If the opposing vehicles do not react to the ego vehicle, they can be predicted accurately. However, if there is interaction between the vehicles, the prediction loses its validity. For high interaction, instead of a planning approach that reacts exclusively to the fixed prediction, a trajectory planning approach is required that incorporates the interaction with the opposing vehicles. This paper demonstrates the limitations of a widely used conventional sampling-based approach within a highly interactive blocking scenario. We show that high success rates are achieved for less aggressive blocking behavior but that the collision rate increases with more significant interaction. We further propose a novel Reinforcement Learning (RL)-based trajectory planning approach for racing that explicitly exploits the interaction with the opposing vehicle without requiring a prediction. In contrast to the conventional approach, the RL-based approach achieves high success rates even for aggressive blocking behavior. Furthermore, we propose a novel safety layer (SL) that intervenes when the trajectory generated by the RL-based approach is infeasible. In that event, the SL generates a sub-optimal but feasible trajectory, avoiding termination of the scenario due to a not found valid solution.

PreGSU-A Generalized Traffic Scene Understanding Model for Autonomous Driving based on Pre-trained Graph Attention Network

Authors: Yuning Wang, Zhiyuan Liu, Haotian Lin, Junkai Jiang, Shaobing Xu, Jianqiang Wang

Link: http://arxiv.org/abs/2404.10263v1open in new window

Abstract: Scene understanding, defined as learning, extraction, and representation of interactions among traffic elements, is one of the critical challenges toward high-level autonomous driving (AD). Current scene understanding methods mainly focus on one concrete single task, such as trajectory prediction and risk level evaluation. Although they perform well on specific metrics, the generalization ability is insufficient to adapt to the real traffic complexity and downstream demand diversity. In this study, we propose PreGSU, a generalized pre-trained scene understanding model based on graph attention network to learn the universal interaction and reasoning of traffic scenes to support various downstream tasks. After the feature engineering and sub-graph module, all elements are embedded as nodes to form a dynamic weighted graph. Then, four graph attention layers are applied to learn the relationships among agents and lanes. In the pre-train phase, the understanding model is trained on two self-supervised tasks: Virtual Interaction Force (VIF) modeling and Masked Road Modeling (MRM). Based on the artificial potential field theory, VIF modeling enables PreGSU to capture the agent-to-agent interactions while MRM extracts agent-to-road connections. In the fine-tuning process, the pre-trained parameters are loaded to derive detailed understanding outputs. We conduct validation experiments on two downstream tasks, i.e., trajectory prediction in urban scenario, and intention recognition in highway scenario, to verify the generalized ability and understanding ability. Results show that compared with the baselines, PreGSU achieves better accuracy on both tasks, indicating the potential to be generalized to various scenes and targets. Ablation study shows the effectiveness of pre-train task design.

Autonomous Implicit Indoor Scene Reconstruction with Frontier Exploration

Authors: Jing Zeng, Yanxu Li, Jiahao Sun, Qi Ye, Yunlong Ran, Jiming Chen

Link: http://arxiv.org/abs/2404.10218v1open in new window

Abstract: Implicit neural representations have demonstrated significant promise for 3D scene reconstruction. Recent works have extended their applications to autonomous implicit reconstruction through the Next Best View (NBV) based method. However, the NBV method cannot guarantee complete scene coverage and often necessitates extensive viewpoint sampling, particularly in complex scenes. In the paper, we propose to 1) incorporate frontier-based exploration tasks for global coverage with implicit surface uncertainty-based reconstruction tasks to achieve high-quality reconstruction. and 2) introduce a method to achieve implicit surface uncertainty using color uncertainty, which reduces the time needed for view selection. Further with these two tasks, we propose an adaptive strategy for switching modes in view path planning, to reduce time and maintain superior reconstruction quality. Our method exhibits the highest reconstruction quality among all planning methods and superior planning efficiency in methods involving reconstruction tasks. We deploy our method on a UAV and the results show that our method can plan multi-task views and reconstruct a scene with high quality.

2024-04-15

Hierarchical Fault-Tolerant Coverage Control for an Autonomous Aerial Agent

Authors: Savvas Papaioannou, Christian Vitale, Panayiotis Kolios, Christos G. Panayiotou, Marios M. Polycarpou

Link: http://arxiv.org/abs/2404.09838v1open in new window

Abstract: Fault-tolerant coverage control involves determining a trajectory that enables an autonomous agent to cover specific points of interest, even in the presence of actuation and/or sensing faults. In this work, the agent encounters control inputs that are erroneous; specifically, its nominal controls inputs are perturbed by stochastic disturbances, potentially disrupting its intended operation. Existing techniques have focused on deterministically bounded disturbances or relied on the assumption of Gaussian disturbances, whereas non-Gaussian disturbances have been primarily been tackled via scenario-based stochastic control methods. However, the assumption of Gaussian disturbances is generally limited to linear systems, and scenario-based methods can become computationally prohibitive. To address these limitations, we propose a hierarchical coverage controller that integrates mixed-trigonometric-polynomial moment propagation to propagate non-Gaussian disturbances through the agent's nonlinear dynamics. Specifically, the first stage generates an ideal reference plan by optimising the agent's mobility and camera control inputs. The second-stage fault-tolerant controller then aims to follow this reference plan, even in the presence of erroneous control inputs caused by non-Gaussian disturbances. This is achieved by imposing a set of deterministic constraints on the moments of the system's uncertain states.

Sampling for Model Predictive Trajectory Planning in Autonomous Driving using Normalizing Flows

Authors: Georg Rabenstein, Lars Ullrich, Knut Graichen

Link: http://arxiv.org/abs/2404.09657v1open in new window

Abstract: Alongside optimization-based planners, sampling-based approaches are often used in trajectory planning for autonomous driving due to their simplicity. Model predictive path integral control is a framework that builds upon optimization principles while incorporating stochastic sampling of input trajectories. This paper investigates several sampling approaches for trajectory generation. In this context, normalizing flows originating from the field of variational inference are considered for the generation of sampling distributions, as they model transformations of simple to more complex distributions. Accordingly, learning-based normalizing flow models are trained for a more efficient exploration of the input domain for the task at hand. The developed algorithm and the proposed sampling distributions are evaluated in two simulation scenarios.

AAM-VDT: Vehicle Digital Twin for Tele-Operations in Advanced Air Mobility

Authors: Tuan Anh Nguyen, Taeho Kwag, Vinh Pham, Viet Nghia Nguyen, Jeongseok Hyun, Minseok Jang, Jae-Woo Lee

Link: http://arxiv.org/abs/2404.09621v1open in new window

Abstract: This study advanced tele-operations in Advanced Air Mobility (AAM) through the creation of a Vehicle Digital Twin (VDT) system for eVTOL aircraft, tailored to enhance remote control safety and efficiency, especially for Beyond Visual Line of Sight (BVLOS) operations. By synergizing digital twin technology with immersive Virtual Reality (VR) interfaces, we notably elevate situational awareness and control precision for remote operators. Our VDT framework integrates immersive tele-operation with a high-fidelity aerodynamic database, essential for authentically simulating flight dynamics and control tactics. At the heart of our methodology lies an eVTOL's high-fidelity digital replica, placed within a simulated reality that accurately reflects physical laws, enabling operators to manage the aircraft via a master-slave dynamic, substantially outperforming traditional 2D interfaces. The architecture of the designed system ensures seamless interaction between the operator, the digital twin, and the actual aircraft, facilitating exact, instantaneous feedback. Experimental assessments, involving propulsion data gathering, simulation database fidelity verification, and tele-operation testing, verify the system's capability in precise control command transmission and maintaining the digital-physical eVTOL synchronization. Our findings underscore the VDT system's potential in augmenting AAM efficiency and safety, paving the way for broader digital twin application in autonomous aerial vehicles.

Towards Collaborative Autonomous Driving: Simulation Platform and End-to-End System

Authors: Genjia Liu, Yue Hu, Chenxin Xu, Weibo Mao, Junhao Ge, Zhengxiang Huang, Yifan Lu, Yinda Xu, Junkai Xia, Yafei Wang, Siheng Chen

Link: http://arxiv.org/abs/2404.09496v1open in new window

Abstract: Vehicle-to-everything-aided autonomous driving (V2X-AD) has a huge potential to provide a safer driving solution. Despite extensive researches in transportation and communication to support V2X-AD, the actual utilization of these infrastructures and communication resources in enhancing driving performances remains largely unexplored. This highlights the necessity of collaborative autonomous driving: a machine learning approach that optimizes the information sharing strategy to improve the driving performance of each vehicle. This effort necessitates two key foundations: a platform capable of generating data to facilitate the training and testing of V2X-AD, and a comprehensive system that integrates full driving-related functionalities with mechanisms for information sharing. From the platform perspective, we present V2Xverse, a comprehensive simulation platform for collaborative autonomous driving. This platform provides a complete pipeline for collaborative driving. From the system perspective, we introduce CoDriving, a novel end-to-end collaborative driving system that properly integrates V2X communication over the entire autonomous pipeline, promoting driving with shared perceptual information. The core idea is a novel driving-oriented communication strategy. Leveraging this strategy, CoDriving improves driving performance while optimizing communication efficiency. We make comprehensive benchmarks with V2Xverse, analyzing both modular performance and closed-loop driving performance. Experimental results show that CoDriving: i) significantly improves the driving score by 62.49% and drastically reduces the pedestrian collision rate by 53.50% compared to the SOTA end-to-end driving method, and ii) achieves sustaining driving performance superiority over dynamic constraint communication conditions.

2024-04-13

Selection of Time Headway in Connected and Autonomous Vehicle Platoons under Noisy V2V Communication

Authors: Guoqi Ma, Prabhakar R. Pagilla, Swaroop Darbha

Link: http://arxiv.org/abs/2404.08889v1open in new window

Abstract: In this paper, we investigate the selection of time headway to ensure robust string stability in connected and autonomous vehicle platoons in the presence of signal noise in Vehicle-to-Vehicle (V2V) communication. In particular, we consider the effect of noise in communicated vehicle acceleration from the predecessor vehicle to the follower vehicle on the selection of the time headway in predecessor-follower type vehicle platooning with a Constant Time Headway Policy (CTHP). Employing a CTHP based control law for each vehicle that utilizes on-board sensors for measurement of position and velocity of the predecessor vehicle and wireless communication network for obtaining the acceleration of the predecessor vehicle, we investigate how time headway is affected by communicated signal noise. We derive constraints on the CTHP controller gains for predecessor acceleration, velocity error and spacing error and a lower bound on the time headway which will ensure robust string stability of the platoon against signal noise. We provide comparative numerical simulations on an example to illustrate the main result.

Benefits of V2V communication in connected and autonomous vehicles in the presence of delays in communicated signals

Authors: Guoqi Ma, Prabhakar R. Pagilla, Swaroop Darbha

Link: http://arxiv.org/abs/2404.08879v1open in new window

Abstract: In this paper, we investigate the effect of signal delay in communicated information in connected and autonomous vehicles. In particular, we relate this delay's effect on the selection of the time headway in predecessor-follower type vehicle platooning with a constant time headway policy (CTHP). We employ a CTHP control law for each vehicle in the platoon by considering two cases: cooperative adaptive cruise control (CACC) strategy where information from only one predecessor vehicle is employed and CACC+ where information from multiple predecessor vehicles is employed. We investigate how the lower bound on the time headway is affected by signal transmission delay due to wireless communication. We provide a systematic approach to the derivation of the lower bound of the time headway and selection of the appropriate CTHP controller gains for predecessor acceleration, velocity error and spacing error which will ensure robust string stability of the platoon under the presence of signal delay. We corroborate the main result with numerical simulations.

2024-04-12

WROOM: An Autonomous Driving Approach for Off-Road Navigation

Authors: Dvij Kalaria, Shreya Sharma, Sarthak Bhagat, Haoru Xue, John M. Dolan

Link: http://arxiv.org/abs/2404.08855v1open in new window

Abstract: Off-road navigation is a challenging problem both at the planning level to get a smooth trajectory and at the control level to avoid flipping over, hitting obstacles, or getting stuck at a rough patch. There have been several recent works using classical approaches involving depth map prediction followed by smooth trajectory planning and using a controller to track it. We design an end-to-end reinforcement learning (RL) system for an autonomous vehicle in off-road environments using a custom-designed simulator in the Unity game engine. We warm-start the agent by imitating a rule-based controller and utilize Proximal Policy Optimization (PPO) to improve the policy based on a reward that incorporates Control Barrier Functions (CBF), facilitating the agent's ability to generalize effectively to real-world scenarios. The training involves agents concurrently undergoing domain-randomized trials in various environments. We also propose a novel simulation environment to replicate off-road driving scenarios and deploy our proposed approach on a real buggy RC car. Videos and additional results: https://sites.google.com/view/wroom-utd/home

Enhancing Autonomous Vehicle Training with Language Model Integration and Critical Scenario Generation

Authors: Hanlin Tian, Kethan Reddy, Yuxiang Feng, Mohammed Quddus, Yiannis Demiris, Panagiotis Angeloudis

Link: http://arxiv.org/abs/2404.08570v1open in new window

Abstract: This paper introduces CRITICAL, a novel closed-loop framework for autonomous vehicle (AV) training and testing. CRITICAL stands out for its ability to generate diverse scenarios, focusing on critical driving situations that target specific learning and performance gaps identified in the Reinforcement Learning (RL) agent. The framework achieves this by integrating real-world traffic dynamics, driving behavior analysis, surrogate safety measures, and an optional Large Language Model (LLM) component. It is proven that the establishment of a closed feedback loop between the data generation pipeline and the training process can enhance the learning rate during training, elevate overall system performance, and augment safety resilience. Our evaluations, conducted using the Proximal Policy Optimization (PPO) and the HighwayEnv simulation environment, demonstrate noticeable performance improvements with the integration of critical case generation and LLM analysis, indicating CRITICAL's potential to improve the robustness of AV systems and streamline the generation of critical scenarios. This ultimately serves to hasten the development of AV agents, expand the general scope of RL training, and ameliorate validation efforts for AV safety.

Maturity of Vehicle Digital Twins: From Monitoring to Enabling Autonomous Driving

Authors: Robert Klar, Niklas Arvidsson, Vangelis Angelakis

Link: http://arxiv.org/abs/2404.08438v1open in new window

Abstract: Digital twinning of vehicles is an iconic application of digital twins, as the concept of twinning dates back to the twinning of NASA space vehicles. Although digital twins (DTs) in the automotive industry have been recognized for their ability to improve efficiency in design and manufacturing, their potential to enhance land vehicle operation has yet to be fully explored. Most existing DT research on vehicle operations, aside from the existing body of work on autonomous guided vehicles (AGVs), focuses on electrified passenger cars. However, the use and value of twinning varies depending on the goal, whether it is to provide cost-efficient and sustainable freight transport without disruptions, sustainable public transport focused on passenger well-being, or fully autonomous vehicle operation. In this context, DTs are used for a range of applications, from real-time battery health monitoring to enabling fully autonomous vehicle operations. This leads to varying requirements, complexities, and maturities of the implemented DT solutions. This paper analyzes recent trends in DT-driven efficiency gains for freight, public, and autonomous vehicles and discusses their required level of maturity based on a maturity tool. The application of our DT maturity tool reveals that most DTs have reached level 3 and enable real-time monitoring. Additionally, DTs of level 5 already exist in closed environments, allowing for restricted autonomous operation.

2024-04-11

LLM Agents can Autonomously Exploit One-day Vulnerabilities

Authors: Richard Fang, Rohan Bindu, Akul Gupta, Daniel Kang

Link: http://arxiv.org/abs/2404.08144v2open in new window

Abstract: LLMs have becoming increasingly powerful, both in their benign and malicious uses. With the increase in capabilities, researchers have been increasingly interested in their ability to exploit cybersecurity vulnerabilities. In particular, recent work has conducted preliminary studies on the ability of LLM agents to autonomously hack websites. However, these studies are limited to simple vulnerabilities. In this work, we show that LLM agents can autonomously exploit one-day vulnerabilities in real-world systems. To show this, we collected a dataset of 15 one-day vulnerabilities that include ones categorized as critical severity in the CVE description. When given the CVE description, GPT-4 is capable of exploiting 87% of these vulnerabilities compared to 0% for every other model we test (GPT-3.5, open-source LLMs) and open-source vulnerability scanners (ZAP and Metasploit). Fortunately, our GPT-4 agent requires the CVE description for high performance: without the description, GPT-4 can exploit only 7% of the vulnerabilities. Our findings raise questions around the widespread deployment of highly capable LLM agents.

NeuroNCAP: Photorealistic Closed-loop Safety Testing for Autonomous Driving

Authors: William Ljungbergh, Adam Tonderski, Joakim Johnander, Holger Caesar, Kalle Åström, Michael Felsberg, Christoffer Petersson

Link: http://arxiv.org/abs/2404.07762v2open in new window

Abstract: We present a versatile NeRF-based simulator for testing autonomous driving (AD) software systems, designed with a focus on sensor-realistic closed-loop evaluation and the creation of safety-critical scenarios. The simulator learns from sequences of real-world driving sensor data and enables reconfigurations and renderings of new, unseen scenarios. In this work, we use our simulator to test the responses of AD models to safety-critical scenarios inspired by the European New Car Assessment Programme (Euro NCAP). Our evaluation reveals that, while state-of-the-art end-to-end planners excel in nominal driving scenarios in an open-loop setting, they exhibit critical flaws when navigating our safety-critical scenarios in a closed-loop setting. This highlights the need for advancements in the safety and real-world usability of end-to-end planners. By publicly releasing our simulator and scenarios as an easy-to-run evaluation suite, we invite the research community to explore, refine, and validate their AD models in controlled, yet highly configurable and challenging sensor-realistic environments. Code and instructions can be found at https://github.com/wljungbergh/NeuroNCAP

2024-04-10

Incorporating Explanations into Human-Machine Interfaces for Trust and Situation Awareness in Autonomous Vehicles

Authors: Shahin Atakishiyev, Mohammad Salameh, Randy Goebel

Link: http://arxiv.org/abs/2404.07383v1open in new window

Abstract: Autonomous vehicles often make complex decisions via machine learning-based predictive models applied to collected sensor data. While this combination of methods provides a foundation for real-time actions, self-driving behavior primarily remains opaque to end users. In this sense, explainability of real-time decisions is a crucial and natural requirement for building trust in autonomous vehicles. Moreover, as autonomous vehicles still cause serious traffic accidents for various reasons, timely conveyance of upcoming hazards to road users can help improve scene understanding and prevent potential risks. Hence, there is also a need to supply autonomous vehicles with user-friendly interfaces for effective human-machine teaming. Motivated by this problem, we study the role of explainable AI and human-machine interface jointly in building trust in vehicle autonomy. We first present a broad context of the explanatory human-machine systems with the "3W1H" (what, whom, when, how) approach. Based on these findings, we present a situation awareness framework for calibrating users' trust in self-driving behavior. Finally, we perform an experiment on our framework, conduct a user study on it, and validate the empirical findings with hypothesis testing.

Enhanced Cooperative Perception for Autonomous Vehicles Using Imperfect Communication

Authors: Ahmad Sarlak, Hazim Alzorgan, Sayed Pedram Haeri Boroujeni, Abolfazl Razi, Rahul Amin

Link: http://arxiv.org/abs/2404.08013v1open in new window

Abstract: Sharing and joint processing of camera feeds and sensor measurements, known as Cooperative Perception (CP), has emerged as a new technique to achieve higher perception qualities. CP can enhance the safety of Autonomous Vehicles (AVs) where their individual visual perception quality is compromised by adverse weather conditions (haze as foggy weather), low illumination, winding roads, and crowded traffic. To cover the limitations of former methods, in this paper, we propose a novel approach to realize an optimized CP under constrained communications. At the core of our approach is recruiting the best helper from the available list of front vehicles to augment the visual range and enhance the Object Detection (OD) accuracy of the ego vehicle. In this two-step process, we first select the helper vehicles that contribute the most to CP based on their visual range and lowest motion blur. Next, we implement a radio block optimization among the candidate vehicles to further improve communication efficiency. We specifically focus on pedestrian detection as an exemplary scenario. To validate our approach, we used the CARLA simulator to create a dataset of annotated videos for different driving scenarios where pedestrian detection is challenging for an AV with compromised vision. Our results demonstrate the efficacy of our two-step optimization process in improving the overall performance of cooperative perception in challenging scenarios, substantially improving driving safety under adverse conditions. Finally, we note that the networking assumptions are adopted from LTE Release 14 Mode 4 side-link communication, commonly used for Vehicle-to-Vehicle (V2V) communication. Nonetheless, our method is flexible and applicable to arbitrary V2V communications.

Multi-Agent Soft Actor-Critic with Global Loss for Autonomous Mobility-on-Demand Fleet Control

Authors: Zeno Woywood, Jasper I. Wiltfang, Julius Luy, Tobias Enders, Maximilian Schiffer

Link: http://arxiv.org/abs/2404.06975v1open in new window

Abstract: We study a sequential decision-making problem for a profit-maximizing operator of an Autonomous Mobility-on-Demand system. Optimizing a central operator's vehicle-to-request dispatching policy requires efficient and effective fleet control strategies. To this end, we employ a multi-agent Soft Actor-Critic algorithm combined with weighted bipartite matching. We propose a novel vehicle-based algorithm architecture and adapt the critic's loss function to appropriately consider global actions. Furthermore, we extend our algorithm to incorporate rebalancing capabilities. Through numerical experiments, we show that our approach outperforms state-of-the-art benchmarks by up to 12.9% for dispatching and up to 38.9% with integrated rebalancing.

PACP: Priority-Aware Collaborative Perception for Connected and Autonomous Vehicles

Authors: Zhengru Fang, Senkang Hu, Haonan An, Yuang Zhang, Jingjing Wang, Hangcheng Cao, Xianhao Chen, Yuguang Fang

Link: http://arxiv.org/abs/2404.06891v1open in new window

Abstract: Surrounding perceptions are quintessential for safe driving for connected and autonomous vehicles (CAVs), where the Bird's Eye View has been employed to accurately capture spatial relationships among vehicles. However, severe inherent limitations of BEV, like blind spots, have been identified. Collaborative perception has emerged as an effective solution to overcoming these limitations through data fusion from multiple views of surrounding vehicles. While most existing collaborative perception strategies adopt a fully connected graph predicated on fairness in transmissions, they often neglect the varying importance of individual vehicles due to channel variations and perception redundancy. To address these challenges, we propose a novel Priority-Aware Collaborative Perception (PACP) framework to employ a BEV-match mechanism to determine the priority levels based on the correlation between nearby CAVs and the ego vehicle for perception. By leveraging submodular optimization, we find near-optimal transmission rates, link connectivity, and compression metrics. Moreover, we deploy a deep learning-based adaptive autoencoder to modulate the image reconstruction quality under dynamic channel conditions. Finally, we conduct extensive studies and demonstrate that our scheme significantly outperforms the state-of-the-art schemes by 8.27% and 13.60%, respectively, in terms of utility and precision of the Intersection over Union.

Monocular 3D lane detection for Autonomous Driving: Recent Achievements, Challenges, and Outlooks

Authors: Fulong Ma, Weiqing Qi, Guoyang Zhao, Linwei Zheng, Sheng Wang, Ming Liu

Link: http://arxiv.org/abs/2404.06860v1open in new window

Abstract: 3D lane detection plays a crucial role in autonomous driving by extracting structural and traffic information from the road in 3D space to assist the self-driving car in rational, safe, and comfortable path planning and motion control. Due to the consideration of sensor costs and the advantages of visual data in color information, in practical applications, 3D lane detection based on monocular vision is one of the important research directions in the field of autonomous driving, which has attracted more and more attention in both industry and academia. Unfortunately, recent progress in visual perception seems insufficient to develop completely reliable 3D lane detection algorithms, which also hinders the development of vision-based fully autonomous self-driving cars, i.e., achieving level 5 autonomous driving, driving like human-controlled cars. This is one of the conclusions drawn from this review paper: there is still a lot of room for improvement and significant improvements are still needed in the 3D lane detection algorithm for autonomous driving cars using visual sensors. Motivated by this, this review defines, analyzes, and reviews the current achievements in the field of 3D lane detection research, and the vast majority of the current progress relies heavily on computationally complex deep learning models. In addition, this review covers the 3D lane detection pipeline, investigates the performance of state-of-the-art algorithms, analyzes the time complexity of cutting-edge modeling choices, and highlights the main achievements and limitations of current research efforts. The survey also includes a comprehensive discussion of available 3D lane detection datasets and the challenges that researchers have faced but have not yet resolved. Finally, our work outlines future research directions and welcomes researchers and practitioners to enter this exciting field.

Enhancing Safety in Mixed Traffic: Learning-Based Modeling and Efficient Control of Autonomous and Human-Driven Vehicles

Authors: Jie Wang, Yash Vardhan Pant, Lei Zhao, Michał Antkiewicz, Krzysztof Czarnecki

Link: http://arxiv.org/abs/2404.06732v1open in new window

Abstract: With the increasing presence of autonomous vehicles (AVs) on public roads, developing robust control strategies to navigate the uncertainty of human-driven vehicles (HVs) is crucial. This paper introduces an advanced method for modeling HV behavior, combining a first-principles model with Gaussian process (GP) learning to enhance velocity prediction accuracy and provide a measurable uncertainty. We validated this innovative HV model using real-world data from field experiments and applied it to develop a GP-enhanced model predictive control (GP-MPC) strategy. This strategy aims to improve safety in mixed vehicle platoons by integrating uncertainty assessment into distance constraints. Comparative simulation studies with a conventional model predictive control (MPC) approach demonstrated that our GP-MPC strategy ensures more reliable safe distancing and fosters efficient vehicular dynamics, achieving notably higher speeds within the platoon. By incorporating a sparse GP technique in HV modeling and adopting a dynamic GP prediction within the MPC framework, we significantly reduced the computation time of GP-MPC, marking it only 4.6% higher than that of the conventional MPC. This represents a substantial improvement, making the process about 100 times faster than our preliminary work without these approximations. Our findings underscore the effectiveness of learning-based HV modeling in enhancing both safety and operational efficiency in mixed-traffic environments, paving the way for more harmonious AV-HV interactions.

2024-04-09

Autonomous Evaluation and Refinement of Digital Agents

Authors: Jiayi Pan, Yichi Zhang, Nicholas Tomlin, Yifei Zhou, Sergey Levine, Alane Suhr

Link: http://arxiv.org/abs/2404.06474v2open in new window

Abstract: We show that domain-general automatic evaluators can significantly improve the performance of agents for web navigation and device control. We experiment with multiple evaluation models that trade off between inference cost, modularity of design, and accuracy. We validate the performance of these models in several popular benchmarks for digital agents, finding between 74.4 and 92.9% agreement with oracle evaluation metrics. Finally, we use these evaluators to improve the performance of existing agents via fine-tuning and inference-time guidance. Without any additional supervision, we improve state-of-the-art performance by 29% on the popular benchmark WebArena, and achieve a 75% relative improvement in a challenging domain transfer scenario.

Towards Autonomous Driving with Small-Scale Cars: A Survey of Recent Development

Authors: Dianzhao Li, Paul Auerbach, Ostap Okhrin

Link: http://arxiv.org/abs/2404.06229v1open in new window

Abstract: While engaging with the unfolding revolution in autonomous driving, a challenge presents itself, how can we effectively raise awareness within society about this transformative trend? While full-scale autonomous driving vehicles often come with a hefty price tag, the emergence of small-scale car platforms offers a compelling alternative. These platforms not only serve as valuable educational tools for the broader public and young generations but also function as robust research platforms, contributing significantly to the ongoing advancements in autonomous driving technology. This survey outlines various small-scale car platforms, categorizing them and detailing the research advancements accomplished through their usage. The conclusion provides proposals for promising future directions in the field.

2024-04-08

Design of Transit-Centric Multimodal Urban Mobility System with Autonomous Mobility-on-Demand

Authors: Xiaotong Guo, Jinhua Zhao

Link: http://arxiv.org/abs/2404.05885v1open in new window

Abstract: This paper addresses the pressing challenge of urban mobility in the context of growing urban populations, changing demand patterns for urban mobility, and emerging technologies like Mobility-on-Demand (MoD) platforms and Autonomous Vehicle (AV). As urban areas swell and demand pattern changes, the integration of Autonomous Mobility-on-Demand (AMoD) systems with existing public transit (PT) networks presents great opportunities to enhancing urban mobility. We propose a novel optimization framework for solving the Transit-Centric Multimodal Urban Mobility with Autonomous Mobility-on-Demand (TCMUM-AMoD) at scale. The system operator (public transit agency) determines the network design and frequency settings of the PT network, fleet sizing and allocations of AMoD system, and the pricing for using the multimodal system with the goal of minimizing passenger disutility. Passengers' mode and route choice behaviors are modeled explicitly using discrete choice models. A first-order approximation algorithm is introduced to solve the problem at scale. Using a case study in Chicago, we showcase the potential to optimize urban mobility across different demand scenarios. To our knowledge, ours is the first paper to jointly optimize transit network design, fleet sizing, and pricing for the multimodal mobility system while considering passengers' mode and route choices.

Human-Machine Interaction in Automated Vehicles: Reducing Voluntary Driver Intervention

Authors: Xinzhi Zhong, Yang Zhou, Varshini Kamaraj, Zhenhao Zhou, Wissam Kontar, Dan Negrut, John D. Lee, Soyoung Ahn

Link: http://arxiv.org/abs/2404.05832v1open in new window

Abstract: This paper develops a novel car-following control method to reduce voluntary driver interventions and improve traffic stability in Automated Vehicles (AVs). Through a combination of experimental and empirical analysis, we show how voluntary driver interventions can instigate substantial traffic disturbances that are amplified along the traffic upstream. Motivated by these findings, we present a framework for driver intervention based on evidence accumulation (EA), which describes the evolution of the driver's distrust in automation, ultimately resulting in intervention. Informed through the EA framework, we propose a deep reinforcement learning (DRL)-based car-following control for AVs that is strategically designed to mitigate unnecessary driver intervention and improve traffic stability. Numerical experiments are conducted to demonstrate the effectiveness of the proposed control model.

Design and Simulation of Time-energy Optimal Anti-swing Trajectory Planner for Autonomous Tower Cranes

Authors: Souravik Dutta, Yiyu Cai

Link: http://arxiv.org/abs/2404.05581v1open in new window

Abstract: For autonomous crane lifting, optimal trajectories of the crane are required as reference inputs to the crane controller to facilitate feedforward control. Reducing the unactuated payload motion is a crucial issue for under-actuated tower cranes with spherical pendulum dynamics. The planned trajectory should be optimal in terms of both operating time and energy consumption, to facilitate optimum output spending optimum effort. This article proposes an anti-swing tower crane trajectory planner that can provide time-energy optimal solutions for the Computer-Aided Lift Planning (CALP) system developed at Nanyang Technological University, which facilitates collision-free lifting path planning of robotized tower cranes in autonomous construction sites. The current work introduces a trajectory planning module to the system that utilizes the geometric outputs from the path planning module and optimally scales them with time information. Firstly, analyzing the non-linear dynamics of the crane operations, the tower crane is established as differentially flat. Subsequently, the multi-objective trajectory optimization problems for all the crane operations are formulated in the flat output space through consideration of the mechanical and safety constraints. Two multi-objective evolutionary algorithms, namely Non-dominated Sorting Genetic Algorithm (NSGA-II) and Generalized Differential Evolution 3 (GDE3), are extensively compared via statistical measures based on the closeness of solutions to the Pareto front, distribution of solutions in the solution space and the runtime, to select the optimization engine of the planner. Finally, the crane operation trajectories are obtained via the corresponding planned flat output trajectories. Studies simulating real-world lifting scenarios are conducted to verify the effectiveness and reliability of the proposed module of the lift planning system.

AutoCodeRover: Autonomous Program Improvement

Authors: Yuntong Zhang, Haifeng Ruan, Zhiyu Fan, Abhik Roychoudhury

Link: http://arxiv.org/abs/2404.05427v2open in new window

Abstract: Researchers have made significant progress in automating the software development process in the past decades. Recent progress in Large Language Models (LLMs) has significantly impacted the development process, where developers can use LLM-based programming assistants to achieve automated coding. Nevertheless software engineering involves the process of program improvement apart from coding, specifically to enable software maintenance (e.g. bug fixing) and software evolution (e.g. feature additions). In this paper, we propose an automated approach for solving GitHub issues to autonomously achieve program improvement. In our approach called AutoCodeRover, LLMs are combined with sophisticated code search capabilities, ultimately leading to a program modification or patch. In contrast to recent LLM agent approaches from AI researchers and practitioners, our outlook is more software engineering oriented. We work on a program representation (abstract syntax tree) as opposed to viewing a software project as a mere collection of files. Our code search exploits the program structure in the form of classes/methods to enhance LLM's understanding of the issue's root cause, and effectively retrieve a context via iterative search. The use of spectrum based fault localization using tests, further sharpens the context, as long as a test-suite is available. Experiments on SWE-bench-lite which consists of 300 real-life GitHub issues show increased efficacy in solving GitHub issues (22-23% on SWE-bench-lite). On the full SWE-bench consisting of 2294 GitHub issues, AutoCodeRover solved around 16% of issues, which is higher than the efficacy of the recently reported AI software engineer Devin from Cognition Labs, while taking time comparable to Devin. We posit that our workflow enables autonomous software engineering, where, in future, auto-generated code from LLMs can be autonomously improved.

Residual Chain Prediction for Autonomous Driving Path Planning

Authors: Liguo Zhou, Yirui Zhou, Huaming Liu, Alois Knoll

Link: http://arxiv.org/abs/2404.05423v1open in new window

Abstract: In the rapidly evolving field of autonomous driving systems, the refinement of path planning algorithms is paramount for navigating vehicles through dynamic environments, particularly in complex urban scenarios. Traditional path planning algorithms, which are heavily reliant on static rules and manually defined parameters, often fall short in such contexts, highlighting the need for more adaptive, learning-based approaches. Among these, behavior cloning emerges as a noteworthy strategy for its simplicity and efficiency, especially within the realm of end-to-end path planning. However, behavior cloning faces challenges, such as covariate shift when employing traditional Manhattan distance as the metric. Addressing this, our study introduces the novel concept of Residual Chain Loss. Residual Chain Loss dynamically adjusts the loss calculation process to enhance the temporal dependency and accuracy of predicted path points, significantly improving the model's performance without additional computational overhead. Through testing on the nuScenes dataset, we underscore the method's substantial advancements in addressing covariate shift, facilitating dynamic loss adjustments, and ensuring seamless integration with end-to-end path planning frameworks. Our findings highlight the potential of Residual Chain Loss to revolutionize planning component of autonomous driving systems, marking a significant step forward in the quest for level 5 autonomous driving system.

2024-04-07

AirShot: Efficient Few-Shot Detection for Autonomous Exploration

Authors: Zihan Wang, Bowen Li, Chen Wang, Sebastian Scherer

Link: http://arxiv.org/abs/2404.05069v1open in new window

Abstract: Few-shot object detection has drawn increasing attention in the field of robotic exploration, where robots are required to find unseen objects with a few online provided examples. Despite recent efforts have been made to yield online processing capabilities, slow inference speeds of low-powered robots fail to meet the demands of real-time detection-making them impractical for autonomous exploration. Existing methods still face performance and efficiency challenges, mainly due to unreliable features and exhaustive class loops. In this work, we propose a new paradigm AirShot, and discover that, by fully exploiting the valuable correlation map, AirShot can result in a more robust and faster few-shot object detection system, which is more applicable to robotics community. The core module Top Prediction Filter (TPF) can operate on multi-scale correlation maps in both the training and inference stages. During training, TPF supervises the generation of a more representative correlation map, while during inference, it reduces looping iterations by selecting top-ranked classes, thus cutting down on computational costs with better performance. Surprisingly, this dual functionality exhibits general effectiveness and efficiency on various off-the-shelf models. Exhaustive experiments on COCO2017, VOC2014, and SubT datasets demonstrate that TPF can significantly boost the efficacy and efficiency of most off-the-shelf models, achieving up to 36.4% precision improvements along with 56.3% faster inference speed. Code and Data are at: https://github.com/ImNotPrepared/AirShot.

Multi-Type Map Construction via Semantics-Aware Autonomous Exploration in Unknown Indoor Environments

Authors: Jianfang Mao, Yuheng Xie, Si Chen, Zhixiong Nan, Xiao Wang

Link: http://arxiv.org/abs/2404.04879v1open in new window

Abstract: This paper proposes a novel semantics-aware autonomous exploration model to handle the long-standing issue: the mainstream RRT (Rapid-exploration Random Tree) based exploration models usually make the mobile robot switch frequently between different regions, leading to the excessively-repeated explorations for the same region. Our proposed semantics-aware model encourages a mobile robot to fully explore the current region before moving to the next region, which is able to avoid excessively-repeated explorations and make the exploration faster. The core idea of semantics-aware autonomous exploration model is optimizing the sampling point selection mechanism and frontier point evaluation function by considering the semantic information of regions. In addition, compared with existing autonomous exploration methods that usually construct the single-type or 2-3 types of maps, our model allows to construct four kinds of maps including point cloud map, occupancy grid map, topological map, and semantic map. To test the performance of our model, we conducted experiments in three simulated environments. The experiment results demonstrate that compared to Improved RRT, our model achieved 33.0% exploration time reduction and 39.3% exploration trajectory length reduction when maintaining >98% exploration rate.

Prompting Multi-Modal Tokens to Enhance End-to-End Autonomous Driving Imitation Learning with LLMs

Authors: Yiqun Duan, Qiang Zhang, Renjing Xu

Link: http://arxiv.org/abs/2404.04869v1open in new window

Abstract: The utilization of Large Language Models (LLMs) within the realm of reinforcement learning, particularly as planners, has garnered a significant degree of attention in recent scholarly literature. However, a substantial proportion of existing research predominantly focuses on planning models for robotics that transmute the outputs derived from perception models into linguistic forms, thus adopting a `pure-language' strategy. In this research, we propose a hybrid End-to-End learning framework for autonomous driving by combining basic driving imitation learning with LLMs based on multi-modality prompt tokens. Instead of simply converting perception results from the separated train model into pure language input, our novelty lies in two aspects. 1) The end-to-end integration of visual and LiDAR sensory input into learnable multi-modality tokens, thereby intrinsically alleviating description bias by separated pre-trained perception models. 2) Instead of directly letting LLMs drive, this paper explores a hybrid setting of letting LLMs help the driving model correct mistakes and complicated scenarios. The results of our experiments suggest that the proposed methodology can attain driving scores of 49.21%, coupled with an impressive route completion rate of 91.34% in the offline evaluation conducted via CARLA. These performance metrics are comparable to the most advanced driving models.

Light the Night: A Multi-Condition Diffusion Framework for Unpaired Low-Light Enhancement in Autonomous Driving

Authors: Jinlong Li, Baolu Li, Zhengzhong Tu, Xinyu Liu, Qing Guo, Felix Juefei-Xu, Runsheng Xu, Hongkai Yu

Link: http://arxiv.org/abs/2404.04804v1open in new window

Abstract: Vision-centric perception systems for autonomous driving have gained considerable attention recently due to their cost-effectiveness and scalability, especially compared to LiDAR-based systems. However, these systems often struggle in low-light conditions, potentially compromising their performance and safety. To address this, our paper introduces LightDiff, a domain-tailored framework designed to enhance the low-light image quality for autonomous driving applications. Specifically, we employ a multi-condition controlled diffusion model. LightDiff works without any human-collected paired data, leveraging a dynamic data degradation process instead. It incorporates a novel multi-condition adapter that adaptively controls the input weights from different modalities, including depth maps, RGB images, and text captions, to effectively illuminate dark scenes while maintaining context consistency. Furthermore, to align the enhanced images with the detection model's knowledge, LightDiff employs perception-specific scores as rewards to guide the diffusion training process through reinforcement learning. Extensive experiments on the nuScenes datasets demonstrate that LightDiff can significantly improve the performance of several state-of-the-art 3D detectors in night-time conditions while achieving high visual quality scores, highlighting its potential to safeguard autonomous driving.

2024-04-06

HawkDrive: A Transformer-driven Visual Perception System for Autonomous Driving in Night Scene

Authors: Ziang Guo, Stepan Perminov, Mikhail Konenkov, Dzmitry Tsetserukou

Link: http://arxiv.org/abs/2404.04653v1open in new window

Abstract: Many established vision perception systems for autonomous driving scenarios ignore the influence of light conditions, one of the key elements for driving safety. To address this problem, we present HawkDrive, a novel perception system with hardware and software solutions. Hardware that utilizes stereo vision perception, which has been demonstrated to be a more reliable way of estimating depth information than monocular vision, is partnered with the edge computing device Nvidia Jetson Xavier AGX. Our software for low light enhancement, depth estimation, and semantic segmentation tasks, is a transformer-based neural network. Our software stack, which enables fast inference and noise reduction, is packaged into system modules in Robot Operating System 2 (ROS2). Our experimental results have shown that the proposed end-to-end system is effective in improving the depth estimation and semantic segmentation performance. Our dataset and codes will be released at https://github.com/ZionGo6/HawkDrive.

2024-04-05

Exploring Autonomous Agents through the Lens of Large Language Models: A Review

Authors: Saikat Barua

Link: http://arxiv.org/abs/2404.04442v1open in new window

Abstract: Large Language Models (LLMs) are transforming artificial intelligence, enabling autonomous agents to perform diverse tasks across various domains. These agents, proficient in human-like text comprehension and generation, have the potential to revolutionize sectors from customer service to healthcare. However, they face challenges such as multimodality, human value alignment, hallucinations, and evaluation. Techniques like prompting, reasoning, tool utilization, and in-context learning are being explored to enhance their capabilities. Evaluation platforms like AgentBench, WebArena, and ToolLLM provide robust methods for assessing these agents in complex scenarios. These advancements are leading to the development of more resilient and capable autonomous agents, anticipated to become integral in our digital lives, assisting in tasks from email responses to disease diagnosis. The future of AI, with LLMs at the forefront, is promising.

A Ground Mobile Robot for Autonomous Terrestrial Laser Scanning-Based Field Phenotyping

Authors: Javier Rodriguez-Sanchez, Kyle Johnsen, Changying Li

Link: http://arxiv.org/abs/2404.04404v1open in new window

Abstract: Traditional field phenotyping methods are often manual, time-consuming, and destructive, posing a challenge for breeding progress. To address this bottleneck, robotics and automation technologies offer efficient sensing tools to monitor field evolution and crop development throughout the season. This study aimed to develop an autonomous ground robotic system for LiDAR-based field phenotyping in plant breeding trials. A Husky platform was equipped with a high-resolution three-dimensional (3D) laser scanner to collect in-field terrestrial laser scanning (TLS) data without human intervention. To automate the TLS process, a 3D ray casting analysis was implemented for optimal TLS site planning, and a route optimization algorithm was utilized to minimize travel distance during data collection. The platform was deployed in two cotton breeding fields for evaluation, where it autonomously collected TLS data. The system provided accurate pose information through RTK-GNSS positioning and sensor fusion techniques, with average errors of less than 0.6 cm for location and 0.38$^{\circ}$ for heading. The achieved localization accuracy allowed point cloud registration with mean point errors of approximately 2 cm, comparable to traditional TLS methods that rely on artificial targets and manual sensor deployment. This work presents an autonomous phenotyping platform that facilitates the quantitative assessment of plant traits under field conditions of both large agricultural fields and small breeding trials to contribute to the advancement of plant phenomics and breeding programs.

2024-04-03

Autonomous Vehicle Networks for More Reliable Truck Tracking in Challenged High Mountain Roads, Tunnels and Bridges Environments

Authors: Junhao Chen, Milena Radenkovic

Link: http://arxiv.org/abs/2404.03033v1open in new window

Abstract: The popularity of online shopping has challenged the existing express tracking. How to provide customers with reliable and stable express tracking has become one of the important issues that express companies need to solve now. The current stage of courier tracking is not ideal in challenging environments such as mountain roads, tunnels and city centres. Therefore, the project aims to overcome the challenging environment and achieve stable express tracking, and proposes the Ya'an scenario and conducted multiple experiments. We show that opportunistic DTN-aware protocols are feasible solution for trucks to maintain stable communication in challenging environments, and nodes maintain extremely high message delivery rates and average delays that can maintain communication.

Leveraging Swarm Intelligence to Drive Autonomously: A Particle Swarm Optimization based Approach to Motion Planning

Authors: Sven Ochs, Jens Doll, Marc Heinrich, Philip Schörner, Sebastian Klemm, Marc René Zofka, J. Marius Zöllner

Link: http://arxiv.org/abs/2404.02644v1open in new window

Abstract: Motion planning is an essential part of autonomous mobile platforms. A good pipeline should be modular enough to handle different vehicles, environments, and perception modules. The planning process has to cope with all the different modalities and has to have a modular and flexible design. But most importantly, it has to be safe and robust. In this paper, we want to present our motion planning pipeline with particle swarm optimization (PSO) at its core. This solution is independent of the vehicle type and has a clear and simple-to-implement interface for perception modules. Moreover, the approach stands out for being easily adaptable to new scenarios. Parallel calculation allows for fast planning cycles. Following the principles of PSO, the trajectory planer first generates a swarm of initial trajectories that are optimized afterward. We present the underlying control space and inner workings. Finally, the application to real-world automated driving is shown in the evaluation with a deeper look at the modeling of the cost function. The approach is used in our automated shuttles that have already driven more than 3.500 km safely and entirely autonomously in sub-urban everyday traffic.

Fusing Multi-sensor Input with State Information on TinyML Brains for Autonomous Nano-drones

Authors: Luca Crupi, Elia Cereda, Daniele Palossi

Link: http://arxiv.org/abs/2404.02567v1open in new window

Abstract: Autonomous nano-drones (~10 cm in diameter), thanks to their ultra-low power TinyML-based brains, are capable of coping with real-world environments. However, due to their simplified sensors and compute units, they are still far from the sense-and-act capabilities shown in their bigger counterparts. This system paper presents a novel deep learning-based pipeline that fuses multi-sensorial input (i.e., low-resolution images and 8x8 depth map) with the robot's state information to tackle a human pose estimation task. Thanks to our design, the proposed system -- trained in simulation and tested on a real-world dataset -- improves a state-unaware State-of-the-Art baseline by increasing the R^2 regression metric up to 0.10 on the distance's prediction.

Cultural influence on autonomous vehicles acceptance

Authors: Chowdhury Shahriar Muzammel, Maria Spichkova, James Harland

Link: http://arxiv.org/abs/2404.03694v1open in new window

Abstract: Autonomous vehicles and other intelligent transport systems have been evolving rapidly and are being increasingly deployed worldwide. Previous work has shown that perceptions of autonomous vehicles and attitudes towards them depend on various attributes, including the respondent's age, education level and background. These findings with respect to age and educational level are generally uniform, such as showing that younger respondents are typically more accepting of autonomous vehicles, as are those with higher education levels. However the influence of factors such as culture are much less clear cut. In this paper we analyse the relationship between acceptance of autonomous vehicles and national culture by means of the well-known Hofstede cultural model.

TCLC-GS: Tightly Coupled LiDAR-Camera Gaussian Splatting for Surrounding Autonomous Driving Scenes

Authors: Cheng Zhao, Su Sun, Ruoyu Wang, Yuliang Guo, Jun-Jun Wan, Zhou Huang, Xinyu Huang, Yingjie Victor Chen, Liu Ren

Link: http://arxiv.org/abs/2404.02410v1open in new window

Abstract: Most 3D Gaussian Splatting (3D-GS) based methods for urban scenes initialize 3D Gaussians directly with 3D LiDAR points, which not only underutilizes LiDAR data capabilities but also overlooks the potential advantages of fusing LiDAR with camera data. In this paper, we design a novel tightly coupled LiDAR-Camera Gaussian Splatting (TCLC-GS) to fully leverage the combined strengths of both LiDAR and camera sensors, enabling rapid, high-quality 3D reconstruction and novel view RGB/depth synthesis. TCLC-GS designs a hybrid explicit (colorized 3D mesh) and implicit (hierarchical octree feature) 3D representation derived from LiDAR-camera data, to enrich the properties of 3D Gaussians for splatting. 3D Gaussian's properties are not only initialized in alignment with the 3D mesh which provides more completed 3D shape and color information, but are also endowed with broader contextual information through retrieved octree implicit features. During the Gaussian Splatting optimization process, the 3D mesh offers dense depth information as supervision, which enhances the training process by learning of a robust geometry. Comprehensive evaluations conducted on the Waymo Open Dataset and nuScenes Dataset validate our method's state-of-the-art (SOTA) performance. Utilizing a single NVIDIA RTX 3090 Ti, our method demonstrates fast training and achieves real-time RGB and depth rendering at 90 FPS in resolution of 1920x1280 (Waymo), and 120 FPS in resolution of 1600x900 (nuScenes) in urban scenarios.

2024-04-02

Towards Scalable & Efficient Interaction-Aware Planning in Autonomous Vehicles using Knowledge Distillation

Authors: Piyush Gupta, David Isele, Sangjae Bae

Link: http://arxiv.org/abs/2404.01746v1open in new window

Abstract: Real-world driving involves intricate interactions among vehicles navigating through dense traffic scenarios. Recent research focuses on enhancing the interaction awareness of autonomous vehicles to leverage these interactions in decision-making. These interaction-aware planners rely on neural-network-based prediction models to capture inter-vehicle interactions, aiming to integrate these predictions with traditional control techniques such as Model Predictive Control. However, this integration of deep learning-based models with traditional control paradigms often results in computationally demanding optimization problems, relying on heuristic methods. This study introduces a principled and efficient method for combining deep learning with constrained optimization, employing knowledge distillation to train smaller and more efficient networks, thereby mitigating complexity. We demonstrate that these refined networks maintain the problem-solving efficacy of larger models while significantly accelerating optimization. Specifically, in the domain of interaction-aware trajectory planning for autonomous vehicles, we illustrate that training a smaller prediction network using knowledge distillation speeds up optimization without sacrificing accuracy.

Boosting Visual Recognition for Autonomous Driving in Real-world Degradations with Deep Channel Prior

Authors: Zhanwen Liu, Yuhang Li, Yang Wang, Bolin Gao, Yisheng An, Xiangmo Zhao

Link: http://arxiv.org/abs/2404.01703v1open in new window

Abstract: The environmental perception of autonomous vehicles in normal conditions have achieved considerable success in the past decade. However, various unfavourable conditions such as fog, low-light, and motion blur will degrade image quality and pose tremendous threats to the safety of autonomous driving. That is, when applied to degraded images, state-of-the-art visual models often suffer performance decline due to the feature content loss and artifact interference caused by statistical and structural properties disruption of captured images. To address this problem, this work proposes a novel Deep Channel Prior (DCP) for degraded visual recognition. Specifically, we observe that, in the deep representation space of pre-trained models, the channel correlations of degraded features with the same degradation type have uniform distribution even if they have different content and semantics, which can facilitate the mapping relationship learning between degraded and clear representations in high-sparsity feature space. Based on this, a novel plug-and-play Unsupervised Feature Enhancement Module (UFEM) is proposed to achieve unsupervised feature correction, where the multi-adversarial mechanism is introduced in the first stage of UFEM to achieve the latent content restoration and artifact removal in high-sparsity feature space. Then, the generated features are transferred to the second stage for global correlation modulation under the guidance of DCP to obtain high-quality and recognition-friendly features. Evaluations of three tasks and eight benchmark datasets demonstrate that our proposed method can comprehensively improve the performance of pre-trained models in real degradation conditions. The source code is available at https://github.com/liyuhang166/Deep_Channel_Prior

2024-04-01

QuAD: Query-based Interpretable Neural Motion Planning for Autonomous Driving

Authors: Sourav Biswas, Sergio Casas, Quinlan Sykora, Ben Agro, Abbas Sadat, Raquel Urtasun

Link: http://arxiv.org/abs/2404.01486v1open in new window

Abstract: A self-driving vehicle must understand its environment to determine the appropriate action. Traditional autonomy systems rely on object detection to find the agents in the scene. However, object detection assumes a discrete set of objects and loses information about uncertainty, so any errors compound when predicting the future behavior of those agents. Alternatively, dense occupancy grid maps have been utilized to understand free-space. However, predicting a grid for the entire scene is wasteful since only certain spatio-temporal regions are reachable and relevant to the self-driving vehicle. We present a unified, interpretable, and efficient autonomy framework that moves away from cascading modules that first perceive, then predict, and finally plan. Instead, we shift the paradigm to have the planner query occupancy at relevant spatio-temporal points, restricting the computation to those regions of interest. Exploiting this representation, we evaluate candidate trajectories around key factors such as collision avoidance, comfort, and progress for safety and interpretability. Our approach achieves better highway driving quality than the state-of-the-art in high-fidelity closed-loop simulations.

Digital Twins for Supporting AI Research with Autonomous Vehicle Networks

Authors: Anıl Gürses, Gautham Reddy, Saad Masrur, Özgür Özdemir, İsmail Güvenç, Mihail L. Sichitiu, Alphan Şahin, Ahmed Alkhateeb, Rudra Dutta

Link: http://arxiv.org/abs/2404.00954v1open in new window

Abstract: Digital twins (DTs), which are virtual environments that simulate, predict, and optimize the performance of their physical counterparts, are envisioned to be essential technologies for advancing next-generation wireless networks. While DTs have been studied extensively for wireless networks, their use in conjunction with autonomous vehicles with programmable mobility remains relatively under-explored. In this paper, we study DTs used as a development environment to design, deploy, and test artificial intelligence (AI) techniques that use real-time observations, e.g. radio key performance indicators, for vehicle trajectory and network optimization decisions in an autonomous vehicle networks (AVN). We first compare and contrast the use of simulation, digital twin (software in the loop (SITL)), sandbox (hardware-in-the-loop (HITL)), and physical testbed environments for their suitability in developing and testing AI algorithms for AVNs. We then review various representative use cases of DTs for AVN scenarios. Finally, we provide an example from the NSF AERPAW platform where a DT is used to develop and test AI-aided solutions for autonomous unmanned aerial vehicles for localizing a signal source based solely on link quality measurements. Our results in the physical testbed show that SITL DTs, when supplemented with data from real-world (RW) measurements and simulations, can serve as an ideal environment for developing and testing innovative AI solutions for AVNs.

2024-03-31

An Active Perception Game for Robust Autonomous Exploration

Authors: Siming He, Yuezhan Tao, Igor Spasojevic, Vijay Kumar, Pratik Chaudhari

Link: http://arxiv.org/abs/2404.00769v1open in new window

Abstract: We formulate active perception for an autonomous agent that explores an unknown environment as a two-player zero-sum game: the agent aims to maximize information gained from the environment while the environment aims to minimize the information gained by the agent. In each episode, the environment reveals a set of actions with their potentially erroneous information gain. In order to select the best action, the robot needs to recover the true information gain from the erroneous one. The robot does so by minimizing the discrepancy between its estimate of information gain and the true information gain it observes after taking the action. We propose an online convex optimization algorithm that achieves sub-linear expected regret $O(T^{3/4})$ for estimating the information gain. We also provide a bound on the regret of active perception performed by any (near-)optimal prediction and trajectory selection algorithms. We evaluate this approach using semantic neural radiance fields (NeRFs) in simulated realistic 3D environments to show that the robot can discover up to 12% more objects using the improved estimate of the information gain. On the M3ED dataset, the proposed algorithm reduced the error of information gain prediction in occupancy map by over 67%. In real-world experiments using occupancy maps on a Jackal ground robot, we show that this approach can calculate complicated trajectories that efficiently explore all occluded regions.

End-to-End Autonomous Driving through V2X Cooperation

Authors: Haibao Yu, Wenxian Yang, Jiaru Zhong, Zhenwei Yang, Siqi Fan, Ping Luo, Zaiqing Nie

Link: http://arxiv.org/abs/2404.00717v1open in new window

Abstract: Cooperatively utilizing both ego-vehicle and infrastructure sensor data via V2X communication has emerged as a promising approach for advanced autonomous driving. However, current research mainly focuses on improving individual modules, rather than taking end-to-end learning to optimize final planning performance, resulting in underutilized data potential. In this paper, we introduce UniV2X, a pioneering cooperative autonomous driving framework that seamlessly integrates all key driving modules across diverse views into a unified network. We propose a sparse-dense hybrid data transmission and fusion mechanism for effective vehicle-infrastructure cooperation, offering three advantages: 1) Effective for simultaneously enhancing agent perception, online mapping, and occupancy prediction, ultimately improving planning performance. 2) Transmission-friendly for practical and limited communication conditions. 3) Reliable data fusion with interpretability of this hybrid data. We implement UniV2X, as well as reproducing several benchmark methods, on the challenging DAIR-V2X, the real-world cooperative driving dataset. Experimental results demonstrate the effectiveness of UniV2X in significantly enhancing planning performance, as well as all intermediate output performance. Code is at https://github.com/AIR-THU/UniV2X.

2024-03-30

Zero-shot Safety Prediction for Autonomous Robots with Foundation World Models

Authors: Zhenjiang Mao, Siqi Dai, Yuang Geng, Ivan Ruchkin

Link: http://arxiv.org/abs/2404.00462v2open in new window

Abstract: A world model creates a surrogate world to train a controller and predict safety violations by learning the internal dynamic model of systems. However, the existing world models rely solely on statistical learning of how observations change in response to actions, lacking precise quantification of how accurate the surrogate dynamics are, which poses a significant challenge in safety-critical systems. To address this challenge, we propose foundation world models that embed observations into meaningful and causally latent representations. This enables the surrogate dynamics to directly predict causal future states by leveraging a training-free large language model. In two common benchmarks, this novel model outperforms standard world models in the safety prediction task and has a performance comparable to supervised learning despite not using any data. We evaluate its performance with a more specialized and system-relevant metric by comparing estimated states instead of aggregating observation-wide error.

On Accessibility Fairness in Intermodal Autonomous Mobility-on-Demand Systems

Authors: Mauro Salazar, Sara Betancur Giraldo, Fabio Paparella, Leonardo Pedroso

Link: http://arxiv.org/abs/2404.00434v2open in new window

Abstract: Research on the operation of mobility systems so far has mostly focused on minimizing cost-centered metrics such as average travel time, distance driven, and operational costs. Whilst capturing economic indicators, such metrics do not account for transportation justice aspects. In this paper, we present an optimization model to plan the operation of Intermodal Autonomous Mobility-on-Demand (I-AMoD) systems, where self-driving vehicles provide on-demand mobility jointly with public transit and active modes, with the goal to minimize the accessibility unfairness experienced by the population. Specifically, we first leverage a previously developed network flow model to compute the I-AMoD system operation in a minimum-time manner. Second, we formally define accessibility unfairness, and use it to frame the minimum-accessibility-unfairness problem and cast it as a linear program. We showcase our framework for a real-world case-study in the city of Eindhoven, NL. Our results show that it is possible to reach an operation that is on average fully fair at the cost of a slight travel time increase compared to a minimum-travel-time solution. Thereby we observe that the accessibility fairness of individual paths is, on average, worse than the average values obtained from flows, setting the stage for a discussion on the definition of accessibility fairness itself.

Continual Learning for Autonomous Robots: A Prototype-based Approach

Authors: Elvin Hajizada, Balachandran Swaminathan, Yulia Sandamirskaya

Link: http://arxiv.org/abs/2404.00418v1open in new window

Abstract: Humans and animals learn throughout their lives from limited amounts of sensed data, both with and without supervision. Autonomous, intelligent robots of the future are often expected to do the same. The existing continual learning (CL) methods are usually not directly applicable to robotic settings: they typically require buffering and a balanced replay of training data. A few-shot online continual learning (FS-OCL) setting has been proposed to address more realistic scenarios where robots must learn from a non-repeated sparse data stream. To enable truly autonomous life-long learning, an additional challenge of detecting novelties and learning new items without supervision needs to be addressed. We address this challenge with our new prototype-based approach called Continually Learning Prototypes (CLP). In addition to being capable of FS-OCL learning, CLP also detects novel objects and learns them without supervision. To mitigate forgetting, CLP utilizes a novel metaplasticity mechanism that adapts the learning rate individually per prototype. CLP is rehearsal-free, hence does not require a memory buffer, and is compatible with neuromorphic hardware, characterized by ultra-low power consumption, real-time processing abilities, and on-chip learning. Indeed, we have open-sourced a simple version of CLP in the neuromorphic software framework Lava, targetting Intel's neuromorphic chip Loihi 2. We evaluate CLP on a robotic vision dataset, OpenLORIS. In a low-instance FS-OCL scenario, CLP shows state-of-the-art results. In the open world, CLP detects novelties with superior precision and recall and learns features of the detected novel classes without supervision, achieving a strong baseline of 99% base class and 65%/76% (5-shot/10-shot) novel class accuracy.

Efficient Multi-branch Segmentation Network for Situation Awareness in Autonomous Navigation

Authors: Guan-Cheng Zhou, Chen Chengb, Yan-zhou Chena

Link: http://arxiv.org/abs/2404.00366v1open in new window

Abstract: Real-time and high-precision situational awareness technology is critical for autonomous navigation of unmanned surface vehicles (USVs). In particular, robust and fast obstacle semantic segmentation methods are essential. However, distinguishing between the sea and the sky is challenging due to the differences between port and maritime environments. In this study, we built a dataset that captured perspectives from USVs and unmanned aerial vehicles in a maritime port environment and analysed the data features. Statistical analysis revealed a high correlation between the distribution of the sea and sky and row positional information. Based on this finding, a three-branch semantic segmentation network with a row position encoding module (RPEM) was proposed to improve the prediction accuracy between the sea and the sky. The proposed RPEM highlights the effect of row coordinates on feature extraction. Compared to the baseline, the three-branch network with RPEM significantly improved the ability to distinguish between the sea and the sky without significantly reducing the computational speed.

Deep Reinforcement Learning in Autonomous Car Path Planning and Control: A Survey

Authors: Yiyang Chen, Chao Ji, Yunrui Cai, Tong Yan, Bo Su

Link: http://arxiv.org/abs/2404.00340v1open in new window

Abstract: Combining data-driven applications with control systems plays a key role in recent Autonomous Car research. This thesis offers a structured review of the latest literature on Deep Reinforcement Learning (DRL) within the realm of autonomous vehicle Path Planning and Control. It collects a series of DRL methodologies and algorithms and their applications in the field, focusing notably on their roles in trajectory planning and dynamic control. In this review, we delve into the application outcomes of DRL technologies in this domain. By summarizing these literatures, we highlight potential challenges, aiming to offer insights that might aid researchers engaged in related fields.

2024-03-29

LeGo-Drive: Language-enhanced Goal-oriented Closed-Loop End-to-End Autonomous Driving

Authors: Pranjal Paul, Anant Garg, Tushar Choudhary, Arun Kumar Singh, K. Madhava Krishna

Link: http://arxiv.org/abs/2403.20116v1open in new window

Abstract: Existing Vision-Language models (VLMs) estimate either long-term trajectory waypoints or a set of control actions as a reactive solution for closed-loop planning based on their rich scene comprehension. However, these estimations are coarse and are subjective to their "world understanding" which may generate sub-optimal decisions due to perception errors. In this paper, we introduce LeGo-Drive, which aims to address this issue by estimating a goal location based on the given language command as an intermediate representation in an end-to-end setting. The estimated goal might fall in a non-desirable region, like on top of a car for a parking-like command, leading to inadequate planning. Hence, we propose to train the architecture in an end-to-end manner, resulting in iterative refinement of both the goal and the trajectory collectively. We validate the effectiveness of our method through comprehensive experiments conducted in diverse simulated environments. We report significant improvements in standard autonomous driving metrics, with a goal reaching Success Rate of 81%. We further showcase the versatility of LeGo-Drive across different driving scenarios and linguistic inputs, underscoring its potential for practical deployment in autonomous vehicles and intelligent transportation systems.

PLoc: A New Evaluation Criterion Based on Physical Location for Autonomous Driving Datasets

Authors: Ruining Yang, Yuqi Peng

Link: http://arxiv.org/abs/2403.19893v1open in new window

Abstract: Autonomous driving has garnered significant attention as a key research area within artificial intelligence. In the context of autonomous driving scenarios, the varying physical locations of objects correspond to different levels of danger. However, conventional evaluation criteria for automatic driving object detection often overlook the crucial aspect of an object's physical location, leading to evaluation results that may not accurately reflect the genuine threat posed by the object to the autonomous driving vehicle. To enhance the safety of autonomous driving, this paper introduces a novel evaluation criterion based on physical location information, termed PLoc. This criterion transcends the limitations of traditional criteria by acknowledging that the physical location of pedestrians in autonomous driving scenarios can provide valuable safety-related information. Furthermore, this paper presents a newly re-annotated dataset (ApolloScape-R) derived from ApolloScape. ApolloScape-R involves the relabeling of pedestrians based on the significance of their physical location. The dataset is utilized to assess the performance of various object detection models under the proposed PLoc criterion. Experimental results demonstrate that the average accuracy of all object detection models in identifying a person situated in the travel lane of an autonomous vehicle is lower than that for a person on a sidewalk. The dataset is publicly available at https://github.com/lnyrlyed/ApolloScape-R.git

2024-03-28

Multi-Frame, Lightweight & Efficient Vision-Language Models for Question Answering in Autonomous Driving

Authors: Akshay Gopalkrishnan, Ross Greer, Mohan Trivedi

Link: http://arxiv.org/abs/2403.19838v1open in new window

Abstract: Vision-Language Models (VLMs) and Multi-Modal Language models (MMLMs) have become prominent in autonomous driving research, as these models can provide interpretable textual reasoning and responses for end-to-end autonomous driving safety tasks using traffic scene images and other data modalities. However, current approaches to these systems use expensive large language model (LLM) backbones and image encoders, making such systems unsuitable for real-time autonomous driving systems where tight memory constraints exist and fast inference time is necessary. To address these previous issues, we develop EM-VLM4AD, an efficient, lightweight, multi-frame vision language model which performs Visual Question Answering for autonomous driving. In comparison to previous approaches, EM-VLM4AD requires at least 10 times less memory and floating point operations, while also achieving higher BLEU-4, METEOR, CIDEr, and ROGUE scores than the existing baseline on the DriveLM dataset. EM-VLM4AD also exhibits the ability to extract relevant information from traffic views related to prompts and can answer questions for various autonomous driving subtasks. We release our code to train and evaluate our model at https://github.com/akshaygopalkr/EM-VLM4AD.

Learning Sampling Distribution and Safety Filter for Autonomous Driving with VQ-VAE and Differentiable Optimization

Authors: Simon Idoko, Basant Sharma, Arun Kumar Singh

Link: http://arxiv.org/abs/2403.19461v1open in new window

Abstract: Sampling trajectories from a distribution followed by ranking them based on a specified cost function is a common approach in autonomous driving. Typically, the sampling distribution is hand-crafted (e.g a Gaussian, or a grid). Recently, there have been efforts towards learning the sampling distribution through generative models such as Conditional Variational Autoencoder (CVAE). However, these approaches fail to capture the multi-modality of the driving behaviour due to the Gaussian latent prior of the CVAE. Thus, in this paper, we re-imagine the distribution learning through vector quantized variational autoencoder (VQ-VAE), whose discrete latent-space is well equipped to capture multi-modal sampling distribution. The VQ-VAE is trained with demonstration data of optimal trajectories. We further propose a differentiable optimization based safety filter to minimally correct the VQVAE sampled trajectories to ensure collision avoidance. We use backpropagation through the optimization layers in a self-supervised learning set-up to learn good initialization and optimal parameters of the safety filter. We perform extensive comparisons with state-of-the-art CVAE-based baseline in dense and aggressive traffic scenarios and show a reduction of up to 12 times in collision-rate while being competitive in driving speeds.

SubjectDrive: Scaling Generative Data in Autonomous Driving via Subject Control

Authors: Binyuan Huang, Yuqing Wen, Yucheng Zhao, Yaosi Hu, Yingfei Liu, Fan Jia, Weixin Mao, Tiancai Wang, Chi Zhang, Chang Wen Chen, Zhenzhong Chen, Xiangyu Zhang

Link: http://arxiv.org/abs/2403.19438v1open in new window

Abstract: Autonomous driving progress relies on large-scale annotated datasets. In this work, we explore the potential of generative models to produce vast quantities of freely-labeled data for autonomous driving applications and present SubjectDrive, the first model proven to scale generative data production in a way that could continuously improve autonomous driving applications. We investigate the impact of scaling up the quantity of generative data on the performance of downstream perception models and find that enhancing data diversity plays a crucial role in effectively scaling generative data production. Therefore, we have developed a novel model equipped with a subject control mechanism, which allows the generative model to leverage diverse external data sources for producing varied and useful data. Extensive evaluations confirm SubjectDrive's efficacy in generating scalable autonomous driving training data, marking a significant step toward revolutionizing data production methods in this field.

Exploring Holistic HMI Design for Automated Vehicles: Insights from a Participatory Workshop to Bridge In-Vehicle and External Communication

Authors: Haoyu Dong, Tram Thi Minh Tran, Rutger Verstegen, Silvia Cazacu, Ruolin Gao, Marius Hoggenmüller, Debargha Dey, Mervyn Franssen, Markus Sasalovici, Pavlo Bazilinskyy, Marieke Martens

Link: http://arxiv.org/abs/2403.19153v1open in new window

Abstract: Human-Machine Interfaces (HMIs) for automated vehicles (AVs) are typically divided into two categories: internal HMIs for interactions within the vehicle, and external HMIs for communication with other road users. In this work, we examine the prospects of bridging these two seemingly distinct domains. Through a participatory workshop with automotive user interface researchers and practitioners, we facilitated a critical exploration of holistic HMI design by having workshop participants collaboratively develop interaction scenarios involving AVs, in-vehicle users, and external road users. The discussion offers insights into the escalation of interface elements as an HMI design strategy, the direct interactions between different users, and an expanded understanding of holistic HMI design. This work reflects a collaborative effort to understand the practical aspects of this holistic design approach, offering new perspectives and encouraging further investigation into this underexplored aspect of automotive user interfaces.

GraphAD: Interaction Scene Graph for End-to-end Autonomous Driving

Authors: Yunpeng Zhang, Deheng Qian, Ding Li, Yifeng Pan, Yong Chen, Zhenbao Liang, Zhiyao Zhang, Shurui Zhang, Hongxu Li, Maolei Fu, Yun Ye, Zhujin Liang, Yi Shan, Dalong Du

Link: http://arxiv.org/abs/2403.19098v2open in new window

Abstract: Modeling complicated interactions among the ego-vehicle, road agents, and map elements has been a crucial part for safety-critical autonomous driving. Previous works on end-to-end autonomous driving rely on the attention mechanism for handling heterogeneous interactions, which fails to capture the geometric priors and is also computationally intensive. In this paper, we propose the Interaction Scene Graph (ISG) as a unified method to model the interactions among the ego-vehicle, road agents, and map elements. With the representation of the ISG, the driving agents aggregate essential information from the most influential elements, including the road agents with potential collisions and the map elements to follow. Since a mass of unnecessary interactions are omitted, the more efficient scene-graph-based framework is able to focus on indispensable connections and leads to better performance. We evaluate the proposed method for end-to-end autonomous driving on the nuScenes dataset. Compared with strong baselines, our method significantly outperforms in the full-stack driving tasks, including perception, prediction, and planning. Code will be released at https://github.com/zhangyp15/GraphAD.

2024-03-27

Nonlinear Model Predictive Control for Enhanced Navigation of Autonomous Surface Vessels

Authors: Daniel Menges, Trym Tengesdal, Adil Rasheed

Link: http://arxiv.org/abs/2403.19028v1open in new window

Abstract: This article proposes an approach for collision avoidance, path following, and anti-grounding of autonomous surface vessels under consideration of environmental forces based on Nonlinear Model Predictive Control (NMPC). Artificial Potential Fields (APFs) set the foundation for the cost function of the optimal control problem in terms of collision avoidance and anti-grounding. Depending on the risk of a collision given by the resulting force of the APFs, the controller optimizes regarding an adapted heading and travel speed by additionally following a desired path. For this purpose, nonlinear vessel dynamics are used for the NMPC. To extend the situational awareness concerning environmental disturbances impacted by wind, waves, and sea currents, a nonlinear disturbance observer is coupled to the entire NMPC scheme, allowing for the correction of an incorrect vessel motion due to external forces. In addition, the most essential rules according to the Convention on the International Regulations for Preventing Collisions at Sea (COLREGs) are considered. The results of the simulations show that the proposed framework can control an autonomous surface vessel under various challenging scenarios, including environmental disturbances, to avoid collisions and follow desired paths.

Ensuring Safe Autonomy: Navigating the Future of Autonomous Vehicles

Authors: Patrick Wolf

Link: http://arxiv.org/abs/2403.19006v1open in new window

Abstract: Autonomous driving vehicles provide a vast potential for realizing use cases in the on-road and off-road domains. Consequently, remarkable solutions exist to autonomous systems' environmental perception and control. Nevertheless, proof of safety remains an open challenge preventing such machinery from being introduced to markets and deployed in real world. Traditional approaches for safety assurance of autonomously driving vehicles often lead to underperformance due to conservative safety assumptions that cannot handle the overall complexity. Besides, the more sophisticated safety systems rely on the vehicle's perception systems. However, perception is often unreliable due to uncertainties resulting from disturbances or the lack of context incorporation for data interpretation. Accordingly, this paper illustrates the potential of a modular, self-adaptive autonomy framework with integrated dynamic risk management to overcome the abovementioned drawbacks.

LORD: Large Models based Opposite Reward Design for Autonomous Driving

Authors: Xin Ye, Feng Tao, Abhirup Mallik, Burhaneddin Yaman, Liu Ren

Link: http://arxiv.org/abs/2403.18965v1open in new window

Abstract: Reinforcement learning (RL) based autonomous driving has emerged as a promising alternative to data-driven imitation learning approaches. However, crafting effective reward functions for RL poses challenges due to the complexity of defining and quantifying good driving behaviors across diverse scenarios. Recently, large pretrained models have gained significant attention as zero-shot reward models for tasks specified with desired linguistic goals. However, the desired linguistic goals for autonomous driving such as "drive safely" are ambiguous and incomprehensible by pretrained models. On the other hand, undesired linguistic goals like "collision" are more concrete and tractable. In this work, we introduce LORD, a novel large models based opposite reward design through undesired linguistic goals to enable the efficient use of large pretrained models as zero-shot reward models. Through extensive experiments, our proposed framework shows its efficiency in leveraging the power of large pretrained models for achieving safe and enhanced autonomous driving. Moreover, the proposed approach shows improved generalization capabilities as it outperforms counterpart methods across diverse and challenging driving scenarios.

3P-LLM: Probabilistic Path Planning using Large Language Model for Autonomous Robot Navigation

Authors: Ehsan Latif

Link: http://arxiv.org/abs/2403.18778v1open in new window

Abstract: Much worldly semantic knowledge can be encoded in large language models (LLMs). Such information could be of great use to robots that want to carry out high-level, temporally extended commands stated in natural language. However, the lack of real-world experience that language models have is a key limitation that makes it challenging to use them for decision-making inside a particular embodiment. This research assesses the feasibility of using LLM (GPT-3.5-turbo chatbot by OpenAI) for robotic path planning. The shortcomings of conventional approaches to managing complex environments and developing trustworthy plans for shifting environmental conditions serve as the driving force behind the research. Due to the sophisticated natural language processing abilities of LLM, the capacity to provide effective and adaptive path-planning algorithms in real-time, great accuracy, and few-shot learning capabilities, GPT-3.5-turbo is well suited for path planning in robotics. In numerous simulated scenarios, the research compares the performance of GPT-3.5-turbo with that of state-of-the-art path planners like Rapidly Exploring Random Tree (RRT) and A*. We observed that GPT-3.5-turbo is able to provide real-time path planning feedback to the robot and outperforms its counterparts. This paper establishes the foundation for LLM-powered path planning for robotic systems.

Sampling-Based Motion Planning with Online Racing Line Generation for Autonomous Driving on Three-Dimensional Race Tracks

Authors: Levent Ögretmen, Matthias Rowold, Boris Lohmann

Link: http://arxiv.org/abs/2403.18643v1open in new window

Abstract: Existing approaches to trajectory planning for autonomous racing employ sampling-based methods, generating numerous jerk-optimal trajectories and selecting the most favorable feasible trajectory based on a cost function penalizing deviations from an offline-calculated racing line. While successful on oval tracks, these methods face limitations on complex circuits due to the simplistic geometry of jerk-optimal edges failing to capture the complexity of the racing line. Additionally, they only consider two-dimensional tracks, potentially neglecting or surpassing the actual dynamic potential. In this paper, we present a sampling-based local trajectory planning approach for autonomous racing that can maintain the lap time of the racing line even on complex race tracks and consider the race track's three-dimensional effects. In simulative experiments, we demonstrate that our approach achieves lower lap times and improved utilization of dynamic limits compared to existing approaches. We also investigate the impact of online racing line generation, in which the time-optimal solution is planned from the current vehicle state for a limited spatial horizon, in contrast to a closed racing line calculated offline. We show that combining the sampling-based planner with the online racing line generation can significantly reduce lap times in multi-vehicle scenarios.

From Two-Dimensional to Three-Dimensional Environment with Q-Learning: Modeling Autonomous Navigation with Reinforcement Learning and no Libraries

Authors: Ergon Cugler de Moraes Silva

Link: http://arxiv.org/abs/2403.18219v1open in new window

Abstract: Reinforcement learning (RL) algorithms have become indispensable tools in artificial intelligence, empowering agents to acquire optimal decision-making policies through interactions with their environment and feedback mechanisms. This study explores the performance of RL agents in both two-dimensional (2D) and three-dimensional (3D) environments, aiming to research the dynamics of learning across different spatial dimensions. A key aspect of this investigation is the absence of pre-made libraries for learning, with the algorithm developed exclusively through computational mathematics. The methodological framework centers on RL principles, employing a Q-learning agent class and distinct environment classes tailored to each spatial dimension. The research aims to address the question: How do reinforcement learning agents adapt and perform in environments of varying spatial dimensions, particularly in 2D and 3D settings? Through empirical analysis, the study evaluates agents' learning trajectories and adaptation processes, revealing insights into the efficacy of RL algorithms in navigating complex, multi-dimensional spaces. Reflections on the findings prompt considerations for future research, particularly in understanding the dynamics of learning in higher-dimensional environments.

Long and Short-Term Constraints Driven Safe Reinforcement Learning for Autonomous Driving

Authors: Xuemin Hu, Pan Chen, Yijun Wen, Bo Tang, Long Chen

Link: http://arxiv.org/abs/2403.18209v1open in new window

Abstract: Reinforcement learning (RL) has been widely used in decision-making tasks, but it cannot guarantee the agent's safety in the training process due to the requirements of interaction with the environment, which seriously limits its industrial applications such as autonomous driving. Safe RL methods are developed to handle this issue by constraining the expected safety violation costs as a training objective, but they still permit unsafe state occurrence, which is unacceptable in autonomous driving tasks. Moreover, these methods are difficult to achieve a balance between the cost and return expectations, which leads to learning performance degradation for the algorithms. In this paper, we propose a novel algorithm based on the long and short-term constraints (LSTC) for safe RL. The short-term constraint aims to guarantee the short-term state safety that the vehicle explores, while the long-term constraint ensures the overall safety of the vehicle throughout the decision-making process. In addition, we develop a safe RL method with dual-constraint optimization based on the Lagrange multiplier to optimize the training process for end-to-end autonomous driving. Comprehensive experiments were conducted on the MetaDrive simulator. Experimental results demonstrate that the proposed method achieves higher safety in continuous state and action tasks, and exhibits higher exploration performance in long-distance decision-making tasks compared with state-of-the-art methods.

2024-03-26

A Study on the Use of Simulation in Synthesizing Path-Following Control Policies for Autonomous Ground Robots

Authors: Harry Zhang, Stefan Caldararu, Aaron Young, Alexis Ruiz, Huzaifa Unjhawala, Ishaan Mahajan, Sriram Ashokkumar, Nevindu Batagoda, Zhenhao Zhou, Luning Bakke, Dan Negrut

Link: http://arxiv.org/abs/2403.18021v1open in new window

Abstract: We report results obtained and insights gained while answering the following question: how effective is it to use a simulator to establish path following control policies for an autonomous ground robot? While the quality of the simulator conditions the answer to this question, we found that for the simulation platform used herein, producing four control policies for path planning was straightforward once a digital twin of the controlled robot was available. The control policies established in simulation and subsequently demonstrated in the real world are PID control, MPC, and two neural network (NN) based controllers. Training the two NN controllers via imitation learning was accomplished expeditiously using seven simple maneuvers: follow three circles clockwise, follow the same circles counter-clockwise, and drive straight. A test randomization process that employs random micro-simulations is used to rank the ``goodness'' of the four control policies. The policy ranking noted in simulation correlates well with the ranking observed when the control policies were tested in the real world. The simulation platform used is publicly available and BSD3-released as open source; a public Docker image is available for reproducibility studies. It contains a dynamics engine, a sensor simulator, a ROS2 bridge, and a ROS2 autonomy stack the latter employed both in the simulator and the real world experiments.

Scenario-Based Curriculum Generation for Multi-Agent Autonomous Driving

Authors: Axel Brunnbauer, Luigi Berducci, Peter Priller, Dejan Nickovic, Radu Grosu

Link: http://arxiv.org/abs/2403.17805v1open in new window

Abstract: The automated generation of diverse and complex training scenarios has been an important ingredient in many complex learning tasks. Especially in real-world application domains, such as autonomous driving, auto-curriculum generation is considered vital for obtaining robust and general policies. However, crafting traffic scenarios with multiple, heterogeneous agents is typically considered as a tedious and time-consuming task, especially in more complex simulation environments. In our work, we introduce MATS-Gym, a Multi-Agent Traffic Scenario framework to train agents in CARLA, a high-fidelity driving simulator. MATS-Gym is a multi-agent training framework for autonomous driving that uses partial scenario specifications to generate traffic scenarios with variable numbers of agents. This paper unifies various existing approaches to traffic scenario description into a single training framework and demonstrates how it can be integrated with techniques from unsupervised environment design to automate the generation of adaptive auto-curricula. The code is available at https://github.com/AutonomousDrivingExaminer/mats-gym.

Optical Flow Based Detection and Tracking of Moving Objects for Autonomous Vehicles

Authors: MReza Alipour Sormoli, Mehrdad Dianati, Sajjad Mozaffari, Roger woodman

Link: http://arxiv.org/abs/2403.17779v1open in new window

Abstract: Accurate velocity estimation of surrounding moving objects and their trajectories are critical elements of perception systems in Automated/Autonomous Vehicles (AVs) with a direct impact on their safety. These are non-trivial problems due to the diverse types and sizes of such objects and their dynamic and random behaviour. Recent point cloud based solutions often use Iterative Closest Point (ICP) techniques, which are known to have certain limitations. For example, their computational costs are high due to their iterative nature, and their estimation error often deteriorates as the relative velocities of the target objects increase (>2 m/sec). Motivated by such shortcomings, this paper first proposes a novel Detection and Tracking of Moving Objects (DATMO) for AVs based on an optical flow technique, which is proven to be computationally efficient and highly accurate for such problems. \textcolor{black}{This is achieved by representing the driving scenario as a vector field and applying vector calculus theories to ensure spatiotemporal continuity.} We also report the results of a comprehensive performance evaluation of the proposed DATMO technique, carried out in this study using synthetic and real-world data. The results of this study demonstrate the superiority of the proposed technique, compared to the DATMO techniques in the literature, in terms of estimation accuracy and processing time in a wide range of relative velocities of moving objects. Finally, we evaluate and discuss the sensitivity of the estimation error of the proposed DATMO technique to various system and environmental parameters, as well as the relative velocities of the moving objects.

LiDAR-Based Crop Row Detection Algorithm for Over-Canopy Autonomous Navigation in Agriculture Fields

Authors: Ruiji Liu, Francisco Yandun, George Kantor

Link: http://arxiv.org/abs/2403.17774v1open in new window

Abstract: Autonomous navigation is crucial for various robotics applications in agriculture. However, many existing methods depend on RTK-GPS systems, which are expensive and susceptible to poor signal coverage. This paper introduces a state-of-the-art LiDAR-based navigation system that can achieve over-canopy autonomous navigation in row-crop fields, even when the canopy fully blocks the interrow spacing. Our crop row detection algorithm can detect crop rows across diverse scenarios, encompassing various crop types, growth stages, weed presence, and discontinuities within the crop rows. Without utilizing the global localization of the robot, our navigation system can perform autonomous navigation in these challenging scenarios, detect the end of the crop rows, and navigate to the next crop row autonomously, providing a crop-agnostic approach to navigate the whole row-crop field. This navigation system has undergone tests in various simulated agricultural fields, achieving an average of $2.98cm$ autonomous driving accuracy without human intervention on the custom Amiga robot. In addition, the qualitative results of our crop row detection algorithm from the actual soybean fields validate our LiDAR-based crop row detection algorithm's potential for practical agricultural applications.

AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving

Authors: Mingfu Liang, Jong-Chyi Su, Samuel Schulter, Sparsh Garg, Shiyu Zhao, Ying Wu, Manmohan Chandraker

Link: http://arxiv.org/abs/2403.17373v1open in new window

Abstract: Autonomous vehicle (AV) systems rely on robust perception models as a cornerstone of safety assurance. However, objects encountered on the road exhibit a long-tailed distribution, with rare or unseen categories posing challenges to a deployed perception model. This necessitates an expensive process of continuously curating and annotating data with significant human effort. We propose to leverage recent advances in vision-language and large language models to design an Automatic Data Engine (AIDE) that automatically identifies issues, efficiently curates data, improves the model through auto-labeling, and verifies the model through generation of diverse scenarios. This process operates iteratively, allowing for continuous self-improvement of the model. We further establish a benchmark for open-world detection on AV datasets to comprehensively evaluate various learning paradigms, demonstrating our method's superior performance at a reduced cost.

Staircase Localization for Autonomous Exploration in Urban Environments

Authors: Jinrae Kim, Sunggoo Jung, Sung-Kyun Kim, Youdan Kim, Ali-akbar Agha-mohammadi

Link: http://arxiv.org/abs/2403.17330v1open in new window

Abstract: A staircase localization method is proposed for robots to explore urban environments autonomously. The proposed method employs a modular design in the form of a cascade pipeline consisting of three modules of stair detection, line segment detection, and stair localization modules. The stair detection module utilizes an object detection algorithm based on deep learning to generate a region of interest (ROI). From the ROI, line segment features are extracted using a deep line segment detection algorithm. The extracted line segments are used to localize a staircase in terms of position, orientation, and stair direction. The stair detection and localization are performed only with a single RGB-D camera. Each component of the proposed pipeline does not need to be designed particularly for staircases, which makes it easy to maintain the whole pipeline and replace each component with state-of-the-art deep learning detection techniques. The results of real-world experiments show that the proposed method can perform accurate stair detection and localization during autonomous exploration for various structured and unstructured upstairs and downstairs with shadows, dirt, and occlusions by artificial and natural objects.

Physical 3D Adversarial Attacks against Monocular Depth Estimation in Autonomous Driving

Authors: Junhao Zheng, Chenhao Lin, Jiahao Sun, Zhengyu Zhao, Qian Li, Chao Shen

Link: http://arxiv.org/abs/2403.17301v2open in new window

Abstract: Deep learning-based monocular depth estimation (MDE), extensively applied in autonomous driving, is known to be vulnerable to adversarial attacks. Previous physical attacks against MDE models rely on 2D adversarial patches, so they only affect a small, localized region in the MDE map but fail under various viewpoints. To address these limitations, we propose 3D Depth Fool (3D$^2$Fool), the first 3D texture-based adversarial attack against MDE models. 3D$^2$Fool is specifically optimized to generate 3D adversarial textures agnostic to model types of vehicles and to have improved robustness in bad weather conditions, such as rain and fog. Experimental results validate the superior performance of our 3D$^2$Fool across various scenarios, including vehicles, MDE models, weather conditions, and viewpoints. Real-world experiments with printed 3D textures on physical vehicle models further demonstrate that our 3D$^2$Fool can cause an MDE error of over 10 meters.

2024-03-25

Building an Open-Source Community to Enhance Autonomic Nervous System Signal Analysis: DBDP-Autonomic

Authors: Jessilyn Dunn, Varun Mishra, Md Mobashir Hasan Shandhi, Hayoung Jeong, Natasha Yamane, Yuna Watanabe, Bill Chen, Matthew S. Goodwin

Link: http://arxiv.org/abs/2403.17165v3open in new window

Abstract: Smartphones and wearable sensors offer an unprecedented ability to collect peripheral psychophysiological signals across diverse timescales, settings, populations, and modalities. However, open-source software development has yet to keep pace with rapid advancements in hardware technology and availability, creating an analytical barrier that limits the scientific usefulness of acquired data. We propose a community-driven, open-source peripheral psychophysiological signal pre-processing and analysis software framework that could advance biobehavioral health by enabling more robust, transparent, and reproducible inferences involving autonomic nervous system data.

RepairAgent: An Autonomous, LLM-Based Agent for Program Repair

Authors: Islem Bouzenia, Premkumar Devanbu, Michael Pradel

Link: http://arxiv.org/abs/2403.17134v1open in new window

Abstract: Automated program repair has emerged as a powerful technique to mitigate the impact of software bugs on system reliability and user experience. This paper introduces RepairAgent, the first work to address the program repair challenge through an autonomous agent based on a large language model (LLM). Unlike existing deep learning-based approaches, which prompt a model with a fixed prompt or in a fixed feedback loop, our work treats the LLM as an agent capable of autonomously planning and executing actions to fix bugs by invoking suitable tools. RepairAgent freely interleaves gathering information about the bug, gathering repair ingredients, and validating fixes, while deciding which tools to invoke based on the gathered information and feedback from previous fix attempts. Key contributions that enable RepairAgent include a set of tools that are useful for program repair, a dynamically updated prompt format that allows the LLM to interact with these tools, and a finite state machine that guides the agent in invoking the tools. Our evaluation on the popular Defects4J dataset demonstrates RepairAgent's effectiveness in autonomously repairing 164 bugs, including 39 bugs not fixed by prior techniques. Interacting with the LLM imposes an average cost of 270,000 tokens per bug, which, under the current pricing of OpenAI's GPT-3.5 model, translates to 14 cents of USD per bug. To the best of our knowledge, this work is the first to present an autonomous, LLM-based agent for program repair, paving the way for future agent-based techniques in software engineering.

SynFog: A Photo-realistic Synthetic Fog Dataset based on End-to-end Imaging Simulation for Advancing Real-World Defogging in Autonomous Driving

Authors: Yiming Xie, Henglu Wei, Zhenyi Liu, Xiaoyu Wang, Xiangyang Ji

Link: http://arxiv.org/abs/2403.17094v1open in new window

Abstract: To advance research in learning-based defogging algorithms, various synthetic fog datasets have been developed. However, existing datasets created using the Atmospheric Scattering Model (ASM) or real-time rendering engines often struggle to produce photo-realistic foggy images that accurately mimic the actual imaging process. This limitation hinders the effective generalization of models from synthetic to real data. In this paper, we introduce an end-to-end simulation pipeline designed to generate photo-realistic foggy images. This pipeline comprehensively considers the entire physically-based foggy scene imaging process, closely aligning with real-world image capture methods. Based on this pipeline, we present a new synthetic fog dataset named SynFog, which features both sky light and active lighting conditions, as well as three levels of fog density. Experimental results demonstrate that models trained on SynFog exhibit superior performance in visual perception and detection accuracy compared to others when applied to real-world foggy images.

2024-03-22

Blockchain-based Pseudonym Management for Vehicle Twin Migrations in Vehicular Edge Metaverse

Authors: Jiawen Kang, Xiaofeng Luo, Jiangtian Nie, Tianhao Wu, Haibo Zhou, Yonghua Wang, Dusit Niyato, Shiwen Mao, Shengli Xie

Link: http://arxiv.org/abs/2403.15285v1open in new window

Abstract: Driven by the great advances in metaverse and edge computing technologies, vehicular edge metaverses are expected to disrupt the current paradigm of intelligent transportation systems. As highly computerized avatars of Vehicular Metaverse Users (VMUs), the Vehicle Twins (VTs) deployed in edge servers can provide valuable metaverse services to improve driving safety and on-board satisfaction for their VMUs throughout journeys. To maintain uninterrupted metaverse experiences, VTs must be migrated among edge servers following the movements of vehicles. This can raise concerns about privacy breaches during the dynamic communications among vehicular edge metaverses. To address these concerns and safeguard location privacy, pseudonyms as temporary identifiers can be leveraged by both VMUs and VTs to realize anonymous communications in the physical space and virtual spaces. However, existing pseudonym management methods fall short in meeting the extensive pseudonym demands in vehicular edge metaverses, thus dramatically diminishing the performance of privacy preservation. To this end, we present a cross-metaverse empowered dual pseudonym management framework. We utilize cross-chain technology to enhance management efficiency and data security for pseudonyms. Furthermore, we propose a metric to assess the privacy level and employ a Multi-Agent Deep Reinforcement Learning (MADRL) approach to obtain an optimal pseudonym generating strategy. Numerical results demonstrate that our proposed schemes are high-efficiency and cost-effective, showcasing their promising applications in vehicular edge metaverses.

2024-03-19

Advancing Explainable Autonomous Vehicle Systems: A Comprehensive Review and Research Roadmap

Authors: Sule Tekkesinoglu, Azra Habibovic, Lars Kunze

Link: http://arxiv.org/abs/2404.00019v1open in new window

Abstract: Given the uncertainty surrounding how existing explainability methods for autonomous vehicles (AVs) meet the diverse needs of stakeholders, a thorough investigation is imperative to determine the contexts requiring explanations and suitable interaction strategies. A comprehensive review becomes crucial to assess the alignment of current approaches with the varied interests and expectations within the AV ecosystem. This study presents a review to discuss the complexities associated with explanation generation and presentation to facilitate the development of more effective and inclusive explainable AV systems. Our investigation led to categorising existing literature into three primary topics: explanatory tasks, explanatory information, and explanatory information communication. Drawing upon our insights, we have proposed a comprehensive roadmap for future research centred on (i) knowing the interlocutor, (ii) generating timely explanations, (ii) communicating human-friendly explanations, and (iv) continuous learning. Our roadmap is underpinned by principles of responsible research and innovation, emphasising the significance of diverse explanation requirements. To effectively tackle the challenges associated with implementing explainable AV systems, we have delineated various research directions, including the development of privacy-preserving data integration, ethical frameworks, real-time analytics, human-centric interaction design, and enhanced cross-disciplinary collaborations. By exploring these research directions, the study aims to guide the development and deployment of explainable AVs, informed by a holistic understanding of user needs, technological advancements, regulatory compliance, and ethical considerations, thereby ensuring safer and more trustworthy autonomous driving experiences.

2024-03-18

Holistic HMI Design for Automated Vehicles: Bridging In-Vehicle and External Communication

Authors: Haoyu Dong, Tram Thi Minh Tran, Pavlo Bazilinskyy, Marius Hoggenmüller, Debargha Dey, Silvia Cazacu, Mervyn Franssen, Ruolin Gao

Link: http://arxiv.org/abs/2403.11386v1open in new window

Abstract: As the field of automated vehicles (AVs) advances, it has become increasingly critical to develop human-machine interfaces (HMI) for both internal and external communication. Critical dialogue is emerging around the potential necessity for a holistic approach to HMI designs, which promotes the integration of both in-vehicle user and external road user perspectives. This approach aims to create a unified and coherent experience for different stakeholders interacting with AVs. This workshop seeks to bring together designers, engineers, researchers, and other stakeholders to delve into relevant use cases, exploring the potential advantages and challenges of this approach. The insights generated from this workshop aim to inform further design and research in the development of coherent HMIs for AVs, ultimately for more seamless integration of AVs into existing traffic.

A Review of Virtual Reality Studies on Autonomous Vehicle--Pedestrian Interaction

Authors: Tram Thi Minh Tran, Callum Parker, Martin Tomitsch

Link: http://arxiv.org/abs/2403.11378v1open in new window

Abstract: An increasing number of studies employ virtual reality (VR) to evaluate interactions between autonomous vehicles (AVs) and pedestrians. VR simulators are valued for their cost-effectiveness, flexibility in developing various traffic scenarios, safe conduct of user studies, and acceptable ecological validity. Reviewing the literature between 2010 and 2020, we found 31 empirical studies using VR as a testing apparatus for both implicit and explicit communication. By performing a systematic analysis, we identified current coverage of critical use cases, obtained a comprehensive account of factors influencing pedestrian behavior in simulated traffic scenarios, and assessed evaluation measures. Based on the findings, we present a set of recommendations for implementing VR pedestrian simulators and propose directions for future research.

2024-03-14

Towards Proactive Interactions for In-Vehicle Conversational Assistants Utilizing Large Language Models

Authors: Huifang Du, Xuejing Feng, Jun Ma, Meng Wang, Shiyu Tao, Yijie Zhong, Yuan-Fang Li, Haofen Wang

Link: http://arxiv.org/abs/2403.09135v1open in new window

Abstract: Research demonstrates that the proactivity of in-vehicle conversational assistants (IVCAs) can help to reduce distractions and enhance driving safety, better meeting users' cognitive needs. However, existing IVCAs struggle with user intent recognition and context awareness, which leads to suboptimal proactive interactions. Large language models (LLMs) have shown potential for generalizing to various tasks with prompts, but their application in IVCAs and exploration of proactive interaction remain under-explored. These raise questions about how LLMs improve proactive interactions for IVCAs and influence user perception. To investigate these questions systematically, we establish a framework with five proactivity levels across two dimensions-assumption and autonomy-for IVCAs. According to the framework, we propose a "Rewrite + ReAct + Reflect" strategy, aiming to empower LLMs to fulfill the specific demands of each proactivity level when interacting with users. Both feasibility and subjective experiments are conducted. The LLM outperforms the state-of-the-art model in success rate and achieves satisfactory results for each proactivity level. Subjective experiments with 40 participants validate the effectiveness of our framework and show the proactive level with strong assumptions and user confirmation is most appropriate.

2024-03-11

People Attribute Purpose to Autonomous Vehicles When Explaining Their Behavior

Authors: Balint Gyevnar, Stephanie Droop, Tadeg Quillien

Link: http://arxiv.org/abs/2403.08828v1open in new window

Abstract: A hallmark of a good XAI system is explanations that users can understand and act on. In many cases, this requires a system to offer causal or counterfactual explanations that are intelligible. Cognitive science can help us understand what kinds of explanations users might expect, and in which format to frame these explanations. We briefly review relevant literature from the cognitive science of explanation, particularly as it concerns teleology, the tendency to explain a decision in terms of the purpose it was meant to achieve. We then report empirical data on how people generate explanations for the behavior of autonomous vehicles, and how they evaluate these explanations. In a first survey, participants (n=54) were shown videos of a road scene and asked to generate either mechanistic, counterfactual, or teleological verbal explanations for a vehicle's actions. In the second survey, a different set of participants (n=356) rated these explanations along various metrics including quality, trustworthiness, and how much each explanatory mode was emphasized in the explanation. Participants deemed mechanistic and teleological explanations as significantly higher quality than counterfactual explanations. In addition, perceived teleology was the best predictor of perceived quality and trustworthiness. Neither perceived teleology nor quality ratings were affected by whether the car whose actions were being explained was an autonomous vehicle or was being driven by a person. The results show people use and value teleological concepts to evaluate information about both other people and autonomous vehicles, indicating they find the 'intentional stance' a convenient abstraction. We make our dataset of annotated video situations with explanations, called Human Explanations for Autonomous Driving Decisions (HEADD), publicly available, which we hope will prompt further research.

Designing for Projection-based Communication between Autonomous Vehicles and Pedestrians

Authors: Trung Thanh Nguyen, Kai Hollander, Marius Hoggenmueller, Callum Parker, Martin Tomitsch

Link: http://arxiv.org/abs/2403.06429v1open in new window

Abstract: Recent studies have investigated new approaches for communicating an autonomous vehicle's (AV) intent and awareness to pedestrians. This paper adds to this body of work by presenting the design and evaluation of in-situ projections on the road. Our design combines common traffic light patterns with aesthetic visual elements. We describe the iterative design process and the prototyping methods used in each stage. The final design concept was represented as a virtual reality simulation and evaluated with 18 participants in four different street crossing scenarios, which included three scenarios that simulated various degrees of system errors. We found that different design elements were able to support participants' confidence in their decision even when the AV failed to correctly detect their presence. We also identified elements in our design that needed to be more clearly communicated. Based on these findings, the paper presents a series of design recommendations for projection-based communication between AVs and pedestrians.

2024-03-08

Designing Wearable Augmented Reality Concepts to Support Scalability in Autonomous Vehicle-Pedestrian Interaction

Authors: Tram Thi Minh Tran, Callum Parker, Yiyuan Wang, Martin Tomitsch

Link: http://arxiv.org/abs/2403.07006v1open in new window

Abstract: Wearable augmented reality (AR) offers new ways for supporting the interaction between autonomous vehicles (AVs) and pedestrians due to its ability to integrate timely and contextually relevant data into the user's field of view. This article presents novel wearable AR concepts that assist crossing pedestrians in multi-vehicle scenarios where several AVs frequent the road from both directions. Three concepts with different communication approaches for signaling responses from multiple AVs to a crossing request, as well as a conventional pedestrian push button, were simulated and tested within a virtual reality environment. The results showed that wearable AR is a promising way to reduce crossing pedestrians' cognitive load when the design offers both individual AV responses and a clear signal to cross. The willingness of pedestrians to adopt a wearable AR solution, however, is subject to different factors, including costs, data privacy, technical defects, liability risks, maintenance duties, and form factors. We further found that all participants favored sending a crossing request to AVs rather than waiting for the vehicles to detect their intentions-pointing to an important gap and opportunity in the current AV-pedestrian interaction literature.

Scoping Out the Scalability Issues of Autonomous Vehicle-Pedestrian Interaction

Authors: Tram Thi Minh Tran, Callum Parker, Martin Tomitsch

Link: http://arxiv.org/abs/2403.05727v1open in new window

Abstract: Autonomous vehicles (AVs) may use external interfaces, such as LED light bands, to communicate with pedestrians safely and intuitively. While previous research has demonstrated the effectiveness of these interfaces in simple traffic scenarios involving one pedestrian and one vehicle, their performance in more complex scenarios with multiple road users remains unclear. The scalability of AV external communication has therefore attracted increasing attention, prompting the need for further investigation. This scoping review synthesises information from 54 papers to identify seven key scalability issues in multi-vehicle and multi-pedestrian environments, with Clarity of Recipients, Information Overload, and Multi-Lane Safety emerging as the most pressing concerns. To guide future research in scalable AV-pedestrian interactions, we propose high-level design directions focused on three communication loci: vehicle, infrastructure, and pedestrian. Our work contributes the groundwork and a roadmap for designing simplified, coordinated, and targeted external AV communication, ultimately improving safety and efficiency in complex traffic scenarios.

2024-03-07

Pedestrian-Vehicle Interaction in Shared Space: Insights for Autonomous Vehicles

Authors: Yiyuan Wang, Luke Hespanhol, Stewart Worrall, Martin Tomitsch

Link: http://arxiv.org/abs/2403.04933v1open in new window

Abstract: Shared space reduces segregation between vehicles and pedestrians and encourages them to share roads without imposed traffic rules. The behaviour of road users (RUs) is then controlled by social norms, and interactions are more versatile than on traditional roads. Autonomous vehicles (AVs) will need to adapt to these norms to become socially acceptable RUs in shared spaces. However, to date, there is not much research into pedestrian-vehicle interaction in shared-space environments, and prior efforts have predominantly focused on traditional roads and crossing scenarios. We present a video observation investigating pedestrian reactions to a small, automation-capable vehicle driven manually in shared spaces based on a long-term naturalistic driving dataset. We report various pedestrian reactions (from movement adjustment to prosocial behaviour) and situations pertinent to shared spaces at this early stage. Insights drawn can serve as a foundation to support future AVs navigating shared spaces, especially those with a high pedestrian focus.

How Can Autonomous Vehicles Convey Emotions to Pedestrians? A Review of Emotionally Expressive Non-Humanoid Robots

Authors: Yiyuan Wang, Luke Hespanhol, Martin Tomitsch

Link: http://arxiv.org/abs/2403.04930v1open in new window

Abstract: In recent years, researchers and manufacturers have started to investigate ways to enable autonomous vehicles (AVs) to interact with nearby pedestrians in compensation for the absence of human drivers. The majority of these efforts focuses on external human-machine interfaces (eHMIs), using different modalities, such as light patterns or on-road projections, to communicate the AV's intent and awareness. In this paper, we investigate the potential role of affective interfaces to convey emotions via eHMIs. To date, little is known about the role that affective interfaces can play in supporting AV-pedestrian interaction. However, emotions have been employed in many smaller social robots, from domestic companions to outdoor aerial robots in the form of drones. To develop a foundation for affective AV-pedestrian interfaces, we reviewed the emotional expressions of non-humanoid robots in 25 articles published between 2011 and 2021. Based on findings from the review, we present a set of considerations for designing affective AV-pedestrian interfaces and highlight avenues for investigating these opportunities in future studies.

2024-02-22

"It Must Be Gesturing Towards Me": Gesture-Based Interaction between Autonomous Vehicles and Pedestrians

Authors: Xiang Chang, Zihe Chen, Xiaoyan Dong, Yuxin Cai, Tingmin Yan, Haolin Cai, Zherui Zhou, Guyue Zhou, Jiangtao Gong

Link: http://arxiv.org/abs/2402.14455v1open in new window

Abstract: Interacting with pedestrians understandably and efficiently is one of the toughest challenges faced by autonomous vehicles (AVs) due to the limitations of current algorithms and external human-machine interfaces (eHMIs). In this paper, we design eHMIs based on gestures inspired by the most popular method of interaction between pedestrians and human drivers. Eight common gestures were selected to convey AVs' yielding or non-yielding intentions at uncontrolled crosswalks from previous literature. Through a VR experiment (N1 = 31) and a following online survey (N2 = 394), we discovered significant differences in the usability of gesture-based eHMIs compared to current eHMIs. Good gesture-based eHMIs increase the efficiency of pedestrian-AV interaction while ensuring safety. Poor gestures, however, cause misinterpretation. The underlying reasons were explored: ambiguity regarding the recipient of the signal and whether the gestures are precise, polite, and familiar to pedestrians. Based on this empirical evidence, we discuss potential opportunities and provide valuable insights into developing comprehensible gesture-based eHMIs in the future to support better interaction between AVs and other road users.

2024-01-29

A New Framework to Predict and Visualize Technology Acceptance: A Case Study of Shared Autonomous Vehicles

Authors: Lirui Guo, Michael G. Burke, Wynita M. Griggs

Link: http://arxiv.org/abs/2401.15921v1open in new window

Abstract: The burgeoning field of Shared Autonomous Vehicles (SAVs) presents transformative potential for the transport sector, subject to public acceptance. Traditional acceptance models, primarily reliant on Structural Equation Modelling (SEM), often fall short of capturing the complex, non-linear dynamics underlying this acceptance. To address these limitations, this paper proposes a Machine Learning (ML) approach to predict public acceptance of SAVs and employs a chord diagram to visualize the influence of different predictors. This approach reveals nuanced, non-linear relationships between factors at both macro and micro levels, and identifies attitude as the primary predictor of SAV usage intention, followed by perceived risk, perceived usefulness, trust, and perceived ease of use. The framework also uncovers divergent perceptions of these factors among SAV adopters and non-adopters, providing granular insights for strategic initiatives to enhance SAV acceptance. Using SAV acceptance as a case study, our findings contribute a novel, machine learning-based perspective to the discourse on technology acceptance, underscoring the importance of nuanced, data-driven approaches in understanding and fostering public acceptance of emerging transport technologies.

2024-01-26

Driving Towards Inclusion: Revisiting In-Vehicle Interaction in Autonomous Vehicles

Authors: Ashish Bastola, Julian Brinkley, Hao Wang, Abolfazl Razi

Link: http://arxiv.org/abs/2401.14571v1open in new window

Abstract: This paper presents a comprehensive literature review of the current state of in-vehicle human-computer interaction (HCI) in the context of self-driving vehicles, with a specific focus on inclusion and accessibility. This study's aim is to examine the user-centered design principles for inclusive HCI in self-driving vehicles, evaluate existing HCI systems, and identify emerging technologies that have the potential to enhance the passenger experience. The paper begins by providing an overview of the current state of self-driving vehicle technology, followed by an examination of the importance of HCI in this context. Next, the paper reviews the existing literature on inclusive HCI design principles and evaluates the effectiveness of current HCI systems in self-driving vehicles. The paper also identifies emerging technologies that have the potential to enhance the passenger experience, such as voice-activated interfaces, haptic feedback systems, and augmented reality displays. Finally, the paper proposes an end-to-end design framework for the development of an inclusive in-vehicle experience, which takes into consideration the needs of all passengers, including those with disabilities, or other accessibility requirements. This literature review highlights the importance of user-centered design principles in the development of HCI systems for self-driving vehicles and emphasizes the need for inclusive design to ensure that all passengers can safely and comfortably use these vehicles. The proposed end-to-end design framework provides a practical approach to achieving this goal and can serve as a valuable resource for designers, researchers, and policymakers in this field.

2023-12-15

Beyond Empirical Windowing: An Attention-Based Approach for Trust Prediction in Autonomous Vehicles

Authors: Minxue Niu, Zhaobo Zheng, Kumar Akash, Teruhisa Misu

Link: http://arxiv.org/abs/2312.10209v2open in new window

Abstract: Humans' internal states play a key role in human-machine interaction, leading to the rise of human state estimation as a prominent field. Compared to swift state changes such as surprise and irritation, modeling gradual states like trust and satisfaction are further challenged by label sparsity: long time-series signals are usually associated with a single label, making it difficult to identify the critical span of state shifts. Windowing has been one widely-used technique to enable localized analysis of long time-series data. However, the performance of downstream models can be sensitive to the window size, and determining the optimal window size demands domain expertise and extensive search. To address this challenge, we propose a Selective Windowing Attention Network (SWAN), which employs window prompts and masked attention transformation to enable the selection of attended intervals with flexible lengths. We evaluate SWAN on the task of trust prediction on a new multimodal driving simulation dataset. Experiments show that SWAN significantly outperforms an existing empirical window selection baseline and neural network baselines including CNN-LSTM and Transformer. Furthermore, it shows robustness across a wide span of windowing ranges, compared to the traditional windowing approach.

2023-12-14

Acceptance and Trust: Drivers' First Contact with Released Automated Vehicles in Naturalistic Traffic

Authors: Sarah Schwindt-Drews, Kai Storms, Steven Peters, Bettina Abendroth

Link: http://arxiv.org/abs/2312.08957v2open in new window

Abstract: This study investigates the impact of initial contact of drivers with an SAE Level 3 Automated Driving System (ADS) under real traffic conditions, focusing on the Mercedes-Benz Drive Pilot in the EQS. It examines Acceptance, Trust, Usability, and User Experience. Although previous studies in simulated environments provided insights into human-automation interaction, real-world experiences can differ significantly. The research was conducted on a segment of German interstate with 30 participants lacking familiarity with Level 3 ADS. Pre- and post-driving questionnaires were used to assess changes in acceptance and confidence. Supplementary metrics included post-driving ratings for usability and user experience. Findings reveal a significant increase in acceptance and trust following the first contact, confirming results from prior simulator studies. Factors such as Performance Expectancy, Effort Expectancy, Facilitating Condition, Self-Efficacy, and Behavioral Intention to use the vehicle were rated higher after initial contact with the ADS. However, inadequate communication from the ADS to the human driver was detected, highlighting the need for improved communication to prevent misuse or confusion about the operating mode. Contrary to prior research, we found no significant impact of general attitudes towards technological innovation on acceptance and trust. However, it's worth noting that most participants already had a high affinity for technology. Although overall reception was positive and showed an upward trend post first contact, the ADS was also perceived as demanding as manual driving. Future research should focus on a more diverse participant sample and include longer or multiple real-traffic trips to understand behavioral adaptations over time.

2023-12-11

Pedestrian and Passenger Interaction with Autonomous Vehicles: Field Study in a Crosswalk Scenario

Authors: Rubén Izquierdo, Javier Alonso, Ola Benderius, Miguel Ángel Sotelo, David Fernández Llorca

Link: http://arxiv.org/abs/2312.07606v1open in new window

Abstract: This study presents the outcomes of empirical investigations pertaining to human-vehicle interactions involving an autonomous vehicle equipped with both internal and external Human Machine Interfaces (HMIs) within a crosswalk scenario. The internal and external HMIs were integrated with implicit communication techniques, incorporating a combination of gentle and aggressive braking maneuvers within the crosswalk. Data were collected through a combination of questionnaires and quantifiable metrics, including pedestrian decision to cross related to the vehicle distance and speed. The questionnaire responses reveal that pedestrians experience enhanced safety perceptions when the external HMI and gentle braking maneuvers are used in tandem. In contrast, the measured variables demonstrate that the external HMI proves effective when complemented by the gentle braking maneuver. Furthermore, the questionnaire results highlight that the internal HMI enhances passenger confidence only when paired with the aggressive braking maneuver.

2023-11-28

Creating inclusive mobility systems: towards age and education sensitive interventions for enhancing autonomous vehicle acceptance

Authors: Celina Kacperski, Roberto Ulloa, Jeremy Wautelet, Tobias Vogel, Florian Kutzner

Link: http://arxiv.org/abs/2311.16780v1open in new window

Abstract: The familiarity principle posits that acceptance increases with exposure, which has previously been shown with in vivo and simulated experiences with connected and autonomous vehicles (CAVs). We investigate the impact of a simulated video-based first-person drive on CAV acceptance, as well as the impact of information customization, with a particular focus on acceptance by older individuals and those with lower education. Findings from an online experiment with N=799 German residents reveal that the simulated experience improved acceptance across response variables such as intention to use and ease of use, particularly among older individuals. However, the opportunity to customize navigation information decreased acceptance of older individuals and those with university degrees and increased acceptance for younger individuals and those with lower educational levels.

2023-11-21

Comparing autonomous vehicle acceptance of German residents with and without visual impairments

Authors: Celina Kacperski, Florian Kutzner, Tobias Vogel

Link: http://arxiv.org/abs/2311.12900v2open in new window

Abstract: Connected and autonomous vehicles (CAVs) will greatly impact the lives of individuals with visual impairments, but how they differ in expectations compared to sighted individuals is not clear. The present research reports results based on survey responses from 114 visually impaired participants and 117 panel recruited participants without visual impairments, from Germany. Their attitudes towards autonomous vehicles and their expectations for consequences of wide-spread adoption of CAVs are assessed. Results indicate significantly more positive CAV attitudes in participants with visual impairments compared to those without visual impairments. Mediation analyses indicate that visually impaired individuals' more positive CAV attitudes (compared to sighted individuals') are largely explained by higher hopes for independence, and more optimistic expectations regarding safety and sustainability. Policy makers should ensure accessibility without sacrificing goals for higher safety and lower ecological impact to make CAVs an acceptable inclusive mobility solution.

2023-11-15

In-vehicle Sensing and Data Analysis for Older Drivers with Mild Cognitive Impairment

Authors: Sonia Moshfeghi, Muhammad Tanveer Jan, Joshua Conniff, Seyedeh Gol Ara Ghoreishi, Jinwoo Jang, Borko Furht, Kwangsoo Yang, Monica Rosselli, David Newman, Ruth Tappen, Dana Smith

Link: http://arxiv.org/abs/2311.09273v1open in new window

Abstract: Driving is a complex daily activity indicating age and disease related cognitive declines. Therefore, deficits in driving performance compared with ones without mild cognitive impairment (MCI) can reflect changes in cognitive functioning. There is increasing evidence that unobtrusive monitoring of older adults driving performance in a daily-life setting may allow us to detect subtle early changes in cognition. The objectives of this paper include designing low-cost in-vehicle sensing hardware capable of obtaining high-precision positioning and telematics data, identifying important indicators for early changes in cognition, and detecting early-warning signs of cognitive impairment in a truly normal, day-to-day driving condition with machine learning approaches. Our statistical analysis comparing drivers with MCI to those without reveals that those with MCI exhibit smoother and safer driving patterns. This suggests that drivers with MCI are cognizant of their condition and tend to avoid erratic driving behaviors. Furthermore, our Random Forest models identified the number of night trips, number of trips, and education as the most influential factors in our data evaluation.

2023-11-10

Can Machine Learning Uncover Insights into Vehicle Travel Demand from Our Built Environment?

Authors: Zixun Huang, Hao Zheng

Link: http://arxiv.org/abs/2311.06321v1open in new window

Abstract: In this paper, we propose a machine learning-based approach to address the lack of ability for designers to optimize urban land use planning from the perspective of vehicle travel demand. Research shows that our computational model can help designers quickly obtain feedback on the vehicle travel demand, which includes its total amount and temporal distribution based on the urban function distribution designed by the designers. It also assists in design optimization and evaluation of the urban function distribution from the perspective of vehicle travel. We obtain the city function distribution information and vehicle hours traveled (VHT) information by collecting the city point-of-interest (POI) data and online vehicle data. The artificial neural networks (ANNs) with the best performance in prediction are selected. By using data sets collected in different regions for mutual prediction and remapping the predictions onto a map for visualization, we evaluate the extent to which the computational model sees use across regions in an attempt to reduce the workload of future urban researchers. Finally, we demonstrate the application of the computational model to help designers obtain feedback on vehicle travel demand in the built environment and combine it with genetic algorithms to optimize the current state of the urban environment to provide recommendations to designers.

2023-11-07

"Tell me about that church": Exploring the Design and User Experience of In-Vehicle Multi-modal Intuitive Interface in the Context of Driving Scenario

Authors: Yueteng Yu, Yan Zhang, Gary Burnett

Link: http://arxiv.org/abs/2311.04160v1open in new window

Abstract: Intuitive interaction has long been seen as a highly user-friendly method. There are attempts to implement intuitive interfaces in vehicles in both research and industrial, such as voice commands. However, there is a lack of exploration in the in-vehicle multi-modal intuitive interaction, especially under a dynamic driving scenario. In this research, we conducted a design workshop (N=6) to understand user's needs and designers' considerations on the in-vehicle multi-modal intuitive interface, based on which we implemented our design on both a simulator and a real autonomous vehicle using Wizard-of-Oz. We conducted a user experiment (N=12) on the simulator to explore determinants of users' acceptance, experience, and behavior. We figured that acceptance was significantly influenced by six determinants. Drivers' behavior has an obvious pattern of change. Drivers have been proven to have less workload but distractions were also reported. Our findings offered empirical evidence which could give insights into future vehicle design.

2023-10-29

Social Interaction-Aware Dynamical Models and Decision Making for Autonomous Vehicles

Authors: Luca Crosato, Kai Tian, Hubert P. H Shum, Edmond S. L. Ho, Yafei Wang, Chongfeng Wei

Link: http://arxiv.org/abs/2310.18891v2open in new window

Abstract: Interaction-aware Autonomous Driving (IAAD) is a rapidly growing field of research that focuses on the development of autonomous vehicles (AVs) that are capable of interacting safely and efficiently with human road users. This is a challenging task, as it requires the autonomous vehicle to be able to understand and predict the behaviour of human road users. In this literature review, the current state of IAAD research is surveyed in this work. Commencing with an examination of terminology, attention is drawn to challenges and existing models employed for modelling the behaviour of drivers and pedestrians. Next, a comprehensive review is conducted on various techniques proposed for interaction modelling, encompassing cognitive methods, machine learning approaches, and game-theoretic methods. The conclusion is reached through a discussion of potential advantages and risks associated with IAAD, along with the illumination of pivotal research inquiries necessitating future exploration.

2023-10-18

HIFuzz: Human Interaction Fuzzing for small Unmanned Aerial Vehicles

Authors: Theodore Chambers, Michael Vierhauser, Ankit Agrawal, Michael Murphy, Jason Matthew Brauer, Salil Purandare, Myra B. Cohen, Jane Cleland-Huang

Link: http://arxiv.org/abs/2310.12058v2open in new window

Abstract: Small Unmanned Aerial Systems (sUAS) must meet rigorous safety standards when deployed in high-stress emergency response scenarios; however many reported accidents have involved humans in the loop. In this paper, we, therefore, present the HiFuzz testing framework, which uses fuzz testing to identify system vulnerabilities associated with human interactions. HiFuzz includes three distinct levels that progress from a low-cost, limited-fidelity, large-scale, no-hazard environment, using fully simulated Proxy Human Agents, via an intermediate level, where proxy humans are replaced with real humans, to a high-stakes, high-cost, real-world environment. Through applying HiFuzz to an autonomous multi-sUAS system-under-test, we show that each test level serves a unique purpose in revealing vulnerabilities and making the system more robust with respect to human mistakes. While HiFuzz is designed for testing sUAS systems, we further discuss its potential for use in other Cyber-Physical Systems.

2023-10-17

Classification of Safety Driver Attention During Autonomous Vehicle Operation

Authors: Santiago Gerling Konrad, Julie Stephany Berrio, Mao Shan, Favio Masson, Stewart Worrall

Link: http://arxiv.org/abs/2310.11608v1open in new window

Abstract: Despite the continual advances in Advanced Driver Assistance Systems (ADAS) and the development of high-level autonomous vehicles (AV), there is a general consensus that for the short to medium term, there is a requirement for a human supervisor to handle the edge cases that inevitably arise. Given this requirement, it is essential that the state of the vehicle operator is monitored to ensure they are contributing to the vehicle's safe operation. This paper introduces a dual-source approach integrating data from an infrared camera facing the vehicle operator and vehicle perception systems to produce a metric for driver alertness in order to promote and ensure safe operator behaviour. The infrared camera detects the driver's head, enabling the calculation of head orientation, which is relevant as the head typically moves according to the individual's focus of attention. By incorporating environmental data from the perception system, it becomes possible to determine whether the vehicle operator observes objects in the surroundings. Experiments were conducted using data collected in Sydney, Australia, simulating AV operations in an urban environment. Our results demonstrate that the proposed system effectively determines a metric for the attention levels of the vehicle operator, enabling interventions such as warnings or reducing autonomous functionality as appropriate. This comprehensive solution shows promise in contributing to ADAS and AVs' overall safety and efficiency in a real-world setting.

2023-10-12

Would you trust a vehicle merging into your lane? Subjective evaluation of negotiating behaviour in a congested merging scenario

Authors: Akinobu Goto, Kerstin Eder

Link: http://arxiv.org/abs/2310.08361v1open in new window

Abstract: Aiming for a society where humans and automated vehicles can coexist cooperatively, understanding what constitutes cooperative and trustworthy behaviour is essential to designing automated vehicle controllers that enable the integration of highly automated vehicles into the real world. This study investigates how merging vehicles can gain trust from human-driven vehicles in a congested merging situation that requires explicit and implicit communication. Specifically, this study examines how the different behaviours of merging vehicles in the preparatory phase of the merge affect perceived trust from the perspective of the host vehicle in the mainstream lane. The findings suggest that transparent longitudinal positioning could improve the chance of successful merging, and cooperative deceleration during merging preparation could enhance the trust perceived by the host vehicle. Furthermore, the results reveal that, in time-sensitive situations where the merging vehicle approaches a lane closing point, prompt and decisive action of the merging vehicle encourages establishing trust with the host vehicle; any delay or hesitation can result in a lower level of trust. The results can provide valuable insights towards developing collaborative automated vehicles that improve safety and efficiency in real-world traffic situations that involve humans.

Receive, Reason, and React: Drive as You Say with Large Language Models in Autonomous Vehicles

Authors: Can Cui, Yunsheng Ma, Xu Cao, Wenqian Ye, Ziran Wang

Link: http://arxiv.org/abs/2310.08034v1open in new window

Abstract: The fusion of human-centric design and artificial intelligence (AI) capabilities has opened up new possibilities for next-generation autonomous vehicles that go beyond transportation. These vehicles can dynamically interact with passengers and adapt to their preferences. This paper proposes a novel framework that leverages Large Language Models (LLMs) to enhance the decision-making process in autonomous vehicles. By utilizing LLMs' linguistic and contextual understanding abilities with specialized tools, we aim to integrate the language and reasoning capabilities of LLMs into autonomous vehicles. Our research includes experiments in HighwayEnv, a collection of environments for autonomous driving and tactical decision-making tasks, to explore LLMs' interpretation, interaction, and reasoning in various scenarios. We also examine real-time personalization, demonstrating how LLMs can influence driving behaviors based on verbal commands. Our empirical results highlight the substantial advantages of utilizing chain-of-thought prompting, leading to improved driving decisions, and showing the potential for LLMs to enhance personalized driving experiences through ongoing verbal feedback. The proposed framework aims to transform autonomous vehicle operations, offering personalized support, transparent decision-making, and continuous learning to enhance safety and effectiveness. We achieve user-centric, transparent, and adaptive autonomous driving ecosystems supported by the integration of LLMs into autonomous vehicles.

2023-10-09

Evaluating a VR System for Collecting Safety-Critical Vehicle-Pedestrian Interactions

Authors: Erica Weng, Kenta Mukoya, Deva Ramanan, Kris Kitani

Link: http://arxiv.org/abs/2310.05882v1open in new window

Abstract: Autonomous vehicles (AVs) require comprehensive and reliable pedestrian trajectory data to ensure safe operation. However, obtaining data of safety-critical scenarios such as jaywalking and near-collisions, or uncommon agents such as children, disabled pedestrians, and vulnerable road users poses logistical and ethical challenges. This paper evaluates a Virtual Reality (VR) system designed to collect pedestrian trajectory and body pose data in a controlled, low-risk environment. We substantiate the usefulness of such a system through semi-structured interviews with professionals in the AV field, and validate the effectiveness of the system through two empirical studies: a first-person user evaluation involving 62 participants, and a third-person evaluative survey involving 290 respondents. Our findings demonstrate that the VR-based data collection system elicits realistic responses for capturing pedestrian data in safety-critical or uncommon vehicle-pedestrian interaction scenarios.

2023-10-04

Curve Trajectory Model for Human Preferred Path Planning of Automated Vehicles

Authors: Gergo Igneczi, Erno Horvath, Roland Toth, Krisztian Nyilas

Link: http://arxiv.org/abs/2310.02696v1open in new window

Abstract: Automated driving systems are often used for lane keeping tasks. By these systems, a local path is planned ahead of the vehicle. However, these paths are often found unnatural by human drivers. We propose a linear driver model, which can calculate node points that reflect the preferences of human drivers and based on these node points a human driver preferred motion path can be designed for autonomous driving. The model input is the road curvature. We apply this model to a self-developed Euler-curve-based curve fitting algorithm. Through a case study, we show that the model based planned path can reproduce the average behavior of human curve path selection. We analyze the performance of the proposed model through statistical analysis that shows the validity of the captured relations.

2023-10-02

It's all about you: Personalized in-Vehicle Gesture Recognition with a Time-of-Flight Camera

Authors: Amr Gomaa, Guillermo Reyes, Michael Feld

Link: http://arxiv.org/abs/2310.01659v1open in new window

Abstract: Despite significant advances in gesture recognition technology, recognizing gestures in a driving environment remains challenging due to limited and costly data and its dynamic, ever-changing nature. In this work, we propose a model-adaptation approach to personalize the training of a CNNLSTM model and improve recognition accuracy while reducing data requirements. Our approach contributes to the field of dynamic hand gesture recognition while driving by providing a more efficient and accurate method that can be customized for individual users, ultimately enhancing the safety and convenience of in-vehicle interactions, as well as driver's experience and system trust. We incorporate hardware enhancement using a time-of-flight camera and algorithmic enhancement through data augmentation, personalized adaptation, and incremental learning techniques. We evaluate the performance of our approach in terms of recognition accuracy, achieving up to 90%, and show the effectiveness of personalized adaptation and incremental learning for a user-centered design.

2023-09-19

Defining, measuring, and modeling passenger's in-vehicle experience and acceptance of automated vehicles

Authors: Neeraja Bhide, Nanami Hashimoto, Kazimierz Dokurno, Chris Van der Hoorn, Sascha Hoogendoorn-Lanser, Sina Nordhoff

Link: http://arxiv.org/abs/2309.10596v1open in new window

Abstract: Automated vehicle acceptance (AVA) has been measured mostly subjectively by questionnaires and interviews, with a main focus on drivers inside automated vehicles (AVs). To ensure that AVs are widely accepted by the public, ensuring the acceptance by both drivers and passengers is key. The in-vehicle experience of passengers will determine the extent to which AVs will be accepted by passengers. A comprehensive understanding of potential assessment methods to measure the passenger experience in AVs is needed to improve the in-vehicle experience of passengers and thereby the acceptance. The present work provides an overview of assessment methods that were used to measure a driver's behavior, and cognitive and emotional states during (automated) driving. The results of the review have shown that these assessment methods can be classified by type of data-collection method (e.g., questionnaires, interviews, direct input devices, sensors), object of their measurement (i.e., perception, behavior, state), time of measurement, and degree of objectivity of the data collected. A conceptual model synthesizes the results of the literature review, formulating relationships between the factors constituting the in-vehicle experience and AVA acceptance. It is theorized that the in-vehicle experience influences the intention to use, with intention to use serving as predictor of actual use. The model also formulates relationships between actual use and well-being. A combined approach of using both subjective and objective assessment methods is needed to provide more accurate estimates for AVA, and advance the uptake and use of AVs.

Instrument for the assessment of road user automated vehicle acceptance: A pyramid of user needs of automated vehicles

Authors: Sina Nordhoff, Marjan Hagenzieker, Esko Lehtonen, Michael Oehl, Marc Wilbrink, Ibrahim Ozturk, David Maggi, Natacha Métayer, Gaëtan Merlhiot, Natasha Merat

Link: http://arxiv.org/abs/2309.10559v1open in new window

Abstract: This study proposed a new methodological approach for the assessment of automated vehicle acceptance (AVA) from the perspective of road users inside and outside of AVs pre- and post- AV experience. Users can be drivers and passengers, but also external road users, such as pedestrians, (motor-)cyclists, and other car drivers, interacting with AVs. A pyramid was developed, which provides a hierarchical representation of user needs. Fundamental user needs are organized at the bottom of the pyramid, while higher-level user needs are at the top of the pyramid. The pyramid distinguishes between six levels of needs, which are safety trust, efficiency, comfort and pleasure, social influence, and well-being. Some user needs universally exist across users, while some are user-specific needs. These needs are translated into operationalizable indicators representing items of a questionnaire for the assessment of AVA of users inside and outside AVs. The formulation of the questionnaire items was derived from established technology acceptance models. As the instrument was based on the same model for all road users, the comparison of AVA between different road users is now possible. We recommend future research to validate this questionnaire, administering it in studies to contribute to the development of a short, efficient, and standardized metric for the assessment of AVA.

Understanding and addressing the resistance towards autonomous vehicles (AVs)

Authors: Sina Nordhoff

Link: http://arxiv.org/abs/2309.10484v1open in new window

Abstract: Autonomous vehicles (AVs) are expected to bring major benefits to transport and society. To exploit this potential, their acceptance by society is a necessary condition. However, AV acceptance is currently at stake: AVs face resistance by bystanders and local communities. Resistance can prevent the implementation and use of AVs, threatening road safety and efficiency. The present study performed a qualitative and quantitative text analysis of comments submitted by locals in San Francisco (SF) to the California Public Utilities Commission (CPUC) on the fared deployment of AVs. The results of the analysis are synthesized, and a conceptual framework explaining and predicting resistance is proposed. The framework posits that the occurrence of resistance is a direct result of the perception of threats, which is determined by individual and system characteristics, direct and indirect consequences of system use, reactions of others, and external events. AVs as threat to safety was associated with their unpredictable, and illegal driving behavior, as well as producing conflict situations. The lack of explicit communication between AVs and other road users due to the absence of a human driver behind the steering wheel negatively contributed to perceived safety and trust, especially for vulnerable populations in crossing situations. Respondents reported a negative impact on road capacity, congestion, and traffic flow, with AVs blocking other road users, such as emergency vehicles. Inaccessible vehicle design contributed to the exclusion of vulnerable groups with disabilities. The scientific dialogue on acceptance of AVs needs to shift towards resistance as the 'other' essential element of acceptance to ensure that we live up to our promise of transitioning towards more sustainable mobility that is inclusive, equitable, fair, just, affordable, and available to all.

Drive as You Speak: Enabling Human-Like Interaction with Large Language Models in Autonomous Vehicles

Authors: Can Cui, Yunsheng Ma, Xu Cao, Wenqian Ye, Ziran Wang

Link: http://arxiv.org/abs/2309.10228v1open in new window

Abstract: The future of autonomous vehicles lies in the convergence of human-centric design and advanced AI capabilities. Autonomous vehicles of the future will not only transport passengers but also interact and adapt to their desires, making the journey comfortable, efficient, and pleasant. In this paper, we present a novel framework that leverages Large Language Models (LLMs) to enhance autonomous vehicles' decision-making processes. By integrating LLMs' natural language capabilities and contextual understanding, specialized tools usage, synergizing reasoning, and acting with various modules on autonomous vehicles, this framework aims to seamlessly integrate the advanced language and reasoning capabilities of LLMs into autonomous vehicles. The proposed framework holds the potential to revolutionize the way autonomous vehicles operate, offering personalized assistance, continuous learning, and transparent decision-making, ultimately contributing to safer and more efficient autonomous driving technologies.

2023-09-08

Enabling the Evaluation of Driver Physiology Via Vehicle Dynamics

Authors: Rodrigo Ordonez-Hurtado, Bo Wen, Nicholas Barra, Ryan Vimba, Sergio Cabrero-Barros, Sergiy Zhuk, Jeffrey L. Rogers

Link: http://arxiv.org/abs/2309.04078v1open in new window

Abstract: Driving is a daily routine for many individuals across the globe. This paper presents the configuration and methodologies used to transform a vehicle into a connected ecosystem capable of assessing driver physiology. We integrated an array of commercial sensors from the automotive and digital health sectors along with driver inputs from the vehicle itself. This amalgamation of sensors allows for meticulous recording of the external conditions and driving maneuvers. These data streams are processed to extract key parameters, providing insights into driver behavior in relation to their external environment and illuminating vital physiological responses. This innovative driver evaluation system holds the potential to amplify road safety. Moreover, when paired with data from conventional health settings, it may enhance early detection of health-related complications.

2023-08-30

Assessing Drivers' Situation Awareness in Semi-Autonomous Vehicles: ASP based Characterisations of Driving Dynamics for Modelling Scene Interpretation and Projection

Authors: Jakob Suchan, Jan-Patrick Osterloh

Link: http://arxiv.org/abs/2308.15895v1open in new window

Abstract: Semi-autonomous driving, as it is already available today and will eventually become even more accessible, implies the need for driver and automation system to reliably work together in order to ensure safe driving. A particular challenge in this endeavour are situations in which the vehicle's automation is no longer able to drive and is thus requesting the human to take over. In these situations the driver has to quickly build awareness for the traffic situation to be able to take over control and safely drive the car. Within this context we present a software and hardware framework to asses how aware the driver is about the situation and to provide human-centred assistance to help in building situation awareness. The framework is developed as a modular system within the Robot Operating System (ROS) with modules for sensing the environment and the driver state, modelling the driver's situation awareness, and for guiding the driver's attention using specialized Human Machine Interfaces (HMIs). A particular focus of this paper is on an Answer Set Programming (ASP) based approach for modelling and reasoning about the driver's interpretation and projection of the scene. This is based on scene data, as well as eye-tracking data reflecting the scene elements observed by the driver. We present the overall application and discuss the role of semantic reasoning and modelling cognitive functions based on logic programming in such applications. Furthermore we present the ASP approach for interpretation and projection of the driver's situation awareness and its integration within the overall system in the context of a real-world use-case in simulated as well as in real driving.

2023-08-11

Pedestrian Trajectory Prediction in Pedestrian-Vehicle Mixed Environments: A Systematic Review

Authors: Mahsa Golchoubian, Moojan Ghafurian, Kerstin Dautenhahn, Nasser Lashgarian Azad

Link: http://arxiv.org/abs/2308.06419v1open in new window

Abstract: Planning an autonomous vehicle's (AV) path in a space shared with pedestrians requires reasoning about pedestrians' future trajectories. A practical pedestrian trajectory prediction algorithm for the use of AVs needs to consider the effect of the vehicle's interactions with the pedestrians on pedestrians' future motion behaviours. In this regard, this paper systematically reviews different methods proposed in the literature for modelling pedestrian trajectory prediction in presence of vehicles that can be applied for unstructured environments. This paper also investigates specific considerations for pedestrian-vehicle interaction (compared with pedestrian-pedestrian interaction) and reviews how different variables such as prediction uncertainties and behavioural differences are accounted for in the previously proposed prediction models. PRISMA guidelines were followed. Articles that did not consider vehicle and pedestrian interactions or actual trajectories, and articles that only focused on road crossing were excluded. A total of 1260 unique peer-reviewed articles from ACM Digital Library, IEEE Xplore, and Scopus databases were identified in the search. 64 articles were included in the final review as they met the inclusion and exclusion criteria. An overview of datasets containing trajectory data of both pedestrians and vehicles used by the reviewed papers has been provided. Research gaps and directions for future work, such as having more effective definition of interacting agents in deep learning methods and the need for gathering more datasets of mixed traffic in unstructured environments are discussed.

2023-08-04

Designing for Passengers' Information Needs on Fellow Travelers: A Comparison of Day and Night Rides in Shared Automated Vehicles

Authors: Lukas A. Flohr, Martina Schuß, Dieter P. Wallach, Antonio Krüger, Andreas Riener

Link: http://arxiv.org/abs/2308.02616v1open in new window

Abstract: Shared automated mobility-on-demand promises efficient, sustainable, and flexible transportation. Nevertheless, security concerns, resilience, and their mutual influence - especially at night - will likely be the most critical barriers to public adoption since passengers have to share rides with strangers without a human driver on board. As related work points out that information about fellow travelers might mitigate passengers' concerns, we designed two user interface variants to investigate the role of this information in an exploratory within-subjects user study (N = 24). Participants experienced four automated day and night rides with varying personal information about co-passengers in a simulated environment. The results of the mixed-method study indicate that having information about other passengers (e.g., photo, gender, and name) positively affects user experience at night. In contrast, it is less necessary during the day. Considering participants' simultaneously raised privacy demands poses a substantial challenge for resilient system design.

Human-Centered Design and Evaluation of a Workplace for the Remote Assistance of Highly Automated Vehicles

Authors: Andreas Schrank, Fabian Walocha, Stefan Brandenburg, Michael Oehl

Link: http://arxiv.org/abs/2308.02330v1open in new window

Abstract: Remotely operating vehicles utilizes the benefits of vehicle automation when fully automated driving is not yet possible. A human operator ensures safety and availability from afar and supports the vehicle automation when its capabilities are exceeded. The remote operator thus fulfills the legal requirements in Germany as a Technical Supervisor to operate highly automated vehicles at SAE 4. To integrate the remote operator into the automated driving system, a novel user-centered human-machine interface (HMI) for remote assistance workplaces was developed and initially evaluated. The insights gained in this process were incorporated into the design of a workplace prototype for remote assistance. This prototype was now tested in the study reported here by 34 participants meeting the professional background criteria for the role of Technical Supervisor according to the German law by using typical remote assistance scenarios created in a simulation environment. Even under elevated cognitive load induced by simultaneously engaging in a secondary task, participants were able to obtain sufficient situation awareness and quickly resolve the scenarios. The HMI also yielded favorable usability and acceptance ratings. The results of the study inform the iterative workplace development and further research on the remote assistance of highly automated vehicles.

2023-07-24

Human-vehicle interaction for autonomous vehicles in crosswalk scenarios: Field experiments with pedestrians and passengers

Authors: R. Izquierdo, S. Martín, J. Alonso, I. Parra, M. A., D. Fernández-Llorca

Link: http://arxiv.org/abs/2307.12708v1open in new window

Abstract: This paper presents the results of real-world testing of human-vehicle interactions with an autonomous vehicle equipped with internal and external Human Machine Interfaces (HMIs) in a crosswalk scenario. The internal and external HMIs were combined with implicit communication techniques using gentle and aggressive braking maneuvers in the crosswalk. Results have been collected in the form of questionnaires and measurable variables such as distance or speed when the pedestrian decides to cross. The questionnaires show that pedestrians feel safer when the external HMI or the gentle braking maneuver is used interchangeably, while the measured variables show that external HMI only helps in combination with the gentle braking maneuver. The questionnaires also show that internal HMI only improves passenger confidence in combination with the aggressive braking maneuver.

2023-07-15

Study on the Impacts of Hazardous Behaviors on Autonomous Vehicle Collision Rates Based on Humanoid Scenario Generation in CARLA

Authors: Longfei Mo, Min Hua, Hongyu Sun, Hongming Xu, Bin Shuai, Quan Zhou

Link: http://arxiv.org/abs/2307.10229v1open in new window

Abstract: Testing of function safety and Safety Of The Intended Functionality (SOTIF) is important for autonomous vehicles (AVs). It is hard to test the AV's hazard response in the real world because it would involve hazards to passengers and other road users. This paper studied on virtual testing of AV on the CARLA platform and proposed a Humanoid Scenario Generation (HSG) scheme to investigate the impacts of hazardous behaviors on AV collision rates. The HSG scheme breakthrough the current limitation on the rarity and reproducibility of real scenes. By accurately capturing five prominent human driver behaviors that directly contribute to vehicle collisions in the real world, the methodology significantly enhances the realism and diversity of the simulation, as evidenced by collision rate statistics across various traffic scenarios. Thus, the modular framework allows for customization, and its seamless integration within the CARLA platform ensures compatibility with existing tools. Ultimately, the comparison results demonstrate that all vehicles that exhibited hazardous behaviors followed the predefined random speed distribution and the effectiveness of the HSG was validated by the distinct characteristics displayed by these behaviors.

2023-07-12

Assessing Augmented Reality Selection Techniques for Passengers in Moving Vehicles: A Real-World User Study

Authors: Robin Connor Schramm, Markus Sasalovici, Axel Hildebrand, Ulrich Schwanecke

Link: http://arxiv.org/abs/2307.06173v1open in new window

Abstract: Nowadays, cars offer many possibilities to explore the world around you by providing location-based information displayed on a 2D-Map. However, this information is often only available to front-seat passengers while being restricted to in-car displays. To propose a more natural way of interacting with the environment, we implemented an augmented reality head-mounted display to overlay points of interest onto the real world. We aim to compare multiple selection techniques for digital objects located outside a moving car by investigating head gaze with dwell time, head gaze with hardware button, eye gaze with hardware button, and hand pointing with gesture confirmation. Our study was conducted in a moving car under real-world conditions (N=22), with significant results indicating that hand pointing usage led to slower and less precise content selection while eye gaze was preferred by participants and performed on par with the other techniques.

2023-06-21

Seat pan angle optimization for vehicle ride comfort using finite element model of human spine

Authors: Raj Desai, Ankit Vekaria, Anirban Guha, P. Seshu

Link: http://arxiv.org/abs/2306.12354v1open in new window

Abstract: Ride comfort of the driver/occupant of a vehicle has been usually analyzed by multibody biodynamic models of human beings. Accurate modeling of critical segments of the human body, e.g. the spine requires these models to have a very high number of segments. The resultant increase in degrees of freedom makes these models difficult to analyze and not able to provide certain details such as seat pressure distribution, the effect of cushion shapes, material, etc. This work presents a finite element based model of a human being seated in a vehicle in which the spine has been modelled in 3-D. It consists of cervical to coccyx vertebrae, ligaments, and discs and has been validated against modal frequencies reported in the literature. It was then subjected to sinusoidal vertical RMS acceleration of 0.1 g for mimicking road induced vibration. The dynamic characteristics of the human body were studied in terms of the seat to head transmissibility and intervertebral disc pressure. The effect of the seat pan angle on these parameters was studied and it was established that the optimum angle should lie between 15 and 19 degrees. This work is expected to be followed up by more simulations of this nature to study other human body comfort and seat design related parameters leading to optimized seat designs for various ride conditions.

2023-06-14

A new computational perceived risk model for automated vehicles based on potential collision avoidance difficulty (PCAD)

Authors: Xiaolin He, Riender Happee, Meng Wang

Link: http://arxiv.org/abs/2306.08458v1open in new window

Abstract: Perceived risk is crucial in designing trustworthy and acceptable vehicle automation systems. However, our understanding of its dynamics is limited, and models for perceived risk dynamics are scarce in the literature. This study formulates a new computational perceived risk model based on potential collision avoidance difficulty (PCAD) for drivers of SAE level 2 driving automation. PCAD uses the 2D safe velocity gap as the potential collision avoidance difficulty, and takes into account collision severity. The safe velocity gap is defined as the 2D gap between the current velocity and the safe velocity region, and represents the amount of braking and steering needed, considering behavioural uncertainty of neighbouring vehicles and imprecise control of the subject vehicle. The PCAD predicts perceived risk both in continuous time and per event. We compare the PCAD model with three state-of-the-art models and analyse the models both theoretically and empirically with two unique datasets: Dataset Merging and Dataset Obstacle Avoidance. The PCAD model generally outperforms the other models in terms of model error, detection rate, and the ability to accurately capture the tendencies of human drivers' perceived risk, albeit at a longer computation time. Additionally, the study shows that the perceived risk is not static and varies with the surrounding traffic conditions. This research advances our understanding of perceived risk in automated driving and paves the way for improved safety and acceptance of driving automation systems.

2023-05-30

Large Car-following Data Based on Lyft level-5 Open Dataset: Following Autonomous Vehicles vs. Human-driven Vehicles

Authors: Guopeng Li, Yiru Jiao, Victor L. Knoop, Simeon C. Calvert, J. W. C. van Lint

Link: http://arxiv.org/abs/2305.18921v2open in new window

Abstract: Car-Following (CF), as a fundamental driving behaviour, has significant influences on the safety and efficiency of traffic flow. Investigating how human drivers react differently when following autonomous vs. human-driven vehicles (HV) is thus critical for mixed traffic flow. Research in this field can be expedited with trajectory datasets collected by Autonomous Vehicles (AVs). However, trajectories collected by AVs are noisy and not readily applicable for studying CF behaviour. This paper extracts and enhances two categories of CF data, HV-following-AV (H-A) and HV-following-HV (H-H), from the open Lyft level-5 dataset. First, CF pairs are selected based on specific rules. Next, the quality of raw data is assessed by anomaly analysis. Then, the raw CF data is corrected and enhanced via motion planning, Kalman filtering, and wavelet denoising. As a result, 29k+ H-A and 42k+ H-H car-following segments are obtained, with a total driving distance of 150k+ km. A diversity assessment shows that the processed data cover complete CF regimes for calibrating CF models. This open and ready-to-use dataset provides the opportunity to investigate the CF behaviours of following AVs vs. HVs from real-world data. It can further facilitate studies on exploring the impact of AVs on mixed urban traffic.

2023-05-29

Is Silent eHMI Enough? A Passenger-Centric Study on Effective eHMI for Autonomous Personal Mobility Vehicles in the Field

Authors: Hailong Liu, Yang Li, Zhe Zeng, Hao Cheng, Chen Peng, Takahiro Wada

Link: http://arxiv.org/abs/2305.17862v2open in new window

Abstract: Autonomous Personal Mobility Vehicle (APMV) is a miniaturized autonomous vehicle designed to provide short-distance mobility to everyone in pedestrian-rich environments. By the characteristic of the open design, passengers on the APMV are exposed to the communication between the eHMI deployed on APMVs and pedestrians. Therefore, to ensure an optimal passenger experience, eHMI designs for APMVs must consider the potential impact of APMV-pedestrian communications on passengers' subjective feelings. To this end, this study discussed three external human-machine interface (eHMI) designs, i.e., 1) graphical user interface (GUI)-based eHMI with text message (eHMI-T), 2) multimodal user interface (MUI)-based eHMI with neutral voice (eHMI-NV), and 3) MUI-based eHMI with affective voice (eHMI-AV), from the perspective of APMV passengers in the communication between APMV and pedestrians. In the riding field experiment (N=24), we found that eHMI-T may be less suitable for APMVs. This conclusion was drawn based on passengers' feedback, as they expressed an awkward feeling during the "silent time" when the eHMI-T provided information only to pedestrians but not to passengers. Additionally, these two MUI-based eHMIs with voice cues had their own advantages, i.e., eHMI-NV has an advantage in pragmatic quality, while eHMI-AV has an advantage in hedonic quality. The study also highlights the necessity of considering passengers' personalities when desig

2023-05-28

Investigating HMIs to Foster Communications between Conventional Vehicles and Autonomous Vehicles in Intersections

Authors: Lilit Avetisyan, Aditya Deshmukh, X. Jessie Yang, Feng Zhou

Link: http://arxiv.org/abs/2305.17769v1open in new window

Abstract: In mixed traffic environments that involve conventional vehicles (CVs) and autonomous vehicles (AVs), it is crucial for CV drivers to maintain an appropriate level of situation awareness to ensure safe and efficient interactions with AVs. This study investigates how AV communication through human-machine interfaces (HMIs) affects CV drivers' situation awareness (SA) in mixed traffic environments, especially at intersections. Initially, we designed eight HMI concepts through a human-centered design process. The two highest-rated concepts were selected for implementation as external and internal HMIs (eHMIs and iHMIs). Subsequently, we designed a within-subjects experiment with three conditions, a control condition without any communication HMI, and two treatment conditions utilizing eHMIs and iHMIs as communication means. We investigated the effects of these conditions on 50 participants acting as CV drivers in a virtual environment (VR) driving simulator. Self-reported assessments and eye-tracking measures were employed to evaluate participants' situation awareness, trust, acceptance, and mental workload. Results indicated that the iHMI condition resulted in superior SA among participants and improved trust in AV compared to the control and eHMI conditions. Additionally, iHMI led to a comparatively lower increase in mental workload compared to the other two conditions. Our study contributes to the development of effective AV-CV communications and has the potential to inform the design of future AV systems.

2023-05-25

Multitasking while Driving: How Drivers Self-Regulate their Interaction with In-Vehicle Touchscreens in Automated Driving

Authors: Patrick Ebel, Christoph Lingenfelder, Andreas Vogelsang

Link: http://arxiv.org/abs/2305.16042v1open in new window

Abstract: Driver assistance systems are designed to increase comfort and safety by automating parts of the driving task. At the same time, modern in-vehicle information systems with large touchscreens provide the driver with numerous options for entertainment, information, or communication, and are a potential source of distraction. However, little is known about how driving automation affects how drivers interact with the center stack touchscreen, i.e., how drivers self-regulate their behavior in response to different levels of driving automation. To investigate this, we apply multilevel models to a real-world driving dataset consisting of 31,378 sequences. Our results show significant differences in drivers' interaction and glance behavior in response to different levels of driving automation, vehicle speed, and road curvature. During automated driving, drivers perform more interactions per touchscreen sequence and increase the time spent looking at the center stack touchscreen. Specifically, at higher levels of driving automation (level 2), the mean glance duration toward the center stack touchscreen increases by 36% and the mean number of interactions per sequence increases by 17% compared to manual driving. Furthermore, partially automated driving has a strong impact on the use of more complex UI elements (e.g., maps) and touch gestures (e.g., multitouch). We also show that the effect of driving automation on drivers' self-regulation is greater than that of vehicle speed and road curvature. The derived knowledge can inform the design and evaluation of touch-based infotainment systems and the development of context-aware driver monitoring systems.

2023-05-18

A systematic review of safety-critical scenarios between automated vehicles and vulnerable road users

Authors: Aditya Deshmukh, Zifei Wang, Aaron Gunn, Huizhong Guo, Rini Sherony, Fred Feng, Brian Lin, Shan Bao, Feng Zhou

Link: http://arxiv.org/abs/2305.11291v1open in new window

Abstract: Automated vehicles (AVs) are of great potential in reducing crashes on the road. However, it is still complicated to eliminate all the possible accidents, especially those with vulnerable road users (VRUs), who are among the greater risk than vehicle occupants in traffic accidents. Thus, in this paper, we conducted a systematic review of safety-critical scenarios between AVs and VRUs. We identified 39 papers in the literature and typical safety-critical scenarios between AVs and VRUs. They were further divided into three categories, including human factors, environmental factors, and vehicle factors. We then discussed the development, challenges, and possible solutions for each category. In order to further improve the safety of VRUs when interacting with AVs, multiple stakeholders should work together to 1) improve AI and sensor technologies and vehicle automation, 2) redesign the current transportation infrastructure, 3) design effective communication technologies and interfaces between vehicles and between vehicles and VRUs, and 4) design effective simulation and testing methods to support and evaluate both infrastructure and technologies.

2023-04-21

Remote Monitoring and Teleoperation of Autonomous Vehicles $-$ Is Virtual Reality an Option?

Authors: Snehanjali Kalamkar, Verena Biener, Fabian Beck, Jens Grubert

Link: http://arxiv.org/abs/2304.11228v2open in new window

Abstract: While the promise of autonomous vehicles has led to significant scientific and industrial progress, fully automated, SAE level 5 conform cars will likely not see mass adoption anytime soon. Instead, in many applications, human supervision, such as remote monitoring and teleoperation, will be required for the foreseeable future. While Virtual Reality (VR) has been proposed as one potential interface for teleoperation, its benefits and drawbacks over physical monitoring and teleoperation solutions have not been thoroughly investigated. To this end, we contribute three user studies, comparing and quantifying the performance of and subjective feedback for a VR-based system with an existing monitoring and teleoperation system, which is in industrial use today. Through these three user studies, we contribute to a better understanding of future virtual monitoring and teleoperation solutions for autonomous vehicles. The results of our first user study (n=16) indicate that a VR interface replicating the physical interface does not outperform the physical interface. It also quantifies the negative effects that combined monitoring and teleoperating tasks have on users irrespective of the interface being used. The results of the second user study (n=24) indicate that the perceptual and ergonomic issues caused by VR outweigh its benefits, like better concentration through isolation. The third follow-up user study (n=24) specifically targeted the perceptual and ergonomic issues of VR; the subjective feedback of this study indicates that newer-generation VR headsets have the potential to catch up with the current physical displays.

2023-04-13

Towards Prototyping Driverless Vehicle Behaviors, City Design, and Policies Simultaneously

Authors: Hauke Sandhaus, Wendy Ju, Qian Yang

Link: http://arxiv.org/abs/2304.06639v1open in new window

Abstract: Autonomous Vehicles (AVs) can potentially improve urban living by reducing accidents, increasing transportation accessibility and equity, and decreasing emissions. Realizing these promises requires the innovations of AV driving behaviors, city plans and infrastructure, and traffic and transportation policies to join forces. However, the complex interdependencies among AV, city, and policy design issues can hinder their innovation. We argue the path towards better AV cities is not a process of matching city designs and policies with AVs' technological innovations, but a process of iterative prototyping of all three simultaneously: Innovations can happen step-wise as the knot of AV, city, and policy design loosens and tightens, unwinds and reties. In this paper, we ask: How can innovators innovate AVs, city environments, and policies simultaneously and productively toward better AV cities? The paper has two parts. First, we map out the interconnections among the many AV, city, and policy design decisions, based on a literature review spanning HCI/HRI, transportation science, urban studies, law and policy, operations research, economy, and philosophy. This map can help innovators identify design constraints and opportunities across the traditional AV/city/policy design disciplinary bounds. Second, we review the respective methods for AV, city, and policy design, and identify key barriers in combining them: (1) Organizational barriers to AV-city-policy design collaboration, (2) computational barriers to multi-granularity AV-city-policy simulation, and (3) different assumptions and goals in joint AV-city-policy optimization. We discuss two broad approaches that can potentially address these challenges, namely, "low-fidelity integrative City-AV-Policy Simulation (iCAPS)" and "participatory design optimization".

2023-04-09

Multimodal Brain-Computer Interface for In-Vehicle Driver Cognitive Load Measurement: Dataset and Baselines

Authors: Prithila Angkan, Behnam Behinaein, Zunayed Mahmud, Anubhav Bhatti, Dirk Rodenburg, Paul Hungler, Ali Etemad

Link: http://arxiv.org/abs/2304.04273v2open in new window

Abstract: Through this paper, we introduce a novel driver cognitive load assessment dataset, CL-Drive, which contains Electroencephalogram (EEG) signals along with other physiological signals such as Electrocardiography (ECG) and Electrodermal Activity (EDA) as well as eye tracking data. The data was collected from 21 subjects while driving in an immersive vehicle simulator, in various driving conditions, to induce different levels of cognitive load in the subjects. The tasks consisted of 9 complexity levels for 3 minutes each. Each driver reported their subjective cognitive load every 10 seconds throughout the experiment. The dataset contains the subjective cognitive load recorded as ground truth. In this paper, we also provide benchmark classification results for different machine learning and deep learning models for both binary and ternary label distributions. We followed 2 evaluation criteria namely 10-fold and leave-one-subject-out (LOSO). We have trained our models on both hand-crafted features as well as on raw data.

2023-03-31

$\textit{e-Uber}$: A Crowdsourcing Platform for Electric Vehicle-based Ride- and Energy-sharing

Authors: Ashutosh Timilsina, Simone Silvestri

Link: http://arxiv.org/abs/2304.04753v1open in new window

Abstract: The sharing-economy-based business model has recently seen success in the transportation and accommodation sectors with companies like Uber and Airbnb. There is growing interest in applying this model to energy systems, with modalities like peer-to-peer (P2P) Energy Trading, Electric Vehicles (EV)-based Vehicle-to-Grid (V2G), Vehicle-to-Home (V2H), Vehicle-to-Vehicle (V2V), and Battery Swapping Technology (BST). In this work, we exploit the increasing diffusion of EVs to realize a crowdsourcing platform called e-Uber that jointly enables ride-sharing and energy-sharing through V2G and BST. e-Uber exploits spatial crowdsourcing, reinforcement learning, and reverse auction theory. Specifically, the platform uses reinforcement learning to understand the drivers' preferences towards different ride-sharing and energy-sharing tasks. Based on these preferences, a personalized list is recommended to each driver through CMAB-based Algorithm for task Recommendation System (CARS). Drivers bid on their preferred tasks in their list in a reverse auction fashion. Then e-Uber solves the task assignment optimization problem that minimizes cost and guarantees V2G energy requirement. We prove that this problem is NP-hard and introduce a bipartite matching-inspired heuristic, Bipartite Matching-based Winner selection (BMW), that has polynomial time complexity. Results from experiments using real data from NYC taxi trips and energy consumption show that e-Uber performs close to the optimum and finds better solutions compared to a state-of-the-art approach

2023-03-17

Trust in Shared Automated Vehicles: Study on Two Mobility Platforms

Authors: Shashank Mehrotra, Jacob G Hunter, Matthew Konishi, Kumar Akash, Zhaobo Zheng, Teruhisa Misu, Anil Kumar, Tahira Reid, Neera Jain

Link: http://arxiv.org/abs/2303.09711v1open in new window

Abstract: The ever-increasing adoption of shared transportation modalities across the United States has the potential to fundamentally change the preferences and usage of different mobilities. It also raises several challenges with respect to the design and development of automated mobilities that can enable a large population to take advantage of this emergent technology. One such challenge is the lack of understanding of how trust in one automated mobility may impact trust in another. Without this understanding, it is difficult for researchers to determine whether future mobility solutions will have acceptance within different population groups. This study focuses on identifying the differences in trust across different mobility and how trust evolves across their use for participants who preferred an aggressive driving style. A dual mobility simulator study was designed in which 48 participants experienced two different automated mobilities (car and sidewalk). The results found that participants showed increasing levels of trust when they transitioned from the car to the sidewalk mobility. In comparison, participants showed decreasing levels of trust when they transitioned from the sidewalk to the car mobility. The findings from the study help inform and identify how people can develop trust in future mobility platforms and could inform the design of interventions that may help improve the trust and acceptance of future mobility.

2023-03-16

Developing IncidentUI -- A Ride Comfort and Disengagement Evaluation Application for Autonomous Vehicles

Authors: Manas Mehta, Nugzar Chkhaidze, Yizhen Wang

Link: http://arxiv.org/abs/2303.13545v1open in new window

Abstract: This report details the design, development, and implementation of IncidentUI, an Android tablet application designed to measure user-experienced ride comfort and record disengagement data for autonomous vehicles (AV) during test drives. The goal of our project was to develop an Android application to run on a peripheral tablet and communicate with the Drive Pegasus AGX, the AI Computing Platform for Nvidia's AV Level 2 Autonomy Solution Architecture [1], to detect AV disengagements and report ride comfort. We designed and developed an Android XML-based intuitive user interface for IncidentUI. The development of IncidentUI required a redesign of the system architecture by redeveloping the system communications protocol in Java and implementing the Protocol Buffers (Protobufs) in Java using the existing system Protobuf definitions. The final iteration of IncidentUI yielded the desired functionality while testing on an AV test drive. We also received positive feedback from Nvidia's AV Platform Team during our final IncidentUI demonstration.

2023-03-15

Pre-instruction for Pedestrians Interacting Autonomous Vehicles with an eHMI: Effects on Their Psychology and Walking Behavior

Authors: Hailong Liu, Takatsugu Hirayama

Link: http://arxiv.org/abs/2303.08380v1open in new window

Abstract: eHMIs refers to a novel and explicit communication method for pedestrian-AV negotiation in interactions, such as in encounter scenarios. However, pedestrians with limited experience in negotiating with AVs could lack a comprehensive and correct understanding of the information on driving intentions' meaning as conveyed by AVs through eHMI, particularly in the current contexts where AV and eHMI are not yet mainstream. Consequently, pedestrians who misunderstand the driving intention of the AVs during the encounter may feel threatened and perform unpredictable behaviors. To solve this issue, this study proposes using the pre-instruction on the rationale of eHMI to help pedestrians correctly understand driving intentions and predict AV behavior. Consequently, this can improve their subjective feelings (ie. sense of danger, trust in AV, and sense of relief) and decision-making. In addition, this study suggests that the eHMI could better guide pedestrian behavior through the pre-instruction. The results of interaction experiments in the road crossing scene show that participants found it more difficult to recognize the situation when they encountered an AV without eHMI than when they encountered a manual driving vehicle (MV); in addition, participants' subjective feelings and hesitations while decision-making worsened significantly. After the pre-instruction, the participants could understand the driving intention of an AV with eHMI and predict driving behavior more easily. Furthermore, the participants' subjective feelings and hesitation to make decisions improved, reaching the same criteria used for MV. Moreover, this study found that the information guidance of using eHMI influenced the participants' walking speed, resulting in a small variation over the time horizon via multiple trials when they fully understood the principle of eHMI through the pre-instruction.

2023-03-09

Work with AI and Work for AI: Autonomous Vehicle Safety Drivers' Lived Experiences

Authors: Mengdi Chu, Keyu Zong, Xin Shu, Jiangtao Gong, Zicong Lu, Kaimin Guo, Xinyi Dai, Guyue Zhou

Link: http://arxiv.org/abs/2303.04986v1open in new window

Abstract: The development of Autonomous Vehicle (AV) has created a novel job, the safety driver, recruited from experienced drivers to supervise and operate AV in numerous driving missions. Safety drivers usually work with non-perfect AV in high-risk real-world traffic environments for road testing tasks. However, this group of workers is under-explored in the HCI community. To fill this gap, we conducted semi-structured interviews with 26 safety drivers. Our results present how safety drivers cope with defective algorithms and shape and calibrate their perceptions while working with AV. We found that, as front-line workers, safety drivers are forced to take risks accumulated from the AV industry upstream and are also confronting restricted self-development in working for AV development. We contribute the first empirical evidence of the lived experience of safety drivers, the first passengers in the development of AV, and also the grassroots workers for AV, which can shed light on future human-AI interaction research.

2023-02-17

Subjective Vertical Conflict Model with Visual Vertical: Predicting Motion Sickness on Autonomous Personal Mobility Vehicles

Authors: Hailong Liu, Shota Inoue, Takahiro Wada

Link: http://arxiv.org/abs/2302.08642v1open in new window

Abstract: Passengers of level 3-5 autonomous personal mobility vehicles (APMV) can perform non-driving tasks, such as reading books and smartphones, while driving. It has been pointed out that such activities may increase motion sickness, especially when frequently avoiding pedestrians or obstacles in shared spaces. Many studies have been conducted to build countermeasures, of which various computational motion sickness models have been developed. Among them, models based on subjective vertical conflict (SVC) theory, which describes vertical changes in direction sensed by human sensory organs v.s. those expected by the central nervous system, have been actively developed. However, no current computational model can integrate visual vertical information with vestibular sensations. We proposed a 6 DoF SVC-VV model which added a visually perceived vertical block into a conventional 6 DoF SVC model to predict visual vertical directions from image data simulating the visual input of a human. In a driving experiment, 27 participants experienced an APMV with two visual conditions: looking ahead (LAD) and working with a tablet device (WAD). We verified that passengers got motion sickness while riding the APMV, and the symptom were severer when especially working on it, by simulating the frequent pedestrian avoidance scenarios of the APMV in the experiment. In addition, the results of the experiment demonstrated that the proposed 6 DoF SVC-VV model could describe the increased motion sickness experienced when the visual vertical and gravitational acceleration directions were different.

2023-02-16

Drive Right: Promoting Autonomous Vehicle Education Through an Integrated Simulation Platform

Authors: Zhijie Qiao, Helen Loeb, Venkata Gurrla, Matt Lebermann, Johannes Betz, Rahul Mangharam

Link: http://arxiv.org/abs/2302.08613v1open in new window

Abstract: Autonomous vehicles (AVs) are being rapidly introduced into our lives. However, public misunderstanding and mistrust have become prominent issues hindering the acceptance of these driverless technologies. The primary objective of this study is to evaluate the effectiveness of a driving simulator to help the public gain an understanding of AVs and build trust in them. To achieve this aim, we built an integrated simulation platform, designed various driving scenarios, and recruited 28 participants for the experiment. The study results indicate that a driving simulator effectively decreases the participants' perceived risk of AVs and increases perceived usefulness. The proposed methodologies and findings of this study can be further explored by auto manufacturers and policymakers to provide user-friendly AV design.

2023-01-21

Leveraging driver vehicle and environment interaction: Machine learning using driver monitoring cameras to detect drunk driving

Authors: Kevin Koch, Martin Maritsch, Eva van Weenen, Stefan Feuerriegel, Matthias Pfäffli, Elgar Fleisch, Wolfgang Weinmann, Felix Wortmann

Link: http://arxiv.org/abs/2301.08978v2open in new window

Abstract: Excessive alcohol consumption causes disability and death. Digital interventions are promising means to promote behavioral change and thus prevent alcohol-related harm, especially in critical moments such as driving. This requires real-time information on a person's blood alcohol concentration (BAC). Here, we develop an in-vehicle machine learning system to predict critical BAC levels. Our system leverages driver monitoring cameras mandated in numerous countries worldwide. We evaluate our system with n=30 participants in an interventional simulator study. Our system reliably detects driving under any alcohol influence (area under the receiver operating characteristic curve [AUROC] 0.88) and driving above the WHO recommended limit of 0.05g/dL BAC (AUROC 0.79). Model inspection reveals reliance on pathophysiological effects associated with alcohol consumption. To our knowledge, we are the first to rigorously evaluate the use of driver monitoring cameras for detecting drunk driving. Our results highlight the potential of driver monitoring cameras and enable next-generation drunk driver interaction preventing alcohol-related harm.

2023-01-05

On the Forces of Driver Distraction: Explainable Predictions for the Visual Demand of In-Vehicle Touchscreen Interactions

Authors: Patrick Ebel, Christoph Lingenfelder, Andreas Vogelsang

Link: http://arxiv.org/abs/2301.02065v1open in new window

Abstract: With modern infotainment systems, drivers are increasingly tempted to engage in secondary tasks while driving. Since distracted driving is already one of the main causes of fatal accidents, in-vehicle touchscreen Human-Machine Interfaces (HMIs) must be as little distracting as possible. To ensure that these systems are safe to use, they undergo elaborate and expensive empirical testing, requiring fully functional prototypes. Thus, early-stage methods informing designers about the implication their design may have on driver distraction are of great value. This paper presents a machine learning method that, based on anticipated usage scenarios, predicts the visual demand of in-vehicle touchscreen interactions and provides local and global explanations of the factors influencing drivers' visual attention allocation. The approach is based on large-scale natural driving data continuously collected from production line vehicles and employs the SHapley Additive exPlanation (SHAP) method to provide explanations leveraging informed design decisions. Our approach is more accurate than related work and identifies interactions during which long glances occur with 68 % accuracy and predicts the total glance duration with a mean error of 2.4 s. Our explanations replicate the results of various recent studies and provide fast and easily accessible insights into the effect of UI elements, driving automation, and vehicle speed on driver distraction. The system can not only help designers to evaluate current designs but also help them to better anticipate and understand the implications their design decisions might have on future designs.

2022-12-02

Evaluation of Arterial Signal Coordination with Commercial Connected Vehicle Data: Empirical Traffic Flow Visualization and Performance Measurement

Authors: Shoaib Mahmud, Christopher M. Day

Link: http://arxiv.org/abs/2212.02315v2open in new window

Abstract: Emerging connected vehicle (CV) data sets have recently become commercially available. This paper presents several tools using CV data to evaluate traffic progression quality along a signalized corridor. These include both performance measures for high-level analysis as well as visualizations to examine details of the coordinated operation. With the use of CV data, it is possible to assess not only the movement of traffic on the corridor but also to consider its origin-destination (OD) path through the corridor. Results for the real-world operation of an eight-intersection signalized arterial are presented. A series of high-level performance measures are used to evaluate overall performance by time of day, with differing results by metric. Next, the details of the operation are examined with the use of two visualization tools: a cyclic time-space diagram (TSD) and an empirical platoon progression diagram (PPD). Comparing flow visualizations developed with different included OD paths reveals several features. In addition, speed heat maps are generated, providing both speed performance along the corridor. The proposed visualization tools portray the corridor performance holistically instead of combining individual signal performance metrics. The techniques exhibited in this study are compelling for identifying locations where engineering solutions are required. The recent progress in infrastructure-free sensing technology has significantly increased the scope of CV data-based traffic management systems. The study demonstrates the utility of CV trajectory data for obtaining high-level details of the corridor performance and drilling down into the minute specifics.

2022-11-22

Predictive Display with Perspective Projection of Surroundings in Vehicle Teleoperation to Account Time-delays

Authors: Jai Prakash, Michele Vignati, Daniele Vignarca, Edoardo Sabbioni, Federico Cheli

Link: http://arxiv.org/abs/2211.11918v1open in new window

Abstract: Teleoperation provides human operator sophisticated perceptual and cognitive skills into an over the network control loop. It gives hope of addressing some challenges related to vehicular autonomy which is based on artificial intelligence by providing a backup plan. Variable network time delays in data transmission is the major problem in teleoperating a vehicle. On 4G network, variability of these delays is high. Due to this, both video streaming and driving commands encounter variable time delay. This paper presents an approach of providing the human operator a forecast video stream which replicates future perspective of vehicle field of view accounting the delay present in the network. Regarding the image transformation, perspective projection technique is combined with correction given by smith predictor in the control loop. This image transformation accounts current time delay and tries to address both issues, time delays as well as its variability. For experiment sake, only frontward field of view is forecast. Performance is evaluated by performing online vehicle teleoperation on street edge case maneuvers and later comparing the path deviation with and without perspective projection.

2022-11-15

AutoTherm: A Dataset and Benchmark for Thermal Comfort Estimation Indoors and in Vehicles

Authors: Mark Colley, Sebastian Hartwig, Albin Zeqiri, Timo Ropinski, Enrico Rukzio

Link: http://arxiv.org/abs/2211.08257v4open in new window

Abstract: Thermal comfort inside buildings is a well-studied field where human judgment for thermal comfort is collected and may be used for automatic thermal comfort estimation. However, indoor scenarios are rather static in terms of thermal state changes and, thus, cannot be applied to dynamic conditions, e.g., inside a vehicle. In this work, we present our findings of a gap between building and in-vehicle scenarios regarding thermal comfort estimation. We provide evidence by comparing deep neural classifiers for thermal comfort estimation for indoor and in-vehicle conditions. Further, we introduce a temporal dataset for indoor predictions incorporating 31 input signals and self-labeled user ratings by 18 subjects in a self-built climatic chamber. For in-vehicle scenarios, we acquired a second dataset featuring human judgments from 20 subjects in a BMW 3 Series. Our experimental results indicate superior performance for estimations from time series data over single vector input. Leveraging modern machine learning architectures enables us to recognize human thermal comfort states and estimate future states automatically. We provide details on training a recurrent network-based classifier and perform an initial performance benchmark of the proposed dataset. Ultimately, we compare our collected dataset to publicly available thermal comfort datasets.

2022-10-27

Vetaverse: A Survey on the Intersection of Metaverse, Vehicles, and Transportation Systems

Authors: Pengyuan Zhou, Jinjing Zhu, Yiting Wang, Yunfan Lu, Zixiang Wei, Haolin Shi, Yuchen Ding, Yu Gao, Qinglong Huang, Yan Shi, Ahmad Alhilal, Lik-Hang Lee, Tristan Braud, Pan Hui, Lin Wang

Link: http://arxiv.org/abs/2210.15109v3open in new window

Abstract: Since 2021, the term "Metaverse" has been the most popular one, garnering a lot of interest. Because of its contained environment and built-in computing and networking capabilities, a modern car makes an intriguing location to host its own little metaverse. Additionally, the travellers don't have much to do to pass the time while traveling, making them ideal customers for immersive services. Vetaverse (Vehicular-Metaverse), which we define as the future continuum between vehicular industries and Metaverse, is envisioned as a blended immersive realm that scales up to cities and countries, as digital twins of the intelligent Transportation Systems, referred to as "TS-Metaverse", as well as customized XR services inside each Individual Vehicle, referred to as "IV-Metaverse". The two subcategories serve fundamentally different purposes, namely long-term interconnection, maintenance, monitoring, and management on scale for large transportation systems (TS), and personalized, private, and immersive infotainment services (IV). By outlining the framework of Vetaverse and examining important enabler technologies, we reveal this impending trend. Additionally, we examine unresolved issues and potential routes for future study while highlighting some intriguing Vetaverse services.

2022-10-22

A Design Space for Human Sensor and Actuator Focused In-Vehicle Interaction Based on a Systematic Literature Review

Authors: Pascal Jansen, Mark Colley, Enrico Rukzio

Link: http://arxiv.org/abs/2210.12493v1open in new window

Abstract: Automotive user interfaces constantly change due to increasing automation, novel features, additional applications, and user demands. While in-vehicle interaction can utilize numerous promising modalities, no existing overview includes an extensive set of human sensors and actuators and interaction locations throughout the vehicle interior. We conducted a systematic literature review of 327 publications leading to a design space for in-vehicle interaction that outlines existing and lack of work regarding input and output modalities, locations, and multimodal interaction. To investigate user acceptance of possible modalities and locations inferred from existing work and gaps unveiled in our design space, we conducted an online study (N=48). The study revealed users' general acceptance of novel modalities (e.g., brain or thermal activity) and interaction with locations other than the front (e.g., seat or table). Our work helps practitioners evaluate key design decisions, exploit trends, and explore new areas in the domain of in-vehicle interaction.

2022-10-20

In-Vehicle Interface Adaptation to Environment-Induced Cognitive Workload

Authors: Elena Meiser, Alexandra Alles, Samuel Selter, Marco Molz, Amr Gomaa, Guillermo Reyes

Link: http://arxiv.org/abs/2210.11271v1open in new window

Abstract: Many car accidents are caused by human distractions, including cognitive distractions. In-vehicle human-machine interfaces (HMIs) have evolved throughout the years, providing more and more functions. Interaction with the HMIs can, however, also lead to further distractions and, as a consequence, accidents. To tackle this problem, we propose using adaptive HMIs that change according to the mental workload of the driver. In this work, we present the current status as well as preliminary results of a user study using naturalistic secondary tasks while driving (i.e., the primary task) that attempt to understand the effects of one such interface.

2022-09-22

Designing an Automated Vehicle: Strategies for Handling Tasks of a Previously Required Accompanying Person

Authors: Tobias Schräder, Robert Graubohm, Nayel Fabian Salem, Markus Maurer

Link: http://arxiv.org/abs/2209.11083v1open in new window

Abstract: When using a conventional passenger car, several groups of people are reliant on the assistance of an accompanying person, for example when getting in and out of the car. For the independent use of an automatically driving vehicle by those groups, the absence of a previously required accompanying person needs to be compensated. During the design process of an autonomous family vehicle, we found that a low-barrier vehicle design can only partly contribute to the compensation for the absence of a required human companion. In this paper, we present four strategies we identified for handling the tasks of a previously required accompanying individual. The presented top-down approach supports developers in identifying unresolved problems, in finding, structuring, and selecting solutions as well as in uncovering upcoming problems at an early stage in the development of novel concepts for driverless vehicles. As an example, we consider the hypothetical exit of persons in need of assistance. The application of the four strategies in this example demonstrates the far-reaching impact of consistently considering users in need of support in the development of automated vehicles.

Affective Role of the Future Autonomous Vehicle Interior

Authors: Taesu Kim, Gyunpyo Lee, Jiwoo Hong, Hyeon-Jeong Suk

Link: http://arxiv.org/abs/2209.10764v1open in new window

Abstract: Recent advancements in autonomous technology allow for new opportunities in vehicle interior design. Such a shift in in-vehicle activity suggests vehicle interior spaces should provide an adequate manner by considering users' affective desires. Therefore, this study aims to investigate the affective role of future vehicle interiors. Thirty one participants in ten focus groups were interviewed about challenges they face regarding their current vehicle interior and expectations they have for future vehicles. Results from content analyses revealed the affective role of future vehicle interiors. Advanced exclusiveness and advanced convenience were two primary aspects identified. The identified affective roles of each aspect are a total of eight visceral levels, four visceral levels each, including focused, stimulating, amused, pleasant, safe, comfortable, accommodated, and organized. We expect the results from this study to lead to the development of affective vehicle interiors by providing the fundamental knowledge for developing conceptual direction and evaluating its impact on user experiences.

Affective responses to chromatic ambient light in a vehicle

Authors: Taesu Kim, Kyungah Choi, Hyeon-Jeong Suk

Link: http://arxiv.org/abs/2209.10761v1open in new window

Abstract: This study investigates the emotional responses to the color of vehicle interior lighting using self-assessment and electroencephalography (EEG). The study was divided into two sessions: the first session investigated the potential of ambient lighting colors, and the second session was used to develop in-vehicle lighting color guidelines. Every session included thirty subjects. In the first session, four lighting colors were assessed using seventeen adjectives. As a result, 'Preference, Softness, Brightness, and Uniqueness were found to be the four factors that best characterize the atmospheric properties of interior lighting in vehicles. Ambient illumination, according to EEG data, increased people's arousal and lowered their alpha waves. The following session investigated a wider spectrum of colors using four factors extracted from the previous session. As a result, bluish and purplish lighting colors had the highest preference and uniqueness among ten lighting colors. Green received an intermediate preference and a high uniqueness score. With its great brightness and softness, Neutral White also achieved an intermediate preference rating. Despite receiving a low preference rating, warm colors were considered to be soft. Red was the least preferred color, but its uniqueness and roughness were highly rated. This study is expected to provide a basic theory on emotional lighting guidelines in the vehicle context, providing manufacturers with objective rationale.

2022-09-21

The Interaction Gap: A Step Toward Understanding Trust in Autonomous Vehicles Between Encounters

Authors: Jacob G. Hunter, Matthew Konishi, Neera Jain, Kumar Akash, Xingwei Wu, Teruhisa Misu, Tahira Reid

Link: http://arxiv.org/abs/2209.10640v1open in new window

Abstract: Shared autonomous vehicles (SAVs) will be introduced in greater numbers over the coming decade. Due to rapid advances in shared mobility and the slower development of fully autonomous vehicles (AVs), SAVs will likely be deployed before privately-owned AVs. Moreover, existing shared mobility services are transitioning their vehicle fleets toward those with increasingly higher levels of driving automation. Consequently, people who use shared vehicles on an "as needed" basis will have infrequent interactions with automated driving, thereby experiencing interaction gaps. Using human trust data of 25 participants, we show that interaction gaps can affect human trust in automated driving. Participants engaged in a simulator study consisting of two interactions separated by a one-week interaction gap. A moderate, inverse correlation was found between the change in trust during the initial interaction and the interaction gap, suggesting people "forget" some of their gained trust or distrust in automation during an interaction gap.

Identification of Adaptive Driving Style Preference through Implicit Inputs in SAE L2 Vehicles

Authors: Zhaobo K. Zheng, Kumar Akash, Teruhisa Misu, Vidya Krishmoorthy, Miaomiao Dong, Yuni Lee, Gaojian Huang

Link: http://arxiv.org/abs/2209.10536v1open in new window

Abstract: A key factor to optimal acceptance and comfort of automated vehicle features is the driving style. Mismatches between the automated and the driver preferred driving styles can make users take over more frequently or even disable the automation features. This work proposes identification of user driving style preference with multimodal signals, so the vehicle could match user preference in a continuous and automatic way. We conducted a driving simulator study with 36 participants and collected extensive multimodal data including behavioral, physiological, and situational data. This includes eye gaze, steering grip force, driving maneuvers, brake and throttle pedal inputs as well as foot distance from pedals, pupil diameter, galvanic skin response, heart rate, and situational drive context. Then, we built machine learning models to identify preferred driving styles, and confirmed that all modalities are important for the identification of user preference. This work paves the road for implicit adaptive driving styles on automated vehicles.

2022-09-12

Driving Safety Prediction and Safe Route Mapping Using In-vehicle and Roadside Data

Authors: Yufei Huang, Mohsen Jafari, Peter Jin

Link: http://arxiv.org/abs/2209.05604v1open in new window

Abstract: Risk assessment of roadways is commonly practiced based on historical crash data. Information on driver behaviors and real-time traffic situations is sometimes missing. In this paper, the Safe Route Mapping (SRM) model, a methodology for developing dynamic risk heat maps of roadways, is extended to consider driver behaviors when making predictions. An Android App is designed to gather drivers' information and upload it to a server. On the server, facial recognition extracts drivers' data, such as facial landmarks, gaze directions, and emotions. The driver's drowsiness and distraction are detected, and driving performance is evaluated. Meanwhile, dynamic traffic information is captured by a roadside camera and uploaded to the same server. A longitudinal-scanline-based arterial traffic video analytics is applied to recognize vehicles from the video to build speed and trajectory profiles. Based on these data, a LightGBM model is introduced to predict conflict indices for drivers in the next one or two seconds. Then, multiple data sources, including historical crash counts and predicted traffic conflict indicators, are combined using a Fuzzy logic model to calculate risk scores for road segments. The proposed SRM model is illustrated using data collected from an actual traffic intersection and a driving simulation platform. The prediction results show that the model is accurate, and the added driver behavior features will improve the model's performance. Finally, risk heat maps are generated for visualization purposes. The authorities can use the dynamic heat map to designate safe corridors and dispatch law enforcement and drivers for early warning and trip planning.

2022-08-30

Compensating for the Absence of a Required Accompanying Person: A Draft of a Functional System Architecture for an Automated Vehicle

Authors: Tobias Schräder, Torben Stolte, Inga Jatzkowski, Robert Graubohm, Marcus Nolte, Markus Maurer

Link: http://arxiv.org/abs/2208.14316v1open in new window

Abstract: A major challenge in the development of a fully automated vehicle is to enable a large variety of users to use the vehicle independently and safely. Particular demands arise from user groups who rely on human assistance when using conventional cars. For the independent use of a vehicle by such groups, the vehicle must compensate for the absence of an accompanying person, whose actions and decisions ensure the accompanied person's safety even in unknown situations. The resulting requirements cannot be fulfilled only by the geometric design of the vehicle and the nature of its control elements. Special user needs must be taken into account in the entire automation of the vehicle. In this paper, we describe requirements for compensating for the absence of an accompanying person and show how required functions can be located in a hierarchical functional system architecture of an automated vehicle. In addition, we outline the relevance of the vehicle's operational design domain in this context and present a use case for the described functionalities.

2022-08-24

Collaborative Remote Control of Unmanned Ground Vehicles in Virtual Reality

Authors: Ziming Li, Yiming Luo, Jialin Wang, Yushan Pan, Lingyun Yu, Hai-Ning Liang

Link: http://arxiv.org/abs/2208.11294v1open in new window

Abstract: Virtual reality (VR) technology is commonly used in entertainment applications; however, it has also been deployed in practical applications in more serious aspects of our lives, such as safety. To support people working in dangerous industries, VR can ensure operators manipulate standardized tasks and work collaboratively to deal with potential risks. Surprisingly, little research has focused on how people can collaboratively work in VR environments. Few studies have paid attention to the cognitive load of operators in their collaborative tasks. Once task demands become complex, many researchers focus on optimizing the design of the interaction interfaces to reduce the cognitive load on the operator. That approach could be of merit; however, it can actually subject operators to a more significant cognitive load and potentially more errors and a failure of collaboration. In this paper, we propose a new collaborative VR system to support two teleoperators working in the VR environment to remote control an uncrewed ground vehicle. We use a compared experiment to evaluate the collaborative VR systems, focusing on the time spent on tasks and the total number of operations. Our results show that the total number of processes and the cognitive load during operations were significantly lower in the two-person group than in the single-person group. Our study sheds light on designing VR systems to support collaborative work with respect to the flow of work of teleoperators instead of simply optimizing the design outcomes.

2022-08-17

In-vehicle alertness monitoring for older adults

Authors: Heng Yao, Sanaz Motamedi, Wayne C. W. Giang, Alexandra Kondyli, Eakta Jain

Link: http://arxiv.org/abs/2208.08091v1open in new window

Abstract: Alertness monitoring in the context of driving improves safety and saves lives. Computer vision based alertness monitoring is an active area of research. However, the algorithms and datasets that exist for alertness monitoring are primarily aimed at younger adults (18-50 years old). We present a system for in-vehicle alertness monitoring for older adults. Through a design study, we ascertained the variables and parameters that are suitable for older adults traveling independently in Level 5 vehicles. We implemented a prototype traveler monitoring system and evaluated the alertness detection algorithm on ten older adults (70 years and older). We report on the system design and implementation at a level of detail that is suitable for the beginning researcher or practitioner. Our study suggests that dataset development is the foremost challenge for developing alertness monitoring systems targeted at older adults. This study is the first of its kind for a hitherto under-studied population and has implications for future work on algorithm development and system design through participatory methods.

2022-08-10

What's on your mind? A Mental and Perceptual Load Estimation Framework towards Adaptive In-vehicle Interaction while Driving

Authors: Amr Gomaa, Alexandra Alles, Elena Meiser, Lydia Helene Rupp, Marco Molz, Guillermo Reyes

Link: http://arxiv.org/abs/2208.05564v1open in new window

Abstract: Several researchers have focused on studying driver cognitive behavior and mental load for in-vehicle interaction while driving. Adaptive interfaces that vary with mental and perceptual load levels could help in reducing accidents and enhancing the driver experience. In this paper, we analyze the effects of mental workload and perceptual load on psychophysiological dimensions and provide a machine learning-based framework for mental and perceptual load estimation in a dual task scenario for in-vehicle interaction (https://github.com/amrgomaaelhady/MWL-PL-estimator). We use off-the-shelf non-intrusive sensors that can be easily integrated into the vehicle's system. Our statistical analysis shows that while mental workload influences some psychophysiological dimensions, perceptual load shows little effect. Furthermore, we classify the mental and perceptual load levels through the fusion of these measurements, moving towards a real-time adaptive in-vehicle interface that is personalized to user behavior and driving conditions. We report up to 89% mental workload classification accuracy and provide a real-time minimally-intrusive solution.

2022-08-09

Vehicle Type Specific Waypoint Generation

Authors: Yunpeng Liu, Jonathan Wilder Lavington, Adam Scibior, Frank Wood

Link: http://arxiv.org/abs/2208.04987v1open in new window

Abstract: We develop a generic mechanism for generating vehicle-type specific sequences of waypoints from a probabilistic foundation model of driving behavior. Many foundation behavior models are trained on data that does not include vehicle information, which limits their utility in downstream applications such as planning. Our novel methodology conditionally specializes such a behavior predictive model to a vehicle-type by utilizing byproducts of the reinforcement learning algorithms used to produce vehicle specific controllers. We show how to compose a vehicle specific value function estimate with a generic probabilistic behavior model to generate vehicle-type specific waypoint sequences that are more likely to be physically plausible then their vehicle-agnostic counterparts.

2022-08-05

Drive Right: Shaping Public's Trust, Understanding, and Preference Towards Autonomous Vehicles Using a Virtual Reality Driving Simulator

Authors: Zhijie Qiao, Xiatao Sun, Helen Loeb, Rahul Mangharam

Link: http://arxiv.org/abs/2208.02939v2open in new window

Abstract: Autonomous vehicles are increasingly introduced into our lives. Yet, people's misunderstanding and mistrust have become the major obstacles to the use of these technologies. In response to this problem, proper work must be done to increase public's understanding and awareness and help drivers rationally evaluate the system. The method proposed in this paper is a virtual reality driving simulator which serves as a low-cost platform for autonomous vehicle demonstration and education. To test the validity of the platform, we recruited 36 participants and conducted a test training drive using three different scenarios. The results show that our simulator successfully increased participants' understanding while favorably changing their attitude towards the autonomous system. The methodology and findings presented in this paper can be further explored by driving schools, auto manufacturers, and policy makers, to improve training for autonomous vehicles.

2022-07-13

Connected Vehicles: A Privacy Analysis

Authors: Mark Quinlan, Jun Zhao, Andrew Simpson

Link: http://arxiv.org/abs/2207.06182v1open in new window

Abstract: Just as the world of consumer devices was forever changed by the introduction of computer controlled solutions, the introduction of the engine control unit (ECU) gave rise to the automobile's transformation from a transportation product to a technology platform. A modern car is capable of processing, analysing and transmitting data in ways that could not have been foreseen only a few years ago. These cars often incorporate telematics systems, which are used to provide navigation and internet connectivity over cellular networks, as well as data-recording devices for insurance and product development purposes. We examine the telematics system of a production vehicle, and aim to ascertain some of the associated privacy-related threats. We also consider how this analysis might underpin further research.

2022-06-18

Anticipated emotions associated with trust in autonomous vehicles

Authors: Lilit Avetisian, Jackie Ayoub, Feng Zhou

Link: http://arxiv.org/abs/2206.09275v1open in new window

Abstract: Trust in automation has been mainly studied in the cognitive perspective, though some researchers have shown that trust is also influenced by emotion. Therefore, it is essential to investigate the relationships between emotions and trust. In this study, we explored the pattern of 19 anticipated emotions associated with two levels of trust (i.e., low vs. high levels of trust) elicited from two levels of autonomous vehicles (AVs) performance (i.e., failure and non-failure) from 105 participants from Amazon Mechanical Turk (AMT). Trust was assessed at three layers i.e., dispositional, initial learned, and situational trust. The study was designed to measure how emotions are affected with low and high levels of trust. Situational trust was significantly correlated with emotions that a high level of trust significantly improved participants' positive emotions, and vice versa. We also identified the underlying factors of emotions associated with situational trust. Our results offered important implications on anticipated emotions associated with trust in AVs.

2022-06-10

Vehicle-To-Pedestrian Communication Feedback Module: A Study on Increasing Legibility, Public Acceptance and Trust

Authors: Melanie Schmidt-Wolf, David Feil-Seifer

Link: http://arxiv.org/abs/2206.05312v2open in new window

Abstract: Vehicle pedestrian communication is extremely important when developing autonomy for an autonomous vehicle. Enabling bidirectional nonverbal communication between pedestrians and autonomous vehicles will lead to an improvement of pedestrians' safety in autonomous driving. If a pedestrian wants to communicate, the autonomous vehicle should provide feedback to the human about what it is about to do. The user study presented in this paper investigated several possible options for an external vehicle display for effective nonverbal communication between an autonomous vehicle and a human. The result of this study will guide the development of the feedback module in future studies, optimizing for public acceptance and trust in the autonomous vehicle's decision while being legible to the widest range of potential users. The results of this study show that participants prefer symbols over text, lights and road projection. Additionally, participants prefer the combination of symbols and text as interaction modes to be displayed if the autonomous vehicle is not driving. Further, the results show that the text interaction mode option "Safe to cross" should be used combined with the symbol interaction mode option that displays a symbol of a walking person. We plan to elaborate and focus on the selected interaction modes via Virtual Reality and in the real world in ongoing and future studies.

2022-06-06

Effects of Augmented-Reality-Based Assisting Interfaces on Drivers' Object-wise Situational Awareness in Highly Autonomous Vehicles

Authors: Xiaofeng Gao, Xingwei Wu, Samson Ho, Teruhisa Misu, Kumar Akash

Link: http://arxiv.org/abs/2206.02332v1open in new window

Abstract: Although partially autonomous driving (AD) systems are already available in production vehicles, drivers are still required to maintain a sufficient level of situational awareness (SA) during driving. Previous studies have shown that providing information about the AD's capability using user interfaces can improve the driver's SA. However, displaying too much information increases the driver's workload and can distract or overwhelm the driver. Therefore, to design an efficient user interface (UI), it is necessary to understand its effect under different circumstances. In this paper, we focus on a UI based on augmented reality (AR), which can highlight potential hazards on the road. To understand the effect of highlighting on drivers' SA for objects with different types and locations under various traffic densities, we conducted an in-person experiment with 20 participants on a driving simulator. Our study results show that the effects of highlighting on drivers' SA varied by traffic densities, object locations and object types. We believe our study can provide guidance in selecting which object to highlight for the AR-based driver-assistance interface to optimize SA for drivers driving and monitoring partially autonomous vehicles.

2022-05-28

Investigating End-user Acceptance of Last-mile Delivery by Autonomous Vehicles in the United States

Authors: Antonios Saravanos, Olivia Verni, Ian Moore, Sall Aboubacar, Jen Arriaza, Sabrina Jivani, Audrey Bennett, Siqi Li, Dongnanzi Zheng, Stavros Zervoudakis

Link: http://arxiv.org/abs/2205.14282v3open in new window

Abstract: This paper investigates the end-user acceptance of last-mile delivery carried out by autonomous vehicles within the United States. A total of 296 participants were presented with information on this technology and then asked to complete a questionnaire on their perceptions to gauge their behavioral intention concerning acceptance. Structural equation modeling of the partial least squares flavor (PLS-SEM) was employed to analyze the collected data. The results indicated that the perceived usefulness of the technology played the greatest role in end-user acceptance decisions, followed by the influence of others, and then the enjoyment received by interacting with the technology. Furthermore, the perception of risk associated with using autonomous delivery vehicles for last-mile delivery led to a decrease in acceptance. However, most participants did not perceive the use of this technology to be risky. The paper concludes by summarizing the implications our findings have on the respective stakeholders and proposing the next steps in this area of research.

2022-04-24

Ordered-logit pedestrian stress model for traffic flow with automated vehicles

Authors: Kimia Kamal, Bilal Farooq, Mahwish Mudassar, Arash Kalatian

Link: http://arxiv.org/abs/2204.11367v1open in new window

Abstract: An ordered-logit model is developed to study the effects of Automated Vehicles (AVs) in the traffic mix on the average stress level of a pedestrian when crossing an urban street at mid-block. Information collected from a galvanic skin resistance sensor and virtual reality experiments are transformed into a dataset with interpretable average stress levels (low, medium, and high) and geometric, traffic, and environmental conditions. Modelling results indicate a decrease in average stress level with the increase in the percentage of AVs in the traffic mix.

2022-04-19

From Spoken Thoughts to Automated Driving Commentary: Predicting and Explaining Intelligent Vehicles' Actions

Authors: Daniel Omeiza, Sule Anjomshoae, Helena Webb, Marina Jirotka, Lars Kunze

Link: http://arxiv.org/abs/2204.09109v2open in new window

Abstract: In commentary driving, drivers verbalise their observations, assessments and intentions. By speaking out their thoughts, both learning and expert drivers are able to create a better understanding and awareness of their surroundings. In the intelligent vehicle context, automated driving commentary can provide intelligible explanations about driving actions, thereby assisting a driver or an end-user during driving operations in challenging and safety-critical scenarios. In this paper, we conducted a field study in which we deployed a research vehicle in an urban environment to obtain data. While collecting sensor data of the vehicle's surroundings, we obtained driving commentary from a driving instructor using the think-aloud protocol. We analysed the driving commentary and uncovered an explanation style; the driver first announces his observations, announces his plans, and then makes general remarks. He also makes counterfactual comments. We successfully demonstrated how factual and counterfactual natural language explanations that follow this style could be automatically generated using a transparent tree-based approach. Generated explanations for longitudinal actions (e.g., stop and move) were deemed more intelligible and plausible by human judges compared to lateral actions, such as lane changes. We discussed how our approach can be built on in the future to realise more robust and effective explainability for driver assistance as well as partial and conditional automation of driving functions.

2022-02-15

Multimodal Driver Referencing: A Comparison of Pointing to Objects Inside and Outside the Vehicle

Authors: Abdul Rafey Aftab, Michael von der Beeck

Link: http://arxiv.org/abs/2202.07360v1open in new window

Abstract: Advanced in-cabin sensing technologies, especially vision based approaches, have tremendously progressed user interaction inside the vehicle, paving the way for new applications of natural user interaction. Just as humans use multiple modes to communicate with each other, we follow an approach which is characterized by simultaneously using multiple modalities to achieve natural human-machine interaction for a specific task: pointing to or glancing towards objects inside as well as outside the vehicle for deictic references. By tracking the movements of eye-gaze, head and finger, we design a multimodal fusion architecture using a deep neural network to precisely identify the driver's referencing intent. Additionally, we use a speech command as a trigger to separate each referencing event. We observe differences in driver behavior in the two pointing use cases (i.e. for inside and outside objects), especially when analyzing the preciseness of the three modalities eye, head, and finger. We conclude that there is no single modality that is solely optimal for all cases as each modality reveals certain limitations. Fusion of multiple modalities exploits the relevant characteristics of each modality, hence overcoming the case dependent limitations of each individual modality. Ultimately, we propose a method to identity whether the driver's referenced object lies inside or outside the vehicle, based on the predicted pointing direction.

2022-02-14

BROOK Dataset: A Playground for Exploiting Data-Driven Techniques in Human-Vehicle Interactive Designs

Authors: Wangkai Jin, Yicun Duan, Junyu Liu, Shuchang Huang, Zeyu Xiong, Xiangjun Peng

Link: http://arxiv.org/abs/2202.06494v1open in new window

Abstract: Emerging Autonomous Vehicles (AV) breed great potentials to exploit data-driven techniques for adaptive and personalized Human-Vehicle Interactions. However, the lack of high-quality and rich data supports limits the opportunities to explore the design space of data-driven techniques, and validate the effectiveness of concrete mechanisms. Our goal is to initialize the efforts to deliver the building block for exploring data-driven Human-Vehicle Interaction designs. To this end, we present BROOK dataset, a multi-modal dataset with facial video records. We first brief our rationales to build BROOK dataset. Then, we elaborate how to build the current version of BROOK dataset via a year-long study, and give an overview of the dataset. Next, we present three example studies using BROOK to justify the applicability of BROOK dataset. We also identify key learning lessons from building BROOK dataset, and discuss about how BROOK dataset can foster an extensive amount of follow-up studies.

2022-02-13

Motion Sickness Modeling with Visual Vertical Estimation and Its Application to Autonomous Personal Mobility Vehicles

Authors: Hailong Liu, Shota Inoue, Takahiro Wada

Link: http://arxiv.org/abs/2202.06299v4open in new window

Abstract: Passengers (drivers) of level 3-5 autonomous personal mobility vehicles (APMV) and cars can perform non-driving tasks, such as reading books and smartphones, while driving. It has been pointed out that such activities may increase motion sickness. Many studies have been conducted to build countermeasures, of which various computational motion sickness models have been developed. Many of these are based on subjective vertical conflict (SVC) theory, which describes vertical changes in direction sensed by human sensory organs vs. those expected by the central nervous system. Such models are expected to be applied to autonomous driving scenarios. However, no current computational model can integrate visual vertical information with vestibular sensations. We proposed a 6 DoF SVC-VV model which add a visually perceived vertical block into a conventional six-degrees-of-freedom SVC model to predict VV directions from image data simulating the visual input of a human. Hence, a simple image-based VV estimation method is proposed. As the validation of the proposed model, this paper focuses on describing the fact that the motion sickness increases as a passenger reads a book while using an AMPV, assuming that visual vertical (VV) plays an important role. In the static experiment, it is demonstrated that the estimated VV by the proposed method accurately described the gravitational acceleration direction with a low mean absolute deviation. In addition, the results of the driving experiment using an APMV demonstrated that the proposed 6 DoF SVC-VV model could describe that the increased motion sickness experienced when the VV and gravitational acceleration directions were different.

2022-01-09

In-Device Feedback in Immersive Head-Mounted Displays for Distance Perception During Teleoperation of Unmanned Ground Vehicles

Authors: Yiming Luo, Jialin Wang, Rongkai Shi, Hai-Ning Liang, Shan Luo

Link: http://arxiv.org/abs/2201.03036v1open in new window

Abstract: In recent years, Virtual Reality (VR) Head-Mounted Displays (HMD) have been used to provide an immersive, first-person view in real-time for the remote-control of Unmanned Ground Vehicles (UGV). One critical issue is that it is challenging to perceive the distance of obstacles surrounding the vehicle from 2D views in the HMD, which deteriorates the control of UGV. Conventional distance indicators used in HMD take up screen space which leads clutter on the display and can further reduce situation awareness of the physical environment. To address the issue, in this paper we propose off-screen in-device feedback using vibro-tactile and/or light-visual cues to provide real-time distance information for the remote control of UGV. Results from a study show a significantly better performance with either feedback type, reduced workload and improved usability in a driving task that requires continuous perception of the distance between the UGV and its environmental objects or obstacles. Our findings show a solid case for in-device vibro-tactile and/or light-visual feedback to support remote operation of UGVs that highly relies on distance perception of objects.

2021-11-06

Prediction of Pedestrian Spatiotemporal Risk Levels for Intelligent Vehicles: A Data-driven Approach

Authors: Zheyu Zhang, Boyang Wang, Chao Lu, Jinghang Li, Cheng Gong, Jianwei Gong

Link: http://arxiv.org/abs/2111.03822v1open in new window

Abstract: In recent years, road safety has attracted significant attention from researchers and practitioners in the intelligent transport systems domain. As one of the most common and vulnerable groups of road users, pedestrians cause great concerns due to their unpredictable behavior and movement, as subtle misunderstandings in vehicle-pedestrian interaction can easily lead to risky situations or collisions. Existing methods use either predefined collision-based models or human-labeling approaches to estimate the pedestrians' risks. These approaches are usually limited by their poor generalization ability and lack of consideration of interactions between the ego vehicle and a pedestrian. This work tackles the listed problems by proposing a Pedestrian Risk Level Prediction system. The system consists of three modules. Firstly, vehicle-perspective pedestrian data are collected. Since the data contains information regarding the movement of both the ego vehicle and pedestrian, it can simplify the prediction of spatiotemporal features in an interaction-aware fashion. Using the long short-term memory model, the pedestrian trajectory prediction module predicts their spatiotemporal features in the subsequent five frames. As the predicted trajectory follows certain interaction and risk patterns, a hybrid clustering and classification method is adopted to explore the risk patterns in the spatiotemporal features and train a risk level classifier using the learned patterns. Upon predicting the spatiotemporal features of pedestrians and identifying the corresponding risk level, the risk patterns between the ego vehicle and pedestrians are determined. Experimental results verified the capability of the PRLP system to predict the risk level of pedestrians, thus supporting the collision risk assessment of intelligent vehicles and providing safety warnings to both vehicles and pedestrians.

2021-11-03

ML-PersRef: A Machine Learning-based Personalized Multimodal Fusion Approach for Referencing Outside Objects From a Moving Vehicle

Authors: Amr Gomaa, Guillermo Reyes, Michael Feld

Link: http://arxiv.org/abs/2111.02327v1open in new window

Abstract: Over the past decades, the addition of hundreds of sensors to modern vehicles has led to an exponential increase in their capabilities. This allows for novel approaches to interaction with the vehicle that go beyond traditional touch-based and voice command approaches, such as emotion recognition, head rotation, eye gaze, and pointing gestures. Although gaze and pointing gestures have been used before for referencing objects inside and outside vehicles, the multimodal interaction and fusion of these gestures have so far not been extensively studied. We propose a novel learning-based multimodal fusion approach for referencing outside-the-vehicle objects while maintaining a long driving route in a simulated environment. The proposed multimodal approaches outperform single-modality approaches in multiple aspects and conditions. Moreover, we also demonstrate possible ways to exploit behavioral differences between users when completing the referencing task to realize an adaptable personalized system for each driver. We propose a personalization technique based on the transfer-of-learning concept for exceedingly small data sizes to enhance prediction and adapt to individualistic referencing behavior. Our code is publicly available at https://github.com/amr-gomaa/ML-PersRef.

2021-10-28

Unmanned Aerial Vehicles Traffic Management Solution Using Crowd-sensing and Blockchain

Authors: Ruba Alkadi, Abdulhadi Shoufan

Link: http://arxiv.org/abs/2110.14979v1open in new window

Abstract: Unmanned aerial vehicles (UAVs) are gaining immense attention due to their potential to revolutionize various businesses and industries. However, the adoption of UAV-assisted applications will strongly rely on the provision of reliable systems that allow managing UAV operations at high levels of safety and security. Recently, the concept of UAV traffic management (UTM) has been introduced to support safe, efficient, and fair access to low-altitude airspace for commercial UAVs. A UTM system identifies multiple cooperating parties with different roles and levels of authority to provide real-time services to airspace users. However, current UTM systems are centralized and lack a clear definition of protocols that govern a secure interaction between authorities, service providers, and end-users. The lack of such protocols renders the UTM system unscalable and prone to various cyber attacks. Another limitation of the currently proposed UTM architecture is the absence of an efficient mechanism to enforce airspace rules and regulations. To address this issue, we propose a decentralized UTM protocol that controls access to airspace while ensuring high levels of integrity, availability, and confidentiality of airspace operations. To achieve this, we exploit key features of the blockchain and smart contract technologies. In addition, we employ a mobile crowdsensing (MCS) mechanism to seamlessly enforce airspace rules and regulations that govern the UAV operations. The solution is implemented on top of the Etheruem platform and verified using four different smart contract verification tools. We also provided a security and cost analysis of our solution. For reproducibility, we made our implementation publicly available on Github.

2021-09-30

Game and Simulation Design for Studying Pedestrian-Automated Vehicle Interactions

Authors: Georgios Pappas, Joshua E. Siegel, Jacob Rutkowski, Andrea Schaaf

Link: http://arxiv.org/abs/2109.15205v1open in new window

Abstract: The present cross-disciplinary research explores pedestrian-autonomous vehicle interactions in a safe, virtual environment. We first present contemporary tools in the field and then propose the design and development of a new application that facilitates pedestrian point of view research. We conduct a three-step user experience experiment where participants answer questions before and after using the application in various scenarios. Behavioral results in virtuality, especially when there were consequences, tend to simulate real life sufficiently well to make design choices, and we received valuable insights into human/vehicle interaction. Our tool seemed to start raising participant awareness of autonomous vehicles and their capabilities and limitations, which is an important step in overcoming public distrust of AVs. Further, studying how users respect or take advantage of AVs may help inform future operating mode indicator design as well as algorithm biases that might support socially-optimal AV operation.