Robot

2024-04-25

Vision-based robot manipulation of transparent liquid containers in a laboratory setting

Authors: Daniel Schober, Ronja Güldenring, James Love, Lazaros Nalpantidis

Link: http://arxiv.org/abs/2404.16529v1open in new window

Abstract: Laboratory processes involving small volumes of solutions and active ingredients are often performed manually due to challenges in automation, such as high initial costs, semi-structured environments and protocol variability. In this work, we develop a flexible and cost-effective approach to address this gap by introducing a vision-based system for liquid volume estimation and a simulation-driven pouring method particularly designed for containers with small openings. We evaluate both components individually, followed by an applied real-world integration of cell culture automation using a UR5 robotic arm. Our work is fully reproducible: we share our code at at \url{https://github.com/DaniSchober/LabLiquidVision} and the newly introduced dataset LabLiquidVolume is available at https://data.dtu.dk/articles/dataset/LabLiquidVision/25103102.

Leveraging Pretrained Latent Representations for Few-Shot Imitation Learning on a Dexterous Robotic Hand

Authors: Davide Liconti, Yasunori Toshimitsu, Robert Katzschmann

Link: http://arxiv.org/abs/2404.16483v1open in new window

Abstract: In the context of imitation learning applied to dexterous robotic hands, the high complexity of the systems makes learning complex manipulation tasks challenging. However, the numerous datasets depicting human hands in various different tasks could provide us with better knowledge regarding human hand motion. We propose a method to leverage multiple large-scale task-agnostic datasets to obtain latent representations that effectively encode motion subtrajectories that we included in a transformer-based behavior cloning method. Our results demonstrate that employing latent representations yields enhanced performance compared to conventional behavior cloning methods, particularly regarding resilience to errors and noise in perception and proprioception. Furthermore, the proposed approach solely relies on human demonstrations, eliminating the need for teleoperation and, therefore, accelerating the data acquisition process. Accurate inverse kinematics for fingertip retargeting ensures precise transfer from human hand data to the robot, facilitating effective learning and deployment of manipulation policies. Finally, the trained policies have been successfully transferred to a real-world 23Dof robotic system.

Neural Assembler: Learning to Generate Fine-Grained Robotic Assembly Instructions from Multi-View Images

Authors: Hongyu Yan, Yadong Mu

Link: http://arxiv.org/abs/2404.16423v1open in new window

Abstract: Image-guided object assembly represents a burgeoning research topic in computer vision. This paper introduces a novel task: translating multi-view images of a structural 3D model (for example, one constructed with building blocks drawn from a 3D-object library) into a detailed sequence of assembly instructions executable by a robotic arm. Fed with multi-view images of the target 3D model for replication, the model designed for this task must address several sub-tasks, including recognizing individual components used in constructing the 3D model, estimating the geometric pose of each component, and deducing a feasible assembly order adhering to physical rules. Establishing accurate 2D-3D correspondence between multi-view images and 3D objects is technically challenging. To tackle this, we propose an end-to-end model known as the Neural Assembler. This model learns an object graph where each vertex represents recognized components from the images, and the edges specify the topology of the 3D model, enabling the derivation of an assembly plan. We establish benchmarks for this task and conduct comprehensive empirical evaluations of Neural Assembler and alternative solutions. Our experiments clearly demonstrate the superiority of Neural Assembler.

Robot Swarm Control Based on Smoothed Particle Hydrodynamics for Obstacle-Unaware Navigation

Authors: Michikuni Eguchi, Mai Nishimura, Shigeo Yoshida, Takefumi Hiraki

Link: http://arxiv.org/abs/2404.16309v1open in new window

Abstract: Robot swarms hold immense potential for performing complex tasks far beyond the capabilities of individual robots. However, the challenge in unleashing this potential is the robots' limited sensory capabilities, which hinder their ability to detect and adapt to unknown obstacles in real-time. To overcome this limitation, we introduce a novel robot swarm control method with an indirect obstacle detector using a smoothed particle hydrodynamics (SPH) model. The indirect obstacle detector can predict the collision with an obstacle and its collision point solely from the robot's velocity information. This approach enables the swarm to effectively and accurately navigate environments without the need for explicit obstacle detection, significantly enhancing their operational robustness and efficiency. Our method's superiority is quantitatively validated through a comparative analysis, showcasing its significant navigation and pattern formation improvements under obstacle-unaware conditions.

2024-04-24

The Robotic MAAO 0.7m Telescope System: Performance and Standard Photometric System

Authors: Gu Lim, Dohyeong Kim, Seonghun Lim, Myungshin Im, Hyeonho Choi, Jaemin Park, Keun-Hong Park, Junyeong Park, Chaudhary Muskaan, Donghyun Kim, Hayeong Jeong

Link: http://arxiv.org/abs/2404.15884v1open in new window

Abstract: We introduce a 0.7m telescope system at the Miryang Arirang Astronomical Observatory (MAAO), a public observatory in Miryang, Korea. System integration and a scheduling program enable the 0.7m telescope system to operate completely robotically during nighttime, eliminating the need for human intervention. Using the 0.7m telescope system, we obtain atmospheric extinction coefficients and the zero-point magnitudes by observing standard stars. As a result, we find that atmospheric extinctions are moderate but they can sometimes increase depending on the weather conditions. The measured 5-sigma limiting magnitudes reach down to BVRI=19.4-19.6 AB mag for a point source with a total integrated time of 10 minutes under clear weather conditions, demonstrating comparable performance with other observational facilities operating under similar specifications and sky conditions. We expect that the newly established MAAO 0.7m telescope system will contribute significantly to the observational studies of astronomy. Particularly, with its capability for robotic observations, this system, although its primary duty is for public viewing, can be extensively used for the time-series observation of transients.

2024-04-23

A Rapid Adapting and Continual Learning Spiking Neural Network Path Planning Algorithm for Mobile Robots

Authors: Harrison Espino, Robert Bain, Jeffrey L. Krichmar

Link: http://arxiv.org/abs/2404.15524v1open in new window

Abstract: Mapping traversal costs in an environment and planning paths based on this map are important for autonomous navigation. We present a neurobotic navigation system that utilizes a Spiking Neural Network Wavefront Planner and E-prop learning to concurrently map and plan paths in a large and complex environment. We incorporate a novel method for mapping which, when combined with the Spiking Wavefront Planner, allows for adaptive planning by selectively considering any combination of costs. The system is tested on a mobile robot platform in an outdoor environment with obstacles and varying terrain. Results indicate that the system is capable of discerning features in the environment using three measures of cost, (1) energy expenditure by the wheels, (2) time spent in the presence of obstacles, and (3) terrain slope. In just twelve hours of online training, E-prop learns and incorporates traversal costs into the path planning maps by updating the delays in the Spiking Wavefront Planner. On simulated paths, the Spiking Wavefront Planner plans significantly shorter and lower cost paths than A* and RRT*. The spiking wavefront planner is compatible with neuromorphic hardware and could be used for applications requiring low size, weight, and power.

Understanding Robot Minds: Leveraging Machine Teaching for Transparent Human-Robot Collaboration Across Diverse Groups

Authors: Suresh Kumaar Jayaraman, Reid Simmons, Aaron Steinfeld, Henny Admoni

Link: http://arxiv.org/abs/2404.15472v1open in new window

Abstract: In this work, we aim to improve transparency and efficacy in human-robot collaboration by developing machine teaching algorithms suitable for groups with varied learning capabilities. While previous approaches focused on tailored approaches for teaching individuals, our method teaches teams with various compositions of diverse learners using team belief representations to address personalization challenges within groups. We investigate various group teaching strategies, such as focusing on individual beliefs or the group's collective beliefs, and assess their impact on learning robot policies for different team compositions. Our findings reveal that team belief strategies yield less variation in learning duration and better accommodate diverse teams compared to individual belief strategies, suggesting their suitability in mixed-proficiency settings with limited resources. Conversely, individual belief strategies provide a more uniform knowledge level, particularly effective for homogeneously inexperienced groups. Our study indicates that the teaching strategy's efficacy is significantly influenced by team composition and learner proficiency, highlighting the importance of real-time assessment of learner proficiency and adapting teaching approaches based on learner proficiency for optimal teaching outcomes.

Planning the path with Reinforcement Learning: Optimal Robot Motion Planning in RoboCup Small Size League Environments

Authors: Mateus G. Machado, João G. Melo, Cleber Zanchettin, Pedro H. M. Braga, Pedro V. Cunha, Edna N. S. Barros, Hansenclever F. Bassani

Link: http://arxiv.org/abs/2404.15410v1open in new window

Abstract: This work investigates the potential of Reinforcement Learning (RL) to tackle robot motion planning challenges in the dynamic RoboCup Small Size League (SSL). Using a heuristic control approach, we evaluate RL's effectiveness in obstacle-free and single-obstacle path-planning environments. Ablation studies reveal significant performance improvements. Our method achieved a 60% time gain in obstacle-free environments compared to baseline algorithms. Additionally, our findings demonstrated dynamic obstacle avoidance capabilities, adeptly navigating around moving blocks. These findings highlight the potential of RL to enhance robot motion planning in the challenging and unpredictable SSL environment.

Closed Loop Interactive Embodied Reasoning for Robot Manipulation

Authors: Michal Nazarczuk, Jan Kristof Behrens, Karla Stepanova, Matej Hoffmann, Krystian Mikolajczyk

Link: http://arxiv.org/abs/2404.15194v1open in new window

Abstract: Embodied reasoning systems integrate robotic hardware and cognitive processes to perform complex tasks typically in response to a natural language query about a specific physical environment. This usually involves changing the belief about the scene or physically interacting and changing the scene (e.g. 'Sort the objects from lightest to heaviest'). In order to facilitate the development of such systems we introduce a new simulating environment that makes use of MuJoCo physics engine and high-quality renderer Blender to provide realistic visual observations that are also accurate to the physical state of the scene. Together with the simulator we propose a new benchmark composed of 10 classes of multi-step reasoning scenarios that require simultaneous visual and physical measurements. Finally, we develop a new modular Closed Loop Interactive Reasoning (CLIER) approach that takes into account the measurements of non-visual object properties, changes in the scene caused by external disturbances as well as uncertain outcomes of robotic actions. We extensively evaluate our reasoning approach in simulation and in the real world manipulation tasks with a success rate above 76% and 64%, respectively.

Impedance Matching: Enabling an RL-Based Running Jump in a Quadruped Robot

Authors: Neil Guan, Shangqun Yu, Shifan Zhu, Donghyun Kim

Link: http://arxiv.org/abs/2404.15096v1open in new window

Abstract: Replicating the remarkable athleticism seen in animals has long been a challenge in robotics control. Although Reinforcement Learning (RL) has demonstrated significant progress in dynamic legged locomotion control, the substantial sim-to-real gap often hinders the real-world demonstration of truly dynamic movements. We propose a new framework to mitigate this gap through frequency-domain analysis-based impedance matching between simulated and real robots. Our framework offers a structured guideline for parameter selection and the range for dynamics randomization in simulation, thus facilitating a safe sim-to-real transfer. The learned policy using our framework enabled jumps across distances of 55 cm and heights of 38 cm. The results are, to the best of our knowledge, one of the highest and longest running jumps demonstrated by an RL-based control policy in a real quadruped robot. Note that the achieved jumping height is approximately 85% of that obtained from a state-of-the-art trajectory optimization method, which can be seen as the physical limit for the given robot hardware. In addition, our control policy accomplished stable walking at speeds up to 2 m/s in the forward and backward directions, and 1 m/s in the sideway direction.

Unknown Object Grasping for Assistive Robotics

Authors: Elle Miller, Maximilian Durner, Matthias Humt, Gabriel Quere, Wout Boerdijk, Ashok M. Sundaram, Freek Stulp, Jorn Vogel

Link: http://arxiv.org/abs/2404.15001v1open in new window

Abstract: We propose a novel pipeline for unknown object grasping in shared robotic autonomy scenarios. State-of-the-art methods for fully autonomous scenarios are typically learning-based approaches optimised for a specific end-effector, that generate grasp poses directly from sensor input. In the domain of assistive robotics, we seek instead to utilise the user's cognitive abilities for enhanced satisfaction, grasping performance, and alignment with their high level task-specific goals. Given a pair of stereo images, we perform unknown object instance segmentation and generate a 3D reconstruction of the object of interest. In shared control, the user then guides the robot end-effector across a virtual hemisphere centered around the object to their desired approach direction. A physics-based grasp planner finds the most stable local grasp on the reconstruction, and finally the user is guided by shared control to this grasp. In experiments on the DLR EDAN platform, we report a grasp success rate of 87% for 10 unknown objects, and demonstrate the method's capability to grasp objects in structured clutter and from shelves.

Vision Beyond Boundaries: An Initial Design Space of Domain-specific Large Vision Models in Human-robot Interaction

Authors: Yuchong Zhang, Yong Ma, Danica Kragic

Link: http://arxiv.org/abs/2404.14965v1open in new window

Abstract: The emergence of Large Vision Models (LVMs) is following in the footsteps of the recent prosperity of Large Language Models (LLMs) in following years. However, there's a noticeable gap in structured research applying LVMs to Human-Robot Interaction (HRI), despite extensive evidence supporting the efficacy of vision models in enhancing interactions between humans and robots. Recognizing the vast and anticipated potential, we introduce an initial design space that incorporates domain-specific LVMs, chosen for their superior performance over normal models. We delve into three primary dimensions: HRI contexts, vision-based tasks, and specific domains. The empirical validation was implemented among 15 experts across six evaluated metrics, showcasing the primary efficacy in relevant decision-making scenarios. We explore the process of ideation and potential application scenarios, envisioning this design space as a foundational guideline for future HRI system design, emphasizing accurate domain alignment and model selection.

Bi-CL: A Reinforcement Learning Framework for Robots Coordination Through Bi-level Optimization

Authors: Zechen Hu, Daigo Shishika, Xuesu Xiao, Xuan Wang

Link: http://arxiv.org/abs/2404.14649v1open in new window

Abstract: In multi-robot systems, achieving coordinated missions remains a significant challenge due to the coupled nature of coordination behaviors and the lack of global information for individual robots. To mitigate these challenges, this paper introduces a novel approach, Bi-level Coordination Learning (Bi-CL), that leverages a bi-level optimization structure within a centralized training and decentralized execution paradigm. Our bi-level reformulation decomposes the original problem into a reinforcement learning level with reduced action space, and an imitation learning level that gains demonstrations from a global optimizer. Both levels contribute to improved learning efficiency and scalability. We note that robots' incomplete information leads to mismatches between the two levels of learning models. To address this, Bi-CL further integrates an alignment penalty mechanism, aiming to minimize the discrepancy between the two levels without degrading their training efficiency. We introduce a running example to conceptualize the problem formulation and apply Bi-CL to two variations of this example: route-based and graph-based scenarios. Simulation results demonstrate that Bi-CL can learn more efficiently and achieve comparable performance with traditional multi-agent reinforcement learning baselines for multi-robot coordination.

2024-04-22

Integrating Disambiguation and User Preferences into Large Language Models for Robot Motion Planning

Authors: Mohammed Abugurain, Shinkyu Park

Link: http://arxiv.org/abs/2404.14547v1open in new window

Abstract: This paper presents a framework that can interpret humans' navigation commands containing temporal elements and directly translate their natural language instructions into robot motion planning. Central to our framework is utilizing Large Language Models (LLMs). To enhance the reliability of LLMs in the framework and improve user experience, we propose methods to resolve the ambiguity in natural language instructions and capture user preferences. The process begins with an ambiguity classifier, identifying potential uncertainties in the instructions. Ambiguous statements trigger a GPT-4-based mechanism that generates clarifying questions, incorporating user responses for disambiguation. Also, the framework assesses and records user preferences for non-ambiguous instructions, enhancing future interactions. The last part of this process is the translation of disambiguated instructions into a robot motion plan using Linear Temporal Logic. This paper details the development of this framework and the evaluation of its performance in various test scenarios.

LLM-Personalize: Aligning LLM Planners with Human Preferences via Reinforced Self-Training for Housekeeping Robots

Authors: Dongge Han, Trevor McInroe, Adam Jelley, Stefano V. Albrecht, Peter Bell, Amos Storkey

Link: http://arxiv.org/abs/2404.14285v1open in new window

Abstract: Large language models (LLMs) have shown significant potential for robotics applications, particularly task planning, by harnessing their language comprehension and text generation capabilities. However, in applications such as household robotics, a critical gap remains in the personalization of these models to individual user preferences. We introduce LLM-Personalize, a novel framework with an optimization pipeline designed to personalize LLM planners for household robotics. Our LLM-Personalize framework features an LLM planner that performs iterative planning in multi-room, partially-observable household scenarios, making use of a scene graph constructed with local observations. The generated plan consists of a sequence of high-level actions which are subsequently executed by a controller. Central to our approach is the optimization pipeline, which combines imitation learning and iterative self-training to personalize the LLM planner. In particular, the imitation learning phase performs initial LLM alignment from demonstrations, and bootstraps the model to facilitate effective iterative self-training, which further explores and aligns the model to user preferences. We evaluate LLM-Personalize on Housekeep, a challenging simulated real-world 3D benchmark for household rearrangements, and show that LLM-Personalize achieves more than a 30 percent increase in success rate over existing LLM planners, showcasing significantly improved alignment with human preferences. Project page: https://donggehan.github.io/projectllmpersonalize/.

A multi-robot system for the detection of explosive devices

Authors: Ken Hasselmann, Mario Malizia, Rafael Caballero, Fabio Polisano, Shashank Govindaraj, Jakob Stigler, Oleksii Ilchenko, Milan Bajic, Geert De Cubber

Link: http://arxiv.org/abs/2404.14167v1open in new window

Abstract: In order to clear the world of the threat posed by landmines and other explosive devices, robotic systems can play an important role. However, the development of such field robots that need to operate in hazardous conditions requires the careful consideration of multiple aspects related to the perception, mobility, and collaboration capabilities of the system. In the framework of a European challenge, the Artificial Intelligence for Detection of Explosive Devices - eXtended (AIDEDeX) project proposes to design a heterogeneous multi-robot system with advanced sensor fusion algorithms. This system is specifically designed to detect and classify improvised explosive devices, explosive ordnances, and landmines. This project integrates specialised sensors, including electromagnetic induction, ground penetrating radar, X-Ray backscatter imaging, Raman spectrometers, and multimodal cameras, to achieve comprehensive threat identification and localisation. The proposed system comprises a fleet of unmanned ground vehicles and unmanned aerial vehicles. This article details the operational phases of the AIDEDeX system, from rapid terrain exploration using unmanned aerial vehicles to specialised detection and classification by unmanned ground vehicles equipped with a robotic manipulator. Initially focusing on a centralised approach, the project will also explore the potential of a decentralised control architecture, taking inspiration from swarm robotics to provide a robust, adaptable, and scalable solution for explosive detection.

Autonomous Forest Inventory with Legged Robots: System Design and Field Deployment

Authors: Matías Mattamala, Nived Chebrolu, Benoit Casseau, Leonard Freißmuth, Jonas Frey, Turcan Tuna, Marco Hutter, Maurice Fallon

Link: http://arxiv.org/abs/2404.14157v1open in new window

Abstract: We present a solution for autonomous forest inventory with a legged robotic platform. Compared to their wheeled and aerial counterparts, legged platforms offer an attractive balance of endurance and low soil impact for forest applications. In this paper, we present the complete system architecture of our forest inventory solution which includes state estimation, navigation, mission planning, and real-time tree segmentation and trait estimation. We present preliminary results for three campaigns in forests in Finland and the UK and summarize the main outcomes, lessons, and challenges. Our UK experiment at the Forest of Dean with the ANYmal D legged platform, achieved an autonomous survey of a 0.96 hectare plot in 20 min, identifying over 100 trees with typical DBH accuracy of 2 cm.

A participatory design approach to using social robots for elderly care

Authors: Barbara Sienkiewicz, Zuzanna Radosz-Knawa, Bipin Indurkhya

Link: http://arxiv.org/abs/2404.14134v1open in new window

Abstract: We present our ongoing research on applying a participatory design approach to using social robots for elderly care. Our approach involves four different groups of stakeholders: the elderly, (non-professional) caregivers, medical professionals, and psychologists. We focus on card sorting and storyboarding techniques to elicit the concerns of the stakeholders towards deploying social robots for elderly care. This is followed by semi-structured interviews to assess their attitudes towards social robots individually. Then we are conducting two-stage workshops with different elderly groups to understand how to engage them with the technology and to identify the challenges in this task.

2024-04-18

RoboDreamer: Learning Compositional World Models for Robot Imagination

Authors: Siyuan Zhou, Yilun Du, Jiaben Chen, Yandong Li, Dit-Yan Yeung, Chuang Gan

Link: http://arxiv.org/abs/2404.12377v1open in new window

Abstract: Text-to-video models have demonstrated substantial potential in robotic decision-making, enabling the imagination of realistic plans of future actions as well as accurate environment simulation. However, one major issue in such models is generalization -- models are limited to synthesizing videos subject to language instructions similar to those seen at training time. This is heavily limiting in decision-making, where we seek a powerful world model to synthesize plans of unseen combinations of objects and actions in order to solve previously unseen tasks in new environments. To resolve this issue, we introduce RoboDreamer, an innovative approach for learning a compositional world model by factorizing the video generation. We leverage the natural compositionality of language to parse instructions into a set of lower-level primitives, which we condition a set of models on to generate videos. We illustrate how this factorization naturally enables compositional generalization, by allowing us to formulate a new natural language instruction as a combination of previously seen components. We further show how such a factorization enables us to add additional multimodal goals, allowing us to specify a video we wish to generate given both natural language instructions and a goal image. Our approach can successfully synthesize video plans on unseen goals in the RT-X, enables successful robot execution in simulation, and substantially outperforms monolithic baseline approaches to video generation.

ASID: Active Exploration for System Identification in Robotic Manipulation

Authors: Marius Memmel, Andrew Wagenmaker, Chuning Zhu, Patrick Yin, Dieter Fox, Abhishek Gupta

Link: http://arxiv.org/abs/2404.12308v1open in new window

Abstract: Model-free control strategies such as reinforcement learning have shown the ability to learn control strategies without requiring an accurate model or simulator of the world. While this is appealing due to the lack of modeling requirements, such methods can be sample inefficient, making them impractical in many real-world domains. On the other hand, model-based control techniques leveraging accurate simulators can circumvent these challenges and use a large amount of cheap simulation data to learn controllers that can effectively transfer to the real world. The challenge with such model-based techniques is the requirement for an extremely accurate simulation, requiring both the specification of appropriate simulation assets and physical parameters. This requires considerable human effort to design for every environment being considered. In this work, we propose a learning system that can leverage a small amount of real-world data to autonomously refine a simulation model and then plan an accurate control strategy that can be deployed in the real world. Our approach critically relies on utilizing an initial (possibly inaccurate) simulator to design effective exploration policies that, when deployed in the real world, collect high-quality data. We demonstrate the efficacy of this paradigm in identifying articulation, mass, and other physical parameters in several challenging robotic manipulation tasks, and illustrate that only a small amount of real-world data can allow for effective sim-to-real transfer. Project website at https://weirdlabuw.github.io/asid

RISE: 3D Perception Makes Real-World Robot Imitation Simple and Effective

Authors: Chenxi Wang, Hongjie Fang, Hao-Shu Fang, Cewu Lu

Link: http://arxiv.org/abs/2404.12281v1open in new window

Abstract: Precise robot manipulations require rich spatial information in imitation learning. Image-based policies model object positions from fixed cameras, which are sensitive to camera view changes. Policies utilizing 3D point clouds usually predict keyframes rather than continuous actions, posing difficulty in dynamic and contact-rich scenarios. To utilize 3D perception efficiently, we present RISE, an end-to-end baseline for real-world imitation learning, which predicts continuous actions directly from single-view point clouds. It compresses the point cloud to tokens with a sparse 3D encoder. After adding sparse positional encoding, the tokens are featurized using a transformer. Finally, the features are decoded into robot actions by a diffusion head. Trained with 50 demonstrations for each real-world task, RISE surpasses currently representative 2D and 3D policies by a large margin, showcasing significant advantages in both accuracy and efficiency. Experiments also demonstrate that RISE is more general and robust to environmental change compared with previous baselines. Project website: rise-policy.github.io.

Hybrid Dynamics Modeling and Trajectory Planning for a Cable-Trailer System with a Quadruped Robot

Authors: Wentao Zhang, Shaohang Xu, Gewei Zuo, Lijun Zhu

Link: http://arxiv.org/abs/2404.12220v1open in new window

Abstract: Inspired by the utilization of dogs in sled-pulling for transportation, we introduce a cable-trailer system with a quadruped robot. The motion planning of the proposed robot system presents challenges arising from the nonholonomic constraints of the trailer, system underactuation, and hybrid interaction through the cable. To tackle these challenges, we develop a hybrid dynamics model that accounts for the cable's taut/slack status. Since it is computationally intense to directly optimize the trajectory, we first propose a search algorithm to compute a sub-optimal trajectory as the initial solution. Then, a novel collision avoidance constraint based on the geometric shapes of objects is proposed to formulate the trajectory optimization problem for the hybrid system. The proposed trajectory planning method is implemented on a Unitree A1 quadruped robot with a customized cable-trailer and validated through experiments.

Automated Real-Time Inspection in Indoor and Outdoor 3D Environments with Cooperative Aerial Robots

Authors: Andreas Anastasiou, Angelos Zacharia, Savvas Papaioannou, Panayiotis Kolios, Christos G. Panayiotou, Marios M. Polycarpou

Link: http://arxiv.org/abs/2404.12018v1open in new window

Abstract: This work introduces a cooperative inspection system designed to efficiently control and coordinate a team of distributed heterogeneous UAV agents for the inspection of 3D structures in cluttered, unknown spaces. Our proposed approach employs a two-stage innovative methodology. Initially, it leverages the complementary sensing capabilities of the robots to cooperatively map the unknown environment. It then generates optimized, collision-free inspection paths, thereby ensuring comprehensive coverage of the structure's surface area. The effectiveness of our system is demonstrated through qualitative and quantitative results from extensive Gazebo-based simulations that closely replicate real-world inspection scenarios, highlighting its ability to thoroughly inspect real-world-like 3D structures.

CelluloTactix: Towards Empowering Collaborative Online Learning through Tangible Haptic Interaction with Cellulo Robots

Authors: Hasaru Kariyawasam, Wafa Johal

Link: http://arxiv.org/abs/2404.11876v1open in new window

Abstract: Online learning has soared in popularity in the educational landscape of COVID-19 and carries the benefits of increased flexibility and access to far-away training resources. However, it also restricts communication between peers and teachers, limits physical interactions and confines learning to the computer screen and keyboard. In this project, we designed a novel way to engage students in collaborative online learning by using haptic-enabled tangible robots, Cellulo. We built a library which connects two robots remotely for a learning activity based around the structure of a biological cell. To discover how separate modes of haptic feedback might differentially affect collaboration, two modes of haptic force-feedback were implemented (haptic co-location and haptic consensus). With a case study, we found that the haptic co-location mode seemed to stimulate collectivist behaviour to a greater extent than the haptic consensus mode, which was associated with individualism and less interaction. While the haptic co-location mode seemed to encourage information pooling, participants using the haptic consensus mode tended to focus more on technical co-ordination. This work introduces a novel system that can provide interesting insights on how to integrate haptic feedback into collaborative remote learning activities in future.

Reinforcement Learning of Multi-robot Task Allocation for Multi-object Transportation with Infeasible Tasks

Authors: Yuma Shida, Tomohiko Jimbo, Tadashi Odashima, Takamitsu Matsubara

Link: http://arxiv.org/abs/2404.11817v1open in new window

Abstract: Multi-object transport using multi-robot systems has the potential for diverse practical applications such as delivery services owing to its efficient individual and scalable cooperative transport. However, allocating transportation tasks of objects with unknown weights remains challenging. Moreover, the presence of infeasible tasks (untransportable objects) can lead to robot stoppage (deadlock). This paper proposes a framework for dynamic task allocation that involves storing task experiences for each task in a scalable manner with respect to the number of robots. First, these experiences are broadcasted from the cloud server to the entire robot system. Subsequently, each robot learns the exclusion levels for each task based on those task experiences, enabling it to exclude infeasible tasks and reset its task priorities. Finally, individual transportation, cooperative transportation, and the temporary exclusion of tasks considered infeasible are achieved. The scalability and versatility of the proposed method were confirmed through numerical experiments with an increased number of robots and objects, including unlearned weight objects. The effectiveness of the temporary deadlock avoidance was also confirmed by introducing additional robots within an episode. The proposed method enables the implementation of task allocation strategies that are feasible for different numbers of robots and various transport tasks without prior consideration of feasibility.

2024-04-17

Spatio-Temporal Motion Retargeting for Quadruped Robots

Authors: Taerim Yoon, Dongho Kang, Seungmin Kim, Minsung Ahn, Stelian Coros, Sungjoon Choi

Link: http://arxiv.org/abs/2404.11557v1open in new window

Abstract: This work introduces a motion retargeting approach for legged robots, which aims to create motion controllers that imitate the fine behavior of animals. Our approach, namely spatio-temporal motion retargeting (STMR), guides imitation learning procedures by transferring motion from source to target, effectively bridging the morphological disparities by ensuring the feasibility of imitation on the target system. Our STMR method comprises two components: spatial motion retargeting (SMR) and temporal motion retargeting (TMR). On the one hand, SMR tackles motion retargeting at the kinematic level by generating kinematically feasible whole-body motions from keypoint trajectories. On the other hand, TMR aims to retarget motion at the dynamic level by optimizing motion in the temporal domain. We showcase the effectiveness of our method in facilitating Imitation Learning (IL) for complex animal movements through a series of simulation and hardware experiments. In these experiments, our STMR method successfully tailored complex animal motions from various media, including video captured by a hand-held camera, to fit the morphology and physical properties of the target robots. This enabled RL policy training for precise motion tracking, while baseline methods struggled with highly dynamic motion involving flying phases. Moreover, we validated that the control policy can successfully imitate six different motions in two quadruped robots with different dimensions and physical properties in real-world settings.

Runtime Verification and Field Testing for ROS-Based Robotic Systems

Authors: Ricardo Caldas, Juan Antonio Piñera García, Matei Schiopu, Patrizio Pelliccione, Genaína Rodrigues, Thorsten Berger

Link: http://arxiv.org/abs/2404.11498v1open in new window

Abstract: Robotic systems are becoming pervasive and adopted in increasingly many domains, such as manufacturing, healthcare, and space exploration. To this end, engineering software has emerged as a crucial discipline for building maintainable and reusable robotic systems. Robotics software engineering research has received increasing attention, fostering autonomy as a fundamental goal. However, robotics developers are still challenged trying to achieve this goal given that simulation is not able to deliver solutions to realistically emulate real-world phenomena. Robots also need to operate in unpredictable and uncontrollable environments, which require safe and trustworthy self-adaptation capabilities implemented in software. Typical techniques to address the challenges are runtime verification, field-based testing, and mitigation techniques that enable fail-safe solutions. However, there is no clear guidance to architect ROS-based systems to enable and facilitate runtime verification and field-based testing. This paper aims to fill in this gap by providing guidelines that can help developers and QA teams when developing, verifying or testing their robots in the field. These guidelines are carefully tailored to address the challenges and requirements of testing robotics systems in real-world scenarios. We conducted a literature review on studies addressing runtime verification and field-based testing for robotic systems, mined ROS-based application repositories, and validated the applicability, clarity, and usefulness via two questionnaires with 55 answers. We contribute 20 guidelines formulated for researchers and practitioners in robotic software engineering. Finally, we map our guidelines to open challenges thus far in runtime verification and field-based testing for ROS-based systems and, we outline promising research directions in the field.

Milling using two mechatronically coupled robots

Authors: Max Goebels, Jan Baumgärtner, Tobias Fuchs, Edgar Mühlbeier, Alexander Puchta, Jürgen Fleischer

Link: http://arxiv.org/abs/2404.11271v1open in new window

Abstract: Industrial robots are commonly used in various industries due to their flexibility. However, their adoption for machining tasks is minimal because of the low dynamic stiffness characteristic of serial kinematic chains. To overcome this problem, we propose coupling two industrial robots at the flanges to form a parallel kinematic machining system. Although parallel kinematic chains are inherently stiffer, one possible disadvantage of the proposed system is that it is heavily overactuated. We perform a modal analysis to show that this may be an advantage, as the redundant degrees of freedom can be used to shift the natural frequencies by applying tension to the coupling module. To demonstrate the validity of our approach, we perform a milling experiment using our coupled system. An external measurement system is used to show that tensioning the coupling module causes a deformation of the system. We further show that this deformation is static over the tool path and can be compensated for.

Towards Human Awareness in Robot Task Planning with Large Language Models

Authors: Yuchen Liu, Luigi Palmieri, Sebastian Koch, Ilche Georgievski, Marco Aiello

Link: http://arxiv.org/abs/2404.11267v1open in new window

Abstract: The recent breakthroughs in the research on Large Language Models (LLMs) have triggered a transformation across several research domains. Notably, the integration of LLMs has greatly enhanced performance in robot Task And Motion Planning (TAMP). However, previous approaches often neglect the consideration of dynamic environments, i.e., the presence of dynamic objects such as humans. In this paper, we propose a novel approach to address this gap by incorporating human awareness into LLM-based robot task planning. To obtain an effective representation of the dynamic environment, our approach integrates humans' information into a hierarchical scene graph. To ensure the plan's executability, we leverage LLMs to ground the environmental topology and actionable knowledge into formal planning language. Most importantly, we use LLMs to predict future human activities and plan tasks for the robot considering the predictions. Our contribution facilitates the development of integrating human awareness into LLM-driven robot task planning, and paves the way for proactive robot decision-making in dynamic environments.

"That's our game!" : Reflections on co-designing a robotic game with neurodiverse children

Authors: Patricia Piedade, Isabel Neto, Ana Pires, Rui Prada, Hugo Nicolau

Link: http://arxiv.org/abs/2404.11252v1open in new window

Abstract: Many neurodivergent (ND) children are integrated into mainstream schools alongside their neurotypical (NT) peers. However, they often face social exclusion, which may have lifelong effects. Inclusive play activities can be a strong driver of inclusion. Unfortunately, games designed for the specific needs of neurodiverse groups, those that include neurodivergent and neurotypical individuals, are scarce. Given the potential of robots as engaging devices, we led a 6-month co-design process to build an inclusive and entertaining robotic game for neurodiverse classrooms. We first interviewed neurodivergent adults and educators to identify the barriers and facilitators for including neurodivergent children in mainstream classrooms. Then, we conducted five co-design sessions, engaging four neurodiverse classrooms with 81 children (19 neurodivergent). We present a reflection on our co-design process and the resulting robotic game through the lens of Self-Determination Theory, discussing how our methodology supported the intrinsic motivations of neurodivergent children.

Accuracy and repeatability of a parallel robot for personalised minimally invasive surgery

Authors: Doina Pisla, Paul Tucan, Damien Chablat, Nadim Al Hajjar, Andra Ciocan, Adrian Pisla, Alexandru Pusca, Corina Radu, Grigore Pop, Bogdan Gherman

Link: http://arxiv.org/abs/2404.11140v1open in new window

Abstract: The paper presents the methodology used for accuracy and repeatability measurements of the experimental model of a parallel robot developed for surgical applications. The experimental setup uses a motion tracking system (for accuracy) and a high precision measuring arm for position (for repeatability). The accuracy was obtained by comparing the trajectory data from the experimental measurement with a baseline trajectory defined with the kinematic models of the parallel robotic system. The repeatability was experi-mentally determined by moving (repeatedly) the robot platform in predefined points.

Empowering Large Language Models on Robotic Manipulation with Affordance Prompting

Authors: Guangran Cheng, Chuheng Zhang, Wenzhe Cai, Li Zhao, Changyin Sun, Jiang Bian

Link: http://arxiv.org/abs/2404.11027v1open in new window

Abstract: While large language models (LLMs) are successful in completing various language processing tasks, they easily fail to interact with the physical world by generating control sequences properly. We find that the main reason is that LLMs are not grounded in the physical world. Existing LLM-based approaches circumvent this problem by relying on additional pre-defined skills or pre-trained sub-policies, making it hard to adapt to new tasks. In contrast, we aim to address this problem and explore the possibility to prompt pre-trained LLMs to accomplish a series of robotic manipulation tasks in a training-free paradigm. Accordingly, we propose a framework called LLM+A(ffordance) where the LLM serves as both the sub-task planner (that generates high-level plans) and the motion controller (that generates low-level control sequences). To ground these plans and control sequences on the physical world, we develop the affordance prompting technique that stimulates the LLM to 1) predict the consequences of generated plans and 2) generate affordance values for relevant objects. Empirically, we evaluate the effectiveness of LLM+A in various language-conditioned robotic manipulation tasks, which show that our approach substantially improves performance by enhancing the feasibility of generated plans and control and can easily generalize to different environments.

OVAL-Prompt: Open-Vocabulary Affordance Localization for Robot Manipulation through LLM Affordance-Grounding

Authors: Edmond Tong, Anthony Opipari, Stanley Lewis, Zhen Zeng, Odest Chadwicke Jenkins

Link: http://arxiv.org/abs/2404.11000v1open in new window

Abstract: In order for robots to interact with objects effectively, they must understand the form and function of each object they encounter. Essentially, robots need to understand which actions each object affords, and where those affordances can be acted on. Robots are ultimately expected to operate in unstructured human environments, where the set of objects and affordances is not known to the robot before deployment (i.e. the open-vocabulary setting). In this work, we introduce OVAL-Prompt, a prompt-based approach for open-vocabulary affordance localization in RGB-D images. By leveraging a Vision Language Model (VLM) for open-vocabulary object part segmentation and a Large Language Model (LLM) to ground each part-segment-affordance, OVAL-Prompt demonstrates generalizability to novel object instances, categories, and affordances without domain-specific finetuning. Quantitative experiments demonstrate that without any finetuning, OVAL-Prompt achieves localization accuracy that is competitive with supervised baseline models. Moreover, qualitative experiments show that OVAL-Prompt enables affordance-based robot manipulation of open-vocabulary object instances and categories.

Machine-Learning-Enhanced Soft Robotic System Inspired by Rectal Functions for Investigating Fecal incontinence

Authors: Zebing Mao, Sota Suzuki, Hiroyuki Nabae, Shoko Miyagawa, Koichi Suzumori, Shingo Maeda

Link: http://arxiv.org/abs/2404.10999v1open in new window

Abstract: Fecal incontinence, arising from a myriad of pathogenic mechanisms, has attracted considerable global attention. Despite its significance, the replication of the defecatory system for studying fecal incontinence mechanisms remains limited largely due to social stigma and taboos. Inspired by the rectum's functionalities, we have developed a soft robotic system, encompassing a power supply, pressure sensing, data acquisition systems, a flushing mechanism, a stage, and a rectal module. The innovative soft rectal module includes actuators inspired by sphincter muscles, both soft and rigid covers, and soft rectum mold. The rectal mold, fabricated from materials that closely mimic human rectal tissue, is produced using the mold replication fabrication method. Both the soft and rigid components of the mold are realized through the application of 3D-printing technology. The sphincter muscles-inspired actuators featuring double-layer pouch structures are modeled and optimized based on multilayer perceptron methods aiming to obtain high contractions ratios (100%), high generated pressure (9.8 kPa), and small recovery time (3 s). Upon assembly, this defecation robot is capable of smoothly expelling liquid faeces, performing controlled solid fecal cutting, and defecating extremely solid long faeces, thus closely replicating the human rectum and anal canal's functions. This defecation robot has the potential to assist humans in understanding the complex defecation system and contribute to the development of well-being devices related to defecation.

2024-04-16

Safety-critical Autonomous Inspection of Distillation Columns using Quadrupedal Robots Equipped with Roller Arms

Authors: Jaemin Lee, Jeeseop Kim, Aaron D. Ames

Link: http://arxiv.org/abs/2404.10938v1open in new window

Abstract: This paper proposes a comprehensive framework designed for the autonomous inspection of complex environments, with a specific focus on multi-tiered settings such as distillation column trays. Leveraging quadruped robots equipped with roller arms, and through the use of onboard perception, we integrate essential motion components including: locomotion, safe and dynamic transitions between trays, and intermediate motions that bridge a variety of motion primitives. Given the slippery and confined nature of column trays, it is critical to ensure safety of the robot during inspection, therefore we employ a safety filter and footstep re-planning based upon control barrier function representations of the environment. Our framework integrates all system components into a state machine encoding the developed safety-critical planning and control elements to guarantee safety-critical autonomy, enabling autonomous and safe navigation and inspection of distillation columns. Experimental validation in an environment, consisting of industrial-grade chemical distillation trays, highlights the effectiveness of our multi-layered architecture.

SPONGE: Open-Source Designs of Modular Articulated Soft Robots

Authors: Tim-Lukas Habich, Jonas Haack, Mehdi Belhadj, Dustin Lehmann, Thomas Seel, Moritz Schappler

Link: http://arxiv.org/abs/2404.10734v1open in new window

Abstract: Soft-robot designs are manifold, but only a few are publicly available. Often, these are only briefly described in their publications. This complicates reproduction, and hinders the reproducibility and comparability of research results. If the designs were uniform and open source, validating researched methods on real benchmark systems would be possible. To address this, we present two variants of a soft pneumatic robot with antagonistic bellows as open source. Starting from a semi-modular design with multiple cables and tubes routed through the robot body, the transition to a fully modular robot with integrated microvalves and serial communication is highlighted. Modularity in terms of stackability, actuation, and communication is achieved, which is the crucial requirement for building soft robots with many degrees of freedom and high dexterity for real-world tasks. Both systems are compared regarding their respective advantages and disadvantages. The robots' functionality is demonstrated in experiments on airtightness, gravitational influence, position control with mean tracking errors of ❤️ deg, and long-term operation of cast and printed bellows. All soft- and hardware files required for reproduction are provided.

SCALE: Self-Correcting Visual Navigation for Mobile Robots via Anti-Novelty Estimation

Authors: Chang Chen, Yuecheng Liu, Yuzheng Zhuang, Sitong Mao, Kaiwen Xue, Shunbo Zhou

Link: http://arxiv.org/abs/2404.10675v1open in new window

Abstract: Although visual navigation has been extensively studied using deep reinforcement learning, online learning for real-world robots remains a challenging task. Recent work directly learned from offline dataset to achieve broader generalization in the real-world tasks, which, however, faces the out-of-distribution (OOD) issue and potential robot localization failures in a given map for unseen observation. This significantly drops the success rates and even induces collision. In this paper, we present a self-correcting visual navigation method, SCALE, that can autonomously prevent the robot from the OOD situations without human intervention. Specifically, we develop an image-goal conditioned offline reinforcement learning method based on implicit Q-learning (IQL). When facing OOD observation, our novel localization recovery method generates the potential future trajectories by learning from the navigation affordance, and estimates the future novelty via random network distillation (RND). A tailored cost function searches for the candidates with the least novelty that can lead the robot to the familiar places. We collect offline data and conduct evaluation experiments in three real-world urban scenarios. Experiment results show that SCALE outperforms the previous state-of-the-art methods for open-world navigation with a unique capability of localization recovery, significantly reducing the need for human intervention. Code is available at https://github.com/KubeEdge4Robotics/ScaleNav.

A Longitudinal Study of Child Wellbeing Assessment via Online Interactions with a Social Robots

Authors: Nida Itrat Abbasi, Guy Laban, Tasmin Ford, Peter B. Jones, Hatice Gunes

Link: http://arxiv.org/abs/2404.10593v1open in new window

Abstract: Socially Assistive Robots are studied in different Child-Robot Interaction settings. However, logistical constraints limit accessibility, particularly affecting timely support for mental wellbeing. In this work, we have investigated whether online interactions with a robot can be used for the assessment of mental wellbeing in children. The children (N=40, 20 girls and 20 boys; 8-13 years) interacted with the Nao robot (30-45 mins) over three sessions, at least a week apart. Audio-visual recordings were collected throughout the sessions that concluded with the children answering user perception questionnaires pertaining to their anxiety towards the robot, and the robot's abilities. We divided the participants into three wellbeing clusters (low, med and high tertiles) using their responses to the Short Moods and Feelings Questionnaire (SMFQ) and further analysed how their wellbeing and their perceptions of the robot changed over the wellbeing tertiles, across sessions and across participants' gender. Our primary findings suggest that (I) online mediated-interactions with robots can be effective in assessing children's mental wellbeing over time, and (II) children's overall perception of the robot either improved or remained consistent across time. Supplementary exploratory analyses have also revealed that gender affected the children's wellbeing assessments as well as their perceptions of the robot.

MPCOM: Robotic Data Gathering with Radio Mapping and Model Predictive Communication

Authors: Zhiyou Ji, Guoliang Li, Ruihua Han, Shuai Wang, Bing Bai, Wei Xu, Kejiang Ye, Chengzhong Xu

Link: http://arxiv.org/abs/2404.10541v1open in new window

Abstract: Robotic data gathering (RDG) is an emerging paradigm that navigates a robot to harvest data from remote sensors. However, motion planning in this paradigm needs to maximize the RDG efficiency instead of the navigation efficiency, for which the existing motion planning methods become inefficient, as they plan robot trajectories merely according to motion factors. This paper proposes radio map guided model predictive communication (MPCOM), which navigates the robot with both grid and radio maps for shape-aware collision avoidance and communication-aware trajectory generation in a dynamic environment. The proposed MPCOM is able to trade off the time spent on reaching goal, avoiding collision, and improving communication. MPCOM captures high-order signal propagation characteristics using radio maps and incorporates the map-guided communication regularizer to the motion planning block. Experiments in IRSIM and CARLA simulators show that the proposed MPCOM outperforms other benchmarks in both LOS and NLOS cases. Real-world testing based on car-like robots is also provided to demonstrate the effectiveness of MPCOM in indoor environments.

2024-04-15

Debunking Robot Rights Metaphysically, Ethically, and Legally

Authors: Abeba Birhane, Jelle van Dijk, Frank Pasquale

Link: http://arxiv.org/abs/2404.10072v1open in new window

Abstract: In this work we challenge arguments for robot rights on metaphysical, ethical and legal grounds. Metaphysically, we argue that machines are not the kinds of things that may be denied or granted rights. Building on theories of phenomenology and post-Cartesian approaches to cognitive science, we ground our position in the lived reality of actual humans in an increasingly ubiquitously connected, controlled, digitized, and surveilled society. Ethically, we argue that, given machines current and potential harms to the most marginalized in society, limits on (rather than rights for) machines should be at the centre of current AI ethics debate. From a legal perspective, the best analogy to robot rights is not human rights but corporate rights, a highly controversial concept whose most important effect has been the undermining of worker, consumer, and voter rights by advancing the power of capital to exercise outsized influence on politics and law. The idea of robot rights, we conclude, acts as a smoke screen, allowing theorists and futurists to fantasize about benevolently sentient machines with unalterable needs and desires protected by law. While such fantasies have motivated fascinating fiction and art, once they influence legal theory and practice articulating the scope of rights claims, they threaten to immunize from legal accountability the current AI and robotics that is fuelling surveillance capitalism, accelerating environmental destruction, and entrenching injustice and human suffering.

Robot Positioning Using Torus Packing for Multisets

Authors: Chung Shue Chen, Peter Keevash, Sean Kennedy, Élie de Panafieu, Adrian Vetta

Link: http://arxiv.org/abs/2404.09981v1open in new window

Abstract: We consider the design of a positioning system where a robot determines its position from local observations. This is a well-studied problem of considerable practical importance and mathematical interest. The dominant paradigm derives from the classical theory of de Bruijn sequences, where the robot has access to a window within a larger code and can determine its position if these windows are distinct. We propose an alternative model in which the robot has more limited observational powers, which we argue is more realistic in terms of engineering: the robot does not have access to the full pattern of colours (or letters) in the window, but only to the intensity of each colour (or the number of occurrences of each letter). This leads to a mathematically interesting problem with a different flavour to that arising in the classical paradigm, requiring new construction techniques. The parameters of our construction are optimal up to a constant factor, and computing the position requires only a constant number of arithmetic operations.

Autonomous Path Planning for Intercostal Robotic Ultrasound Imaging Using Reinforcement Learning

Authors: Yuan Bi, Cheng Qian, Zhicheng Zhang, Nassir Navab, Zhongliang Jiang

Link: http://arxiv.org/abs/2404.09927v1open in new window

Abstract: Ultrasound (US) has been widely used in daily clinical practice for screening internal organs and guiding interventions. However, due to the acoustic shadow cast by the subcutaneous rib cage, the US examination for thoracic application is still challenging. To fully cover and reconstruct the region of interest in US for diagnosis, an intercostal scanning path is necessary. To tackle this challenge, we present a reinforcement learning (RL) approach for planning scanning paths between ribs to monitor changes in lesions on internal organs, such as the liver and heart, which are covered by rib cages. Structured anatomical information of the human skeleton is crucial for planning these intercostal paths. To obtain such anatomical insight, an RL agent is trained in a virtual environment constructed using computational tomography (CT) templates with randomly initialized tumors of various shapes and locations. In addition, task-specific state representation and reward functions are introduced to ensure the convergence of the training process while minimizing the effects of acoustic attenuation and shadows during scanning. To validate the effectiveness of the proposed approach, experiments have been carried out on unseen CTs with randomly defined single or multiple scanning targets. The results demonstrate the efficiency of the proposed RL framework in planning non-shadowed US scanning trajectories in areas with limited acoustic access.

Facial Features Integration in Last Mile Delivery Robots

Authors: Delgermaa Gankhuyag, Stephanie Groiß, Lena Schwamberger, Özge Talay, Cristina Olaverri-Monreal

Link: http://arxiv.org/abs/2404.09844v1open in new window

Abstract: Delivery services have undergone technological advancements, with robots now directly delivering packages to recipients. While these robots are designed for efficient functionality, they have not been specifically designed for interactions with humans. Building on the premise that incorporating human-like characteristics into a robot has the potential to positively impact technology acceptance, this study explores human reactions to a robot characterized with facial expressions. The findings indicate a correlation between anthropomorphic features and the observed responses.

Enhancing Robot Explanation Capabilities through Vision-Language Models: a Preliminary Study by Interpreting Visual Inputs for Improved Human-Robot Interaction

Authors: David Sobrín-Hidalgo, Miguel Ángel González-Santamarta, Ángel Manuel Guerrero-Higueras, Francisco Javier Rodríguez-Lera, Vicente Matellán-Olivera

Link: http://arxiv.org/abs/2404.09705v1open in new window

Abstract: This paper presents an improved system based on our prior work, designed to create explanations for autonomous robot actions during Human-Robot Interaction (HRI). Previously, we developed a system that used Large Language Models (LLMs) to interpret logs and produce natural language explanations. In this study, we expand our approach by incorporating Vision-Language Models (VLMs), enabling the system to analyze textual logs with the added context of visual input. This method allows for generating explanations that combine data from the robot's logs and the images it captures. We tested this enhanced system on a basic navigation task where the robot needs to avoid a human obstacle. The findings from this preliminary study indicate that adding visual interpretation improves our system's explanations by precisely identifying obstacles and increasing the accuracy of the explanations provided.

A Generic Trajectory Planning Method for Constrained All-Wheel-Steering Robots

Authors: Ren Xin, Hongji Liu, Yingbing Chen, Sheng Wang, Ming Liu

Link: http://arxiv.org/abs/2404.09677v1open in new window

Abstract: This paper presents a trajectory planning method for wheeled robots with fixed steering axes while the steering angle of each wheel is constrained. In the past, All-Wheel-Steering(AWS) robots, incorporating modes such as rotation-free translation maneuvers, in-situ rotational maneuvers, and proportional steering, exhibited inefficient performance due to time-consuming mode switches. This inefficiency arises from wheel rotation constraints and inter-wheel cooperation requirements. The direct application of a holonomic moving strategy can lead to significant slip angles or even structural failure. Additionally, the limited steering range of AWS wheeled robots exacerbates nonlinearity issues, thereby complicating control processes. To address these challenges, we developed a novel planning method termed Constrained AWS(C-AWS), which integrates second-order discrete search with predictive control techniques. Experimental results demonstrate that our method adeptly generates feasible and smooth trajectories for C-AWS while adhering to steering angle constraints.

Stiffness-Tuneable Limb Segment with Flexible Spine for Malleable Robots

Authors: Angus B. Clark, Nicolas Rojas

Link: http://arxiv.org/abs/2404.09653v1open in new window

Abstract: Robotic arms built from stiffness-adjustable, continuously bending segments serially connected with revolute joints have the ability to change their mechanical architecture and workspace, thus allowing high flexibility and adaptation to different tasks with less than six degrees of freedom, a concept that we call malleable robots. Known stiffening mechanisms may be used to implement suitable links for these novel robotic manipulators; however, these solutions usually show a reduced performance when bending due to structural deformation. By including an inner support structure this deformation can be minimised, resulting in an increased stiffening performance. This paper presents a new multi-material spine-inspired flexible structure for providing support in stiffness-controllable layer-jamming-based robotic links of large diameter. The proposed spine mechanism is highly movable with type and range of motions that match those of a robotic link using solely layer jamming, whilst maintaining a hollow and light structure. The mechanics and design of the flexible spine are explored, and a prototype of a link utilising it is developed and compared with limb segments based on granular jamming and layer jamming without support structure. Results of experiments verify the advantages of the proposed design, demonstrating that it maintains a constant central diameter across bending angles and presents an improvement of more than 203% of resisting force at 180 degrees.

Object Instance Retrieval in Assistive Robotics: Leveraging Fine-Tuned SimSiam with Multi-View Images Based on 3D Semantic Map

Authors: Taichi Sakaguchi, Akira Taniguchi, Yoshinobu Hagiwara, Lotfi El Hafi, Shoichi Hasegawa, Tadahiro Taniguchi

Link: http://arxiv.org/abs/2404.09647v1open in new window

Abstract: Robots that assist in daily life are required to locate specific instances of objects that match the user's desired object in the environment. This task is known as Instance-Specific Image Goal Navigation (InstanceImageNav), which requires a model capable of distinguishing between different instances within the same class. One significant challenge in robotics is that when a robot observes the same object from various 3D viewpoints, its appearance may differ greatly, making it difficult to recognize and locate the object accurately. In this study, we introduce a method, SimView, that leverages multi-view images based on a 3D semantic map of the environment and self-supervised learning by SimSiam to train an instance identification model on-site. The effectiveness of our approach is validated using a photorealistic simulator, Habitat Matterport 3D, created by scanning real home environments. Our results demonstrate a 1.7-fold improvement in task accuracy compared to CLIP, which is pre-trained multimodal contrastive learning for object search. This improvement highlights the benefits of our proposed fine-tuning method in enhancing the performance of assistive robots in InstanceImageNav tasks. The project website is https://emergentsystemlabstudent.github.io/MultiViewRetrieve/.

Real-world Instance-specific Image Goal Navigation for Service Robots: Bridging the Domain Gap with Contrastive Learning

Authors: Taichi Sakaguchi, Akira Taniguchi, Yoshinobu Hagiwara, Lotfi El Hafi, Shoichi Hasegawa, Tadahiro Taniguchi

Link: http://arxiv.org/abs/2404.09645v1open in new window

Abstract: Improving instance-specific image goal navigation (InstanceImageNav), which locates the identical object in a real-world environment from a query image, is essential for robotic systems to assist users in finding desired objects. The challenge lies in the domain gap between low-quality images observed by the moving robot, characterized by motion blur and low-resolution, and high-quality query images provided by the user. Such domain gaps could significantly reduce the task success rate but have not been the focus of previous work. To address this, we propose a novel method called Few-shot Cross-quality Instance-aware Adaptation (CrossIA), which employs contrastive learning with an instance classifier to align features between massive low- and few high-quality images. This approach effectively reduces the domain gap by bringing the latent representations of cross-quality images closer on an instance basis. Additionally, the system integrates an object image collection with a pre-trained deblurring model to enhance the observed image quality. Our method fine-tunes the SimSiam model, pre-trained on ImageNet, using CrossIA. We evaluated our method's effectiveness through an InstanceImageNav task with 20 different types of instances, where the robot identifies the same instance in a real-world environment as a high-quality query image. Our experiments showed that our method improves the task success rate by up to three times compared to the baseline, a conventional approach based on SuperGlue. These findings highlight the potential of leveraging contrastive learning and image enhancement techniques to bridge the domain gap and improve object localization in robotic applications. The project website is https://emergentsystemlabstudent.github.io/DomainBridgingNav/.

An Origami-Inspired Variable Friction Surface for Increasing the Dexterity of Robotic Grippers

Authors: Qiujie Lu, Angus B. Clark, Matthew Shen, Nicolas Rojas

Link: http://arxiv.org/abs/2404.09644v1open in new window

Abstract: While the grasping capability of robotic grippers has shown significant development, the ability to manipulate objects within the hand is still limited. One explanation for this limitation is the lack of controlled contact variation between the grasped object and the gripper. For instance, human hands have the ability to firmly grip object surfaces, as well as slide over object faces, an aspect that aids the enhanced manipulation of objects within the hand without losing contact. In this letter, we present a parametric, origami-inspired thin surface capable of transitioning between a high friction and a low friction state, suitable for implementation as an epidermis in robotic fingers. A numerical analysis of the proposed surface based on its design parameters, force analysis, and performance in in-hand manipulation tasks is presented. Through the development of a simple two-fingered two-degree-of-freedom gripper utilizing the proposed variable-friction surfaces with different parameters, we experimentally demonstrate the improved manipulation capabilities of the hand when compared to the same gripper without changeable friction. Results show that the pattern density and valley gap are the main parameters that effect the in-hand manipulation performance. The origami-inspired thin surface with a higher pattern density generated a smaller valley gap and smaller height change, producing a more stable improvement of the manipulation capabilities of the hand.

Towards Robotised Palpation for Cancer Detection through Online Tissue Viscoelastic Characterisation with a Collaborative Robotic Arm

Authors: Luca Beber, Edoardo Lamon, Giacomo Moretti, Daniele Fontanelli, Matteo Saveriano, Luigi Palopoli

Link: http://arxiv.org/abs/2404.09542v1open in new window

Abstract: This paper introduces a new method for estimating the penetration of the end effector and the parameters of a soft body using a collaborative robotic arm. This is possible using the dimensionality reduction method that simplifies the Hunt-Crossley model. The parameters can be found without a force sensor thanks to the information of the robotic arm controller. To achieve an online estimation, an extended Kalman filter is employed, which embeds the contact dynamic model. The algorithm is tested with various types of silicone, including samples with hard intrusions to simulate cancerous cells within a soft tissue. The results indicate that this technique can accurately determine the parameters and estimate the penetration of the end effector into the soft body. These promising preliminary results demonstrate the potential for robots to serve as an effective tool for early-stage cancer diagnostics.

2024-04-14

A Linear MPC with Control Barrier Functions for Differential Drive Robots

Authors: Ali Mohamed Ali, Chao Shen, Hashim A. Hashim

Link: http://arxiv.org/abs/2404.10018v1open in new window

Abstract: The need for fully autonomous mobile robots has surged over the past decade, with the imperative of ensuring safe navigation in a dynamic setting emerging as a primary challenge impeding advancements in this domain. In this paper, a Safety Critical Model Predictive Control based on Dynamic Feedback Linearization tailored to the application of differential drive robots with two wheels is proposed to generate control signals that result in obstacle-free paths. A barrier function introduces a safety constraint to the optimization problem of the Model Predictive Control (MPC) to prevent collisions. Due to the intrinsic nonlinearities of the differential drive robots, computational complexity while implementing a Nonlinear Model Predictive Control (NMPC) arises. To facilitate the real-time implementation of the optimization problem and to accommodate the underactuated nature of the robot, a combination of Linear Model Predictive Control (LMPC) and Dynamic Feedback Linearization (DFL) is proposed. The MPC problem is formulated on a linear equivalent model of the differential drive robot rendered by the DFL controller. The analysis of the closed-loop stability and recursive feasibility of the proposed control design is discussed. Numerical experiments illustrate the robustness and effectiveness of the proposed control synthesis in avoiding obstacles with respect to the benchmark of using Euclidean distance constraints. Keywords: Model Predictive Control, MPC, Autonomous Ground Vehicles, Nonlinearity, Dynamic Feedback Linearization, Optimal Control, Differential Robots.

Dynamics of spherical telescopic linear driven rotation robots

Authors: Jasper Zevering, Dorit Borrmann, Anton Bredenbeck, Andreas Nuechter

Link: http://arxiv.org/abs/2404.09230v1open in new window

Abstract: Lunar caves are promising features for long-term and permanent human presence on the moon. However, given their inaccessibility to imaging from survey satellites, the concrete environment within the underground cavities is not well known. Thus, to further the efforts of human presence on the moon, these caves are to be explored by robotic systems. However, a set of environmental factors make this exploration particularly challenging. Among those are the very fine lunar dust that damages exposed sensors and actuators and the unknown composition of the surface and obstacles within the cavity. One robotic system that is particularly fit to meet these challenges is that of a spherical robot, as the exterior shell completely separates the sensors and actuators from the hazardous environment. This work introduces the mathematical description in the form of a dynamic model of a novel locomotion approach for this form factor that adds additional functionality. A set of telescopic linearly extending rods moves the robot using a combination of pushing away from the ground and leveraging the gravitational torque. The approach allows the system to locomote, overcome objects by hoisting its center of gravity on top, and transform into a terrestrial laser scanner by using the rods as a tripod.

A Survey on Integration of Large Language Models with Intelligent Robots

Authors: Yeseung Kim, Dohyun Kim, Jieun Choi, Jisang Park, Nayoung Oh, Daehyung Park

Link: http://arxiv.org/abs/2404.09228v1open in new window

Abstract: In recent years, the integration of large language models (LLMs) has revolutionized the field of robotics, enabling robots to communicate, understand, and reason with human-like proficiency. This paper explores the multifaceted impact of LLMs on robotics, addressing key challenges and opportunities for leveraging these models across various domains. By categorizing and analyzing LLM applications within core robotics elements -- communication, perception, planning, and control -- we aim to provide actionable insights for researchers seeking to integrate LLMs into their robotic systems. Our investigation focuses on LLMs developed post-GPT-3.5, primarily in text-based modalities while also considering multimodal approaches for perception and control. We offer comprehensive guidelines and examples for prompt engineering, facilitating beginners' access to LLM-based robotics solutions. Through tutorial-level examples and structured prompt construction, we illustrate how LLM-guided enhancements can be seamlessly integrated into robotics applications. This survey serves as a roadmap for researchers navigating the evolving landscape of LLM-driven robotics, offering a comprehensive overview and practical guidance for harnessing the power of language models in robotics development.

Design and Fabrication of String-driven Origami Robots

Authors: Peiwen Yang, Shuguang Li

Link: http://arxiv.org/abs/2404.09222v1open in new window

Abstract: Origami designs and structures have been widely used in many fields, such as morphing structures, robotics, and metamaterials. However, the design and fabrication of origami structures rely on human experiences and skills, which are both time and labor-consuming. In this paper, we present a rapid design and fabrication method for string-driven origami structures and robots. We developed an origami design software to generate desired crease patterns based on analytical models and Evolution Strategies (ES). Additionally, the software can automatically produce 3D models of origami designs. We then used a dual-material 3D printer to fabricate those wrapping-based origami structures with the required mechanical properties. We utilized Twisted String Actuators (TSAs) to fold the target 3D structures from flat plates. To demonstrate the capability of these techniques, we built and tested an origami crawling robot and an origami robotic arm using 3D-printed origami structures driven by TSAs.

Tube-RRT*: Efficient Homotopic Path Planning for Swarm Robotics Passing-Through Large-Scale Obstacle Environments

Authors: Pengda Mao, Quan Quan

Link: http://arxiv.org/abs/2404.09200v1open in new window

Abstract: Recently, the concept of optimal virtual tube has emerged as a novel solution to the challenging task of navigating obstacle-dense environments for swarm robotics, offering a wide ranging of applications. However, it lacks an efficient homotopic path planning method in obstacle-dense environments. This paper introduces Tube-RRT*, an innovative homotopic path planning method that builds upon and improves the Rapidly-exploring Random Tree (RRT) algorithm. Tube-RRT* is specifically designed to generate homotopic paths for the trajectories in the virtual tube, strategically considering opening volume and tube length to mitigate swarm congestion and ensure agile navigation. Through comprehensive comparative simulations conducted within complex, large-scale obstacle environments, we demonstrate the effectiveness of Tube-RRT*.

BEATLE - Self-Reconfigurable Aerial Robot: Design, Control and Experimental Validation

Authors: Junichiro Sugihara, Moju Zhao, Takuzumi Nishio, Kei Okada, Masayuki Inaba

Link: http://arxiv.org/abs/2404.09153v2open in new window

Abstract: Modular self-reconfigurable robots (MSRRs) offer enhanced task flexibility by constructing various structures suitable for each task. However, conventional terrestrial MSRRs equipped with wheels face critical challenges, including limitations in the size of constructible structures and system robustness due to elevated wrench loads applied to each module. In this work, we introduce an Aerial MSRR (A-MSRR) system named BEATLE, capable of merging and separating in-flight. BEATLE can merge without applying wrench loads to adjacent modules, thereby expanding the scalability and robustness of conventional terrestrial MSRRs. In this article, we propose a system configuration for BEATLE, including mechanical design, a control framework for multi-connected flight, and a motion planner for reconfiguration motion. The design of a docking mechanism and housing structure aims to balance the durability of the constructed structure with ease of separation. Furthermore, the proposed flight control framework achieves stable multi-connected flight based on contact wrench control. Moreover, the proposed motion planner based on a finite state machine (FSM) achieves precise and robust reconfiguration motion. We also introduce the actual implementation of the prototype and validate the robustness and scalability of the proposed system design through experiments and simulation studies.

BEATLE -- Self-Reconfigurable Aerial Robot: Design, Control and Experimental Validation

Authors: Junichiro Sugihara, Moju Zhao, Takuzumi Nishio, Kei Okada, Masayuki Inaba

Link: http://arxiv.org/abs/2404.09153v1open in new window

Abstract: Modular self-reconfigurable robots (MSRRs) offer enhanced task flexibility by constructing various structures suitable for each task. However, conventional terrestrial MSRRs equipped with wheels face critical challenges, including limitations in the size of constructible structures and system robustness due to elevated wrench loads applied to each module. In this work, we introduce an Aerial MSRR (A-MSRR) system named BEATLE, capable of merging and separating in-flight. BEATLE can merge without applying wrench loads to adjacent modules, thereby expanding the scalability and robustness of conventional terrestrial MSRRs. In this article, we propose a system configuration for BEATLE, including mechanical design, a control framework for multi-connected flight, and a motion planner for reconfiguration motion. The design of a docking mechanism and housing structure aims to balance the durability of the constructed structure with ease of separation. Furthermore, the proposed flight control framework achieves stable multi-connected flight based on contact wrench control. Moreover, the proposed motion planner based on a finite state machine (FSM) achieves precise and robust reconfiguration motion. We also introduce the actual implementation of the prototype and validate the robustness and scalability of the proposed system design through experiments and simulation studies.

2024-04-13

Smart Help: Strategic Opponent Modeling for Proactive and Adaptive Robot Assistance in Households

Authors: Zhihao Cao, Zidong Wang, Siwen Xie, Anji Liu, Lifeng Fan

Link: http://arxiv.org/abs/2404.09001v1open in new window

Abstract: Despite the significant demand for assistive technology among vulnerable groups (e.g., the elderly, children, and the disabled) in daily tasks, research into advanced AI-driven assistive solutions that genuinely accommodate their diverse needs remains sparse. Traditional human-machine interaction tasks often require machines to simply help without nuanced consideration of human abilities and feelings, such as their opportunity for practice and learning, sense of self-improvement, and self-esteem. Addressing this gap, we define a pivotal and novel challenge Smart Help, which aims to provide proactive yet adaptive support to human agents with diverse disabilities and dynamic goals in various tasks and environments. To establish this challenge, we leverage AI2-THOR to build a new interactive 3D realistic household environment for the Smart Help task. We introduce an innovative opponent modeling module that provides a nuanced understanding of the main agent's capabilities and goals, in order to optimize the assisting agent's helping policy. Rigorous experiments validate the efficacy of our model components and show the superiority of our holistic approach against established baselines. Our findings illustrate the potential of AI-imbued assistive robots in improving the well-being of vulnerable groups.

NeurIT: Pushing the Limit of Neural Inertial Tracking for Indoor Robotic IoT

Authors: Xinzhe Zheng, Sijie Ji, Yipeng Pan, Kaiwen Zhang, Chenshu Wu

Link: http://arxiv.org/abs/2404.08939v1open in new window

Abstract: Inertial tracking is vital for robotic IoT and has gained popularity thanks to the ubiquity of low-cost Inertial Measurement Units (IMUs) and deep learning-powered tracking algorithms. Existing works, however, have not fully utilized IMU measurements, particularly magnetometers, nor maximized the potential of deep learning to achieve the desired accuracy. To enhance the tracking accuracy for indoor robotic applications, we introduce NeurIT, a sequence-to-sequence framework that elevates tracking accuracy to a new level. NeurIT employs a Time-Frequency Block-recurrent Transformer (TF-BRT) at its core, combining the power of recurrent neural network (RNN) and Transformer to learn representative features in both time and frequency domains. To fully utilize IMU information, we strategically employ body-frame differentiation of the magnetometer, which considerably reduces the tracking error. NeurIT is implemented on a customized robotic platform and evaluated in various indoor environments. Experimental results demonstrate that NeurIT achieves a mere 1-meter tracking error over a 300-meter distance. Notably, it significantly outperforms state-of-the-art baselines by 48.21% on unseen data. NeurIT also performs comparably to the visual-inertial approach (Tango Phone) in vision-favored conditions and surpasses it in plain environments. We believe NeurIT takes an important step forward toward practical neural inertial tracking for ubiquitous and scalable tracking of robotic things. NeurIT, including the source code and the dataset, is open-sourced here: https://github.com/NeurIT-Project/NeurIT.

2024-04-12

Multi-fingered Robotic Hand Grasping in Cluttered Environments through Hand-object Contact Semantic Mapping

Authors: Lei Zhang, Kaixin Bai, Guowen Huang, Zhaopeng Chen, Jianwei Zhang

Link: http://arxiv.org/abs/2404.08844v1open in new window

Abstract: The integration of optimization method and generative models has significantly advanced dexterous manipulation techniques for five-fingered hand grasping. Yet, the application of these techniques in cluttered environments is a relatively unexplored area. To address this research gap, we have developed a novel method for generating five-fingered hand grasp samples in cluttered settings. This method emphasizes simulated grasp quality and the nuanced interaction between the hand and surrounding objects. A key aspect of our approach is our data generation method, capable of estimating contact spatial and semantic representations and affordance grasps based on object affordance information. Furthermore, our Contact Semantic Conditional Variational Autoencoder (CoSe-CVAE) network is adept at creating comprehensive contact maps from point clouds, incorporating both spatial and semantic data. We introduce a unique grasp detection technique that efficiently formulates mechanical hand grasp poses from these maps. Additionally, our evaluation model is designed to assess grasp quality and collision probability, significantly improving the practicality of five-fingered hand grasping in complex scenarios. Our data generation method outperforms previous datasets in grasp diversity, scene diversity, modality diversity. Our grasp generation method has demonstrated remarkable success, outperforming established baselines with 81.0% average success rate in real-world single-object grasping and 75.3% success rate in multi-object grasping. The dataset and supplementary materials can be found at https://sites.google.com/view/ffh-clutteredgrasping, and we will release the code upon publication.

Inverse Kinematics for Neuro-Robotic Grasping with Humanoid Embodied Agents

Authors: Jan-Gerrit Habekost, Connor Gäde, Philipp Allgeuer, Stefan Wermter

Link: http://arxiv.org/abs/2404.08825v1open in new window

Abstract: This paper introduces a novel zero-shot motion planning method that allows users to quickly design smooth robot motions in Cartesian space. A B'ezier curve-based Cartesian plan is transformed into a joint space trajectory by our neuro-inspired inverse kinematics (IK) method CycleIK, for which we enable platform independence by scaling it to arbitrary robot designs. The motion planner is evaluated on the physical hardware of the two humanoid robots NICO and NICOL in a human-in-the-loop grasping scenario. Our method is deployed with an embodied agent that is a large language model (LLM) at its core. We generalize the embodied agent, that was introduced for NICOL, to also be embodied by NICO. The agent can execute a discrete set of physical actions and allows the user to verbally instruct various different robots. We contribute a grasping primitive to its action space that allows for precise manipulation of household objects. The new CycleIK method is compared to popular numerical IK solvers and state-of-the-art neural IK methods in simulation and is shown to be competitive with or outperform all evaluated methods when the algorithm runtime is very short. The grasping primitive is evaluated on both NICOL and NICO robots with a reported grasp success of 72% to 82% for each robot, respectively.

Collective Bayesian Decision-Making in a Swarm of Miniaturized Robots for Surface Inspection

Authors: Thiemen Siemensma, Darren Chiu, Sneha Ramshanker, Radhika Nagpal, Bahar Haghighat

Link: http://arxiv.org/abs/2404.08390v1open in new window

Abstract: Robot swarms can effectively serve a variety of sensing and inspection applications. Certain inspection tasks require a binary classification decision. This work presents an experimental setup for a surface inspection task based on vibration sensing and studies a Bayesian two-outcome decision-making algorithm in a swarm of miniaturized wheeled robots. The robots are tasked with individually inspecting and collectively classifying a 1mx1m tiled surface consisting of vibrating and non-vibrating tiles based on the majority type of tiles. The robots sense vibrations using onboard IMUs and perform collision avoidance using a set of IR sensors. We develop a simulation and optimization framework leveraging the Webots robotic simulator and a Particle Swarm Optimization (PSO) method. We consider two existing information sharing strategies and propose a new one that allows the swarm to rapidly reach accurate classification decisions. We first find optimal parameters that allow efficient sampling in simulation and then evaluate our proposed strategy against the two existing ones using 100 randomized simulation and 10 real experiments. We find that our proposed method compels the swarm to make decisions at an accelerated rate, with an improvement of up to 20.52% in mean decision time at only 0.78% loss in accuracy.

Optimization-Based System Identification and Moving Horizon Estimation Using Low-Cost Sensors for a Miniature Car-Like Robot

Authors: Sabrina Bodmer, Lukas Vogel, Simon Muntwiler, Alexander Hansson, Tobias Bodewig, Jonas Wahlen, Melanie N. Zeilinger, Andrea Carron

Link: http://arxiv.org/abs/2404.08362v1open in new window

Abstract: This paper presents an open-source miniature car-like robot with low-cost sensing and a pipeline for optimization-based system identification, state estimation, and control. The overall robotics platform comes at a cost of less than $700 and thus significantly simplifies the verification of advanced algorithms in a realistic setting. We present a modified bicycle model with Pacejka tire forces to model the dynamics of the considered all-wheel drive vehicle and to prevent singularities of the model at low velocities. Furthermore, we provide an optimization-based system identification approach and a moving horizon estimation (MHE) scheme. In extensive hardware experiments, we show that the presented system identification approach results in a model with high prediction accuracy, while the MHE results in accurate state estimates. Finally, the overall closed-loop system is shown to perform well even in the presence of sensor failure for limited time intervals. All hardware, firmware, and control and estimation software is released under a BSD 2-clause license to promote widespread adoption and collaboration within the community.

Agile and versatile bipedal robot tracking control through reinforcement learning

Authors: Jiayi Li, Linqi Ye, Yi Cheng, Houde Liu, Bin Liang

Link: http://arxiv.org/abs/2404.08246v1open in new window

Abstract: The remarkable athletic intelligence displayed by humans in complex dynamic movements such as dancing and gymnastics suggests that the balance mechanism in biological beings is decoupled from specific movement patterns. This decoupling allows for the execution of both learned and unlearned movements under certain constraints while maintaining balance through minor whole-body coordination. To replicate this balance ability and body agility, this paper proposes a versatile controller for bipedal robots. This controller achieves ankle and body trajectory tracking across a wide range of gaits using a single small-scale neural network, which is based on a model-based IK solver and reinforcement learning. We consider a single step as the smallest control unit and design a universally applicable control input form suitable for any single-step variation. Highly flexible gait control can be achieved by combining these minimal control units with high-level policy through our extensible control interface. To enhance the trajectory-tracking capability of our controller, we utilize a three-stage training curriculum. After training, the robot can move freely between target footholds at varying distances and heights. The robot can also maintain static balance without repeated stepping to adjust posture. Finally, we evaluate the tracking accuracy of our controller on various bipedal tasks, and the effectiveness of our control framework is verified in the simulation environment.

A Passively Bendable, Compliant Tactile Palm with RObotic Modular Endoskeleton Optical (ROMEO) Fingers

Authors: Sandra Q. Liu, Edward H. Adelson

Link: http://arxiv.org/abs/2404.08227v1open in new window

Abstract: Many robotic hands currently rely on extremely dexterous robotic fingers and a thumb joint to envelop themselves around an object. Few hands focus on the palm even though human hands greatly benefit from their central fold and soft surface. As such, we develop a novel structurally compliant soft palm, which enables more surface area contact for the objects that are pressed into it. Moreover, this design, along with the development of a new low-cost, flexible illumination system, is able to incorporate a high-resolution tactile sensing system inspired by the GelSight sensors. Concurrently, we design RObotic Modular Endoskeleton Optical (ROMEO) fingers, which are underactuated two-segment soft fingers that are able to house the new illumination system, and we integrate them into these various palm configurations. The resulting robotic hand is slightly bigger than a baseball and represents one of the first soft robotic hands with actuated fingers and a passively compliant palm, all of which have high-resolution tactile sensing. This design also potentially helps researchers discover and explore more soft-rigid tactile robotic hand designs with greater capabilities in the future. The supplementary video can be found here: https://youtu.be/RKfIFiewqsg

Loco-Manipulation with Nonimpulsive Contact-Implicit Planning in a Slithering Robot

Authors: Adarsh Salagame, Kruthika Gangaraju, Harin Kumar Nallaguntla, Eric Sihite, Gunar Schirner, Alireza Ramezani

Link: http://arxiv.org/abs/2404.08174v1open in new window

Abstract: Object manipulation has been extensively studied in the context of fixed base and mobile manipulators. However, the overactuated locomotion modality employed by snake robots allows for a unique blend of object manipulation through locomotion, referred to as loco-manipulation. The following work presents an optimization approach to solving the loco-manipulation problem based on non-impulsive implicit contact path planning for our snake robot COBRA. We present the mathematical framework and show high-fidelity simulation results and experiments to demonstrate the effectiveness of our approach.

2024-04-11

Towards a Robust Soft Baby Robot With Rich Interaction Ability for Advanced Machine Learning Algorithms

Authors: Mohannad Alhakami, Dylan R. Ashley, Joel Dunham, Francesco Faccio, Eric Feron, Jürgen Schmidhuber

Link: http://arxiv.org/abs/2404.08093v1open in new window

Abstract: Artificial intelligence has made great strides in many areas lately, yet it has had comparatively little success in general-use robotics. We believe one of the reasons for this is the disconnect between traditional robotic design and the properties needed for open-ended, creativity-based AI systems. To that end, we, taking selective inspiration from nature, build a robust, partially soft robotic limb with a large action space, rich sensory data stream from multiple cameras, and the ability to connect with others to enhance the action space and data stream. As a proof of concept, we train two contemporary machine learning algorithms to perform a simple target-finding task. Altogether, we believe that this design serves as a first step to building a robot tailor-made for achieving artificial general intelligence.

Multi-Robot Target Tracking with Sensing and Communication Danger Zones

Authors: Jiazhen Li, Peihan Li, Yuwei Wu, Gaurav S. Sukhatme, Vijay Kumar, Lifeng Zhou

Link: http://arxiv.org/abs/2404.07880v1open in new window

Abstract: Multi-robot target tracking finds extensive applications in different scenarios, such as environmental surveillance and wildfire management, which require the robustness of the practical deployment of multi-robot systems in uncertain and dangerous environments. Traditional approaches often focus on the performance of tracking accuracy with no modeling and assumption of the environments, neglecting potential environmental hazards which result in system failures in real-world deployments. To address this challenge, we investigate multi-robot target tracking in the adversarial environment considering sensing and communication attacks with uncertainty. We design specific strategies to avoid different danger zones and proposed a multi-agent tracking framework under the perilous environment. We approximate the probabilistic constraints and formulate practical optimization strategies to address computational challenges efficiently. We evaluate the performance of our proposed methods in simulations to demonstrate the ability of robots to adjust their risk-aware behaviors under different levels of environmental uncertainty and risk confidence. The proposed method is further validated via real-world robot experiments where a team of drones successfully track dynamic ground robots while being risk-aware of the sensing and/or communication danger zones.

From the Lab to the Theater: An Unconventional Field Robotics Journey

Authors: Ali Imran, Vivek Shankar Varadharajan, Rafael Gomes Braga, Yann Bouteiller, Abdalwhab Bakheet Mohamed Abdalwhab, Matthis Di-Giacomo, Alexandra Mercader, Giovanni Beltrame, David St-Onge

Link: http://arxiv.org/abs/2404.07795v1open in new window

Abstract: Artistic performances involving robotic systems present unique technical challenges akin to those encountered in other field deployments. In this paper, we delve into the orchestration of robotic artistic performances, focusing on the complexities inherent in communication protocols and localization methods. Through our case studies and experimental insights, we demonstrate the breadth of technical requirements for this type of deployment, and, most importantly, the significant contributions of working closely with non-experts.

Sketch-Plan-Generalize: Continual Few-Shot Learning of Inductively Generalizable Spatial Concepts for Language-Guided Robot Manipulation

Authors: Namasivayam Kalithasan, Sachit Sachdeva, Himanshu Gaurav Singh, Divyanshu Aggarwal, Gurarmaan Singh Panjeta, Vishal Bindal, Arnav Tuli, Rohan Paul, Parag Singla

Link: http://arxiv.org/abs/2404.07774v1open in new window

Abstract: Our goal is to build embodied agents that can learn inductively generalizable spatial concepts in a continual manner, e.g, constructing a tower of a given height. Existing work suffers from certain limitations (a) (Liang et al., 2023) and their multi-modal extensions, rely heavily on prior knowledge and are not grounded in the demonstrations (b) (Liu et al., 2023) lack the ability to generalize due to their purely neural approach. A key challenge is to achieve a fine balance between symbolic representations which have the capability to generalize, and neural representations that are physically grounded. In response, we propose a neuro-symbolic approach by expressing inductive concepts as symbolic compositions over grounded neural concepts. Our key insight is to decompose the concept learning problem into the following steps 1) Sketch: Getting a programmatic representation for the given instruction 2) Plan: Perform Model-Based RL over the sequence of grounded neural action concepts to learn a grounded plan 3) Generalize: Abstract out a generic (lifted) Python program to facilitate generalizability. Continual learning is achieved by interspersing learning of grounded neural concepts with higher level symbolic constructs. Our experiments demonstrate that our approach significantly outperforms existing baselines in terms of its ability to learn novel concepts and generalize inductively.

Diffusing in Someone Else's Shoes: Robotic Perspective Taking with Diffusion

Authors: Josua Spisak, Matthias Kerzel, Stefan Wermter

Link: http://arxiv.org/abs/2404.07735v1open in new window

Abstract: Humanoid robots can benefit from their similarity to the human shape by learning from humans. When humans teach other humans how to perform actions, they often demonstrate the actions and the learning human can try to imitate the demonstration. Being able to mentally transfer from a demonstration seen from a third-person perspective to how it should look from a first-person perspective is fundamental for this ability in humans. As this is a challenging task, it is often simplified for robots by creating a demonstration in the first-person perspective. Creating these demonstrations requires more effort but allows for an easier imitation. We introduce a novel diffusion model aimed at enabling the robot to directly learn from the third-person demonstrations. Our model is capable of learning and generating the first-person perspective from the third-person perspective by translating the size and rotations of objects and the environment between two perspectives. This allows us to utilise the benefits of easy-to-produce third-person demonstrations and easy-to-imitate first-person demonstrations. The model can either represent the first-person perspective in an RGB image or calculate the joint values. Our approach significantly outperforms other image-to-image models in this task.

Reflectance Estimation for Proximity Sensing by Vision-Language Models: Utilizing Distributional Semantics for Low-Level Cognition in Robotics

Authors: Masashi Osada, Gustavo A. Garcia Ricardez, Yosuke Suzuki, Tadahiro Taniguchi

Link: http://arxiv.org/abs/2404.07717v1open in new window

Abstract: Large language models (LLMs) and vision-language models (VLMs) have been increasingly used in robotics for high-level cognition, but their use for low-level cognition, such as interpreting sensor information, remains underexplored. In robotic grasping, estimating the reflectance of objects is crucial for successful grasping, as it significantly impacts the distance measured by proximity sensors. We investigate whether LLMs can estimate reflectance from object names alone, leveraging the embedded human knowledge in distributional semantics, and if the latent structure of language in VLMs positively affects image-based reflectance estimation. In this paper, we verify that 1) LLMs such as GPT-3.5 and GPT-4 can estimate an object's reflectance using only text as input; and 2) VLMs such as CLIP can increase their generalization capabilities in reflectance estimation from images. Our experiments show that GPT-4 can estimate an object's reflectance using only text input with a mean error of 14.7%, lower than the image-only ResNet. Moreover, CLIP achieved the lowest mean error of 11.8%, while GPT-3.5 obtained a competitive 19.9% compared to ResNet's 17.8%. These results suggest that the distributional semantics in LLMs and VLMs increases their generalization capabilities, and the knowledge acquired by VLMs benefits from the latent structure of language.

Safe haptic teleoperations of admittance controlled robots with virtualization of the force feedback

Authors: Lorenzo Pagliara, Enrico Ferrentino, Andrea Chiacchio, Giovanni Russo

Link: http://arxiv.org/abs/2404.07672v1open in new window

Abstract: Haptic teleoperations play a key role in extending human capabilities to perform complex tasks remotely, employing a robotic system. The impact of haptics is far-reaching and can improve the sensory awareness and motor accuracy of the operator. In this context, a key challenge is attaining a natural, stable and safe haptic human-robot interaction. Achieving these conflicting requirements is particularly crucial for complex procedures, e.g. medical ones. To address this challenge, in this work we develop a novel haptic bilateral teleoperation system (HBTS), featuring a virtualized force feedback, based on the motion error generated by an admittance controlled robot. This approach allows decoupling the force rendering system from the control of the interaction: the rendered force is assigned with the desired dynamics, while the admittance control parameters are separately tuned to maximize interaction performance. Furthermore, recognizing the necessity to limit the forces exerted by the robot on the environment, to ensure a safe interaction, we embed a saturation strategy of the motion references provided by the haptic device to admittance control. We validate the different aspects of the proposed HBTS, through a teleoperated blackboard writing experiment, against two other architectures. The results indicate that the proposed HBTS improves the naturalness of teleoperation, as well as safety and accuracy of the interaction.

Weakly-Supervised Learning via Multi-Lateral Decoder Branching for Guidewire Segmentation in Robot-Assisted Cardiovascular Catheterization

Authors: Olatunji Mumini Omisore, Toluwanimi Akinyemi, Anh Nguyen, Lei Wang

Link: http://arxiv.org/abs/2404.07594v1open in new window

Abstract: Although robot-assisted cardiovascular catheterization is commonly performed for intervention of cardiovascular diseases, more studies are needed to support the procedure with automated tool segmentation. This can aid surgeons on tool tracking and visualization during intervention. Learning-based segmentation has recently offered state-of-the-art segmentation performances however, generating ground-truth signals for fully-supervised methods is labor-intensive and time consuming for the interventionists. In this study, a weakly-supervised learning method with multi-lateral pseudo labeling is proposed for tool segmentation in cardiac angiograms. The method includes a modified U-Net model with one encoder and multiple lateral-branched decoders that produce pseudo labels as supervision signals under different perturbation. The pseudo labels are self-generated through a mixed loss function and shared consistency in the decoders. We trained the model end-to-end with weakly-annotated data obtained during robotic cardiac catheterization. Experiments with the proposed model shows weakly annotated data has closer performance to when fully annotated data is used. Compared to three existing weakly-supervised methods, our approach yielded higher segmentation performance across three different cardiac angiogram data. With ablation study, we showed consistent performance under different parameters. Thus, we offer a less expensive method for real-time tool segmentation and tracking during robot-assisted cardiac catheterization.

Differentiable Rendering as a Way to Program Cable-Driven Soft Robots

Authors: Kasra Arnavaz, Kenny Erleben

Link: http://arxiv.org/abs/2404.07590v1open in new window

Abstract: Soft robots have gained increased popularity in recent years due to their adaptability and compliance. In this paper, we use a digital twin model of cable-driven soft robots to learn control parameters in simulation. In doing so, we take advantage of differentiable rendering as a way to instruct robots to complete tasks such as point reach, gripping an object, and obstacle avoidance. This approach simplifies the mathematical description of such complicated tasks and removes the need for landmark points and their tracking. Our experiments demonstrate the applicability of our method.

Socially Pertinent Robots in Gerontological Healthcare

Authors: Xavier Alameda-Pineda, Angus Addlesee, Daniel Hernández García, Chris Reinke, Soraya Arias, Federica Arrigoni, Alex Auternaud, Lauriane Blavette, Cigdem Beyan, Luis Gomez Camara, Ohad Cohen, Alessandro Conti, Sébastien Dacunha, Christian Dondrup, Yoav Ellinson, Francesco Ferro, Sharon Gannot, Florian Gras, Nancie Gunson, Radu Horaud, Moreno D'Incà, Imad Kimouche, Séverin Lemaignan, Oliver Lemon, Cyril Liotard, Luca Marchionni, Mordehay Moradi, Tomas Pajdla, Maribel Pino, Michal Polic, Matthieu Py, Ariel Rado, Bin Ren, Elisa Ricci, Anne-Sophie Rigaud, Paolo Rota, Marta Romeo, Nicu Sebe, Weronika Sieińska, Pinchas Tandeitnik, Francesco Tonini, Nicolas Turro, Timothée Wintz, Yanchao Yu

Link: http://arxiv.org/abs/2404.07560v1open in new window

Abstract: Despite the many recent achievements in developing and deploying social robotics, there are still many underexplored environments and applications for which systematic evaluation of such systems by end-users is necessary. While several robotic platforms have been used in gerontological healthcare, the question of whether or not a social interactive robot with multi-modal conversational capabilities will be useful and accepted in real-life facilities is yet to be answered. This paper is an attempt to partially answer this question, via two waves of experiments with patients and companions in a day-care gerontological facility in Paris with a full-sized humanoid robot endowed with social and conversational interaction capabilities. The software architecture, developed during the H2020 SPRING project, together with the experimental protocol, allowed us to evaluate the acceptability (AES) and usability (SUS) with more than 60 end-users. Overall, the users are receptive to this technology, especially when the robot perception and action skills are robust to environmental clutter and flexible to handle a plethora of different interactions.

Model Predictive Trajectory Planning for Human-Robot Handovers

Authors: Thies Oelerich, Christian Hartl-Nesic, Andreas Kugi

Link: http://arxiv.org/abs/2404.07505v1open in new window

Abstract: This work develops a novel trajectory planner for human-robot handovers. The handover requirements can naturally be handled by a path-following-based model predictive controller, where the path progress serves as a progress measure of the handover. Moreover, the deviations from the path are used to follow human motion by adapting the path deviation bounds with a handover location prediction. A Gaussian process regression model, which is trained on known handover trajectories, is employed for this prediction. Experiments with a collaborative 7-DoF robotic manipulator show the effectiveness and versatility of the proposed approach.

AdaDemo: Data-Efficient Demonstration Expansion for Generalist Robotic Agent

Authors: Tongzhou Mu, Yijie Guo, Jie Xu, Ankit Goyal, Hao Su, Dieter Fox, Animesh Garg

Link: http://arxiv.org/abs/2404.07428v1open in new window

Abstract: Encouraged by the remarkable achievements of language and vision foundation models, developing generalist robotic agents through imitation learning, using large demonstration datasets, has become a prominent area of interest in robot learning. The efficacy of imitation learning is heavily reliant on the quantity and quality of the demonstration datasets. In this study, we aim to scale up demonstrations in a data-efficient way to facilitate the learning of generalist robotic agents. We introduce AdaDemo (Adaptive Online Demonstration Expansion), a general framework designed to improve multi-task policy learning by actively and continually expanding the demonstration dataset. AdaDemo strategically collects new demonstrations to address the identified weakness in the existing policy, ensuring data efficiency is maximized. Through a comprehensive evaluation on a total of 22 tasks across two robotic manipulation benchmarks (RLBench and Adroit), we demonstrate AdaDemo's capability to progressively improve policy performance by guiding the generation of high-quality demonstration datasets in a data-efficient manner.

Too good to be true: People reject free gifts from robots because they infer bad intentions

Authors: Benjamin Lebrun, Andrew Vonasch, Christoph Bartneck

Link: http://arxiv.org/abs/2404.07409v1open in new window

Abstract: A recent psychology study found that people sometimes reject overly generous offers from people because they imagine hidden ''phantom costs'' must be part of the transaction. Phantom costs occur when a person seems overly generous for no apparent reason. This study aims to explore whether people can imagine phantom costs when interacting with a robot. To this end, screen or physically embodied agents (human or robot) offered to people either a cookie or a cookie + $2. Participants were then asked to make a choice whether they would accept or decline the offer. Results showed that people did perceive phantom costs in the offer + $2 conditions when interacting with a human, but also with a robot, across both embodiment levels, leading to the characteristic behavioral effect that offering more money made people less likely to accept the offer. While people were more likely to accept offers from a robot than from a human, people more often accepted offers from humans when they were physically compared to screen embodied but were equally likely to accept the offer from a robot whether it was screen or physically embodied. This suggests that people can treat robots (and humans) as social agents with hidden intentions and knowledge, and that this influences their behavior toward them. This provides not only new insights on how people make decisions when interacting with a robot but also how robot embodiment impacts HRI research.

2024-04-10

Enhancing Accessibility in Soft Robotics: Exploring Magnet-Embedded Paper-Based Interactions

Authors: Ruhan Yang, Ellen Yi-Luen Do

Link: http://arxiv.org/abs/2404.07360v1open in new window

Abstract: This paper explores the implementation of embedded magnets to enhance paper-based interactions. The integration of magnets in paper-based interactions simplifies the fabrication process, making it more accessible for building soft robotics systems. We discuss various interaction patterns achievable through this approach and highlight their potential applications.

Interactive Learning of Physical Object Properties Through Robot Manipulation and Database of Object Measurements

Authors: Andrej Kruzliak, Jiri Hartvich, Shubhan P. Patni, Lukas Rustler, Jan Kristof Behrens, Fares J. Abu-Dakka, Krystian Mikolajczyk, Ville Kyrki, Matej Hoffmann

Link: http://arxiv.org/abs/2404.07344v1open in new window

Abstract: This work presents a framework for automatically extracting physical object properties, such as material composition, mass, volume, and stiffness, through robot manipulation and a database of object measurements. The framework involves exploratory action selection to maximize learning about objects on a table. A Bayesian network models conditional dependencies between object properties, incorporating prior probability distributions and uncertainty associated with measurement actions. The algorithm selects optimal exploratory actions based on expected information gain and updates object properties through Bayesian inference. Experimental evaluation demonstrates effective action selection compared to a baseline and correct termination of the experiments if there is nothing more to be learned. The algorithm proved to behave intelligently when presented with trick objects with material properties in conflict with their appearance. The robot pipeline integrates with a logging module and an online database of objects, containing over 24,000 measurements of 63 objects with different grippers. All code and data are publicly available, facilitating automatic digitization of objects and their physical properties through exploratory manipulations.

Using Neural Networks to Model Hysteretic Kinematics in Tendon-Actuated Continuum Robots

Authors: Yuan Wang, Max McCandless, Abdulhamit Donder, Giovanni Pittiglio, Behnam Moradkhani, Yash Chitalia, Pierre E. Dupont

Link: http://arxiv.org/abs/2404.07168v1open in new window

Abstract: The ability to accurately model mechanical hysteretic behavior in tendon-actuated continuum robots using deep learning approaches is a growing area of interest. In this paper, we investigate the hysteretic response of two types of tendon-actuated continuum robots and, ultimately, compare three types of neural network modeling approaches with both forward and inverse kinematic mappings: feedforward neural network (FNN), FNN with a history input buffer, and long short-term memory (LSTM) network. We seek to determine which model best captures temporal dependent behavior. We find that, depending on the robot's design, choosing different kinematic inputs can alter whether hysteresis is exhibited by the system. Furthermore, we present the results of the model fittings, revealing that, in contrast to the standard FNN, both FNN with a history input buffer and the LSTM model exhibit the capacity to model historical dependence with comparable performance in capturing rate-dependent hysteresis.

CBFKIT: A Control Barrier Function Toolbox for Robotics Applications

Authors: Mitchell Black, Georgios Fainekos, Bardh Hoxha, Hideki Okamoto, Danil Prokhorov

Link: http://arxiv.org/abs/2404.07158v1open in new window

Abstract: This paper introduces CBFKit, a Python/ROS toolbox for safe robotics planning and control under uncertainty. The toolbox provides a general framework for designing control barrier functions for mobility systems within both deterministic and stochastic environments. It can be connected to the ROS open-source robotics middleware, allowing for the setup of multi-robot applications, encoding of environments and maps, and integrations with predictive motion planning algorithms. Additionally, it offers multiple CBF variations and algorithms for robot control. The CBFKit is demonstrated on the Toyota Human Support Robot (HSR) in both simulation and in physical experiments.

Deep Reinforcement Learning for Mobile Robot Path Planning

Authors: Hao Liu, Yi Shen, Shuangjiang Yu, Zijun Gao, Tong Wu

Link: http://arxiv.org/abs/2404.06974v1open in new window

Abstract: Path planning is an important problem with the the applications in many aspects, such as video games, robotics etc. This paper proposes a novel method to address the problem of Deep Reinforcement Learning (DRL) based path planning for a mobile robot. We design DRL-based algorithms, including reward functions, and parameter optimization, to avoid time-consuming work in a 2D environment. We also designed an Two-way search hybrid A* algorithm to improve the quality of local path planning. We transferred the designed algorithm to a simple embedded environment to test the computational load of the algorithm when running on a mobile robot. Experiments show that when deployed on a robot platform, the DRL-based algorithm in this article can achieve better planning results and consume less computing resources.

Robotic Learning for Adaptive Informative Path Planning

Authors: Marija Popovic, Joshua Ott, Julius Rückin, Mykel J. Kochendorfer

Link: http://arxiv.org/abs/2404.06940v1open in new window

Abstract: Adaptive informative path planning (AIPP) is important to many robotics applications, enabling mobile robots to efficiently collect useful data about initially unknown environments. In addition, learning-based methods are increasingly used in robotics to enhance adaptability, versatility, and robustness across diverse and complex tasks. Our survey explores research on applying robotic learning to AIPP, bridging the gap between these two research fields. We begin by providing a unified mathematical framework for general AIPP problems. Next, we establish two complementary taxonomies of current work from the perspectives of (i) learning algorithms and (ii) robotic applications. We explore synergies, recent trends, and highlight the benefits of learning-based methods in AIPP frameworks. Finally, we discuss key challenges and promising future directions to enable more generally applicable and robust robotic data-gathering systems through learning. We provide a comprehensive catalogue of papers reviewed in our survey, including publicly available repositories, to facilitate future studies in the field.

Vision-Language Model-based Physical Reasoning for Robot Liquid Perception

Authors: Wenqiang Lai, Yuan Gao, Tin Lun Lam

Link: http://arxiv.org/abs/2404.06904v1open in new window

Abstract: There is a growing interest in applying large language models (LLMs) in robotic tasks, due to their remarkable reasoning ability and extensive knowledge learned from vast training corpora. Grounding LLMs in the physical world remains an open challenge as they can only process textual input. Recent advancements in large vision-language models (LVLMs) have enabled a more comprehensive understanding of the physical world by incorporating visual input, which provides richer contextual information than language alone. In this work, we proposed a novel paradigm that leveraged GPT-4V(ision), the state-of-the-art LVLM by OpenAI, to enable embodied agents to perceive liquid objects via image-based environmental feedback. Specifically, we exploited the physical understanding of GPT-4V to interpret the visual representation (e.g., time-series plot) of non-visual feedback (e.g., F/T sensor data), indirectly enabling multimodal perception beyond vision and language using images as proxies. We evaluated our method using 10 common household liquids with containers of various geometry and material. Without any training or fine-tuning, we demonstrated that our method can enable the robot to indirectly perceive the physical response of liquids and estimate their viscosity. We also showed that by jointly reasoning over the visual and physical attributes learned through interactions, our method could recognize liquid objects in the absence of strong visual cues (e.g., container labels with legible text or symbols), increasing the accuracy from 69.0% -- achieved by the best-performing vision-only variant -- to 86.0%.

Sound Matters: Auditory Detectability of Mobile Robots

Authors: Subham Agrawal, Marlene Wessels, Jorge de Heuvel, Johannes Kraus, Maren Bennewitz

Link: http://arxiv.org/abs/2404.06807v1open in new window

Abstract: Mobile robots are increasingly being used in noisy environments for social purposes, e.g. to provide support in healthcare or public spaces. Since these robots also operate beyond human sight, the question arises as to how different robot types, ambient noise or cognitive engagement impacts the detection of the robots by their sound. To address this research gap, we conducted a user study measuring auditory detection distances for a wheeled (Turtlebot 2i) and quadruped robot (Unitree Go 1), which emit different consequential sounds when moving. Additionally, we also manipulated background noise levels and participants' engagement in a secondary task during the study. Our results showed that the quadruped robot sound was detected significantly better (i.e., at a larger distance) than the wheeled one, which demonstrates that the movement mechanism has a meaningful impact on the auditory detectability. The detectability for both robots diminished significantly as background noise increased. But even in high background noise, participants detected the quadruped robot at a significantly larger distance. The engagement in a secondary task had hardly any impact. In essence, these findings highlight the critical role of distinguishing auditory characteristics of different robots to improve the smooth human-centered navigation of mobile robots in noisy environments.

Designing Fluid-Exuding Cartilage for Biomimetic Robots Mimicking Human Joint Lubrication Function

Authors: Akihiro Miki, Yuta Sahara, Kazuhiro Miyama, Shunnosuke Yoshimura, Yoshimoto Ribayashi, Shun Hasegawa, Kento Kawaharazuka, Kei Okada, Masayuki Inaba

Link: http://arxiv.org/abs/2404.06740v1open in new window

Abstract: The human joint is an open-type joint composed of bones, cartilage, ligaments, synovial fluid, and joint capsule, having advantages of flexibility and impact resistance. However, replicating this structure in robots introduces friction challenges due to the absence of bearings. To address this, our study focuses on mimicking the fluid-exuding function of human cartilage. We employ a rubber-based 3D printing technique combined with absorbent materials to create a versatile and easily designed cartilage sheet for biomimetic robots. We evaluate both the fluid-exuding function and friction coefficient of the fabricated flat cartilage sheet. Furthermore, we practically create a piece of curved cartilage and an open-type biomimetic ball joint in combination with bones, ligaments, synovial fluid, and joint capsule to demonstrate the utility of the proposed cartilage sheet in the construction of such joints.

Fast and Accurate Relative Motion Tracking for Two Industrial Robots

Authors: Honglu He, Chen-lung Lu, Glenn Saunders, Pinghai Yang, Jeffrey Schoonover, John Wason, Santiago Paternain, Agung Julius, John T. Wen

Link: http://arxiv.org/abs/2404.06687v1open in new window

Abstract: Industrial robotic applications such as spraying, welding, and additive manufacturing frequently require fast, accurate, and uniform motion along a 3D spatial curve. To increase process throughput, some manufacturers propose a dual-robot setup to overcome the speed limitation of a single robot. Industrial robot motion is programmed through waypoints connected by motion primitives (Cartesian linear and circular paths and linear joint paths at constant Cartesian speed). The actual robot motion is affected by the blending between these motion primitives and the pose of the robot (an outstretched/close to singularity pose tends to have larger path-tracking errors). Choosing the waypoints and the speed along each motion segment to achieve the performance requirement is challenging. At present, there is no automated solution, and laborious manual tuning by robot experts is needed to approach the desired performance. In this paper, we present a systematic three-step approach to designing and programming a dual-robot system to optimize system performance. The first step is to select the relative placement between the two robots based on the specified relative motion path. The second step is to select the relative waypoints and the motion primitives. The final step is to update the waypoints iteratively based on the actual relative motion. Waypoint iteration is first executed in simulation and then completed using the actual robots. For performance measures, we use the mean path speed subject to the relative position and orientation constraints and the path speed uniformity constraint. We have demonstrated the effectiveness of this method with ABB and FANUC robots on two challenging test curves. The performance improvement over the current industrial practice baseline is over 300%. Compared to the optimized single-arm case that we have previously reported, the improvement is over 14%.

2024-04-09

GenCHiP: Generating Robot Policy Code for High-Precision and Contact-Rich Manipulation Tasks

Authors: Kaylee Burns, Ajinkya Jain, Keegan Go, Fei Xia, Michael Stark, Stefan Schaal, Karol Hausman

Link: http://arxiv.org/abs/2404.06645v1open in new window

Abstract: Large Language Models (LLMs) have been successful at generating robot policy code, but so far these results have been limited to high-level tasks that do not require precise movement. It is an open question how well such approaches work for tasks that require reasoning over contact forces and working within tight success tolerances. We find that, with the right action space, LLMs are capable of successfully generating policies for a variety of contact-rich and high-precision manipulation tasks, even under noisy conditions, such as perceptual errors or grasping inaccuracies. Specifically, we reparameterize the action space to include compliance with constraints on the interaction forces and stiffnesses involved in reaching a target pose. We validate this approach on subtasks derived from the Functional Manipulation Benchmark (FMB) and NIST Task Board Benchmarks. Exposing this action space alongside methods for estimating object poses improves policy generation with an LLM by greater than 3x and 4x when compared to non-compliant action spaces

Counting Objects in a Robotic Hand

Authors: Francis Tsow, Tianze Chen, Yu Sun

Link: http://arxiv.org/abs/2404.06631v1open in new window

Abstract: A robot performing multi-object grasping needs to sense the number of objects in the hand after grasping. The count plays an important role in determining the robot's next move and the outcome and efficiency of the whole pick-place process. This paper presents a data-driven contrastive learning-based counting classifier with a modified loss function as a simple and effective approach for object counting despite significant occlusion challenges caused by robotic fingers and objects. The model was validated against other models with three different common shapes (spheres, cylinders, and cubes) in simulation and in a real setup. The proposed contrastive learning-based counting approach achieved above 96% accuracy for all three objects in the real setup.

MORPHeus: a Multimodal One-armed Robot-assisted Peeling System with Human Users In-the-loop

Authors: Ruolin Ye, Yifei Hu, Yuhan, Bian, Luke Kulm, Tapomayukh Bhattacharjee

Link: http://arxiv.org/abs/2404.06570v1open in new window

Abstract: Meal preparation is an important instrumental activity of daily living~(IADL). While existing research has explored robotic assistance in meal preparation tasks such as cutting and cooking, the crucial task of peeling has received less attention. Robot-assisted peeling, conventionally a bimanual task, is challenging to deploy in the homes of care recipients using two wheelchair-mounted robot arms due to ergonomic and transferring challenges. This paper introduces a robot-assisted peeling system utilizing a single robotic arm and an assistive cutting board, inspired by the way individuals with one functional hand prepare meals. Our system incorporates a multimodal active perception module to determine whether an area on the food is peeled, a human-in-the-loop long-horizon planner to perform task planning while catering to a user's preference for peeling coverage, and a compliant controller to peel the food items. We demonstrate the system on 12 food items representing the extremes of different shapes, sizes, skin thickness, surface textures, skin vs flesh colors, and deformability.

Large Language Models to the Rescue: Deadlock Resolution in Multi-Robot Systems

Authors: Kunal Garg, Jacob Arkin, Songyuan Zhang, Nicholas Roy, Chuchu Fan

Link: http://arxiv.org/abs/2404.06413v1open in new window

Abstract: Multi-agent robotic systems are prone to deadlocks in an obstacle environment where the system can get stuck away from its desired location under a smooth low-level control policy. Without an external intervention, often in terms of a high-level command, it is not possible to guarantee that just a low-level control policy can resolve such deadlocks. Utilizing the generalizability and low data requirements of large language models (LLMs), this paper explores the possibility of using LLMs for deadlock resolution. We propose a hierarchical control framework where an LLM resolves deadlocks by assigning a leader and direction for the leader to move along. A graph neural network (GNN) based low-level distributed control policy executes the assigned plan. We systematically study various prompting techniques to improve LLM's performance in resolving deadlocks. In particular, as part of prompt engineering, we provide in-context examples for LLMs. We conducted extensive experiments on various multi-robot environments with up to 15 agents and 40 obstacles. Our results demonstrate that LLM-based high-level planners are effective in resolving deadlocks in MRS.

Learning Embeddings with Centroid Triplet Loss for Object Identification in Robotic Grasping

Authors: Anas Gouda, Max Schwarz, Christopher Reining, Sven Behnke, Alice Kirchheim

Link: http://arxiv.org/abs/2404.06277v1open in new window

Abstract: Foundation models are a strong trend in deep learning and computer vision. These models serve as a base for applications as they require minor or no further fine-tuning by developers to integrate into their applications. Foundation models for zero-shot object segmentation such as Segment Anything (SAM) output segmentation masks from images without any further object information. When they are followed in a pipeline by an object identification model, they can perform object detection without training. Here, we focus on training such an object identification model. A crucial practical aspect for an object identification model is to be flexible in input size. As object identification is an image retrieval problem, a suitable method should handle multi-query multi-gallery situations without constraining the number of input images (e.g. by having fixed-size aggregation layers). The key solution to train such a model is the centroid triplet loss (CTL), which aggregates image features to their centroids. CTL yields high accuracy, avoids misleading training signals and keeps the model input size flexible. In our experiments, we establish a new state of the art on the ArmBench object identification task, which shows general applicability of our model. We furthermore demonstrate an integrated unseen object detection pipeline on the challenging HOPE dataset, which requires fine-grained detection. There, our pipeline matches and surpasses related methods which have been trained on dataset-specific data.

Learning Efficient and Fair Policies for Uncertainty-Aware Collaborative Human-Robot Order Picking

Authors: Igor G. Smit, Zaharah Bukhsh, Mykola Pechenizkiy, Kostas Alogariastos, Kasper Hendriks, Yingqian Zhang

Link: http://arxiv.org/abs/2404.08006v1open in new window

Abstract: In collaborative human-robot order picking systems, human pickers and Autonomous Mobile Robots (AMRs) travel independently through a warehouse and meet at pick locations where pickers load items onto the AMRs. In this paper, we consider an optimization problem in such systems where we allocate pickers to AMRs in a stochastic environment. We propose a novel multi-objective Deep Reinforcement Learning (DRL) approach to learn effective allocation policies to maximize pick efficiency while also aiming to improve workload fairness amongst human pickers. In our approach, we model the warehouse states using a graph, and define a neural network architecture that captures regional information and effectively extracts representations related to efficiency and workload. We develop a discrete-event simulation model, which we use to train and evaluate the proposed DRL approach. In the experiments, we demonstrate that our approach can find non-dominated policy sets that outline good trade-offs between fairness and efficiency objectives. The trained policies outperform the benchmarks in terms of both efficiency and fairness. Moreover, they show good transferability properties when tested on scenarios with different warehouse sizes. The implementation of the simulation model, proposed approach, and experiments are published.

Resilient Movement Planning for Continuum Robots

Authors: Oxana Shamilyan, Ievgen Kabin, Zoya Dyka, Peter Langendoerfer

Link: http://arxiv.org/abs/2404.06178v1open in new window

Abstract: The paper presents an experimental study of resilient path planning for con-tinuum robots taking into account the multi-objective optimisation problem. To do this, we used two well-known algorithms, namely Genetic algorithm and A* algorithm, for path planning and the Analytical Hierarchy Process al-gorithm for paths evaluation. In our experiment Analytical Hierarchy Process algorithm considers four different criteria, i.e. distance, motors damage, me-chanical damage and accuracy each considered to contribute to the resilience of a continuum robot. The use of different criteria is necessary to increasing the time to maintenance operations of the robot. The experiment shows that on the one hand both algorithms can be used in combination with Analytical Hierarchy Process algorithm for multi criteria path-planning, while Genetic algorithm shows superior performance in the comparison of the two algo-rithms.

Intelligence and Motion Models of Continuum Robots: an Overview

Authors: Oxana Shamilyan, Ievgen Kabin, Zoya Dyka, Oleksandr Sudakov, Andrii Cherninskyi, Marcin Brzozowski, Peter Langendoerfer

Link: http://arxiv.org/abs/2404.06171v1open in new window

Abstract: Many technical solutions are bio-inspired. Octopus-inspired robotic arms belong to continuum robots which are used in minimally invasive surgery or for technical system restoration in areas difficult-toaccess. Continuum robot missions are bounded with their motions, whereby the motion of the robots is controlled by humans via wireless communication. In case of a lost connection, robot autonomy is required. Distributed control and distributed decision-making mechanisms based on artificial intelligence approaches can be a promising solution to achieve autonomy of technical systems and to increase their resilience. However these methods are not well investigated yet. Octopuses are the living example of natural distributed intelligence but their learning and decision-making mechanisms are also not fully investigated and understood yet. Our major interest is investigating mechanisms of Distributed Artificial Intelligence as a basis for improving resilience of complex systems. We decided to use a physical continuum robot prototype that is able to perform some basic movements for our research. The idea is to research how a technical system can be empowered to combine movements into sequences of motions by itself. For the experimental investigations a suitable physical prototype has to be selected, its motion control has to be implemented and automated. In this paper, we give an overview combining different fields of research, such as Distributed Artificial Intelligence and continuum robots based on 98 publications. We provide a detailed description of the basic motion control models of continuum robots based on the literature reviewed, discuss different aspects of autonomy and give an overview of physical prototypes of continuum robots.

Adaptable Recovery Behaviors in Robotics: A Behavior Trees and Motion Generators(BTMG) Approach for Failure Management

Authors: Faseeh Ahmad, Matthias Mayr, Sulthan Suresh-Fazeela, Volker Kreuger

Link: http://arxiv.org/abs/2404.06129v1open in new window

Abstract: In dynamic operational environments, particularly in collaborative robotics, the inevitability of failures necessitates robust and adaptable recovery strategies. Traditional automated recovery strategies, while effective for predefined scenarios, often lack the flexibility required for on-the-fly task management and adaptation to expected failures. Addressing this gap, we propose a novel approach that models recovery behaviors as adaptable robotic skills, leveraging the Behavior Trees and Motion Generators~(BTMG) framework for policy representation. This approach distinguishes itself by employing reinforcement learning~(RL) to dynamically refine recovery behavior parameters, enabling a tailored response to a wide array of failure scenarios with minimal human intervention. We assess our methodology through a series of progressively challenging scenarios within a peg-in-a-hole task, demonstrating the approach's effectiveness in enhancing operational efficiency and task success rates in collaborative robotics settings. We validate our approach using a dual-arm KUKA robot.

EVE: Enabling Anyone to Train Robot using Augmented Reality

Authors: Jun Wang, Chun-Cheng Chang, Jiafei Duan, Dieter Fox, Ranjay Krishna

Link: http://arxiv.org/abs/2404.06089v1open in new window

Abstract: The increasing affordability of robot hardware is accelerating the integration of robots into everyday activities. However, training a robot to automate a task typically requires physical robots and expensive demonstration data from trained human annotators. Consequently, only those with access to physical robots produce demonstrations to train robots. To mitigate this issue, we introduce EVE, an iOS app that enables everyday users to train robots using intuitive augmented reality visualizations without needing a physical robot. With EVE, users can collect demonstrations by specifying waypoints with their hands, visually inspecting the environment for obstacles, modifying existing waypoints, and verifying collected trajectories. In a user study ($N=14$, $D=30$) consisting of three common tabletop tasks, EVE outperformed three state-of-the-art interfaces in success rate and was comparable to kinesthetic teaching-physically moving a real robot-in completion time, usability, motion intent communication, enjoyment, and preference ($mean_{p}=0.30$). We conclude by enumerating limitations and design considerations for future AR-based demonstration collection systems for robotics.

3D Branch Point Cloud Completion for Robotic Pruning in Apple Orchards

Authors: Tian Qiu, Alan Zoubi, Nikolai Spine, Lailiang Cheng, Yu Jiang

Link: http://arxiv.org/abs/2404.05953v1open in new window

Abstract: Robotic branch pruning is a significantly growing research area to cope with the shortage of labor force in the context of agriculture. One fundamental requirement in robotic pruning is the perception of detailed geometry and topology of branches. However, the point clouds obtained in agricultural settings often exhibit incompleteness due to several constraints, thereby restricting the accuracy of downstream robotic pruning. In this work, we addressed the issue of point cloud quality through a simulation-based deep neural network, leveraging a Real-to-Simulation (Real2Sim) data generation pipeline that not only eliminates the need for manual parameterization but also guarantees the realism of simulated data. The simulation-based neural network was applied to jointly perform point cloud completion and skeletonization on real-world partial branches, without additional real-world training. The Sim2Real qualitative completion and skeletonization results showed the model's remarkable capability for geometry reconstruction and topology prediction. Additionally, we quantitatively evaluated the Sim2Real performance by comparing branch-level trait characterization errors using raw incomplete data and complete data. The Mean Absolute Error (MAE) reduced by 75% and 8% for branch diameter and branch angle estimation, respectively, using the best complete data, which indicates the effectiveness of the Real2Sim data in a zero-shot generalization setting. The characterization improvements contributed to the precision and efficacy of robotic branch pruning.

Robot Safe Planning In Dynamic Environments Based On Model Predictive Control Using Control Barrier Function

Authors: Zetao Lu, Kaijun Feng, Jun Xu, Haoyao Chen, Yunjiang Lou

Link: http://arxiv.org/abs/2404.05952v1open in new window

Abstract: Implementing obstacle avoidance in dynamic environments is a challenging problem for robots. Model predictive control (MPC) is a popular strategy for dealing with this type of problem, and recent work mainly uses control barrier function (CBF) as hard constraints to ensure that the system state remains in the safe set. However, in crowded scenarios, effective solutions may not be obtained due to infeasibility problems, resulting in degraded controller performance. We propose a new MPC framework that integrates CBF to tackle the issue of obstacle avoidance in dynamic environments, in which the infeasibility problem induced by hard constraints operating over the whole prediction horizon is solved by softening the constraints and introducing exact penalty, prompting the robot to actively seek out new paths. At the same time, generalized CBF is extended as a single-step safety constraint of the controller to enhance the safety of the robot during navigation. The efficacy of the proposed method is first shown through simulation experiments, in which a double-integrator system and a unicycle system are employed, and the proposed method outperforms other controllers in terms of safety, feasibility, and navigation efficiency. Furthermore, real-world experiment on an MR1000 robot is implemented to demonstrate the effectiveness of the proposed method.

Body Design and Gait Generation of Chair-Type Asymmetrical Tripedal Low-rigidity Robot

Authors: Shintaro Inoue, Kento Kawaharazuka, Kei Okada, Masayuki Inaba

Link: http://arxiv.org/abs/2404.05932v1open in new window

Abstract: In this study, a chair-type asymmetric tripedal low-rigidity robot was designed based on the three-legged chair character in the movie "Suzume" and its gait was generated. Its body structure consists of three legs that are asymmetric to the body, so it cannot be easily balanced. In addition, the actuator is a servo motor that can only feed-forward rotational angle commands and the sensor can only sense the robot's posture quaternion. In such an asymmetric and imperfect body structure, we analyzed how gait is generated in walking and stand-up motions by generating gaits with two different methods: a method using linear completion to connect the postures necessary for the gait discovered through trial and error using the actual robot, and a method using the gait generated by reinforcement learning in the simulator and reflecting it to the actual robot. Both methods were able to generate gait that realized walking and stand-up motions, and interesting gait patterns were observed, which differed depending on the method, and were confirmed on the actual robot. Our code and demonstration videos are available here: https://github.com/shin0805/Chair-TypeAsymmetricalTripedalRobot.git

2024-04-08

On the Fly Robotic-Assisted Medical Instrument Planning and Execution Using Mixed Reality

Authors: Letian Ai, Yihao Liu, Mehran Armand, Amir Kheradmand, Alejandro Martin-Gomez

Link: http://arxiv.org/abs/2404.05887v1open in new window

Abstract: Robotic-assisted medical systems (RAMS) have gained significant attention for their advantages in alleviating surgeons' fatigue and improving patients' outcomes. These systems comprise a range of human-computer interactions, including medical scene monitoring, anatomical target planning, and robot manipulation. However, despite its versatility and effectiveness, RAMS demands expertise in robotics, leading to a high learning cost for the operator. In this work, we introduce a novel framework using mixed reality technologies to ease the use of RAMS. The proposed framework achieves real-time planning and execution of medical instruments by providing 3D anatomical image overlay, human-robot collision detection, and robot programming interface. These features, integrated with an easy-to-use calibration method for head-mounted display, improve the effectiveness of human-robot interactions. To assess the feasibility of the framework, two medical applications are presented in this work: 1) coil placement during transcranial magnetic stimulation and 2) drill and injector device positioning during femoroplasty. Results from these use cases demonstrate its potential to extend to a wider range of medical scenarios.

CoBT: Collaborative Programming of Behaviour Trees from One Demonstration for Robot Manipulation

Authors: Aayush Jain, Philip Long, Valeria Villani, John D. Kelleher, Maria Chiara Leva

Link: http://arxiv.org/abs/2404.05870v2open in new window

Abstract: Mass customization and shorter manufacturing cycles are becoming more important among small and medium-sized companies. However, classical industrial robots struggle to cope with product variation and dynamic environments. In this paper, we present CoBT, a collaborative programming by demonstration framework for generating reactive and modular behavior trees. CoBT relies on a single demonstration and a combination of data-driven machine learning methods with logic-based declarative learning to learn a task, thus eliminating the need for programming expertise or long development times. The proposed framework is experimentally validated on 7 manipulation tasks and we show that CoBT achieves approx. 93% success rate overall with an average of 7.5s programming time. We conduct a pilot study with non-expert users to provide feedback regarding the usability of CoBT.

A Neuromorphic Approach to Obstacle Avoidance in Robot Manipulation

Authors: Ahmed Faisal Abdelrahman, Matias Valdenegro-Toro, Maren Bennewitz, Paul G. Plöger

Link: http://arxiv.org/abs/2404.05858v1open in new window

Abstract: Neuromorphic computing mimics computational principles of the brain in $\textit{silico}$ and motivates research into event-based vision and spiking neural networks (SNNs). Event cameras (ECs) exclusively capture local intensity changes and offer superior power consumption, response latencies, and dynamic ranges. SNNs replicate biological neuronal dynamics and have demonstrated potential as alternatives to conventional artificial neural networks (ANNs), such as in reducing energy expenditure and inference time in visual classification. Nevertheless, these novel paradigms remain scarcely explored outside the domain of aerial robots. To investigate the utility of brain-inspired sensing and data processing, we developed a neuromorphic approach to obstacle avoidance on a camera-equipped manipulator. Our approach adapts high-level trajectory plans with reactive maneuvers by processing emulated event data in a convolutional SNN, decoding neural activations into avoidance motions, and adjusting plans using a dynamic motion primitive. We conducted experiments with a Kinova Gen3 arm performing simple reaching tasks that involve obstacles in sets of distinct task scenarios and in comparison to a non-adaptive baseline. Our neuromorphic approach facilitated reliable avoidance of imminent collisions in simulated and real-world experiments, where the baseline consistently failed. Trajectory adaptations had low impacts on safety and predictability criteria. Among the notable SNN properties were the correlation of computations with the magnitude of perceived motions and a robustness to different event emulation methods. Tests with a DAVIS346 EC showed similar performance, validating our experimental event emulation. Our results motivate incorporating SNN learning, utilizing neuromorphic processors, and further exploring the potential of neuromorphic methods.

Unveiling Latent Topics in Robotic Process Automation -- an Approach based on Latent Dirichlet Allocation Smart Review

Authors: Petr Prucha, Peter Madzik, Lukas Falat, Hajo A. Reijers

Link: http://arxiv.org/abs/2404.05836v1open in new window

Abstract: Robotic process automation (RPA) is a software technology that in recent years has gained a lot of attention and popularity. By now, research on RPA has spread into multiple research streams. This study aims to create a science map of RPA and its aspects by revealing latent topics related to RPA, their research interest, impact, and time development. We provide a systematic framework that is helpful to develop further research into this technology. By using an unsupervised machine learning method based on Latent Dirichlet Allocation, we were able to analyse over 2000 paper abstracts. Among these, we found 100 distinct study topics, 15 of which have been included in the science map we provide.

Humanoid-Gym: Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real Transfer

Authors: Xinyang Gu, Yen-Jen Wang, Jianyu Chen

Link: http://arxiv.org/abs/2404.05695v1open in new window

Abstract: Humanoid-Gym is an easy-to-use reinforcement learning (RL) framework based on Nvidia Isaac Gym, designed to train locomotion skills for humanoid robots, emphasizing zero-shot transfer from simulation to the real-world environment. Humanoid-Gym also integrates a sim-to-sim framework from Isaac Gym to Mujoco that allows users to verify the trained policies in different physical simulations to ensure the robustness and generalization of the policies. This framework is verified by RobotEra's XBot-S (1.2-meter tall humanoid robot) and XBot-L (1.65-meter tall humanoid robot) in a real-world environment with zero-shot sim-to-real transfer. The project website and source code can be found at: https://sites.google.com/view/humanoid-gym/.

OtterROS: Picking and Programming an Uncrewed Surface Vessel for Experimental Field Robotics Research with ROS 2

Authors: Thomas M. C. Sears, M. Riley Cooper, Sabrina R. Button, Joshua A. Marshall

Link: http://arxiv.org/abs/2404.05627v1open in new window

Abstract: There exist a wide range of options for field robotics research using ground and aerial mobile robots, but there are comparatively few robust and research-ready uncrewed surface vessels (USVs). This workshop paper starts with a snapshot of USVs currently available to the research community and then describes "OtterROS", an open source ROS 2 solution for the Otter USV. Field experiments using OtterROS are described, which highlight the utility of the Otter USV and the benefits of using ROS 2 in aquatic robotics research. For those interested in USV research, the paper details recommended hardware to run OtterROS and includes an example ROS 2 package using OtterROS, removing unnecessary non-recurring engineering from field robotics research activities.

Stochastic Online Optimization for Cyber-Physical and Robotic Systems

Authors: Hao Ma, Melanie Zeilinger, Michael Muehlebach

Link: http://arxiv.org/abs/2404.05318v1open in new window

Abstract: We propose a novel gradient-based online optimization framework for solving stochastic programming problems that frequently arise in the context of cyber-physical and robotic systems. Our problem formulation accommodates constraints that model the evolution of a cyber-physical system, which has, in general, a continuous state and action space, is nonlinear, and where the state is only partially observed. We also incorporate an approximate model of the dynamics as prior knowledge into the learning process and show that even rough estimates of the dynamics can significantly improve the convergence of our algorithms. Our online optimization framework encompasses both gradient descent and quasi-Newton methods, and we provide a unified convergence analysis of our algorithms in a non-convex setting. We also characterize the impact of modeling errors in the system dynamics on the convergence rate of the algorithms. Finally, we evaluate our algorithms in simulations of a flexible beam, a four-legged walking robot, and in real-world experiments with a ping-pong playing robot.

Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models

Authors: Yutao Ouyang, Jinhan Li, Yunfei Li, Zhongyu Li, Chao Yu, Koushil Sreenath, Yi Wu

Link: http://arxiv.org/abs/2404.05291v1open in new window

Abstract: We present a large language model (LLM) based system to empower quadrupedal robots with problem-solving abilities for long-horizon tasks beyond short-term motions. Long-horizon tasks for quadrupeds are challenging since they require both a high-level understanding of the semantics of the problem for task planning and a broad range of locomotion and manipulation skills to interact with the environment. Our system builds a high-level reasoning layer with large language models, which generates hybrid discrete-continuous plans as robot code from task descriptions. It comprises multiple LLM agents: a semantic planner for sketching a plan, a parameter calculator for predicting arguments in the plan, and a code generator to convert the plan into executable robot code. At the low level, we adopt reinforcement learning to train a set of motion planning and control skills to unleash the flexibility of quadrupeds for rich environment interactions. Our system is tested on long-horizon tasks that are infeasible to complete with one single skill. Simulation and real-world experiments show that it successfully figures out multi-step strategies and demonstrates non-trivial behaviors, including building tools or notifying a human for help.

Robust Anthropomorphic Robotic Manipulation through Biomimetic Distributed Compliance

Authors: Kai Junge, Josie Hughes

Link: http://arxiv.org/abs/2404.05262v1open in new window

Abstract: The impressive capabilities of humans to robustly perform manipulation relies on compliant interactions, enabled through the structure and materials spatially distributed in our hands. We propose by mimicking this distributed compliance in an anthropomorphic robotic hand, the open-loop manipulation robustness increases and observe the emergence of human-like behaviours. To achieve this, we introduce the ADAPT Hand equipped with tunable compliance throughout the skin, fingers, and the wrist. Through extensive automated pick-and-place tests, we show the grasping robustness closely mirrors an estimated geometric theoretical limit, while `stress-testing' the robot hand to perform 800+ grasps. Finally, 24 items with largely varying geometries are grasped in a constrained environment with a success rate of 93%. We demonstrate the hand-object self-organization behavior underlines this extreme robustness, where the hand automatically exhibits different grasp types depending on object geometries. Furthermore, the robot grasp type mimics a natural human grasp with a direct similarity of 68%.

MeSA-DRL: Memory-Enhanced Deep Reinforcement Learning for Advanced Socially Aware Robot Navigation in Crowded Environments

Authors: Mannan Saeed Muhammad, Estrella Montero

Link: http://arxiv.org/abs/2404.05203v1open in new window

Abstract: Autonomous navigation capabilities play a critical role in service robots operating in environments where human interactions are pivotal, due to the dynamic and unpredictable nature of these environments. However, the variability in human behavior presents a substantial challenge for robots in predicting and anticipating movements, particularly in crowded scenarios. To address this issue, a memory-enabled deep reinforcement learning framework is proposed for autonomous robot navigation in diverse pedestrian scenarios. The proposed framework leverages long-term memory to retain essential information about the surroundings and model sequential dependencies effectively. The importance of human-robot interactions is also encoded to assign higher attention to these interactions. A global planning mechanism is incorporated into the memory-enabled architecture. Additionally, a multi-term reward system is designed to prioritize and encourage long-sighted robot behaviors by incorporating dynamic warning zones. Simultaneously, it promotes smooth trajectories and minimizes the time taken to reach the robot's desired goal. Extensive simulation experiments show that the suggested approach outperforms representative state-of-the-art methods, showcasing its ability to a navigation efficiency and safety in real-world scenarios.

LLM-BT: Performing Robotic Adaptive Tasks based on Large Language Models and Behavior Trees

Authors: Haotian Zhou, Yunhan Lin, Longwu Yan, Jihong Zhu, Huasong Min

Link: http://arxiv.org/abs/2404.05134v1open in new window

Abstract: Large Language Models (LLMs) have been widely utilized to perform complex robotic tasks. However, handling external disturbances during tasks is still an open challenge. This paper proposes a novel method to achieve robotic adaptive tasks based on LLMs and Behavior Trees (BTs). It utilizes ChatGPT to reason the descriptive steps of tasks. In order to enable ChatGPT to understand the environment, semantic maps are constructed by an object recognition algorithm. Then, we design a Parser module based on Bidirectional Encoder Representations from Transformers (BERT) to parse these steps into initial BTs. Subsequently, a BTs Update algorithm is proposed to expand the initial BTs dynamically to control robots to perform adaptive tasks. Different from other LLM-based methods for complex robotic tasks, our method outputs variable BTs that can add and execute new actions according to environmental changes, which is robust to external disturbances. Our method is validated with simulation in different practical scenarios.

Rollbot: a Spherical Robot Driven by a Single Actuator

Authors: Jingxian Wang, Michael Rubenstein

Link: http://arxiv.org/abs/2404.05120v1open in new window

Abstract: Here we present Rollbot, the first spherical robot capable of controllably maneuvering on 2D plane with a single actuator. Rollbot rolls on the ground in circular pattern and controls its motion by changing the curvature of the trajectory through accelerating and decelerating its single motor and attached mass. We present the theoretical analysis, design, and control of Rollbot, and demonstrate its ability to move in a controllable circular pattern and follow waypoints.

2024-04-07

Legibot: Generating Legible Motions for Service Robots Using Cost-Based Local Planners

Authors: Javad Amirian, Mouad Abrini, Mohamed Chetouani

Link: http://arxiv.org/abs/2404.05100v1open in new window

Abstract: With the increasing presence of social robots in various environments and applications, there is an increasing need for these robots to exhibit socially-compliant behaviors. Legible motion, characterized by the ability of a robot to clearly and quickly convey intentions and goals to the individuals in its vicinity, through its motion, holds significant importance in this context. This will improve the overall user experience and acceptance of robots in human environments. In this paper, we introduce a novel approach to incorporate legibility into local motion planning for mobile robots. This can enable robots to generate legible motions in real-time and dynamic environments. To demonstrate the effectiveness of our proposed methodology, we also provide a robotic stack designed for deploying legibility-aware motion planning in a social robot, by integrating perception and localization components.

PCBot: a Minimalist Robot Designed for Swarm Applications

Authors: Jingxian Wang, Michael Rubenstein

Link: http://arxiv.org/abs/2404.05087v1open in new window

Abstract: Complexity, cost, and power requirements for the actuation of individual robots can play a large factor in limiting the size of robotic swarms. Here we present PCBot, a minimalist robot that can precisely move on an orbital shake table using a bi-stable solenoid actuator built directly into its PCB. This allows the actuator to be built as part of the automated PCB manufacturing process, greatly reducing the impact it has on manual assembly. Thanks to this novel actuator design, PCBot has merely five major components and can be assembled in under 20 seconds, potentially enabling them to be easily mass-manufactured. Here we present the electro-magnetic and mechanical design of PCBot. Additionally, a prototype robot is used to demonstrate its ability to move in a straight line as well as follow given paths.

Adaptive Anchor Pairs Selection in a TDOA-based System Through Robot Localization Error Minimization

Authors: Marcin Kolakowski

Link: http://arxiv.org/abs/2404.05067v1open in new window

Abstract: The following paper presents an adaptive anchor pairs selection method for ultra-wideband (UWB) Time Difference of Arrival (TDOA) based positioning systems. The method divides the area covered by the system into several zones and assigns them anchor pair sets. The pair sets are determined during calibration based on localization root mean square error (RMSE). The calibration assumes driving a mobile platform equipped with a LiDAR sensor and a UWB tag through the specified zones. The robot is localized separately based on a large set of different TDOA pairs and using a LiDAR, which acts as the reference. For each zone, the TDOA pairs set for which the registered RMSE is lowest is selected and used for localization in the routine system work. The proposed method has been tested with simulations and experiments. The results for both simulated static and experimental dynamic scenarios have proven that the adaptive selection of the anchor nodes leads to an increase in localization accuracy. In the experiment, the median trajectory error for a moving person localization was at a level of 25 cm.

Co-design Accessible Public Robots: Insights from People with Mobility Disability, Robotic Practitioners and Their Collaborations

Authors: Howard Ziyu Han, Franklin Mingzhe Li, Alesandra Baca Vazquez, Daragh Byrne, Nikolas Martelaro, Sarah E Fox

Link: http://arxiv.org/abs/2404.05050v1open in new window

Abstract: Sidewalk robots are increasingly common across the globe. Yet, their operation on public paths poses challenges for people with mobility disabilities (PwMD) who face barriers to accessibility, such as insufficient curb cuts. We interviewed 15 PwMD to understand how they perceive sidewalk robots. Findings indicated that PwMD feel they have to compete for space on the sidewalk when robots are introduced. We next interviewed eight robotics practitioners to learn about their attitudes towards accessibility. Practitioners described how issues often stem from robotic companies addressing accessibility only after problems arise. Both interview groups underscored the importance of integrating accessibility from the outset. Building on this finding, we held four co-design workshops with PwMD and practitioners in pairs. These convenings brought to bear accessibility needs around robots operating in public spaces and in the public interest. Our study aims to set the stage for a more inclusive future around public service robots.

StaccaToe: A Single-Leg Robot that Mimics the Human Leg and Toe

Authors: Nisal Perera, Shangqun Yu, Daniel Marew, Mack Tang, Ken Suzuki, Aidan McCormack, Shifan Zhu, Yong-Jae Kim, Donghyun Kim

Link: http://arxiv.org/abs/2404.05039v1open in new window

Abstract: We introduce StaccaToe, a human-scale, electric motor-powered single-leg robot designed to rival the agility of human locomotion through two distinctive attributes: an actuated toe and a co-actuation configuration inspired by the human leg. Leveraging the foundational design of HyperLeg's lower leg mechanism, we develop a stand-alone robot by incorporating new link designs, custom-designed power electronics, and a refined control system. Unlike previous jumping robots that rely on either special mechanisms (e.g., springs and clutches) or hydraulic/pneumatic actuators, StaccaToe employs electric motors without energy storage mechanisms. This choice underscores our ultimate goal of developing a practical, high-performance humanoid robot capable of human-like, stable walking as well as explosive dynamic movements. In this paper, we aim to empirically evaluate the balance capability and the exertion of explosive ground reaction forces of our toe and co-actuation mechanisms. Throughout extensive hardware and controller development, StaccaToe showcases its control fidelity by demonstrating a balanced tip-toe stance and dynamic jump. This study is significant for three key reasons: 1) StaccaToe represents the first human-scale, electric motor-driven single-leg robot to execute dynamic maneuvers without relying on specialized mechanisms; 2) our research provides empirical evidence of the benefits of replicating critical human leg attributes in robotic design; and 3) we explain the design process for creating agile legged robots, the details that have been scantily covered in academic literature.

PathFinder: Attention-Driven Dynamic Non-Line-of-Sight Tracking with a Mobile Robot

Authors: Shenbagaraj Kannapiran, Sreenithy Chandran, Suren Jayasuriya, Spring Berman

Link: http://arxiv.org/abs/2404.05024v1open in new window

Abstract: The study of non-line-of-sight (NLOS) imaging is growing due to its many potential applications, including rescue operations and pedestrian detection by self-driving cars. However, implementing NLOS imaging on a moving camera remains an open area of research. Existing NLOS imaging methods rely on time-resolved detectors and laser configurations that require precise optical alignment, making it difficult to deploy them in dynamic environments. This work proposes a data-driven approach to NLOS imaging, PathFinder, that can be used with a standard RGB camera mounted on a small, power-constrained mobile robot, such as an aerial drone. Our experimental pipeline is designed to accurately estimate the 2D trajectory of a person who moves in a Manhattan-world environment while remaining hidden from the camera's field-of-view. We introduce a novel approach to process a sequence of dynamic successive frames in a line-of-sight (LOS) video using an attention-based neural network that performs inference in real-time. The method also includes a preprocessing selection metric that analyzes images from a moving camera which contain multiple vertical planar surfaces, such as walls and building facades, and extracts planes that return maximum NLOS information. We validate the approach on in-the-wild scenes using a drone for video capture, thus demonstrating low-cost NLOS imaging in dynamic capture environments.

RoboMP$^2$: A Robotic Multimodal Perception-Planning Framework with Multimodal Large Language Models

Authors: Qi Lv, Hao Li, Xiang Deng, Rui Shao, Michael Yu Wang, Liqiang Nie

Link: http://arxiv.org/abs/2404.04929v1open in new window

Abstract: Multimodal Large Language Models (MLLMs) have shown impressive reasoning abilities and general intelligence in various domains. It inspires researchers to train end-to-end MLLMs or utilize large models to generate policies with human-selected prompts for embodied agents. However, these methods exhibit limited generalization capabilities on unseen tasks or scenarios, and overlook the multimodal environment information which is critical for robots to make decisions. In this paper, we introduce a novel Robotic Multimodal Perception-Planning (RoboMP$^2$) framework for robotic manipulation which consists of a Goal-Conditioned Multimodal Preceptor (GCMP) and a Retrieval-Augmented Multimodal Planner (RAMP). Specially, GCMP captures environment states by employing a tailored MLLMs for embodied agents with the abilities of semantic reasoning and localization. RAMP utilizes coarse-to-fine retrieval method to find the $k$ most-relevant policies as in-context demonstrations to enhance the planner. Extensive experiments demonstrate the superiority of RoboMP$^2$ on both VIMA benchmark and real-world tasks, with around 10% improvement over the baselines.

Learning Adaptive Multi-Objective Robot Navigation with Demonstrations

Authors: Jorge de Heuvel, Tharun Sethuraman, Maren Bennewitz

Link: http://arxiv.org/abs/2404.04857v1open in new window

Abstract: Preference-aligned robot navigation in human environments is typically achieved through learning-based approaches, utilizing demonstrations and user feedback for personalization. However, personal preferences are subject to change and might even be context-dependent. Yet traditional reinforcement learning (RL) approaches with a static reward function often fall short in adapting to these varying user preferences. This paper introduces a framework that combines multi-objective reinforcement learning (MORL) with demonstration-based learning. Our approach allows for dynamic adaptation to changing user preferences without retraining. Through rigorous evaluations, including sim-to-real and robot-to-robot transfers, we demonstrate our framework's capability to reflect user preferences accurately while achieving high navigational performance in terms of collision avoidance and goal pursuance.

EnQuery: Ensemble Policies for Diverse Query-Generation in Preference Alignment of Robot Navigation

Authors: Jorge de Heuvel, Florian Seiler, Maren Bennewitz

Link: http://arxiv.org/abs/2404.04852v1open in new window

Abstract: To align mobile robot navigation policies with user preferences through reinforcement learning from human feedback (RLHF), reliable and behavior-diverse user queries are required. However, deterministic policies fail to generate a variety of navigation trajectory suggestions for a given navigation task configuration. We introduce EnQuery, a query generation approach using an ensemble of policies that achieve behavioral diversity through a regularization term. For a given navigation task, EnQuery produces multiple navigation trajectory suggestions, thereby optimizing the efficiency of preference data collection with fewer queries. Our methodology demonstrates superior performance in aligning navigation policies with user preferences in low-query regimes, offering enhanced policy convergence from sparse preference queries. The evaluation is complemented with a novel explainability representation, capturing full scene navigation behavior of the mobile robot in a single plot.

Robotic Sorting Systems: Robot Management and Layout Design Optimization

Authors: Tong Zhao, Xi Lin, Fang He, Hanwen Dai

Link: http://arxiv.org/abs/2404.04832v1open in new window

Abstract: In the contemporary logistics industry, automation plays a pivotal role in enhancing production efficiency and expanding industrial scale. Autonomous mobile robots, in particular, have become integral to the modernization efforts in warehouses. One noteworthy application in robotic warehousing is the robotic sorting system (RSS), distinguished by its characteristics such as cost-effectiveness, simplicity, scalability, and adaptable throughput control. While previous research has focused on analyzing the efficiency of RSS, it often assumed an ideal robot management system ignoring potential queuing delays by assuming constant travel times. This study relaxes this assumption and explores the quantitative relationship between RSS configuration parameters and system throughput. We introduce a novel robot traffic management method, named the rhythmic control for sorting scenario (RC-S), for RSS operations, equipped with an estimation formula establishing the relationship between system performance and configurations. Simulations validate that RC-S reduces average service time by 10.3% compared to the classical cooperative A* algorithm, while also improving throughput and runtime. Based on the performance analysis of RC-S, we further develop a layout optimization model for RSS, considering RSS configuration, desired throughput, and costs, to minimize expenses and determine the best layout. Numerical studies show that at lower throughput levels, facility costs dominate, while at higher throughput levels, labor costs prevail. Additionally, due to traffic efficiency limitations, RSS is well-suited for small-scale operations like end-of-supply-chain distribution centers.

Efficient Reinforcement Learning of Task Planners for Robotic Palletization through Iterative Action Masking Learning

Authors: Zheng Wu, Yichuan Li, Wei Zhan, Changliu Liu, Yun-Hui Liu, Masayoshi Tomizuka

Link: http://arxiv.org/abs/2404.04772v1open in new window

Abstract: The development of robotic systems for palletization in logistics scenarios is of paramount importance, addressing critical efficiency and precision demands in supply chain management. This paper investigates the application of Reinforcement Learning (RL) in enhancing task planning for such robotic systems. Confronted with the substantial challenge of a vast action space, which is a significant impediment to efficiently apply out-of-the-shelf RL methods, our study introduces a novel method of utilizing supervised learning to iteratively prune and manage the action space effectively. By reducing the complexity of the action space, our approach not only accelerates the learning phase but also ensures the effectiveness and reliability of the task planning in robotic palletization. The experimental results underscore the efficacy of this method, highlighting its potential in improving the performance of RL applications in complex and high-dimensional environments like logistics palletization.

2024-04-06

EAGLE: The First Event Camera Dataset Gathered by an Agile Quadruped Robot

Authors: Shifan Zhu, Zixun Xiong, Donghyun Kim

Link: http://arxiv.org/abs/2404.04698v1open in new window

Abstract: When legged robots perform agile movements, traditional RGB cameras often produce blurred images, posing a challenge for accurate state estimation. Event cameras, inspired by biological vision mechanisms, have emerged as a promising solution for capturing high-speed movements and coping with challenging lighting conditions, owing to their significant advantages, such as low latency, high temporal resolution, and a high dynamic range. However, the integration of event cameras into agile-legged robots is still largely unexplored. Notably, no event camera-based dataset has yet been specifically developed for dynamic legged robots. To bridge this gap, we introduce EAGLE (Event dataset of an AGile LEgged robot), a new dataset comprising data from an event camera, an RGB-D camera, an IMU, a LiDAR, and joint angle encoders, all mounted on a quadruped robotic platform. This dataset features more than 100 sequences from real-world environments, encompassing various indoor and outdoor environments, different lighting conditions, a range of robot gaits (e.g., trotting, bounding, pronking), as well as acrobatic movements such as backflipping. To our knowledge, this is the first event camera dataset to include multi-sensory data collected by an agile quadruped robot.

TeleAware Robot: Designing Awareness-augmented Telepresence Robot for Remote Collaborative Locomotion

Authors: Ruyi Li, Yaxin Zhu, Min Liu, Yihang Zeng, Shanning Zhuang, Jiayi Fu, Yi Lu, Guyue Zhou, Can Liu, Jiangtao Gong

Link: http://arxiv.org/abs/2404.04579v1open in new window

Abstract: Telepresence robots can be used to support users to navigate an environment remotely and share the visiting experience with their social partners. Although such systems allow users to see and hear the remote environment and communicate with their partners via live video feed, this does not provide enough awareness of the environment and their remote partner's activities. In this paper, we introduce an awareness framework for collaborative locomotion in scenarios of onsite and remote users visiting a place together. From an observational study of small groups of people visiting exhibitions, we derived four design goals for enhancing the environmental and social awareness between social partners, and developed a set of awareness-enhancing techniques to add to a standard telepresence robot - named TeleAware robot. Through a controlled experiment simulating a guided exhibition visiting task, TeleAware robot showed the ability to lower the workload, facilitate closer social proximity, and improve mutual awareness and social presence compared with the standard one. We discuss the impact of mobility and roles of local and remote users, and provide insights for the future design of awareness-enhancing telepresence robot systems that facilitate collaborative locomotion.

JRDB-Social: A Multifaceted Robotic Dataset for Understanding of Context and Dynamics of Human Interactions Within Social Groups

Authors: Simindokht Jahangard, Zhixi Cai, Shiki Wen, Hamid Rezatofighi

Link: http://arxiv.org/abs/2404.04458v1open in new window

Abstract: Understanding human social behaviour is crucial in computer vision and robotics. Micro-level observations like individual actions fall short, necessitating a comprehensive approach that considers individual behaviour, intra-group dynamics, and social group levels for a thorough understanding. To address dataset limitations, this paper introduces JRDB-Social, an extension of JRDB. Designed to fill gaps in human understanding across diverse indoor and outdoor social contexts, JRDB-Social provides annotations at three levels: individual attributes, intra-group interactions, and social group context. This dataset aims to enhance our grasp of human social dynamics for robotic applications. Utilizing the recent cutting-edge multi-modal large language models, we evaluated our benchmark to explore their capacity to decipher social human behaviour.

2024-04-05

Admittance Control for Adaptive Remote Center of Motion in Robotic Laparoscopic Surgery

Authors: Ehsan Nasiri, Long Wang

Link: http://arxiv.org/abs/2404.04416v1open in new window

Abstract: In laparoscopic robot-assisted minimally invasive surgery, the kinematic control of the robot is subject to the remote center of motion (RCM) constraint at the port of entry (e.g., trocar) into the patient's body. During surgery, after the instrument is inserted through the trocar, intrinsic physiological movements such as the patient's heartbeat, breathing process, and/or other purposeful body repositioning may deviate the position of the port of entry. This can cause a conflict between the registered RCM and the moved port of entry. To mitigate this conflict, we seek to utilize the interaction forces at the RCM. We develop a novel framework that integrates admittance control into a redundancy resolution method for the RCM kinematic constraint. Using the force/torque sensory feedback at the base of the instrument driving mechanism (IDM), the proposed framework estimates the forces at RCM, rejects forces applied on other locations along the instrument, and uses them in the admittance controller. In this paper, we report analysis from kinematic simulations to validate the proposed framework. In addition, a hardware platform has been completed, and future work is planned for experimental validation.

A Ground Mobile Robot for Autonomous Terrestrial Laser Scanning-Based Field Phenotyping

Authors: Javier Rodriguez-Sanchez, Kyle Johnsen, Changying Li

Link: http://arxiv.org/abs/2404.04404v1open in new window

Abstract: Traditional field phenotyping methods are often manual, time-consuming, and destructive, posing a challenge for breeding progress. To address this bottleneck, robotics and automation technologies offer efficient sensing tools to monitor field evolution and crop development throughout the season. This study aimed to develop an autonomous ground robotic system for LiDAR-based field phenotyping in plant breeding trials. A Husky platform was equipped with a high-resolution three-dimensional (3D) laser scanner to collect in-field terrestrial laser scanning (TLS) data without human intervention. To automate the TLS process, a 3D ray casting analysis was implemented for optimal TLS site planning, and a route optimization algorithm was utilized to minimize travel distance during data collection. The platform was deployed in two cotton breeding fields for evaluation, where it autonomously collected TLS data. The system provided accurate pose information through RTK-GNSS positioning and sensor fusion techniques, with average errors of less than 0.6 cm for location and 0.38$^{\circ}$ for heading. The achieved localization accuracy allowed point cloud registration with mean point errors of approximately 2 cm, comparable to traditional TLS methods that rely on artificial targets and manual sensor deployment. This work presents an autonomous phenotyping platform that facilitates the quantitative assessment of plant traits under field conditions of both large agricultural fields and small breeding trials to contribute to the advancement of plant phenomics and breeding programs.

Humanoid Robots at work: where are we ?

Authors: Fabrice R. Noreils

Link: http://arxiv.org/abs/2404.04249v1open in new window

Abstract: Launched by Elon Musk and its Optimus, we are witnessing a new race in which many companies have already engaged. The objective it to put at work a new generation of humanoid robots in demanding industrial environments within 2 or 3 years. Is this objective realistic ? The aim of this document and its main contributions is to provide some hints by covering the following topics: First an analysis of 12 companies based on eight criteria that will help us to distinguish companies based on their maturity and approach to the market; second as these humanoids are very complex systems we will provide an overview of the technological challenges to be addressed; third when humanoids are deployed at scale, Operation and Maintenance become critical and the we will explore what is new with these complex machines; Finally Pilots are the last step to test the feasibility of a new system before mass deployment. This is an important step to test the maturity of a product and the strategy of the humanoid supplier to address a market and two pragmatic approaches will be discussed.

Modeling Kinematic Uncertainty of Tendon-Driven Continuum Robots via Mixture Density Networks

Authors: Jordan Thompson, Brian Y. Cho, Daniel S. Brown, Alan Kuntz

Link: http://arxiv.org/abs/2404.04241v1open in new window

Abstract: Tendon-driven continuum robot kinematic models are frequently computationally expensive, inaccurate due to unmodeled effects, or both. In particular, unmodeled effects produce uncertainties that arise during the robot's operation that lead to variability in the resulting geometry. We propose a novel solution to these issues through the development of a Gaussian mixture kinematic model. We train a mixture density network to output a Gaussian mixture model representation of the robot geometry given the current tendon displacements. This model computes a probability distribution that is more representative of the true distribution of geometries at a given configuration than a model that outputs a single geometry, while also reducing the computation time. We demonstrate one use of this model through a trajectory optimization method that explicitly reasons about the workspace uncertainty to minimize the probability of collision.

Multi-modal perception for soft robotic interactions using generative models

Authors: Enrico Donato, Egidio Falotico, Thomas George Thuruthel

Link: http://arxiv.org/abs/2404.04220v1open in new window

Abstract: Perception is essential for the active interaction of physical agents with the external environment. The integration of multiple sensory modalities, such as touch and vision, enhances this perceptual process, creating a more comprehensive and robust understanding of the world. Such fusion is particularly useful for highly deformable bodies such as soft robots. Developing a compact, yet comprehensive state representation from multi-sensory inputs can pave the way for the development of complex control strategies. This paper introduces a perception model that harmonizes data from diverse modalities to build a holistic state representation and assimilate essential information. The model relies on the causality between sensory input and robotic actions, employing a generative model to efficiently compress fused information and predict the next observation. We present, for the first time, a study on how touch can be predicted from vision and proprioception on soft robots, the importance of the cross-modal generation and why this is essential for soft robotic interactions in unstructured environments.

Continual Policy Distillation of Reinforcement Learning-based Controllers for Soft Robotic In-Hand Manipulation

Authors: Lanpei Li, Enrico Donato, Vincenzo Lomonaco, Egidio Falotico

Link: http://arxiv.org/abs/2404.04219v1open in new window

Abstract: Dexterous manipulation, often facilitated by multi-fingered robotic hands, holds solid impact for real-world applications. Soft robotic hands, due to their compliant nature, offer flexibility and adaptability during object grasping and manipulation. Yet, benefits come with challenges, particularly in the control development for finger coordination. Reinforcement Learning (RL) can be employed to train object-specific in-hand manipulation policies, but limiting adaptability and generalizability. We introduce a Continual Policy Distillation (CPD) framework to acquire a versatile controller for in-hand manipulation, to rotate different objects in shape and size within a four-fingered soft gripper. The framework leverages Policy Distillation (PD) to transfer knowledge from expert policies to a continually evolving student policy network. Exemplar-based rehearsal methods are then integrated to mitigate catastrophic forgetting and enhance generalization. The performance of the CPD framework over various replay strategies demonstrates its effectiveness in consolidating knowledge from multiple experts and achieving versatile and adaptive behaviours for in-hand manipulation tasks.

Probabilistically Informed Robot Object Search with Multiple Regions

Authors: Matthew Collins, Jared J. Beard, Nicholas Ohi, Yu Gu

Link: http://arxiv.org/abs/2404.04186v1open in new window

Abstract: The increasing use of autonomous robot systems in hazardous environments underscores the need for efficient search and rescue operations. Despite significant advancements, existing literature on object search often falls short in overcoming the difficulty of long planning horizons and dealing with sensor limitations, such as noise. This study introduces a novel approach that formulates the search problem as a belief Markov decision processes with options (BMDP-O) to make Monte Carlo tree search (MCTS) a viable tool for overcoming these challenges in large scale environments. The proposed formulation incorporates sequences of actions (options) to move between regions of interest, enabling the algorithm to efficiently scale to large environments. This approach also enables the use of customizable fields of view, for use with multiple types of sensors. Experimental results demonstrate the superiority of this approach in large environments when compared to the problem without options and alternative tools such as receding horizon planners. Given compute time for the proposed formulation is relatively high, a further approximated "lite" formulation is proposed. The lite formulation finds objects in a comparable number of steps with faster computation.

Designing Robots to Help Women

Authors: Martin Cooney, Lena Klasén, Fernando Alonso-Fernandez

Link: http://arxiv.org/abs/2404.04123v1open in new window

Abstract: Robots are being designed to help people in an increasing variety of settings--but seemingly little attention has been given so far to the specific needs of women, who represent roughly half of the world's population but are highly underrepresented in robotics. Here we used a speculative prototyping approach to explore this expansive design space: First, we identified some potential challenges of interest, including crimes and illnesses that disproportionately affect women, as well as potential opportunities for designers, which were visualized in five sketches. Then, one of the sketched scenarios was further explored by developing a prototype, of a robotic helper drone equipped with computer vision to detect hidden cameras that could be used to spy on women. While object detection introduced some errors, hidden cameras were identified with a reasonable accuracy of 80% (Intersection over Union (IoU) score: 0.40). Our aim is that the identified challenges and opportunities could help spark discussion and inspire designers, toward realizing a safer, more inclusive future through responsible use of technology.

Self-Sensing Feedback Control of an Electrohydraulic Robotic Shoulder

Authors: Clemens C. Christoph, Amirhossein Kazemipour, Michel R. Vogt, Yu Zhang, Robert K. Katzschmann

Link: http://arxiv.org/abs/2404.04079v1open in new window

Abstract: The human shoulder, with its glenohumeral joint, tendons, ligaments, and muscles, allows for the execution of complex tasks with precision and efficiency. However, current robotic shoulder designs lack the compliance and compactness inherent in their biological counterparts. A major limitation of these designs is their reliance on external sensors like rotary encoders, which restrict mechanical joint design and introduce bulk to the system. To address this constraint, we present a bio-inspired antagonistic robotic shoulder with two degrees of freedom powered by self-sensing hydraulically amplified self-healing electrostatic actuators. Our artificial muscle design decouples the high-voltage electrostatic actuation from the pair of low-voltage self-sensing electrodes. This approach allows for proprioceptive feedback control of trajectories in the task space while eliminating the necessity for any additional sensors. We assess the platform's efficacy by comparing it to a feedback control based on position data provided by a motion capture system. The study demonstrates closed-loop controllable robotic manipulators based on an inherent self-sensing capability of electrohydraulic actuators. The proposed architecture can serve as a basis for complex musculoskeletal joint arrangements.

Bidirectional Human Interactive AI Framework for Social Robot Navigation

Authors: Tuba Girgin, Emre Girgin, Yigit Yildirim, Emre Ugur, Mehmet Haklidir

Link: http://arxiv.org/abs/2404.04069v1open in new window

Abstract: Trustworthiness is a crucial concept in the context of human-robot interaction. Cooperative robots must be transparent regarding their decision-making process, especially when operating in a human-oriented environment. This paper presents a comprehensive end-to-end framework aimed at fostering trustworthy bidirectional human-robot interaction in collaborative environments for the social navigation of mobile robots. Our method enables a mobile robot to predict the trajectory of people and adjust its route in a socially-aware manner. In case of conflict between human and robot decisions, detected through visual examination, the route is dynamically modified based on human preference while verbal communication is maintained. We present our pipeline, framework design, and preliminary experiments that form the foundation of our proposition.

VoicePilot: Harnessing LLMs as Speech Interfaces for Physically Assistive Robots

Authors: Akhil Padmanabha, Jessie Yuan, Janavi Gupta, Zulekha Karachiwalla, Carmel Majidi, Henny Admoni, Zackory Erickson

Link: http://arxiv.org/abs/2404.04066v1open in new window

Abstract: Physically assistive robots present an opportunity to significantly increase the well-being and independence of individuals with motor impairments or other forms of disability who are unable to complete activities of daily living. Speech interfaces, especially ones that utilize Large Language Models (LLMs), can enable individuals to effectively and naturally communicate high-level commands and nuanced preferences to robots. Frameworks for integrating LLMs as interfaces to robots for high level task planning and code generation have been proposed, but fail to incorporate human-centric considerations which are essential while developing assistive interfaces. In this work, we present a framework for incorporating LLMs as speech interfaces for physically assistive robots, constructed iteratively with 3 stages of testing involving a feeding robot, culminating in an evaluation with 11 older adults at an independent living facility. We use both quantitative and qualitative data from the final study to validate our framework and additionally provide design guidelines for using LLMs as speech interfaces for assistive robots. Videos and supporting files are located on our project website: https://sites.google.com/andrew.cmu.edu/voicepilot/

Towards Safe Robot Use with Edged or Pointed Objects: A Surrogate Study Assembling a Human Hand Injury Protection Database

Authors: Robin Jeanne Kirschner, Carina M. Micheler, Yangcan Zhou, Sebastian Siegner, Mazin Hamad, Claudio Glowalla, Jan Neumann, Nader Rajaei, Rainer Burgkart, Sami Haddadin

Link: http://arxiv.org/abs/2404.04004v1open in new window

Abstract: The use of pointed or edged tools or objects is one of the most challenging aspects of today's application of physical human-robot interaction (pHRI). One reason for this is that the severity of harm caused by such edged or pointed impactors is less well studied than for blunt impactors. Consequently, the standards specify well-reasoned force and pressure thresholds for blunt impactors and advise avoiding any edges and corners in contacts. Nevertheless, pointed or edged impactor geometries cannot be completely ruled out in real pHRI applications. For example, to allow edged or pointed tools such as screwdrivers near human operators, the knowledge of injury severity needs to be extended so that robot integrators can perform well-reasoned, time-efficient risk assessments. In this paper, we provide the initial datasets on injury prevention for the human hand based on drop tests with surrogates for the human hand, namely pig claws and chicken drumsticks. We then demonstrate the ease and efficiency of robot use using the dataset for contact on two examples. Finally, our experiments provide a set of injuries that may also be expected for human subjects under certain robot mass-velocity constellations in collisions. To extend this work, testing on human samples and a collaborative effort from research institutes worldwide is needed to create a comprehensive human injury avoidance database for any pHRI scenario and thus for safe pHRI applications including edged and pointed geometries.

POMDP-Guided Active Force-Based Search for Robotic Insertion

Authors: Chen Wang, Haoxiang Luo, Kun Zhang, Hua Chen, Jia Pan, Wei Zhang

Link: http://arxiv.org/abs/2404.03943v1open in new window

Abstract: In robotic insertion tasks where the uncertainty exceeds the allowable tolerance, a good search strategy is essential for successful insertion and significantly influences efficiency. The commonly used blind search method is time-consuming and does not exploit the rich contact information. In this paper, we propose a novel search strategy that actively utilizes the information contained in the contact configuration and shows high efficiency. In particular, we formulate this problem as a Partially Observable Markov Decision Process (POMDP) with carefully designed primitives based on an in-depth analysis of the contact configuration's static stability. From the formulated POMDP, we can derive a novel search strategy. Thanks to its simplicity, this search strategy can be incorporated into a Finite-State-Machine (FSM) controller. The behaviors of the FSM controller are realized through a low-level Cartesian Impedance Controller. Our method is based purely on the robot's proprioceptive sensing and does not need visual or tactile sensors. To evaluate the effectiveness of our proposed strategy and control framework, we conduct extensive comparison experiments in simulation, where we compare our method with the baseline approach. The results demonstrate that our proposed method achieves a higher success rate with a shorter search time and search trajectory length compared to the baseline method. Additionally, we show that our method is robust to various initial displacement errors.

2024-04-04

Fast k-connectivity Restoration in Multi-Robot Systems for Robust Communication Maintenance

Authors: Md Ishat-E-Rabban, Guangyao Shi, Griffin Bonner, Pratap Tokekar

Link: http://arxiv.org/abs/2404.03834v1open in new window

Abstract: Maintaining a robust communication network plays an important role in the success of a multi-robot team jointly performing an optimization task. A key characteristic of a robust cooperative multi-robot system is the ability to repair the communication topology in the case of robot failure. In this paper, we focus on the Fast k-connectivity Restoration (FCR) problem, which aims to repair a network to make it k-connected with minimum robot movement. We develop a Quadratically Constrained Program (QCP) formulation of the FCR problem, which provides a way to optimally solve the problem, but cannot handle large instances due to high computational overhead. We therefore present a scalable algorithm, called EA-SCR, for the FCR problem using graph theoretic concepts. By conducting empirical studies, we demonstrate that the EA-SCR algorithm performs within 10 percent of the optimal while being orders of magnitude faster. We also show that EA-SCR outperforms existing solutions by 30 percent in terms of the FCR distance metric.

Accounting for Hysteresis in the Forward Kinematics of Nonlinearly-Routed Tendon-Driven Continuum Robots via a Learned Deep Decoder Network

Authors: Brian Y. Cho, Daniel S. Esser, Jordan Thompson, Bao Thach, Robert J. Webster III, Alan Kuntz

Link: http://arxiv.org/abs/2404.03816v1open in new window

Abstract: Tendon-driven continuum robots have been gaining popularity in medical applications due to their ability to curve around complex anatomical structures, potentially reducing the invasiveness of surgery. However, accurate modeling is required to plan and control the movements of these flexible robots. Physics-based models have limitations due to unmodeled effects, leading to mismatches between model prediction and actual robot shape. Recently proposed learning-based methods have been shown to overcome some of these limitations but do not account for hysteresis, a significant source of error for these robots. To overcome these challenges, we propose a novel deep decoder neural network that predicts the complete shape of tendon-driven robots using point clouds as the shape representation, conditioned on prior configurations to account for hysteresis. We evaluate our method on a physical tendon-driven robot and show that our network model accurately predicts the robot's shape, significantly outperforming a state-of-the-art physics-based model and a learning-based model that does not account for hysteresis.

Legible and Proactive Robot Planning for Prosocial Human-Robot Interactions

Authors: Jasper Geldenbott, Karen Leung

Link: http://arxiv.org/abs/2404.03734v1open in new window

Abstract: Humans have a remarkable ability to fluently engage in joint collision avoidance in crowded navigation tasks despite the complexities and uncertainties inherent in human behavior. Underlying these interactions is a mutual understanding that (i) individuals are prosocial, that is, there is equitable responsibility in avoiding collisions, and (ii) individuals should behave legibly, that is, move in a way that clearly conveys their intent to reduce ambiguity in how they intend to avoid others. Toward building robots that can safely and seamlessly interact with humans, we propose a general robot trajectory planning framework for synthesizing legible and proactive behaviors and demonstrate that our robot planner naturally leads to prosocial interactions. Specifically, we introduce the notion of a markup factor to incentivize legible and proactive behaviors and an inconvenience budget constraint to ensure equitable collision avoidance responsibility. We evaluate our approach against well-established multi-agent planning algorithms and show that using our approach produces safe, fluent, and prosocial interactions. We demonstrate the real-time feasibility of our approach with human-in-the-loop simulations. Project page can be found at https://uw-ctrl.github.io/phri/.

JUICER: Data-Efficient Imitation Learning for Robotic Assembly

Authors: Lars Ankile, Anthony Simeonov, Idan Shenfeld, Pulkit Agrawal

Link: http://arxiv.org/abs/2404.03729v2open in new window

Abstract: While learning from demonstrations is powerful for acquiring visuomotor policies, high-performance imitation without large demonstration datasets remains challenging for tasks requiring precise, long-horizon manipulation. This paper proposes a pipeline for improving imitation learning performance with a small human demonstration budget. We apply our approach to assembly tasks that require precisely grasping, reorienting, and inserting multiple parts over long horizons and multiple task phases. Our pipeline combines expressive policy architectures and various techniques for dataset expansion and simulation-based data augmentation. These help expand dataset support and supervise the model with locally corrective actions near bottleneck regions requiring high precision. We demonstrate our pipeline on four furniture assembly tasks in simulation, enabling a manipulator to assemble up to five parts over nearly 2500 time steps directly from RGB images, outperforming imitation and data augmentation baselines. Project website: https://imitation-juicer.github.io/.

ROBUST: 221 Bugs in the Robot Operating System

Authors: Christopher S. Timperley, Gijs van der Hoorn, André Santos, Harshavardhan Deshpande, Andrzej Wąsowski

Link: http://arxiv.org/abs/2404.03629v1open in new window

Abstract: As robotic systems such as autonomous cars and delivery drones assume greater roles and responsibilities within society, the likelihood and impact of catastrophic software failure within those systems is increased.To aid researchers in the development of new methods to measure and assure the safety and quality of robotics software, we systematically curated a dataset of 221 bugs across 7 popular and diverse software systems implemented via the Robot Operating System (ROS). We produce historically accurate recreations of each of the 221 defective software versions in the form of Docker images, and use a grounded theory approach to examine and categorize their corresponding faults, failures, and fixes. Finally, we reflect on the implications of our findings and outline future research directions for the community.

Anticipate & Collab: Data-driven Task Anticipation and Knowledge-driven Planning for Human-robot Collaboration

Authors: Shivam Singh, Karthik Swaminathan, Raghav Arora, Ramandeep Singh, Ahana Datta, Dipanjan Das, Snehasis Banerjee, Mohan Sridharan, Madhava Krishna

Link: http://arxiv.org/abs/2404.03587v1open in new window

Abstract: An agent assisting humans in daily living activities can collaborate more effectively by anticipating upcoming tasks. Data-driven methods represent the state of the art in task anticipation, planning, and related problems, but these methods are resource-hungry and opaque. Our prior work introduced a proof of concept framework that used an LLM to anticipate 3 high-level tasks that served as goals for a classical planning system that computed a sequence of low-level actions for the agent to achieve these goals. This paper describes DaTAPlan, our framework that significantly extends our prior work toward human-robot collaboration. Specifically, DaTAPlan planner computes actions for an agent and a human to collaboratively and jointly achieve the tasks anticipated by the LLM, and the agent automatically adapts to unexpected changes in human action outcomes and preferences. We evaluate DaTAPlan capabilities in a realistic simulation environment, demonstrating accurate task anticipation, effective human-robot collaboration, and the ability to adapt to unexpected changes. Project website: https://dataplan-hrc.github.io

Robot Safety Monitoring using Programmable Light Curtains

Authors: Karnik Ram, Shobhit Aggarwal, Robert Tamburo, Siddharth Ancha, Srinivasa Narasimhan

Link: http://arxiv.org/abs/2404.03556v1open in new window

Abstract: As factories continue to evolve into collaborative spaces with multiple robots working together with human supervisors in the loop, ensuring safety for all actors involved becomes critical. Currently, laser-based light curtain sensors are widely used in factories for safety monitoring. While these conventional safety sensors meet high accuracy standards, they are difficult to reconfigure and can only monitor a fixed user-defined region of space. Furthermore, they are typically expensive. Instead, we leverage a controllable depth sensor, programmable light curtains (PLC), to develop an inexpensive and flexible real-time safety monitoring system for collaborative robot workspaces. Our system projects virtual dynamic safety envelopes that tightly envelop the moving robot at all times and detect any objects that intrude the envelope. Furthermore, we develop an instrumentation algorithm that optimally places (multiple) PLCs in a workspace to maximize the visibility coverage of robots. Our work enables fence-less human-robot collaboration, while scaling to monitor multiple robots with few sensors. We analyze our system in a real manufacturing testbed with four robot arms and demonstrate its capabilities as a fast, accurate, and inexpensive safety monitoring solution.

Integrating Large Language Models with Multimodal Virtual Reality Interfaces to Support Collaborative Human-Robot Construction Work

Authors: Somin Park, Carol C. Menassa, Vineet R. Kamat

Link: http://arxiv.org/abs/2404.03498v1open in new window

Abstract: In the construction industry, where work environments are complex, unstructured and often dangerous, the implementation of Human-Robot Collaboration (HRC) is emerging as a promising advancement. This underlines the critical need for intuitive communication interfaces that enable construction workers to collaborate seamlessly with robotic assistants. This study introduces a conversational Virtual Reality (VR) interface integrating multimodal interaction to enhance intuitive communication between construction workers and robots. By integrating voice and controller inputs with the Robot Operating System (ROS), Building Information Modeling (BIM), and a game engine featuring a chat interface powered by a Large Language Model (LLM), the proposed system enables intuitive and precise interaction within a VR setting. Evaluated by twelve construction workers through a drywall installation case study, the proposed system demonstrated its low workload and high usability with succinct command inputs. The proposed multimodal interaction system suggests that such technological integration can substantially advance the integration of robotic assistants in the construction industry.

Design of Stickbug: a Six-Armed Precision Pollination Robot

Authors: Trevor Smith, Madhav Rijal, Christopher Tatsch, R. Michael Butts, Jared Beard, R. Tyler Cook, Andy Chu, Jason Gross, Yu Gu

Link: http://arxiv.org/abs/2404.03489v1open in new window

Abstract: This work presents the design of Stickbug, a six-armed, multi-agent, precision pollination robot that combines the accuracy of single-agent systems with swarm parallelization in greenhouses. Precision pollination robots have often been proposed to offset the effects of a decreasing population of natural pollinators, but they frequently lack the required parallelization and scalability. Stickbug achieves this by allowing each arm and drive base to act as an individual agent, significantly reducing planning complexity. Stickbug uses a compact holonomic Kiwi drive to navigate narrow greenhouse rows, a tall mast to support multiple manipulators and reach plant heights, a detection model and classifier to identify Bramble flowers, and a felt-tipped end-effector for contact-based pollination. Initial experimental validation demonstrates that Stickbug can attempt over 1.5 pollinations per minute with a 50% success rate. Additionally, a Bramble flower perception dataset was created and is publicly available alongside Stickbug's software and design files.

You Only Scan Once: A Dynamic Scene Reconstruction Pipeline for 6-DoF Robotic Grasping of Novel Objects

Authors: Lei Zhou, Haozhe Wang, Zhengshen Zhang, Zhiyang Liu, Francis EH Tay, adn Marcelo H. Ang. Jr

Link: http://arxiv.org/abs/2404.03462v1open in new window

Abstract: In the realm of robotic grasping, achieving accurate and reliable interactions with the environment is a pivotal challenge. Traditional methods of grasp planning methods utilizing partial point clouds derived from depth image often suffer from reduced scene understanding due to occlusion, ultimately impeding their grasping accuracy. Furthermore, scene reconstruction methods have primarily relied upon static techniques, which are susceptible to environment change during manipulation process limits their efficacy in real-time grasping tasks. To address these limitations, this paper introduces a novel two-stage pipeline for dynamic scene reconstruction. In the first stage, our approach takes scene scanning as input to register each target object with mesh reconstruction and novel object pose tracking. In the second stage, pose tracking is still performed to provide object poses in real-time, enabling our approach to transform the reconstructed object point clouds back into the scene. Unlike conventional methodologies, which rely on static scene snapshots, our method continuously captures the evolving scene geometry, resulting in a comprehensive and up-to-date point cloud representation. By circumventing the constraints posed by occlusion, our method enhances the overall grasp planning process and empowers state-of-the-art 6-DoF robotic grasping algorithms to exhibit markedly improved accuracy.

Simultaneous State Estimation and Contact Detection for Legged Robots by Multiple-Model Kalman Filtering

Authors: Marcel Menner, Karl Berntorp

Link: http://arxiv.org/abs/2404.03444v1open in new window

Abstract: This paper proposes an algorithm for combined contact detection and state estimation for legged robots. The proposed algorithm models the robot's movement as a switched system, in which different modes relate to different feet being in contact with the ground. The key element in the proposed algorithm is an interacting multiple-model Kalman filter, which identifies the currently-active mode defining contacts, while estimating the state. The rationale for the proposed estimation framework is that contacts (and contact forces) impact the robot's state and vice versa. This paper presents validation studies with a quadruped using (i) the high-fidelity simulator Gazebo for a comparison with ground truth values and a baseline estimator, and (ii) hardware experiments with the Unitree A1 robot. The simulation study shows that the proposed algorithm outperforms the baseline estimator, which does not simultaneous detect contacts. The hardware experiments showcase the applicability of the proposed algorithm and highlights the ability to detect contacts.

Future Predictive Success-or-Failure Classification for Long-Horizon Robotic Tasks

Authors: Naoya Sogi, Hiroyuki Oyama, Takashi Shibata, Makoto Terao

Link: http://arxiv.org/abs/2404.03415v1open in new window

Abstract: Automating long-horizon tasks with a robotic arm has been a central research topic in robotics. Optimization-based action planning is an efficient approach for creating an action plan to complete a given task. Construction of a reliable planning method requires a design process of conditions, e.g., to avoid collision between objects. The design process, however, has two critical issues: 1) iterative trials--the design process is time-consuming due to the trial-and-error process of modifying conditions, and 2) manual redesign--it is difficult to cover all the necessary conditions manually. To tackle these issues, this paper proposes a future-predictive success-or-failure-classification method to obtain conditions automatically. The key idea behind the proposed method is an end-to-end approach for determining whether the action plan can complete a given task instead of manually redesigning the conditions. The proposed method uses a long-horizon future-prediction method to enable success-or-failure classification without the execution of an action plan. This paper also proposes a regularization term called transition consistency regularization to provide easy-to-predict feature distribution. The regularization term improves future prediction and classification performance. The effectiveness of our method is demonstrated through classification and robotic-manipulation experiments.

RADIUM: Predicting and Repairing End-to-End Robot Failures using Gradient-Accelerated Sampling

Authors: Charles Dawson, Anjali Parashar, Chuchu Fan

Link: http://arxiv.org/abs/2404.03412v1open in new window

Abstract: Before autonomous systems can be deployed in safety-critical applications, we must be able to understand and verify the safety of these systems. For cases where the risk or cost of real-world testing is prohibitive, we propose a simulation-based framework for a) predicting ways in which an autonomous system is likely to fail and b) automatically adjusting the system's design and control policy to preemptively mitigate those failures. Existing tools for failure prediction struggle to search over high-dimensional environmental parameters, cannot efficiently handle end-to-end testing for systems with vision in the loop, and provide little guidance on how to mitigate failures once they are discovered. We approach this problem through the lens of approximate Bayesian inference and use differentiable simulation and rendering for efficient failure case prediction and repair. For cases where a differentiable simulator is not available, we provide a gradient-free version of our algorithm, and we include a theoretical and empirical evaluation of the trade-offs between gradient-based and gradient-free methods. We apply our approach on a range of robotics and control problems, including optimizing search patterns for robot swarms, UAV formation control, and robust network control. Compared to optimization-based falsification methods, our method predicts a more diverse, representative set of failure modes, and we find that our use of differentiable simulation yields solutions that have up to 10x lower cost and requires up to 2x fewer iterations to converge relative to gradient-free techniques. In hardware experiments, we find that repairing control policies using our method leads to a 5x robustness improvement. Accompanying code and video can be found at https://mit-realm.github.io/radium/

Space Physiology and Technology: Musculoskeletal Adaptations, Countermeasures, and the Opportunity for Wearable Robotics

Authors: Shamas Ul Ebad Khan, Rejin John Varghese, Panagiotis Kassanos, Dario Farina, Etienne Burdet

Link: http://arxiv.org/abs/2404.03363v1open in new window

Abstract: Space poses significant challenges for human physiology, leading to physiological adaptations in response to an environment vastly different from Earth. While these adaptations can be beneficial, they may not fully counteract the adverse impact of space-related stressors. A comprehensive understanding of these physiological adaptations is needed to devise effective countermeasures to support human life in space. This review focuses on the impact of the environment in space on the musculoskeletal system. It highlights the complex interplay between bone and muscle adaptation, the underlying physiological mechanisms, and their implications on astronaut health. Furthermore, the review delves into the deployed and current advances in countermeasures and proposes, as a perspective for future developments, wearable sensing and robotic technologies, such as exoskeletons, as a fitting alternative.

Embodied Neuromorphic Artificial Intelligence for Robotics: Perspectives, Challenges, and Research Development Stack

Authors: Rachmad Vidya Wicaksana Putra, Alberto Marchisio, Fakhreddine Zayer, Jorge Dias, Muhammad Shafique

Link: http://arxiv.org/abs/2404.03325v1open in new window

Abstract: Robotic technologies have been an indispensable part for improving human productivity since they have been helping humans in completing diverse, complex, and intensive tasks in a fast yet accurate and efficient way. Therefore, robotic technologies have been deployed in a wide range of applications, ranging from personal to industrial use-cases. However, current robotic technologies and their computing paradigm still lack embodied intelligence to efficiently interact with operational environments, respond with correct/expected actions, and adapt to changes in the environments. Toward this, recent advances in neuromorphic computing with Spiking Neural Networks (SNN) have demonstrated the potential to enable the embodied intelligence for robotics through bio-plausible computing paradigm that mimics how the biological brain works, known as "neuromorphic artificial intelligence (AI)". However, the field of neuromorphic AI-based robotics is still at an early stage, therefore its development and deployment for solving real-world problems expose new challenges in different design aspects, such as accuracy, adaptability, efficiency, reliability, and security. To address these challenges, this paper will discuss how we can enable embodied neuromorphic AI for robotic systems through our perspectives: (P1) Embodied intelligence based on effective learning rule, training mechanism, and adaptability; (P2) Cross-layer optimizations for energy-efficient neuromorphic computing; (P3) Representative and fair benchmarks; (P4) Low-cost reliability and safety enhancements; (P5) Security and privacy for neuromorphic computing; and (P6) A synergistic development for energy-efficient and robust neuromorphic-based robotics. Furthermore, this paper identifies research challenges and opportunities, as well as elaborates our vision for future research development toward embodied neuromorphic AI for robotics.

DELTA: Decomposed Efficient Long-Term Robot Task Planning using Large Language Models

Authors: Yuchen Liu, Luigi Palmieri, Sebastian Koch, Ilche Georgievski, Marco Aiello

Link: http://arxiv.org/abs/2404.03275v1open in new window

Abstract: Recent advancements in Large Language Models (LLMs) have sparked a revolution across various research fields. In particular, the integration of common-sense knowledge from LLMs into robot task and motion planning has been proven to be a game-changer, elevating performance in terms of explainability and downstream task efficiency to unprecedented heights. However, managing the vast knowledge encapsulated within these large models has posed challenges, often resulting in infeasible plans generated by LLM-based planning systems due to hallucinations or missing domain information. To overcome these challenges and obtain even greater planning feasibility and computational efficiency, we propose a novel LLM-driven task planning approach called DELTA. For achieving better grounding from environmental topology into actionable knowledge, DELTA leverages the power of scene graphs as environment representations within LLMs, enabling the fast generation of precise planning problem descriptions. For obtaining higher planning performance, we use LLMs to decompose the long-term task goals into an autoregressive sequence of sub-goals for an automated task planner to solve. Our contribution enables a more efficient and fully automatic task planning pipeline, achieving higher planning success rates and significantly shorter planning times compared to the state of the art.

Design and Evaluation of a Compact 3D End-effector Assistive Robot for Adaptive Arm Support

Authors: Sibo Yang, Lincong Luo, Wei Chuan Law, Youlong Wang, Lei Li, Wei Tech Ang

Link: http://arxiv.org/abs/2404.03149v1open in new window

Abstract: We developed a 3D end-effector type of upper limb assistive robot, named as Assistive Robotic Arm Extender (ARAE), that provides transparency movement and adaptive arm support control to achieve home-based therapy and training in the real environment. The proposed system composes five degrees of freedom, including three active motors and two passive joints at the end-effector module. The core structure of the system is based on a parallel mechanism. The kinematic and dynamic modeling are illustrated in detail. The proposed adaptive arm support control framework calculates the compensated force based on the estimated human arm posture in 3D space. It firstly estimates human arm joint angles using two proposed methods: fixed torso and sagittal plane models without using external sensors such as IMUs, magnetic sensors, or depth cameras. The experiments were carried out to evaluate the performance of the two proposed angle estimation methods. Then, the estimated human joint angles were input into the human upper limb dynamics model to derive the required support force generated by the robot. The muscular activities were measured to evaluate the effects of the proposed framework. The obvious reduction of muscular activities was exhibited when participants were tested with the ARAE under an adaptive arm gravity compensation control framework. The overall results suggest that the ARAE system, when combined with the proposed control framework, has the potential to offer adaptive arm support. This integration could enable effective training with Activities of Daily Living (ADLs) and interaction with real environments.

2024-04-03

Multi-Robot Planning for Filming Groups of Moving Actors Leveraging Submodularity and Pixel Density

Authors: Skyler Hughes, Rebecca Martin, Micah Corah, Sebastian Scherer

Link: http://arxiv.org/abs/2404.03103v1open in new window

Abstract: Observing and filming a group of moving actors with a team of aerial robots is a challenging problem that combines elements of multi-robot coordination, coverage, and view planning. A single camera may observe multiple actors at once, and the robot team may observe individual actors from multiple views. As actors move about, groups may split, merge, and reform, and robots filming these actors should be able to adapt smoothly to such changes in actor formations. Rather than adopt an approach based on explicit formations or assignments, we propose an approach based on optimizing views directly. We model actors as moving polyhedra and compute approximate pixel densities for each face and camera view. Then, we propose an objective that exhibits diminishing returns as pixel densities increase from repeated observation. This gives rise to a multi-robot perception planning problem which we solve via a combination of value iteration and greedy submodular maximization. %using a combination of value iteration to optimize views for individual robots and sequential submodular maximization methods to coordinate the team. We evaluate our approach on challenging scenarios modeled after various kinds of social behaviors and featuring different numbers of robots and actors and observe that robot assignments and formations arise implicitly based on the movements of groups of actors. Simulation results demonstrate that our approach consistently outperforms baselines, and in addition to performing well with the planner's approximation of pixel densities our approach also performs comparably for evaluation based on rendered views. Overall, the multi-round variant of the sequential planner we propose meets (within 1%) or exceeds the formation and assignment baselines in all scenarios we consider.

Unsupervised, Bottom-up Category Discovery for Symbol Grounding with a Curious Robot

Authors: Catherine Henry, Casey Kennington

Link: http://arxiv.org/abs/2404.03092v1open in new window

Abstract: Towards addressing the Symbol Grounding Problem and motivated by early childhood language development, we leverage a robot which has been equipped with an approximate model of curiosity with particular focus on bottom-up building of unsupervised categories grounded in the physical world. That is, rather than starting with a top-down symbol (e.g., a word referring to an object) and providing meaning through the application of predetermined samples, the robot autonomously and gradually breaks up its exploration space into a series of increasingly specific unlabeled categories at which point an external expert may optionally provide a symbol association. We extend prior work by using a robot that can observe the visual world, introducing a higher dimensional sensory space, and using a more generalizable method of category building. Our experiments show that the robot learns categories based on actions and what it visually observes, and that those categories can be symbolically grounded into.https://info.arxiv.org/help/prep#comments

Self-supervised 6-DoF Robot Grasping by Demonstration via Augmented Reality Teleoperation System

Authors: Xiwen Dengxiong, Xueting Wang, Shi Bai, Yunbo Zhang

Link: http://arxiv.org/abs/2404.03067v1open in new window

Abstract: Most existing 6-DoF robot grasping solutions depend on strong supervision on grasp pose to ensure satisfactory performance, which could be laborious and impractical when the robot works in some restricted area. To this end, we propose a self-supervised 6-DoF grasp pose detection framework via an Augmented Reality (AR) teleoperation system that can efficiently learn human demonstrations and provide 6-DoF grasp poses without grasp pose annotations. Specifically, the system collects the human demonstration from the AR environment and contrastively learns the grasping strategy from the demonstration. For the real-world experiment, the proposed system leads to satisfactory grasping abilities and learning to grasp unknown objects within three demonstrations.

Language, Environment, and Robotic Navigation

Authors: Johnathan E. Avery

Link: http://arxiv.org/abs/2404.03049v1open in new window

Abstract: This paper explores the integration of linguistic inputs within robotic navigation systems, drawing upon the symbol interdependency hypothesis to bridge the divide between symbolic and embodied cognition. It examines previous work incorporating language and semantics into Neural Network (NN) and Simultaneous Localization and Mapping (SLAM) approaches, highlighting how these integrations have advanced the field. By contrasting abstract symbol manipulation with sensory-motor grounding, we propose a unified framework where language functions both as an abstract communicative system and as a grounded representation of perceptual experiences. Our review of cognitive models of distributional semantics and their application to autonomous agents underscores the transformative potential of language-integrated systems.

Forming Large Patterns with Local Robots in the OBLOT Model

Authors: Christopher Hahn, Jonas Harbig, Peter Kling

Link: http://arxiv.org/abs/2404.02771v2open in new window

Abstract: In the arbitrary pattern formation problem, $n$ autonomous, mobile robots must form an arbitrary pattern $P \subseteq \mathbb{R}^2$. The (deterministic) robots are typically assumed to be indistinguishable, disoriented, and unable to communicate. An important distinction is whether robots have memory and/or a limited viewing range. Previous work managed to form $P$ under a natural symmetry condition if robots have no memory but an unlimited viewing range [22] or if robots have a limited viewing range but memory [25]. In the latter case, $P$ is only formed in a shrunk version that has constant diameter. Without memory and with limited viewing range, forming arbitrary patterns remains an open problem. We provide a partial solution by showing that $P$ can be formed under the same symmetry condition if the robots' initial diameter is $\leq 1$. Our protocol partitions $P$ into rotation-symmetric components and exploits the initial mutual visibility to form one cluster per component. Using a careful placement of the clusters and their robots, we show that a cluster can move in a coordinated way through its component while drawing $P$ by dropping one robot per pattern coordinate.

Unsupervised Learning of Effective Actions in Robotics

Authors: Marko Zaric, Jakob Hollenstein, Justus Piater, Erwan Renaudo

Link: http://arxiv.org/abs/2404.02728v1open in new window

Abstract: Learning actions that are relevant to decision-making and can be executed effectively is a key problem in autonomous robotics. Current state-of-the-art action representations in robotics lack proper effect-driven learning of the robot's actions. Although successful in solving manipulation tasks, deep learning methods also lack this ability, in addition to their high cost in terms of memory or training data. In this paper, we propose an unsupervised algorithm to discretize a continuous motion space and generate "action prototypes", each producing different effects in the environment. After an exploration phase, the algorithm automatically builds a representation of the effects and groups motions into action prototypes, where motions more likely to produce an effect are represented more than those that lead to negligible changes. We evaluate our method on a simulated stair-climbing reinforcement learning task, and the preliminary results show that our effect driven discretization outperforms uniformly and randomly sampled discretizations in convergence speed and maximum reward.

SliceIt! -- A Dual Simulator Framework for Learning Robot Food Slicing

Authors: Cristian C. Beltran-Hernandez, Nicolas Erbetti, Masashi Hamaya

Link: http://arxiv.org/abs/2404.02569v1open in new window

Abstract: Cooking robots can enhance the home experience by reducing the burden of daily chores. However, these robots must perform their tasks dexterously and safely in shared human environments, especially when handling dangerous tools such as kitchen knives. This study focuses on enabling a robot to autonomously and safely learn food-cutting tasks. More specifically, our goal is to enable a collaborative robot or industrial robot arm to perform food-slicing tasks by adapting to varying material properties using compliance control. Our approach involves using Reinforcement Learning (RL) to train a robot to compliantly manipulate a knife, by reducing the contact forces exerted by the food items and by the cutting board. However, training the robot in the real world can be inefficient, and dangerous, and result in a lot of food waste. Therefore, we proposed SliceIt!, a framework for safely and efficiently learning robot food-slicing tasks in simulation. Following a real2sim2real approach, our framework consists of collecting a few real food slicing data, calibrating our dual simulation environment (a high-fidelity cutting simulator and a robotic simulator), learning compliant control policies on the calibrated simulation environment, and finally, deploying the policies on the real robot.

On-the-Go Tree Detection and Geometric Traits Estimation with Ground Mobile Robots in Fruit Tree Groves

Authors: Dimitrios Chatziparaschis, Hanzhe Teng, Yipeng Wang, Pamodya Peiris, Elia Scudiero, Konstantinos Karydis

Link: http://arxiv.org/abs/2404.02516v1open in new window

Abstract: By-tree information gathering is an essential task in precision agriculture achieved by ground mobile sensors, but it can be time- and labor-intensive. In this paper we present an algorithmic framework to perform real-time and on-the-go detection of trees and key geometric characteristics (namely, width and height) with wheeled mobile robots in the field. Our method is based on the fusion of 2D domain-specific data (normalized difference vegetation index [NDVI] acquired via a red-green-near-infrared [RGN] camera) and 3D LiDAR point clouds, via a customized tree landmark association and parameter estimation algorithm. The proposed system features a multi-modal and entropy-based landmark correspondences approach, integrated into an underlying Kalman filter system to recognize the surrounding trees and jointly estimate their spatial and vegetation-based characteristics. Realistic simulated tests are used to evaluate our proposed algorithm's behavior in a variety of settings. Physical experiments in agricultural fields help validate our method's efficacy in acquiring accurate by-tree information on-the-go and in real-time by employing only onboard computational and sensing resources.

Tightly-Coupled LiDAR-IMU-Wheel Odometry with Online Calibration of a Kinematic Model for Skid-Steering Robots

Authors: Taku Okawara, Kenji Koide, Shuji Oishi, Masashi Yokozuka, Atsuhiko Banno, Kentaro Uno, Kazuya Yoshida

Link: http://arxiv.org/abs/2404.02515v1open in new window

Abstract: Tunnels and long corridors are challenging environments for mobile robots because a LiDAR point cloud should degenerate in these environments. To tackle point cloud degeneration, this study presents a tightly-coupled LiDAR-IMU-wheel odometry algorithm with an online calibration for skid-steering robots. We propose a full linear wheel odometry factor, which not only serves as a motion constraint but also performs the online calibration of kinematic models for skid-steering robots. Despite the dynamically changing kinematic model (e.g., wheel radii changes caused by tire pressures) and terrain conditions, our method can address the model error via online calibration. Moreover, our method enables an accurate localization in cases of degenerated environments, such as long and straight corridors, by calibration while the LiDAR-IMU fusion sufficiently operates. Furthermore, we estimate the uncertainty (i.e., covariance matrix) of the wheel odometry online for creating a reasonable constraint. The proposed method is validated through three experiments. The first indoor experiment shows that the proposed method is robust in severe degeneracy cases (long corridors) and changes in the wheel radii. The second outdoor experiment demonstrates that our method accurately estimates the sensor trajectory despite being in rough outdoor terrain owing to online uncertainty estimation of wheel odometry. The third experiment shows the proposed online calibration enables robust odometry estimation in changing terrains.

PromptRPA: Generating Robotic Process Automation on Smartphones from Textual Prompts

Authors: Tian Huang, Chun Yu, Weinan Shi, Zijian Peng, David Yang, Weiqi Sun, Yuanchun Shi

Link: http://arxiv.org/abs/2404.02475v1open in new window

Abstract: Robotic Process Automation (RPA) offers a valuable solution for efficiently automating tasks on the graphical user interface (GUI), by emulating human interactions, without modifying existing code. However, its broader adoption is constrained by the need for expertise in both scripting languages and workflow design. To address this challenge, we present PromptRPA, a system designed to comprehend various task-related textual prompts (e.g., goals, procedures), thereby generating and performing corresponding RPA tasks. PromptRPA incorporates a suite of intelligent agents that mimic human cognitive functions, specializing in interpreting user intent, managing external information for RPA generation, and executing operations on smartphones. The agents can learn from user feedback and continuously improve their performance based on the accumulated knowledge. Experimental results indicated a performance jump from a 22.28% success rate in the baseline to 95.21% with PromptRPA, requiring an average of 1.66 user interventions for each new task. PromptRPA presents promising applications in fields such as tutorial creation, smart assistance, and customer service.