Robotics – Page 3 – The AI Sector

Skip to content Skip to sidebar Skip to footer

This AI Paper Introduces an LLM+FOON Framework: A Graph-Validated Approach for Robotic Cooking Task Planning from Video Instructions

RoboticsApril 16, 202573Views 0Likes 0Comments

Robots are increasingly being developed for home environments, specifically to enable them to perform daily activities like cooking. These tasks involve a combination of visual interpretation, manipulation, and decision-making across a series of actions. Cooking, in particular, is complex for robots due to the diversity in utensils, varying visual perspectives, and frequent omissions of intermediate…

Sensor-Invariant Tactile Representation for Zero-Shot Transfer Across Vision-Based Tactile Sensors

RoboticsApril 11, 202549Views 0Likes 0Comments

Tactile sensing is a crucial modality for intelligent systems to perceive and interact with the physical world. The GelSight sensor and its variants have emerged as influential tactile technologies, providing detailed information about contact surfaces by transforming tactile data into visual images. However, vision-based tactile sensing lacks transferability between sensors due to design and manufacturing…

NVIDIA AI Releases HOVER: A Breakthrough AI for Versatile Humanoid Control in Robotics

RoboticsApril 6, 202571Views 0Likes 0Comments

The future of robotics has advanced significantly. For many years, there have been expectations of human-like robots that can navigate our environments, perform complex tasks, and work alongside humans. Examples include robots conducting precise surgical procedures, building intricate structures, assisting in disaster response, and cooperating efficiently with humans in various settings such as factories, offices,…

Google DeepMind’s Gemini Robotics: Unleashing Embodied AI with Zero-Shot Control and Enhanced Spatial Reasoning

RoboticsMarch 17, 202586Views 0Likes 0Comments

Google DeepMind has shattered conventional boundaries in robotics AI with the unveiling of Gemini Robotics, a suite of models built upon the formidable foundation of Gemini 2.0. This isn’t just an incremental upgrade; it’s a paradigm shift, propelling AI from the digital realm into the tangible world with unprecedented “embodied reasoning” capabilities. Gemini Robotics: Bridging…

Optimizing Imitation Learning: How X‑IL is Shaping the Future of Robotics

RoboticsMarch 2, 2025109Views 0Likes 0Comments

Designing imitation learning (IL) policies involves many choices, such as selecting features, architecture, and policy representation. The field is advancing quickly, introducing many new techniques and increasing complexity, making it difficult to explore all possible designs and understand their impact. IL enables agents to learn through demonstrations rather than reward-based approaches. The increasing number of…

π0 Released and Open Sourced: A General-Purpose Robotic Foundation Model that could be Fine-Tuned to a Diverse Range of Tasks

RoboticsFebruary 10, 2025140Views 0Likes 0Comments

Robots are usually unsuitable for altering different tasks and environments. General-purpose models of robots are devised to circumvent this problem. They allow fine-tuning these general-purpose models for a wide scope of robotic tasks. However, it is challenging to maintain the consistency of shared open resources across various platforms. Success in real-world environments is far from…

Researchers from Stanford and Cornell Introduce APRICOT: A Novel AI Approach that Merges LLM-based Bayesian Active Preference Learning with Constraint-Aware Task Planning

RoboticsNovember 16, 2024104Views 0Likes 0Comments

In the rapidly evolving field of household robotics, a significant challenge has emerged in executing personalized organizational tasks, such as arranging groceries in a refrigerator. These tasks require robots to balance user preferences with physical constraints while avoiding collisions and maintaining stability. While Large Language Models (LLMs) enable natural language communication of user preferences, this…

Google DeepMind Researchers Propose RT-Affordance: A Hierarchical Method that Uses Affordances as an Intermediate Representation for Policies

RoboticsNovember 11, 2024115Views 0Likes 0Comments

In recent years, there has been significant development in the field of large pre-trained models for learning robot policies. The term “policy representation” here refers to the different ways of interfacing with the decision-making mechanisms of robots, which can potentially facilitate generalization to new tasks and environments. Vision-language-action (VLA) models are pre-trained with large-scale robot…

Latent Action Pretraining for General Action models (LAPA): An Unsupervised Method for Pretraining Vision-Language-Action (VLA) Models without Ground-Truth Robot Action Labels

RoboticsOctober 22, 2024142Views 0Likes 0Comments

Vision-Language-Action Models (VLA) for robotics are trained by combining large language models with vision encoders and then fine-tuning them on various robot datasets; this allows generalization to new instructions, unseen objects, and distribution shifts. However, various real-world robot datasets mostly require human control, which makes scaling difficult. On the other hand, Internet video data offers…

Theia: A Robot Vision Foundation Model that Simultaneously Distills Off-the-Shelf VFMs such as CLIP, DINOv2, and ViT

RoboticsAugust 3, 2024152Views 0Likes 0Comments

Visual understanding is the abstracting of high-dimensional visual signals like images and videos. Many problems are involved in this process, ranging from depth prediction and vision-language correspondence to classification and object grounding, which include tasks defined along spatial and temporal axes and tasks defined along coarse to fine granularity, like object grounding. In light of…