Hiwi (Student Assistant Researcher) in KI Fabrik

Published: October 04, 2025

I collected human demonstration data in a multi-modal fashion to enable skill planning in robotic manipulation using Large Language Models (LLMs). This includes:

Data Collection: Capturing human motion, force feedback, and vision-based interactions.
Skill Representation: Translating collected data into a structured format for robot skill learning.

2. Skill-Based Robot Control (MIOS)

MIOS (Skill-based Robot Control Platform) was extended by programming new skills and controllers:

Motion Primitives: Developing reusable skill modules for diverse robotic tasks.
New Controller: Designing controllers for robust execution of skills.

3. Haptic Sensing Integration

To enhance manipulation, external force sensing was integrated into MIOS, allowing:

Force Feedback Processing: Utilizing external force-torque sensors to enhance control precision.

4. Robot Vision System Implementation

Several vision-based techniques were implemented for perception tasks:

Hand-Eye Calibration: Establishing accurate alignment between camera and robot.
3D Reconstruction: Utilizing BundleSDF for environment mapping.
6D Pose Estimation: Leveraging FoundationPose for object localization.

5. Web Development

I contributed to web development for Project LEMMo-Plan: LLM-Enhanced Learning from Mutli-Modal Demonstration for Planning Sequential Contact-Rich Manipulation Tasks:

Project Website: Developed and maintained LEMMo-Plan.

6. Contribution to Deformable Linear Object Assembly

I continued my work on deformable linear object assembly by implementing:

Virtual-Tactile-Based Robot Position Local Correction: Ensuring accurate assembly using tactile feedback.
Real Robot Experiments: Conducted hardware experiments to validate the proposed approach.
Submitted to IROS 2025.

Share on

X (formerly Twitter) Facebook LinkedIn

Yue Zhang

1. Multi-Modal Demonstration Collection