Site
»çÀÌÆ®
ȸ¿øÁ¤º¸
³í¹®
ÇÐȸ¼Ò°³
ÇÐȸ¼Ò°³
ȸÀå Àλ縻
ÇÐȸ¿¬Çõ
ÇÐȸÁ¤°ü ¹× ±ÔÁ¤
Æ÷»óÁ¦µµ
ÀÓ¿øÁø ¹× À§¿øȸ
¿¬±¸È¸/ÁöºÎ
À§Ä¡ ¹× ¿¬¶ôó
Çмú´ëȸ
±¹³»Çмú´ëȸ
±¹Á¦Çмú´ëȸ
ÀΰøÁö´É ¹× ·Îº¸Æ½½º ¿©¸§/°Ü¿ïÇб³
¿öÅ©¼¥
°£Ç๰
ÇÐȸÁö
·Îº¿ÇÐȸ ³í¹®Áö
Intelligent Service Robotics
ÇÐȸ¼Ò½Ä
ÇÐȸ¼Ò½Ä
´º½º·¹ÅÍ
Çà»ç¾È³»
ä¿ë°ø°í
¶óÀ̺귯¸®
·Îº¿¿ª»ç
ȸ¿ø¾È³»
ȸ¿ø°¡ÀÔ ¾È³»
Ưº°È¸¿ø
´Üüȸ¿ø
Çмú´ëȸ
Korea Robotics Society
Çмú´ëȸ
±¹³»Çмú´ëȸ
±¹Á¦Çмú´ëȸ
ÀΰøÁö´É ¹× ·Îº¸Æ½½º ¿©¸§/°Ü¿ïÇб³
¿öÅ©¼¥
Á¦17ȸ Çѱ¹·Îº¿Á¾ÇÕÇмú´ëȸ(KRoC 2022) - ÀÏ¹Ý³í¹®/Ưº°¼¼¼Ç(OS)
KRoC 2022 Çà»ç¾È³» - ÀϹݡ¦
±âÁ¶°¿¬
ÃÊû°¿¬
½ÅÁø¿¬±¸ÀÚ
Flagship Conferences
µî·Ï¾È³»
> Çмú´ëȸ >
±¹³»Çмú´ëȸ
±¹³»Çмú´ëȸ
Flagship Conferences
·Îº¿/AI ºÐ¾ß Flagship Conferences(ICRA, IROS, ICML µî)¿¡¼ ¿ì¼öÇÑ ³í¹®À» ¹ßÇ¥ÇÑ ÀúÀڵ鿡°Ô ³í¹®¿¡ ´ëÇÑ ¼Ò°³¿Í ÇØ´ç ³í¹®À» ÁغñÇÏ¸é¼ ¾î¶² Åõ°í°úÁ¤À» °ÅÃÆ´ÂÁö ºñÇÏÀÎµå ½ºÅ丮¸¦ µé¾îº¸´Â ¼¼¼ÇÀÔ´Ï´Ù. ÀúÀÚµéÀÇ ¹ßÇ¥¸¦ ÅëÇØ Ã³À½ ±¹Á¦ÇÐȸ¿¡ ³í¹® Åõ°í¸¦ ÁغñÇÏ´Â Çлýµé¿¡°Ô ÁÁÀº °¡À̵带 Á¦°øÇÏ°íÀÚ ÇÕ´Ï´Ù. ¶ÇÇÑ, º» ¼¼¼Ç¿¡¼ ¹ßÇ¥ÇÏ´Â ¿©·¯ ¿ì¼ö³í¹®¿¡ °ü½ÉÀ» °¡Áö°í ÀÖ´Â »êÇп¬¿¡ °è½Å ¿¬±¸ÀÚ ºÐµéÀº ÇØ´ç ³»¿ëÀ» ½±°Ô Á¢ÇÒ ¼ö ÀÖ´Â ±âȸ°¡ µÇ¾úÀ¸¸é ÇÕ´Ï´Ù.
[Flagship Conferences ÀÏÁ¤]
- ÀϽÃ: 2022³â 5¿ù 13ÀÏ(±Ý) 10:30~12:00
- Àå¼Ò: ÆÀ¹öȦ
- ÁÂÀå: Á¶¿µ±Ù (ÀÎÇÏ´ëÇб³ ±³¼ö)
[ÇÁ·Î±×·¥]
°¿¬ÀÚ
À̱â¹Î (UC Berkeley ¹Ú»ç)
Á¦¸ñ
PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training
ÇÐȸ
2021, ICML (International Conference on Machine Learning), Oral (3% Acceptance Rate)
¿ä¾à
Conveying complex objectives to reinforcement learning (RL) agents can often be difficult, involving meticulous design of reward functions that are sufficiently informative yet easy enough to provide. Human-in-the-loop RL methods allow practitioners to instead interactively teach agents through tailored feedback; however, such approaches have been challenging to scale since human feedback is very expensive. In this work, we aim to make this process more sample- and feedback-efficient. We present an off-policy, interactive RL algorithm that capitalizes on the strengths of both feedback and off-policy learning. Specifically, we learn a reward model by actively querying a teacher's preferences between two clips of behavior and use it to train an agent. To enable off-policy learning, we relabel all the agent's past experience when its reward model changes. We additionally show that pre-training our agents with unsupervised exploration substantially increases the mileage of its queries. We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods, including a variety of locomotion and robotic manipulation skills. We also show that our method is able to utilize real-time human feedback to effectively prevent reward exploitation and learn new behaviors that are difficult to specify with standard reward functions.
°¿¬ÀÚ
È«½Â¿ì (KAIST ¹Ú»ç°úÁ¤)
Á¦¸ñ
Real-time constrained nonlinear model predictive control on SO(3) for dynamic legged locomotion
ÇÐȸ
2020, IROS (International Conference on Intelligent Robots and Systems), Best RoboCup Paper Award
¿ä¾à
º» ¿¬±¸´Â ´ÙÁ··Îº¿ÀÇ µ¿Àû º¸ÇàÀ» À§ÇÑ Á¦ÇÑµÈ ºñ¼±Çü ¸ðµ¨ ¿¹Ãø Á¦¾î±â¹ýÀ» Á¦¾ÈÇÏ¿´½À´Ï´Ù. ´ÙÁ··Îº¿À» ºÎÀ¯Çü ±âÀúÀÇ ´ÜÀÏ °Ã¼·Î °¡Á¤ÇÏ¿© ¸öü¿¡ Àû¿ëµÇ´Â Áö¸é¹Ý·ÂÀ» ¿ÜºÎ ÈûÀ¸·Î °£ÁÖÇÏ¿© ¸ðµ¨¸µÇÏ¿´½À´Ï´Ù. ƯÈ÷, ·Îº¿ ¸öüÀÇ 3Â÷¿ø ȸÀü »óÅ´ ¿ÀÀÏ·¯ °¢µµ µîÀÇ ±¹ºÎ ¸Å°³ º¯¼ö¸¦ »ç¿ëÇÏÁö ¾Ê°í, 3Â÷¿ø Ư¼ö Á÷±³ ±×·ì SO(3) À§»ó°ø°£¿¡¼ Á¤ÀǵǴ ȸÀü Çà·ÄÀ» »ç¿ëÇÏ¿´½À´Ï´Ù. SO(3) À§»ó°ø°£¿¡¼ Á¤ÀǵǴ ȸÀü Çà·ÄÀ» ´Ù·ç´Âµ¥ ÀÖ¾î µ¿¹ÝµÇ´Â ¹®Á¦Á¡Àº ÀϹÝÀûÀÎ ÃÖÀûÈ ±â¹ýµéÀ» Á÷Á¢ÀûÀ¸·Î Àû¿ëÇÒ ¼ö ¾ø´Ù´Â °ÍÀÔ´Ï´Ù. ÀÌ·¯ÇÑ ¹®Á¦Á¡À» ÇØ°áÇϱâ À§ÇØ, º» ¿¬±¸¿¡¼´Â Áö¼ö »ç»óÀ» SO(3) À§»ó°ø°£¿¡ ´ëÇÑ º¯Çü ¼öÃàÀ¸·Î »ç¿ëÇÏ¿© ȸÀü Çà·Ä º¯¼ö¸¦ ÇØ´ç ÁöÁ¡¿¡¼ Á¢¼± °ø°£ÀÇ º¯ºÐÀ¸·Î ³ªÅ¸³Â°í, À̸¦ ¹ÙÅÁÀ¸·Î ºñ¿ë ÇÔ¼öÀÇ ±â¿ï±â¿Í °¡¿ì½º-´ºÅÏ ±Ù»ç Çà·ÄÀ» °è»êÇϴµ¥ ÇÊ¿äÇÑ Çؼ®Àû ¾ßÄÚºñ Çà·ÄÀ» µµÃâÇß½À´Ï´Ù. ºñ¼±Çü ¸ðµ¨ ¿¹Ãø Á¦¾î ¹®Á¦¸¦ ¼ö½ÄÀ¸·Î Àü°³Çϸé Á¦ÇÑµÈ ºñ¼±Çü ÃÖ¼Ò Á¦°ö ¹®Á¦ ÇüÅ·Π³ªÅ¸³¾ ¼ö ÀÖ°í, ÀÌ´Â È¿À²ÀûÀÎ °¡¿ì½º-´ºÅÏ ¾Ë°í¸®ÁòÀ» ÀÌ¿ëÇÏ¿© ½Ç½Ã°£À¸·Î ÃÖÀûÇظ¦ °è»êÇÒ ¼ö ÀÖ½À´Ï´Ù. º» ¿¬±¸¿¡¼ Á¦¾ÈÇÏ´Â ºñ¼±Çü ¸ðµ¨ ¿¹Ãø Á¦¾î±â¸¦ º®¸é¿¡¼ÀÇ º¸ÇàÀ» Æ÷ÇÔÇÑ ´Ù¾çÇÑ µ¿ÀûÀÎ °ÉÀ½»õ ¹× ´Ù¾çÇÑ Á¾·ùÀÇ 4Á··Îº¿¿¡ Àû¿ëÇÏ¿© ¾Ë°í¸®ÁòÀÇ ¼º´ÉÀ» ÀÔÁõÇÏ¿´½À´Ï´Ù.
°¿¬ÀÚ
ÀÌÁ¤¹Î (¼¿ï´ëÇб³ ¹Ú»ç°úÁ¤)
Á¦¸ñ
´ÜÀ§½Ã½ºÅÛ ±â¹Ý º´·ÄÀûÀÌ°í ¸ðµâÈµÈ µ¿¿ªÇÐ ½Ã¹Ä·¹ÀÌ¼Ç (A Parallelized Iterative Algorithm for Real-Time Simulation of Long Flexible Cable Manipulation)
ÇÐȸ
2021 ICRA (International Conference on Robotics and Automation), Best Manipulation Paper Award Finalist
¿ä¾à
AI ¹× °¡»óÇö½Ç ±â¼ú µîÀÇ ¹ß´Þ·Î µ¿¿ªÇÐ ½Ã¹Ä·¹À̼ǿ¡ ´ëÇÑ °ü½ÉÀÌ ³ô¾ÆÁö°í ÀÖ½À´Ï´Ù. ƯÈ÷ ·Îº¸Æ½½º ºÐ¾ßÀÇ ½Ã¹Ä·¹À̼ÇÀº ´ÙÁßÁ¢ÃË, ¸ÖƼ¹Ùµð(À¯¿¬Ã¼, °Ã¼ µî)¸¦ Æ÷ÇÔÇϸé¼, ¼Óµµ¿Í Á¤È®µµ¸¦ µ¿½Ã¿¡ ¿ä±¸Çϱ⠶§¹®¿¡ ¾ÆÁ÷µµ »ó´çÈ÷ ¾î·Á¿î ¹®Á¦ÀÔ´Ï´Ù. º» ¿¬±¸¿¡¼´Â ´ÜÀ§½Ã½ºÅÛ ±â¹Ý, º´·ÄÀûÀÌ°í ¸ðµâÈµÈ µ¿¿ªÇÐ ½Ã¹Ä·¹ÀÌ¼Ç ÇÁ·¹ÀÓ¿öÅ©¸¦ Á¦½ÃÇÕ´Ï´Ù. ÀÌ´Â ÃÖ±Ù ¹ßÀüÇÏ°í ÀÖ´Â º´·Ä ÄÄÇ»Æÿ¡ ÀûÇÕÇÒ »Ó¸¸ ¾Æ´Ï¶ó, ƯÁ¤ ½Ã½ºÅÛÀ» À§ÇØ °³¹ßµÈ ¿©·¯ ¾Ë°í¸®ÁòµéÀ» ÅëÇÕÇÒ ¼ö ÀÖµµ·Ï ÇØÁÝ´Ï´Ù. ¿©±â¼ °¡Àå ÇÙ½ÉÀûÀÎ ºÎºÐÀº ´ÜÀ§½Ã½ºÅÛ °£ Ä¿Çøµ ºÎºÐÀ» ¾ÈÁ¤ÀûÀÌ°í È¿À²ÀûÀ¸·Î Ǫ´Â °ÍÀε¥, À̸¦ ¾î¶»°Ô ÇØ°áÇÏ´ÂÁö¿¡ ´ëÇÑ ¹æ¹ý·ÐÀÌ Á¦½ÃµË´Ï´Ù.
°¿¬ÀÚ
±è¿ìÁ¾ (KAIST ¹Ú»ç°úÁ¤)
Á¦¸ñ
¼ÒÇÁÆ® ¿þ¾î·¯ºí ·Îº¿ ÀåÄ¡¸¦ À§ÇÑ Æò¸éÇü õ¼ÒÀç °ø¾Ð Àΰø ±ÙÀ° (Compact Flat Fabric Pneumatic Artificial Muscle (ffPAM) for Soft Wearable Robotic Devices)
ÇÐȸ
2021 ICRA (International Conference on Robotics and Automation), Best Paper in Service Robots
¿ä¾à
2021³â 6¿ù¿¡ ÁøÇàµÇ¾ú´ø, International Conference on Robotics and Automation (ICRA 2021)¿¡¼ Best Paper Award in Service Robotics¸¦ ÃÖÁ¾ ¼ö»óÇÑ ¿¬±¸·Î, º» ¿¬±¸´Â ¼ÒÇÁÆ® ¿þ¾î·¯ºí ·Îº¿ ÀåÄ¡¸¦ À§ÇÑ Æò¸éÇü õ ¼ÒÀç °ø¾Ð Àΰø ±ÙÀ°À» Á¦¾ÈÇÏ¿´½À´Ï´Ù. Ư¼ö õ ¼ÒÀçÀÇ È°¿ë ¹× ÄÄÆÑÆ®ÇÑ Æò¸éÇü µðÀÚÀÎÀ» °í¾ÈÇÔÀ¸·Î½á ¼ÒÇüÈ. ºü¸¥ ÀÀ´ä½Ã°£. ³·Àº ÀÌ·Â ¿ÀÂ÷ µîÀÇ ³ôÀº ½Ç¿ëÀûÀÎ È°¿ë¼ºÀ» ÀÔÁõÇÏ¿´À¸¸ç Æó·çÇÁ ±æÀÌ Á¦¾î¸¦ À§ÇÑ Á¤Àü¿ë·® ½Ä ¼öÃà ¼¾¼ ¶ÇÇÑ ³»ÀçÇÏ°í ÀÖ½À´Ï´Ù. ´Ù¾çÇÑ ½ÇÇè °á°úµéÀ» Åä´ë·Î º» ±¸µ¿±â°¡ ¿þ¾î·¯ºí ¼ºñ½º ·Îº¿ ºÐ¾ß¿¡ ±¤¹üÀ§ÇÑ ¿µÇâ·ÂÀ» ¹ÌÄ¥¼ö ÀÖÀ½À» ºÐ¸íÈ÷ º¸¿©ÁÖ¾ú°í À̸¦ ÀÎÁ¤¹Þ¾Æ »óÀ» ¼ö»ó¹Þ¾Ò½À´Ï´Ù.