您的瀏覽器不支援JavaScript語法,網站的部份功能在JavaScript沒有啟用的狀態下無法正常使用。

中央研究院 資訊科學研究所

活動訊息

友善列印

列印可使用瀏覽器提供的(Ctrl+P)功能

學術演講

:::

TIGP (AIoT) -- Challenges of Virtual-to-Real Learning for Deep Learning Based Intelligent Robotics

  • 講者李濬屹 教授 (國立清華大學 資訊工程學系)
    邀請人:TIGP (AIoT)
  • 時間2023-12-01 (Fri.) 14:00 ~ 16:00
  • 地點資訊所新館106演講廳
摘要
Collecting data on a large scale is vital for developing cutting-edge artificial intelligence (AI) technologies, when
they involve machine learning (ML) models such as deep neural networks that require to be trained using
relevant data. On the one hand, collecting data of the real world, using cameras or microphones would allow
AIs to better understand our everyday life and ultimately to behave naturally as we humans do or to help us in
a natural fashion. On the other hand, due to growing concerns about security and privacy, it is becoming
increasingly difficult to collect such real data.

This presentation aims to discuss a computational framework for real-data collection and learning, which
effectively leverages a collection of AI models for self-navigating mobile robots. We will particularly focus on
developing visual perception models that can see the real world through a camera -- as they play a pivotal role
for a variety of AI-powered products and services, such as autonomous vehicles and smart cities, and are the
main area of research that Elsa Lab has been contributing to. Visual perception models based on deep neural
networks have achieved unprecedented accuracies in benchmark datasets. Installing such models would
enable edge AIs to better perceive and understand their surrounding environments and act intelligently in the
real world. However, they usually suffer from accuracy drops and insufficiency of effective data samples from
the real world, leading to unsatisfactory performance and safety concerns in practical deployments.

To address the above-mentioned problems, we explore and incorporate the following key technologies into a
framework: virtual-to-real transfer, semantic segmentation based unsupervised domain adaptation (UDA), and
mid-level representations. Specifically, virtual-to-real transfer allows ML models to be trained in simulated
environments first and migrated to the real world setting with ease. Semantic segmentation based
unsupervised domain adaptation (UDA) further enables the above model migration process to become
possible, even under practical and challenging scenarios where data collection in the real world involves laborintensive
manual preprocessing costs. Furthermore, mid-level representations are used to deliver various
types of information from the perception module to the control module, and form the basis of modular
frameworks for many learning-based systems. The main scientific challenge of this research direction is to
integrate them into a unified solution, and improve the adaptation ability of AI models in the real world.