Hank Chi-Hsi Kung
I am a visiting researcher at Indiana University Bloomington where I am working with Prof. David Crandall and Prof. Linda Smith. My current research focuses on 3D representation learning in both human and machine intelligence, aiming to reverse-engineer the process by which children develop 3D representation learning.
Previously, I was a research assistant at National Chiao Tung University in Taiwan, where I focused on visual compositional representation and self-supervised video representation learning under the guidance of Prof. Yi-Ting Chen and Dr. Yi-Hsuan Tsai.
I was a research intern at IBM Thomas J. Watson Research Center. I received my M.Sc from National Tsing-Hua University, where I was supervised by Prof. Che-Rung Lee, and B.Sc from National Taipei University.
I am actively seeking a Ph.D. position starting in Fall 2025
Email /
Google Scholar /
X /
Github /
CV
|
|
Research
My research goal is to reverse engineer the human cognition. Intuitive Physics World Models, a foundamental componet enabling human to imagining the physical interaction and causality can help machines adapt to novel situations. I aim to build Intuitive Physics World Models to facilitate human-like intelligence. Toward this, learning compositional and augmentable representations is a crucial component as physical interaction and object properties are compositional concepts and can be "reused" to form new patterns and concepts.
Moreover, I am fascinated by how humans rapidly learn novel compositions with minimal experience, motivating me to approach learning compositionality through human learning by integrating insights from child development and cognitive sciences.
|
Publications
|
|
What Changed and What Could Have Changed? State-Change Counterfactuals for Procedure-Aware Video Representation Learning
Chi-Hsi Kung*,
Frangil Ramirez*,
Juhyung Ha,
Yi-Hsuan Tsai,
Yi-Ting Chen,
David Crandall
(* Equal Contribution)
Under review
arxiv
/
Code coming soon
Procedure-aware video representation learning with state changes and their counterfactuals.
|
|
ATARS: An Aerial Traffic Atomic Activity Recognition and Temporal Segmentation Dataset
Zihao Chen,
Hsuanyu Wu,
Chi-Hsi Kung*,
Yi-Ting Chen*,
Yan-Tsung Peng*
(* Equal Advising)
Under review
arxiv
/
Code
/
The first drone-view traffic dataset for compositional action recognition.
|
|
Controllable Scenario-based Collision Generation for Safety Assessment
Pin-Lun Chen,
Chi-Hsi Kung,
Che-Han Chang,
Wei-Chen Chiu,
Yi-Ting Chen
Under review
Paper coming soon
/
Code coming soon
/
|
|
Action-slot: Visual Action-centric Representations for Atomic Activity Recognition in Traffic Scenes
Chi-Hsi Kung,
Shu-Wei Lu,
Yi-Hsuan Tsai,
Yi-Ting Chen
CVPR, 2024
project page
/
paper
/
arxiv
/
code
/
TACO dataset
We use Action-slot to represent atomic activities. The learned attention can discover and localize atomic activities with only weak video labels and without using any perception module (e.g., object detector).
|
|
RiskBench: A Scenario-based Benchmark for Risk Identification
Chi-Hsi Kung,
Chieh-Chi Yang,
Pang-Yuan Pao,
Shu-Wei Lu,
Pin-Lun Chen,
Hsin-Cheng Lu,
Yi-Ting Chen
ICRA, 2024
project page
/
video
/
paper
/
code
/
dataset
The FIRST benchmark that enables evaluation of various types of risk identification algorithms, namely, rule-based, trajectoy-prediction-based, collision prediction, and behavior-change-based. We also assess the influence of risk identification to the downstream driving task.
|
|
ADD: A Fine-grained Dynamic Inference Architecture for Semantic Image Segmentation
Chi-Hsi Kung and
Che-Rung Lee
IROS, 2021 & ACML 2021 MRVC workshop
paper
/
code
We use Neural Architecture Search (NAS) to find an optimal structure for dynamic inference on semantic segmentation.
|
Conference Reviewer
IEEE Conference on Computer Vision and Pattern Recognition CVPR 2023-2025
The International Conference on Machine Learning ICML 2025
International Conference on Computer Vision ICCV 2025
Advances in Neural Information Processing Systems NeurIPS 2024
IEEE International Conference on Development and Learning ICDL 2024
IEEE/RSJ International Conference on Intelligent Robots and Systems IROS 2025
|
Feel free to steal this website's source code. Do not scrape the HTML from this page itself, as it includes analytics tags that you do not want on your own website — use the github code instead. Also, consider using Leonid Keselman's Jekyll fork of this page.
|
|