논문

실감교류인체감응솔루션연구단의 주요 논문 성과를 소개합니다.

Robust Visual Speakingness detection using bi-level HMM

Robust Visual Speakingness detection using bi-level HMM
학술지명 Pattern Recognition ISSN 0031-3203
SCI 유무 SCI 게재연월 2012-02 Vol. 45 No. 2
표준화된 순위정보영향력지수 88.38 IF 2.632 Citation -

Robust Visual Speakingness detection using bi-level HMM
저자 P Tiawongsombat, MunHo Jeong, JooSeop Yun, BumJae You, SangRok Oh
초록
Visual voice activity detection (V-VAD) plays an important role in both HCI and HRI, affecting both the conversation strategy and sync between humans and robots/computers. The typical speakingness decision of V-VAD consists of post-processing for signal smoothing and classification using thresholding. Several parameters, ensuring a good trade-off between hit rate and false alarm, are usually heuristically defined. This makes the V-VAD approaches vulnerable to noisy observation and changes of environment conditions, resulting in poor performance and robustness to undesired frequent speaking state changes. To overcome those difficulties, this paper proposes a new probabilistic approach, naming bi-level HMM and analyzing lip activity energy for V-VAD in HRI. The designing idea is based on lip movement and speaking assumptions, embracing two essential procedures into a single model.
A bi-level HMM is an HMM with two state variables in different levels, where state occurrence in a lower level conditionally depends on the state in an upper level. The approach works online with low-resolution image and in various lighting conditions, and has been successfully tested in 21 image sequences (22,927 frames). It achieved over 90% of probabilities of detection, in which it brought improvements of almost 20% compared to four other V-VAD approaches 
keyword Visual voice activity detection; Mouth image energy; Speakingness detection; Bi-level HMM

Robust Visual Speakingness detection using bi-level HMM
과제명 실감교류 확장공간 소프트웨어 플랫폼 기술
연구기관 실감교류인체감응솔루션연구단 연구책임자 유범재