TY - JOUR
T1 - Data generation and representation method for 3D video conferencing using programming by demonstration
AU - Sung, Yunsick
AU - Cho, Kyungeun
PY - 2013/11
Y1 - 2013/11
N2 - Video conferencing provides an environment for multiple users linked on a network to have meetings. Since a large quantity of audio and video data are transferred to multiple users in real time, research into reducing the quantity of data to be transferred has been drawing attention. Such methods extract and transfer only the features of a user from video data and then reconstruct a video conference using virtual humans. The disadvantage of such an approach is that only the positions and features of hands and heads are extracted and reconstructed, whilst the other virtual body parts do not follow the user. In order to enable a virtual human to accurately mimic the entire body of the user in a 3D virtual conference, we examined what features should be extracted to express a user more clearly and how they can be reproduced by a virtual human. This 3D video conferencing estimates the user's pose by comparing predefined images with a photographed user's image and generates a virtual human that takes the estimated pose. However, this requires predefining a diverse set of images for pose estimation and, moreover, it is difficult to define behaviors that can express poses correctly. This paper proposes a framework to automatically generate the pose-images used to estimate a user's pose and the behaviors required to present a user using a virtual human in a 3D video conference. The method for applying this framework to a 3D video conference on the basis of the automatically generated data is also described. In the experiment, the framework proposed in this paper was implemented in a mobile device. The generation process of poses and behaviors of virtual human was verified. Finally, by applying programming by demonstration, we developed a system that can automatically collect the various data necessary for a video conference directly without any prior knowledge of the video conference system.
AB - Video conferencing provides an environment for multiple users linked on a network to have meetings. Since a large quantity of audio and video data are transferred to multiple users in real time, research into reducing the quantity of data to be transferred has been drawing attention. Such methods extract and transfer only the features of a user from video data and then reconstruct a video conference using virtual humans. The disadvantage of such an approach is that only the positions and features of hands and heads are extracted and reconstructed, whilst the other virtual body parts do not follow the user. In order to enable a virtual human to accurately mimic the entire body of the user in a 3D virtual conference, we examined what features should be extracted to express a user more clearly and how they can be reproduced by a virtual human. This 3D video conferencing estimates the user's pose by comparing predefined images with a photographed user's image and generates a virtual human that takes the estimated pose. However, this requires predefining a diverse set of images for pose estimation and, moreover, it is difficult to define behaviors that can express poses correctly. This paper proposes a framework to automatically generate the pose-images used to estimate a user's pose and the behaviors required to present a user using a virtual human in a 3D video conference. The method for applying this framework to a 3D video conference on the basis of the automatically generated data is also described. In the experiment, the framework proposed in this paper was implemented in a mobile device. The generation process of poses and behaviors of virtual human was verified. Finally, by applying programming by demonstration, we developed a system that can automatically collect the various data necessary for a video conference directly without any prior knowledge of the video conference system.
KW - Behavior generation
KW - Maximin Selection Algorithm
KW - Programming by demonstration
KW - Video conferencing
KW - Virtual human
UR - http://www.scopus.com/inward/record.url?scp=84881122620&partnerID=8YFLogxK
U2 - 10.1007/s11042-011-0941-8
DO - 10.1007/s11042-011-0941-8
M3 - Article
AN - SCOPUS:84881122620
SN - 1380-7501
VL - 67
SP - 71
EP - 95
JO - Multimedia Tools and Applications
JF - Multimedia Tools and Applications
IS - 1
ER -