KR101768980B1

KR101768980B1 - Virtual video call method and terminal

Info

Publication number: KR101768980B1
Application number: KR1020157036602A
Authority: KR
Inventors: 강 리
Original assignee: 바이두 온라인 네트웍 테크놀러지 (베이징) 캄파니 리미티드
Priority date: 2013-12-20
Filing date: 2014-12-05
Publication date: 2017-08-17
Also published as: KR20160021146A; WO2015090147A1; JP2016537922A; CN103647922A

Abstract

본 발명은 한가지 가상 동영상 통화 방법 및 단말을 제공한다. 여기서, 당해 방법은: 제1 단말 사용자의 동영상 이미지를 채집하는 단계; 동영상 이미지에 대해 얼굴 인식을 행하여 얼굴 표정 정보를 획득하는 단계; 얼굴 표정 정보를 제1 단말과 통화연결을 구성하는 제2 단말로 송신하는 단계를 포함하고, 얼굴 표정 정보는 제2 단말이 얼굴 표정 정보와 제2 단말에 미리 설정된 안면 이미지 모형에 의거하여 동영상 이미지를 합성하고 디스플레이하도록 하기 위한 것이다. 본 발명의 실시예의 방법은 얼굴 인식 기술을 이용하여 송신단(예를 들어, 제1 단말)에서 얼굴 표정 정보를 추출하고, 수신단(예를 들어, 제2 단말)에서 송신한 얼굴 표정 정보와 미리 설정된 안면 이미지 모듈에 의거하여 안면 이미지의 합성과 환원을 실현한다. 전송된 얼굴 표정 데이터량이 매우 작아 동영상 통화과정에서 전송되는 데이터량을 대폭 낮춤으로써 동영상 통화가 더욱 유창하도록 하고, 네트워크 대역폭 한정 또는 트래픽 제한 등이 동영상 통화에 대한 영향을 줄인다.The present invention provides a virtual video communication method and a terminal. Here, the method comprises: collecting a moving image of a first terminal user; Acquiring facial expression information by performing face recognition on a moving image; And transmitting the facial expression information to a second terminal that forms a call connection with the first terminal, wherein the facial expression information is information indicating that the second terminal transmits the facial expression information and the moving image So as to synthesize and display them. The method of the present invention extracts facial expression information from a transmitting end (e.g., a first terminal) using a facial recognition technology, and extracts facial expression information transmitted from a receiving end (e.g., a second terminal) Based on the facial image module, the synthesis and reduction of the facial image are realized. Since the amount of transmitted facial expression data is very small, the amount of data transmitted in the video call process is greatly reduced, so that the video call becomes more fluent, and the influence of the network bandwidth limitation or the traffic limitation is reduced.

Description

TECHNICAL FIELD [0001] The present invention relates to a virtual video call method and terminal,

본 출원은 바이두 온라인 네트웍 테크놀로지(베이징)캄파니 리미티드가 2013년 12월 20일 제출한 발명의 명칭이 "가상 동영상 통화 방법 및 단말"이고, 중국 특허 출원 번호가 "201310714667.3"인 우선권을 요구한다.The present application claims the priority of Chinese patent application number "201310714667.3 " which is the name of the invention submitted by Baidu Online Network Technology (Beijing) Co., Ltd. on December 20,

본 발명은 통신 기술 분야에 관한 것으로, 더욱 상세하게는 가상 동영상 통화 방법 및 단말에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a communication technology field, and more particularly, to a virtual video communication method and a terminal.

네트워크 브로드밴드가 쾌속으로 진급 및 하드웨어 설비가 발전하고 보급됨에 따라 동영상 통화의 시장도 빠른 속도로 발전단계에 진입하였다. 현재, 가상 동영상 통화의 주요 방법은 송신단에서 이미지를 채집하고, 이미지 중의 얼굴 구역을 확정하고, 얼굴 구역 내의 얼굴 특징 정보에 대해 추출을 행하고, 추출해낸 얼굴 특징 정보를 수신단으로 송신하여, 수신단에서 얼굴 특징 정보를 이용하여 대응되는 사용자의 얼굴 표정을 재현하는 것이다.With the rapid growth of network broadband and the development and distribution of hardware facilities, the market for video calls has also entered the development stage at a rapid pace. At present, the main method of virtual video call is to collect images at the transmitting end, to determine the face area in the image, to extract the face feature information in the face area, to transmit the extracted facial feature information to the receiving end, And reproduces the facial expression of the corresponding user using the feature information.

현재 존재하는 결함은, 사람마다 얼굴 특징이 다르므로, 추출한 얼굴 특징 정보의 데이터는 여전히 매우 크며, 또한 상기 방법은 얼굴 특징 정보에 의거하여 특정된 대상 얼굴 모형(예를 들어, 송신단의 사용자의 얼굴 모형)도 재건해야 한다는 것이다. 이에 따라 알 수 있다시피, 종래의 기술에서 전송하는 동영상 데이터량은 매우 큰바, 대량의 데이터 트래픽을 소모하고, 동영상 통화가 유창하지 못함을 초래할 수 있어 대역폭이 제한된 모바일 네트워크 또는 트래픽이 제한된 장소에 적합하지 않는바, 이에 따라 동영상 통화의 보급과 확장을 심각하게 가로막고 있다.Since the facial features of the present defects are different for each person, the extracted data of the facial feature information is still very large, and the above method is also applicable to a target facial model specified based on facial feature information (for example, Model) should also be rebuilt. As can be appreciated from the above, the amount of moving picture data to be transmitted in the conventional technology is very large, which may result in consuming a large amount of data traffic and not fluency of the video call, which is suitable for a mobile network with limited bandwidth or limited traffic This, in turn, seriously blocks the spread and expansion of video calls.

본 발명은 적어도 상기 기술 문제 중의 하나를 해결하고자 한다.The present invention seeks to solve at least one of the above problems.

이를 위해, 본 발명의 첫번째 목적은 한가지 가상 동영상 통화 방법을 제공하는데 있다. 당해 방법은 동영상 통화과정에서 전송되는 데이터량을 대폭 줄이고, 데이터트래픽을 절약함으로써, 동영상 통화가 더욱 유창하도록 하고, 네트워크 대역폭 한정 또는 트래픽 제한이 동영상 통화에 대한 영향을 줄이고, 사용자 체험을 향상시킨다.To this end, a first object of the present invention is to provide a virtual video communication method. The method significantly reduces the amount of data transmitted in a video call process, saves data traffic, makes video calls more fluent, reduces network bandwidth limitations or traffic restrictions on video calls, and improves the user experience.

본 발명의 두번째 목적은 다른 한가지 가상 동영상 통화 방법을 제공하는데 있다.A second object of the present invention is to provide another virtual video communication method.

본 발명의 세번째 목적은 한가지 단말을 제공하는데 있다.A third object of the present invention is to provide a single terminal.

본 발명의 네번째 목적은 다른 한가지 단말을 제공하는데 있다.A fourth object of the present invention is to provide another terminal.

본 발명의 다섯번째 목적은 한가지 단말설비를 제공하는데 있다.A fifth object of the present invention is to provide a single terminal equipment.

본 발명의 여섯번째 목적은 다른 한가지 단말설비를 제공하는데 있다.The sixth object of the present invention is to provide another terminal equipment.

상기 목적을 실현하기 위하여, 본 발명의 제1측면의 실시예의 가상 동영상 통화 방법은, 제1 단말 사용자의 동영상 이미지를 채집하는 단계; 상기 동영상 이미지에 대해 얼굴 인식을 행하여 얼굴 표정 정보를 획득하는 단계; 상기 얼굴 표정 정보를 상기 제1 단말과 통화를 형성하는 제2 단말로 송신하는 단계를 포함하며, 상기 얼굴 표정 정보는 상기 제2 단말이 상기 얼굴 표정 정보와 상기 제2 단말에 미리 설정된 안면 이미지 모형에 의거하여 동영상 이미지를 합성하고 디스플레이하도록 하기 위한 것이다.In order to achieve the above object, a virtual video calling method of an embodiment of the first aspect of the present invention includes: collecting a moving image of a first terminal user; Performing facial recognition on the moving image to obtain facial expression information; And transmitting the facial expression information to a second terminal that forms a call with the first terminal, wherein the facial expression information is information indicating that the second terminal has received the facial expression information and a facial image model To synthesize and display moving image images.

본 발명의 실시예의 가상 동영상 통화 방법은 얼굴 인식 기술을 이용하여 송신단(예를 들어, 제1 단말)에서 얼굴 표정 정보를 추출하고, 수신단(예를 들어, 제2 단말)에서 송신된 얼굴 표정 정보와 미리 설정된 안면 이미지 모형에 의거하여 안면 이미지의 간단한 합성과 환원을 실현한다. 송신단과 수신단에서 전송되는 정보는 얼굴 표정 정보에만 한정되고, 당해 얼굴 표정 정보는 완전한 안면 이미지를 합성하는 것을 필요로 하지 않으므로, 포함된 정보량이 적고, 코딩한 후 얼굴 표정 정보의 데이터량은 몇 개의 비트바이트만 차지한다. 따라서 배경 기술이 전송하는 정보와 비교할 때, 동영상 통화과정에서 전송되는 데이터량을 대폭 줄이고, 데이터트래픽을 절약함으로써, 동영상 통화가 더욱 유창하도록 하고, 네트워크 대역폭 한정 또는 트래픽 제한이 동영상 통화에 대한 영향을 줄여, 특히 모바일 네트워크에서의 전송에 적합하고, 사용자 체험을 향상시킨다. 이 외에, 제2 단말에서 제1 단말 사용자의 안면 이미지 모형을 재건할 필요 없이 제2 단말이 얼굴 표정 정보에 의거하여 미리 설정된 안면 이미지 모형에 대응되는 얼굴 표정을 디스플레이하기만 하면 되는바, 제2 단말이 조정하기 용이하도록 한다.The virtual video call method of the embodiment of the present invention extracts facial expression information from a transmitting end (for example, a first terminal) using the facial recognition technology, and transmits the facial expression information transmitted from a receiving end (for example, And a simple synthesis and reduction of the facial image based on the preset facial image model. Since the information transmitted from the transmitting end and the receiving end is limited to facial expression information and the facial expression information does not need to synthesize a complete facial image, the amount of information contained is small, and the data amount of the facial expression information after coding is several It takes up only bit bytes. Therefore, when compared with the information transmitted by the background technology, it is possible to greatly reduce the amount of data transmitted during the video call process, to save data traffic, to make the video call more fluent, and to limit the effect of network bandwidth limitation or traffic restriction on the video call Which is particularly suitable for transmission in a mobile network and enhances the user experience. In addition, since the second terminal needs only to display the facial expression corresponding to the preset facial image model based on the facial expression information without having to reconstruct the facial image model of the first terminal user at the second terminal, So that the terminal can be easily adjusted.

상기 목적을 실현하기 위하여, 본 발명의 제2측면의 실시예의 가상 동영상 통화 방법은 제2 단말과 통화를 형성하는 제1 단말이 송신한 동영상 이미지의 얼굴 표정 정보를 수신하는 단계; 상기 얼굴 표정 정보와 상기 제2 단말에 미리 설정된 안면 이미지 모형에 의거하여 동영상 이미지를 합성하고 디스플레이하는 단계를 포함한다.In order to achieve the above object, a virtual moving picture communication method of an embodiment of the second aspect of the present invention comprises: receiving facial expression information of a moving image transmitted by a first terminal forming a call with a second terminal; And synthesizing and displaying the moving image based on the facial expression information and the facial image model set in advance in the second terminal.

본 발명의 실시예의 가상 동영상 통화 방법은 얼굴 인식 기술을 이용하여 송신단(예를 들어, 제1 단말)에서 얼굴 표정 정보를 추출하고, 수신단(예를 들어, 제2 단말)에서 송신한 얼굴 표정 정보와 미리 설정된 안면 이미지 모형에 의거하여 안면 이미지의 간단한 합성과 환원을 실현한다. 송신단과 수신단에서 전송되는 정보는 얼굴 표정 정보에만 한정되고, 당해 얼굴 표정 정보는 완전한 안면 이미지를 합성하는 것은 필요하지 않으므로, 포함된 정보량이 적고, 코딩한 후 얼굴 표정 정보의 데이터량은 몇 개의 비트바이트만 차지한다. 따라서 배경 기술이 전송한 정보와 비교할 때, 동영상 통화과정에서 전송되는 데이터량을 대폭 줄이고, 데이터트래픽을 절약함으로써, 동영상 통화가 더욱 유창하도록 하고, 네트워크 대역폭 한정 또는 트래픽 제한이 동영상 통화에 대한 영향을 줄여, 특히 모바일 네트워크에서의 전송에 적합하고, 사용자 체험을 향상시킨다. 이 외에, 제2 단말에서 제1 단말 사용자의 안면 이미지 모형을 재건할 필요 없이 제2 단말이 얼굴 표정 정보에 의거하여 미리 설정된 안면 이미지 모형에 대응되는 얼굴 표정을 디스플레이하기만 하면 되는바, 제2 단말이 조정하기 용이하도록 한다.The virtual video call method of the embodiment of the present invention extracts facial expression information from a transmitting end (for example, a first terminal) using the facial recognition technology, and transmits the facial expression information transmitted from a receiving end (for example, And a simple synthesis and reduction of the facial image based on the preset facial image model. Since the information transmitted from the transmitting end and the receiving end is limited to facial expression information, it is not necessary to synthesize a complete facial image, the amount of information included is small, and the data amount of the facial expression information after coding is several bits Only bytes are occupied. Therefore, when compared with the information transmitted by the background technology, it is possible to greatly reduce the amount of data transmitted during the video call process, to save the data traffic, to make the video call more fluent, and to limit the effect of network bandwidth limitation or traffic restriction on the video call Which is particularly suitable for transmission in a mobile network and enhances the user experience. In addition, since the second terminal needs only to display the facial expression corresponding to the preset facial image model based on the facial expression information without having to reconstruct the facial image model of the first terminal user at the second terminal, So that the terminal can be easily adjusted.

상기 목적을 실현하기 위하여, 본 발명의 세번째 실시예의 단말은 사용자의 동영상 이미지를 채집하기 위한 채집 모듈; 상기 동영상 이미지에 대해 얼굴 인식을 행하여 얼굴 표정 정보를 획득하기 위한 인식 모듈; 상기 얼굴 표정 정보를 단말과 통화를 형성하는 제2 단말로 송신하기 위한 송신 모듈을 포함하며, 상기 얼굴 표정 정보는 상기 제2 단말이 상기 얼굴 표정 정보와 상기 제2 단말에 미리 설정된 안면 이미지 모형에 의거하여 동영상 이미지를 합성하고 디스플레이하도록 하기 위한 것이다.In order to achieve the above object, a terminal of a third embodiment of the present invention includes a collection module for collecting a moving image of a user; A recognition module for performing facial recognition on the moving image to acquire facial expression information; And a transmission module for transmitting the facial expression information to a second terminal that forms a call with the terminal, wherein the facial expression information is information indicating that the second terminal has received the facial expression information and a facial image model preset in the second terminal To synthesize and display the moving image.

본 발명의 실시예의 단말은 얼굴 인식 기술을 이용하여 얼굴 표정 정보를 추출하고, 단말과 통화를 형성하는 제2 단말이 송신된 얼굴 표정 정보와 미리 설정된 안면 이미지 모형에 의거하여 안면 이미지의 간단한 합성과 환원을 실현하도록 한다. 전송한 정보는 얼굴 표정 정보에만 한정되고, 또 당해 얼굴 표정 정보는 완전한 안면 이미지를 합성하는 것을 필요로 하지 않으므로, 포함된 정보량이 적고, 코딩한 후 얼굴 표정 정보의 데이터량은 몇 개의 비트바이트만 차지한다. 따라서 배경 기술이 전송한 정보와 비교할 때, 동영상 통화과정에서 전송되는 데이터량을 대폭 줄이고, 데이터트래픽을 절약함으로써, 동영상 통화가 더욱 유창하도록 하고, 네트워크 대역폭 한정 또는 트래픽 제한이 동영상 통화에 대한 영향을 줄여, 특히 모바일 네트워크에서의 전송에 적합하고, 사용자 체험을 향상시킨다. 이 외에, 제2 단말에서 사용자의 안면 이미지 모형을 재건할 필요 없이 제2 단말은 얼굴 표정 정보에 의거하여 미리 설정된 안면 이미지 모형에 대응되는 얼굴 표정을 디스플레이하기만 하면 되는바, 제2 단말이 조정하기 용이하도록 한다.The terminal of the embodiment of the present invention extracts facial expression information using the facial recognition technology, and a simple synthesis of a facial image based on facial expression information and a preset facial image model transmitted from a second terminal, which forms a call with the terminal, Reduction is realized. Since the transmitted information is limited only to the facial expression information and the facial expression information does not need to synthesize a complete facial image, the amount of information included is small, and the data amount of the facial expression information after coding is only a few bit bytes Occupies. Therefore, when compared with the information transmitted by the background technology, it is possible to greatly reduce the amount of data transmitted during the video call process, to save the data traffic, to make the video call more fluent, and to limit the effect of network bandwidth limitation or traffic restriction on the video call Which is particularly suitable for transmission in a mobile network and enhances the user experience. In addition, since the second terminal needs only to display the facial expression corresponding to the preset facial image model based on the facial expression information without having to rebuild the facial image model of the user at the second terminal, .

상기 목적을 실현하기 위하여, 본 발명의 제4측면의 실시예의 단말은 단말과 통화를 형성하는 제1 단말이 송신한 동영상 이미지의 얼굴 표정 정보를 수신하기 위한 수신 모듈; 상기 얼굴 표정 정보와 상기 단말에 미리 설정된 안면 이미지 모형에 의거하여 동영상 이미지를 합성하고 디스플레이하기 위한 합성 모듈을 포함한다.In order to achieve the above object, a terminal of an embodiment of the fourth aspect of the present invention comprises: a receiving module for receiving facial expression information of a moving image transmitted by a first terminal forming a call with a terminal; And a synthesizing module for synthesizing and displaying the moving image based on the facial expression information and the facial image model set in advance in the terminal.

본 발명의 실시예의 단말은, 얼굴 인식 기술을 이용하여 얼굴 표정 정보를 추출하고, 단말과 통화를 형성하는 제1 단말이 송신한 얼굴 표정 정보와 미리 설정된 안면 이미지 모형에 의거하여 안면 이미지의 간단한 합성과 환원을 실현한다. 송신단과 수신단에서 전송되는 정보는 얼굴 표정 정보에만 한정되고, 당해 얼굴 표정 정보는 완전한 안면 이미지를 합성하는 것을 필요로 하지 않으므로, 포함된 정보량이 적고, 코딩한 후 얼굴 표정 정보의 데이터량은 몇 개의 비트바이트만 차지한다. 따라서 배경 기술이 전송하는 정보와 비교할 때, 동영상 통화과정에서 전송되는 데이터량을 대폭 줄이고, 데이터트래픽을 절약함으로써, 동영상 통화가 더욱 유창하도록 하고, 네트워크 대역폭 한정 또는 트래픽 제한이 동영상 통화에 대한 영향을 줄여, 특히 모바일 네트워크에서의 전송에 적합하고, 사용자 체험을 향상시킨다. 이 외에, 안면 이미지 모형을 재건할 필요 없이 얼굴 표정 정보에 의거하여 미리 설정된 안면 이미지 모형에 대응되는 얼굴 표정만 디스플레이하기만 하면 되는바, 조정하기 용이하고, 단말이 조정하기 용이하도록 한다.A terminal of the embodiment of the present invention extracts facial expression information using a facial recognition technology and generates a simple synthesis of a facial image based on facial expression information transmitted from a first terminal forming a call with a terminal and a preset facial image model And realizes reduction. Since the information transmitted from the transmitting end and the receiving end is limited to facial expression information and the facial expression information does not need to synthesize a complete facial image, the amount of information contained is small, and the data amount of the facial expression information after coding is several It takes up only bit bytes. Therefore, when compared with the information transmitted by the background technology, it is possible to greatly reduce the amount of data transmitted during the video call process, to save data traffic, to make the video call more fluent, and to limit the effect of network bandwidth limitation or traffic restriction on the video call Which is particularly suitable for transmission in a mobile network and enhances the user experience. In addition, since it is only necessary to display a facial expression corresponding to a predetermined facial image model based on the facial expression information without reconstructing the facial image model, adjustment is easy and the terminal facilitates adjustment.

상기 목적을 실현하기 위하여, 본 발명의 제5측면의 실시예의 단말설비는 하나 또는 복수의 프로세서; 메모리; 하나 또는 복수의 프로그램을 포함하고, 상기 하나 또는 복수의 프로그램은 상기 메모리에 저장되고, 상기 하나 또는 복수의 프로세서에 의해 수행될 때, 상기 단말설비 사용자의 동영상 이미지를 채집하는 조작; 상기 동영상 이미지에 대해 얼굴 인식을 행하여 얼굴 표정 정보를 획득하는 조작; 상기 얼굴 표정 정보를 상기 단말설비와 통화를 형성하는 제2 단말로 송신하는 조작을 행하며, 상기 얼굴 표정 정보는 상기 제2 단말이 상기 얼굴 표정 정보와 상기 제2 단말에 미리 설정된 안면 이미지 모형에 의거하여 동영상 이미지를 합성하고 디스플레이하도록 하기 위한 것이다.In order to realize the above object, a terminal equipment of an embodiment of the fifth aspect of the present invention comprises one or a plurality of processors; Memory; An operation of collecting a moving image of the terminal equipment user when the one or more programs are stored in the memory and executed by the one or more processors; An operation of performing facial recognition on the moving image to obtain facial expression information; The facial expression information is transmitted to the second terminal based on the facial expression information and a facial image model preset in the second terminal, Thereby synthesizing and displaying the moving image.

상기 목적을 실현하기 위하여, 본 발명의 제6측면의 실시예의 단말설비는 하나 또는 복수의 프로세서; 메모리; 하나 또는 복수의 프로그램을 포함하고, 상기 하나 또는 복수의 프로그램은 상기 메모리에 저장되고, 상기 하나 또는 복수의 프로세서에 의해 수행될 때, 상기 단말설비와 통화를 형성하는 제1 단말이 송신한 동영상 이미지의 얼굴 표정 정보를 수신하는 조작; 상기 얼굴 표정 정보와 상기 단말설비에 미리 설정된 안면 이미지 모형에 의거하여 동영상 이미지를 합성하고 디스플레이하는 조작을 행한다.In order to achieve the above object, a terminal equipment of an embodiment of the sixth aspect of the present invention comprises one or a plurality of processors; Memory; Wherein the one or more programs are stored in the memory, and when the one or more programs are executed by the one or more processors, An operation of receiving facial expression information of a face; And performs an operation of synthesizing and displaying the moving image based on the facial expression information and the facial image model set in advance in the terminal equipment.

본 발명의 부가된 방면과 장점은 아래의 설명에서 부분적으로 제시되고, 일부는 아래의 설명에서 뚜렷해지거나 본 발명의 실천을 통하여 요해되게 될 것이다.Additional aspects and advantages of the invention will be set forth in part in the description that follows, and in part will be obvious from the description, or may be learned by practice of the invention.

본 발명의 상기 및/또는 부가된 방면과 장점은 아래에서 첨부된 도면을 결합한 실시예에 대한 설명으로부터 뚜렷해지고 이해하기 용이해진다. 여기서,
도 1은 본 발명의 일 실시예에 따른 가상 동영상 통화 방법의 플로차트이고;
도 2는 본 발명의 다른 한 실시예에 따른 가상 동영상 통화 방법의 플로차트이며;
도 3은 본 발명의 또 하나의 실시예에 따른 가상 동영상 통화 방법의 플로차트이고;
도 4는 발명의 일 실시예에 따른 단말의 구조 설명도이며;
도 5는 본 발명의 다른 한 실시예에 따른 단말의 구조 설명도이고; 그리고,
도 6은 본 발명의 또 하나의 실시예에 따른 단말의 구조 설명도이다.The above and / or additional aspects and advantages of the present invention will become apparent and appreciated from the following description of an embodiment in which the accompanying drawings are combined. here,
1 is a flowchart of a virtual moving picture communication method according to an embodiment of the present invention;
2 is a flowchart of a virtual video call method according to another embodiment of the present invention;
3 is a flowchart of a virtual moving picture communication method according to another embodiment of the present invention;
4 is a structural explanatory diagram of a terminal according to an embodiment of the present invention;
5 is a structural explanatory view of a terminal according to another embodiment of the present invention; And,
6 is a structural explanatory view of a terminal according to another embodiment of the present invention.

아래에서는 본 발명의 실시예를 상세하게 설명하기로 한다. 상기 실시예의 예시는 첨부된 도면에서 제시되고, 여기서 처음부터 끝까지 동일하거나 유사한 번호는 동일하거나 유사한 구성 요소나 동일하거나 유사한 기능을 갖는 구성 요소를 나타낸다. 아래에서 첨부 도면을 참조하는 것을 통하여 설명되는 실시예는 예시적인 것으로, 본 발명을 해석하기 위한 것일 뿐, 본 발명에 대한 한정으로 이해해서는 안된다. 반대로, 본 발명의 실시예는 청구범위의 취지와 내포된 의미 범위내에 놓이는 모든 변화, 수정과 균등물을 포함한다.Hereinafter, embodiments of the present invention will be described in detail. BRIEF DESCRIPTION OF THE DRAWINGS Examples of the above embodiments are set forth in the accompanying drawings, wherein like reference numerals designate like or similar elements and elements having the same or similar functions. BRIEF DESCRIPTION OF THE DRAWINGS The embodiments described below with reference to the accompanying drawings are intended to be illustrative and interpreting the present invention, but should not be construed as limiting the present invention. To the contrary, embodiments of the invention include all changes, modifications and equivalents which fall within the spirit and scope of the following claims.

해석해야 할 바로는, 본 발명의 설명에서, 용어 "제 1", "제 2"는 설명을 하기 위한 것일 뿐, 상대적인 중요성을 가리키거나 암시하는 것으로 이해해서는 안된다. 해석해야 할 바로는, 본 발명의 설명에서, 별도의 명확한 규정과 한정이 있지 않는 한, 용어 "연결", "연접"은 광의의 의미로 이해되어야 한다. 예를 들어, 고정 연결일 수 있고, 탈착식 연결일 수도 있으며 또는 일체형 연결일 수도 있다. 기계적 연결일 수 있고, 전기적 연결일 수도 있고; 직접 연결일 수도 있고, 중간 매개물을 통한 간접 연결일 수도 있다. 본 기술 분야의 통상의 지식을 가진 자들은 구체적인 상황에 따라 상기 용어가 본 발명에서 뜻하는 구체적인 함의를 이해할 것이다. 이 외에, 본 발명의 설명에서 별도의 설명이 있지 않는 한 "복수"의 함의는 두개 또는 두개 이상이다.It should be understood that, in the description of the present invention, the terms "first" and "second" are for explanation purposes only and should not be construed to imply or imply relative importance. It should be understood that, unless the context clearly dictates otherwise, in the description of the present invention, the terms "connection" and "concatenation" should be understood broadly. For example, be a fixed connection, a removable connection, or an integral connection. It may be a mechanical connection, or it may be an electrical connection; It may be a direct connection or an indirect connection through an intermediate medium. Those skilled in the art will appreciate the specific implications of the term for the present invention in accordance with the specific circumstances. In addition, unless stated otherwise in the description of the present invention, the term "plurality" has two or more implications.

플로차트에서 또는 여기서 기타 방식으로 설명되는 그 어떤 과정 또는 방법 설명은 하나의 또는 더 많은 특정 로직 기능 또는 과정의 단계를 실현하기 위한 수행 가능한 명령의 코드의 모듈, 단락 또는 부분을 포함함을 나타내는 것으로 이해할 수 있다. 아울러, 본 발명의 바람직한 실시 방식의 범위는 별도의 실현을 포함하고, 여기서 제시되거나 토론된 순서대로가 아닌 관련된 기능에 의하여 거의 동시의 방식 또는 상반되는 순서를 포함한 순서에 따라 기능을 수행할 수 있는바 이는 마땅히 본 발명의 실시예가 속하는 기술분야의 기술자들에 의하여 이해되어야 한다.It should be understood that any process or method description set forth in the flowchart or otherwise described herein may be understood to include the modules, paragraphs, or portions of code of instructions that may be executed to implement one or more specific logic functions or steps of a process . In addition, the scope of the preferred embodiments of the present invention encompasses separate implementations, and it is to be understood and appreciated that those skilled in the art, It should be understood by those skilled in the art that the embodiments of the present invention belong.

동영상 통화를 행할 때의 전송되는 동영상 데이터량이 너무 큰 문제를 해결하기 위하여, 본 발명은 한가지 가상 동영상 통화 방법 및 단말을 제공한다. 아래에서는 첨부된 도면을 참조하여 본 발명의 실시예에 따른 가상 동영상 통화 방법 및 단말을 설명하기로 한다.In order to solve the problem that the amount of moving picture data to be transmitted when making a video call is too large, the present invention provides one virtual video communication method and terminal. Hereinafter, a virtual video calling method and a terminal according to an embodiment of the present invention will be described with reference to the accompanying drawings.

가상 동영상 통화 방법으로서, 제1 단말 사용자의 동영상 이미지를 채집하는 단계; 동영상 이미지에 대해 얼굴 인식을 행하여 얼굴 표정 정보를 획득하는 단계; 얼굴 표정 정보를 제1 단말과 통화를 형성하는 제2 단말로 송신하는 단계를 포함하고, 상기 얼굴 표정 정보는 제2 단말이 얼굴 표정 정보와 제2 단말에 미리 설정된 안면 이미지 모형에 의거하여 동영상 이미지를 합성하고 디스플레이하도록 하기 위한 것이다.A virtual video call method comprising: collecting a moving image of a first terminal user; Acquiring facial expression information by performing face recognition on a moving image; And transmitting the facial expression information to a second terminal that forms a call with the first terminal, wherein the facial expression information is information indicating that the second terminal has received the facial expression information and the moving image So as to synthesize and display them.

도 1은 본 발명의 일 실시예에 따른 가상 동영상 통화 방법의 플로차트이다.1 is a flowchart of a virtual moving picture communication method according to an embodiment of the present invention.

도 1이 도시하는 바와 같이, 가상 동영상 통화 방법은 아래의 단계 S101 내지 S103을 포함한다.As shown in Fig. 1, the virtual moving picture communication method includes the following steps S101 to S103.

S101: 제1 단말 사용자의 동영상 이미지를 채집한다.S101: The moving image of the first terminal user is collected.

구체적으로, 제1 단말은 자체의 또는 외장된 카메라를 통하여 촬영을 행하여 제1 단말 사용자의 동영상 이미지를 채집할 수 있다.Specifically, the first terminal can capture a moving image of the first terminal user by shooting through its own or an external camera.

S102: 동영상 이미지에 대해 얼굴 인식을 행하여 얼굴 표정 정보를 획득한다.S102: Face recognition is performed on the moving image to acquire facial expression information.

구체적으로, 제1 단말은 종래의 여러가지 컴퓨터 이미지 프로세싱 기술, 예를 들어 유전 알고리즘의 안면 인식, 뉴럴 네트워크의 안면 인식 등을 통하여 동영상 이미지에 대해 얼굴 인식을 행하여 얼굴 표정 정보를 획득할 수 있다. 얼굴 표정의 데이터량은 매우 작다. 후속의 실시예에서 얼굴 표정의 획득과정을 상세하게 기술하기로 한다.Specifically, the first terminal can acquire facial expression information by performing facial recognition on a moving image through various conventional computer image processing techniques, such as facial recognition of a genetic algorithm and facial recognition of a neural network. The amount of facial expression data is very small. In the following embodiment, the process of acquiring the facial expression will be described in detail.

S103: 얼굴 표정 정보를 제1 단말과 통화를 형성하는 제2 단말로 송신한다. 상기 얼굴 표정 정보는 제2 단말이 얼굴 표정 정보와 제2 단말에 미리 설정된 안면 이미지 모형에 의거하여 동영상 이미지를 합성하고 디스플레이하도록 하기 위한 것이다.S103: The facial expression information is transmitted to the second terminal that forms a call with the first terminal. The facial expression information is for the second terminal to synthesize and display the moving image on the basis of the facial expression information and the facial image model previously set in the second terminal.

여기서, 제1 단말이 서버를 통하여 제2 단말로 동영상 통화 청구를 송신하거나, 또는 제2 단말이 서버를 통하여 제1 단말로 동영상 통화 청구를 송신한다. 만약 제2 단말이 제1 단말의 동영상 통화 요청에 동의하였다면, 또는 제1 단말이 제2 단말의 동영상 통화 요청에 동의하였다면, 서버는 바로 제1 단말과 제2 단말간의 동영상 통화를 형성할 수 있다.Here, the first terminal transmits the video call request to the second terminal through the server, or the second terminal transmits the video call request to the first terminal through the server. If the second terminal agrees to the video call request of the first terminal or the first terminal agrees to the video call request of the second terminal, the server can directly form a video call between the first terminal and the second terminal .

구체적으로, 제1 단말은 제1 단말 사용자의 얼굴 표정 정보에 대해 코딩을 행하여 디지털 표현을 형성하고, 당해 얼굴 표정 정보를 서버를 통하여 형성한 동영상 통화를 통하여 제2 단말로 송신할 수 있다.Specifically, the first terminal may code the facial expression information of the first terminal user to form a digital representation, and may transmit the facial expression information to the second terminal through the video call formed through the server.

제1 단말이 제1 단말 사용자의 얼굴 표정 정보를 제2 단말로 송신한 후, 제2 단말은 제1 단말 사용자의 얼굴 표정 정보와 미리 설정된 안면 이미지 모형에 의거하여 합성을 행하여 제1 단말 사용자의 얼굴 이미지를 재현하고 제2단말의 동영상 통화 인터페이스에 디스플레이할 수 있다. 여기서, 미리 설정된 안면 이미지 모형은 사용자 자신이 설정한 것일 수 있고, 서버가 디폴트 설정한 것일 수도 있다. 이 외에, 제2 단말의 사용자는 자신의 사진 또는 제1 단말 사용자의 사진과 얼굴 표정 정보를 적용하여 합성을 행하여 제1 단말 사용자의 얼굴 이미지를 재현할 수도 있다.After the first terminal transmits the facial expression information of the first terminal user to the second terminal, the second terminal synthesizes based on the facial expression information of the first terminal user and the preset facial image model, The face image can be reproduced and displayed on the video call interface of the second terminal. Here, the preset facial image model may be set by the user himself or may be the default set by the server. In addition, the user of the second terminal may reproduce the face image of the first terminal user by applying the photograph of the user of the second terminal or the photograph of the first terminal user and the facial expression information to perform composition.

이 외에, 동영상을 한 프레임 한 프레임의 동영상 이미지로 간주할 수 있는바, 제1 단말에서 각각의 프레임의 이미지의 얼굴 표정 정보를 획득하고, 제2 단말에서도 각각의 프레임의 이미지에 대해 얼굴 표정 정보의 합성을 행하며, 이에 따라 가상 동영상 통화를 실현한다. 여기서 합성 과정은 종래의 기술로서, 여기에서는 더 이상 세세하게 기술하지 않기로 한다.In addition, the moving image can be regarded as a moving image of one frame. The first terminal acquires facial expression information of the image of each frame, and the second terminal also acquires facial expression information Thereby realizing a virtual moving picture call. Here, the synthesis process is a conventional technology, and will not be described in detail here.

본 발명의 실시예의 가상 동영상 통화 방법은 얼굴 인식 기술을 이용하여 송신단(예를 들어, 제1 단말)에서 얼굴 표정 정보를 추출하고, 수신단(예를 들어, 제2 단말)에서 송신된 얼굴 표정 정보와 미리 설정된 안면 이미지 모형에 의거하여 안면 이미지의 간단한 합성과 환원을 실현한다. 송신단과 수신단에서 전송되는 정보는 얼굴 표정 정보에만 한정되고, 당해 얼굴 표정 정보는 완전한 안면 이미지를 합성하는 것을 필요하지 않으므로, 포함된 정보량이 적고, 코딩한 후 얼굴 표정 정보의 데이터량은 몇 개의 비트바이트만 차지한다. 따라서 배경 기술이 전송하는 정보와 비교할 때, 동영상 통화과정에서 전송되는 데이터량을 대폭 줄이고, 데이터트래픽을 절약함으로써, 동영상 통화가 더욱 유창하도록 하고, 네트워크 대역폭 한정 또는 트래픽 제한이 동영상 통화에 대한 영향을 줄여, 특히 모바일 네트워크에서의 전송에 적합하고, 사용자 체험을 향상시킨다. 이 외에, 제2 단말에서 제1 단말 사용자의 안면 이미지 모형을 재건할 필요 없이 제2 단말이 얼굴 표정 정보에 의거하여 미리 설정된 안면 이미지 모형에 대응되는 얼굴 표정을 디스플레이하기만 하면 되는바, 제2 단말이 조정하기 용이하도록 한다.The virtual video call method of the embodiment of the present invention extracts facial expression information from a transmitting end (for example, a first terminal) using the facial recognition technology, and transmits the facial expression information transmitted from a receiving end (for example, And a simple synthesis and reduction of the facial image based on the preset facial image model. Since the information transmitted from the transmitting end and the receiving end is limited to the facial expression information and the facial expression information does not need to synthesize a complete facial image, the amount of information included is small, and the data amount of the facial expression information after coding is several bits Only bytes are occupied. Therefore, when compared with the information transmitted by the background technology, it is possible to greatly reduce the amount of data transmitted during the video call process, to save data traffic, to make the video call more fluent, and to limit the effect of network bandwidth limitation or traffic restriction on the video call Which is particularly suitable for transmission in a mobile network and enhances the user experience. In addition, since the second terminal needs only to display the facial expression corresponding to the preset facial image model based on the facial expression information without having to reconstruct the facial image model of the first terminal user at the second terminal, So that the terminal can be easily adjusted.

본 측면의 일 실시예에서, 동영상 이미지에 대해 얼굴 인식을 행하여 얼굴 표정 정보를 획득하는 단계는(즉: S102): 동영상 이미지에 대해 얼굴 인식을 행하여 얼굴 특징을 획득하고, 얼굴 특징으로부터 얼굴 표정 정보를 추출하는 단계를 포함한다.In one embodiment of the present aspect, the step of performing face recognition on the moving image to acquire facial expression information (i.e., S102) includes: acquiring facial features by performing face recognition on the moving image, .

구체적으로, 우선 동영상 이미지에서 얼굴 특징을 추출하는바, 얼굴 특징에는 안면 특징(예컨대 눈, 코, 입, 귀 등)의 기하학적 정보, 예를 들어, 눈썹의 위치, 입의 각도, 눈의 크기 등을 포함할 수 있으나 이들에 제한되지 않는다. 기타 방법을 통하여 얼굴 특징을 획득할 수 도 있는바, 미래의 새로운 안면 인식 기술에 대하여, 본 실시예의 제1 단말은 모두 미래의 새로운 안면 인식 기술을 사용하여 동영상 이미지에 대해 얼굴 인식을 행하여 얼굴 특징을 획득할 수 있다는 것을 이해해야 한다. 그 다음 얼굴 특징으로부터 얼굴 표정 정보를 추출하고, 제1 단말이 얼굴 특징 분석에 의거하여 제1 단말 사용자의 얼굴 표정 정보를 획득할 수 있다.Specifically, facial features are first extracted from the moving image. The facial features include geometric information of the facial features (e.g., eyes, nose, mouth, ear, etc.) such as eyebrow position, mouth angle, But are not limited to these. The facial feature can be acquired through other methods. With respect to the new facial recognition technology of the future, the first terminal of the present embodiment performs face recognition on the moving image using a new facial recognition technology of the future, &Lt; / RTI > Then, the facial expression information is extracted from the facial feature, and the first terminal can acquire the facial expression information of the first terminal user based on the facial feature analysis.

본 측면의 일 실시예에서, 얼굴 표정 정보는 눈살을 찌푸리는지, 입을 벌리거나 다무는지, 입가의 라디안(radian, 호도(弧度)), 눈을 떴는지 감았는지, 눈의 크기, 눈물이 있는지 등 내용 중의 하나 또는 여러 개를 포함한다.In one embodiment of the present aspect, the facial expression information includes information such as frowning, opening or closing the mouth, radian in the mouth, whether the eye is open or closed, the size of the eye, One or more of the contents.

이 외에, 얼굴 표정 정보는 주로 사람의 정서 정보를 반영하는바, 예를 들어, 눈썹의 위치, 입의 각도, 눈의 크기 등에 대해 분석을 행하는 것을 통하여 사용자의 표정이 미소인지, 크게 웃는 것인지, 통곡인지, 침울인지, 흥분 또는 화를 내는 것인지 등등을 획득할 수 있다. 마찬가지로, 종래의 여러가지 얼굴 표정 정보 분석 기술(예컨대: 기계 학습 알고리즘 등)을 적용하여 분석을 행할 수 있다. 이 외에, 미래의 유사한 기능을 갖는 알고리즘에 대하여, 본 실시예의 제1 단말은 모두 미래의 유사한 기능을 갖는 알고리즘을 사용하여 얼굴 특징 분석을 행하여 얼굴 표정 정보를 획득할 수 있다.In addition, the facial expression information mainly reflects the emotional information of a person. For example, through analysis of the position of the eyebrow, the angle of the mouth, the size of the eyes, and the like, Whether it is weeping, gloom, excitement, or anger, and so on. Similarly, analysis can be performed by applying various conventional facial expression information analysis techniques (e.g., machine learning algorithm). In addition, for an algorithm having similar functions in the future, the first terminals of the present embodiment can acquire facial expression information by performing facial feature analysis using an algorithm having a similar function in the future.

제1 단말은 제1 단말 사용자의 얼굴 표정 정보에 대해 코딩을 행하여 디지털 표현을 형성하고 - 예를 들어 간단한 몇 개의 문자 부호이고 몇 비트만 차지하는 것일 수 있는바, 예를 들어 "크게 웃는 것"에 대해 직접 문자 부호"D: "를 송신하여 코드 전송을 행하는 것 등 일 수 있다. 물론 코딩방식이 더 풍부할 수 있는바, 여기에서는 단지 이해를 돕기 위해 예를 들어 설명하는 것일 뿐임 -, 당해 얼굴 표정 정보를 서버를 통하여 형성한 동영상 통화를 통하여 제2 단말로 송신한다.The first terminal may code the facial expression information of the first terminal user to form a digital representation-for example, a few simple character codes and occupy only a few bits. For example, Quot; D: "to perform code transmission by directly transmitting the character code" D: " Of course, the coding method may be richer. Here, the explanation is merely an example for the sake of understanding, and the facial expression information is transmitted to the second terminal through the video call formed through the server.

해석해야 할 바로는, 미리 설정된 안면 이미지 모형은 더욱 다양화된 것일 수 있다. 본 발명의 일 실시예에서, 제2 단말에 미리 설정된 안면 이미지 모형은 진실 안면 이미지 모형과 만화 캐릭터 안면 이미지 모형을 포함한다. 이 외에, 제2 단말에 저장된 사진 등일 수도 있다.To be interpreted, the pre-set facial image model may be more diversified. In one embodiment of the present invention, the facial image model preset at the second terminal includes a true facial image model and a cartoon character facial image model. Alternatively, it may be a photograph stored in the second terminal or the like.

영상 통화과정이 더욱 개성적이고 즐거움을 향상하기 위하여 제2 단말 사용자는 자신의 수요에 따라 좋아하는 만화 캐릭터 안면 이미지 모형을 선택할 수 있다. 본 발명의 일 실시예에서, 가상 동영상 통화 방법은, 제2 단말이 제2 단말의 사용자에게 적어도 하나의 만화 캐릭터 안면 이미지 모형을 제공하는 단계; 제2 단말이 제2 단말의 사용자가 선택한 만화 캐릭터 안면 이미지 모형을 수신하고, 얼굴 표정 정보와 선택된 안면 이미지 모형에 의거하여 합성과 디스플레이를 행하는 단계를 더 포함한다. 구체적으로, 제2 단말의 사용자가 자신의 수요에 의거하여 제1 단말 사용자를 위해 좋아하는 만화 캐릭터 안면 이미지 모형을 선택한 후, 제2 단말은 제2 단말의 사용자가 선택한 만화 캐릭터 안면 이미지 모형을 수신하고, 제1 단말 사용자의 얼굴 표정 정보와 제2 단말 사용자가 선택한 만화 캐릭터 안면 이미지 모형에 의거하여 합성을 행하여 제1 단말 사용자의 얼굴 이미지를 재현하고, 제2 단말 동영상 통화 인터페이스에 디스플레이한다. 예를 들어, 제1 단말 사용자의 얼굴 표정 정보는 입을 벌리고, 입가의 라디안이 많고, 눈을 약간 가늘게 뜨고, 이때 제1 단말 사용자가 크게 웃고 있으며, 제2 단말 사용자가 선택한 것이 슈퍼맨의 안면 이미지 모형이면, 제2 단말은 제1 단말 사용자의 얼굴 표정 정보와 슈퍼맨의 만화 캐릭터 이미지를 합성하여 제1 단말 사용자의 얼굴 표정을 크게 웃고 있는 이미지로 재현한다.The user of the second terminal can select a favorite cartoon character face image model according to his / her demand in order to make the video call process more individual and improve the enjoyment. In one embodiment of the present invention, a virtual video call method comprises the steps of: a second terminal providing at least one cartoon character face image model to a user of a second terminal; The second terminal receives the cartoon character face image model selected by the user of the second terminal, and performs synthesis and display based on the facial expression information and the selected facial image model. Specifically, the user of the second terminal selects the favorite cartoon character face image model for the first terminal user based on his or her demand, and then the second terminal receives the cartoon character face image model selected by the user of the second terminal And synthesizes the facial expression image of the first terminal user based on the facial image model of the cartoon character selected by the second terminal user to reproduce the facial image of the first terminal user and displays the facial image on the second terminal video call interface. For example, the facial expression information of the first terminal user is wide open, the radians of the mouth are large, the eyes are slightly thin, the first terminal user laughs loudly, and the second terminal user selects the facial image model of Superman , The second terminal combines the facial expression information of the first terminal user with the comic character image of Superman to reproduce the facial expression of the first terminal user as a largely laughing image.

본 발명의 실시예는 다른 한가지 가상 동영상 통화 방법을 더 제공한다.The embodiment of the present invention further provides another virtual video communication method.

도 2는 본 발명의 다른 한 실시예에 따른 가상 동영상 통화 방법의 플로차트이다.2 is a flowchart of a virtual moving picture communication method according to another embodiment of the present invention.

도 2가 도시하는 바와 같이, 가상 동영상 통화 방법은 아래의 단계 S201, S202를 포함한다.As shown in FIG. 2, the virtual video call method includes the following steps S201 and S202.

S201: 제2 단말과 통화를 형성하는 제1 단말이 송신한 동영상 이미지의 얼굴 표정 정보를 수신한다.S201: The facial expression information of the moving image transmitted by the first terminal forming the call with the second terminal is received.

구체적으로, 우선 제1 단말이 서버를 통하여 제2 단말로 동영상 통화 청구를 송신하거나 또는 제2 단말이 서버를 통하여 제1 단말로 동영상 통화 청구를 송신한다. 만약 제2 단말이 제1 단말의 동영상 통화 요청에 동의하였다면, 또는 제1 단말이 제2 단말의 동영상 통화 요청에 동의하였다면, 서버는 바로 제1 단말과 제2 단말간의 동영상 통화를 형성할 수 있다.Specifically, the first terminal transmits a video call request to the second terminal through the server, or the second terminal transmits a video call request to the first terminal through the server. If the second terminal agrees to the video call request of the first terminal or the first terminal agrees to the video call request of the second terminal, the server can directly form a video call between the first terminal and the second terminal .

여기서, 제1 단말은 자체의 또는 외장된 카메라를 통하여 촬영을 하여 제1 단말 사용자의 동영상 이미지를 채집할 수 있고, 상기 어느 한 항의 실시예 따른 방법에 의거하여 얼굴 표정 정보를 획득하고 제2 단말로 송신할 수 있다.Here, the first terminal captures the moving image of the user of the first terminal by shooting through its own or an external camera, acquires the facial expression information based on the method according to any one of the above-described embodiments, As shown in FIG.

S202: 얼굴 표정 정보와 제2 단말에 미리 설정된 안면 이미지 모형에 의거하여 동영상 이미지를 합성하고 디스플레이한다.S202: The moving image is synthesized and displayed based on the facial expression information and the facial image model preset in the second terminal.

구체적으로, 제2 단말은 제1 단말 사용자의 얼굴 표정 정보와 미리 설정된 안면 이미지 모형에 의거하여 합성을 행하여 제1 단말 사용자의 얼굴 이미지를 재현하고, 제2 단말의 동영상 통화 인터페이스에 디스플레이할 수 있다. 여기서, 미리 설정된 안면 이미지 모형은 사용자 자신이 설정한 것일 수 있고, 서버가 디폴트 설정한 것일 수도 있다. 이 외에, 제2 단말의 사용자는 자신의 사진 또는 제1 단말 사용자의 사진이 디스플레이된 것을 적용하여 안면 이미지 모형으로 하여 제1 단말 사용자의 얼굴 이미지를 재현할 수도 있다.Specifically, the second terminal synthesizes the facial expression information of the first terminal user and the preset facial image model, reproduces the facial image of the first terminal user, and displays the facial image on the video call interface of the second terminal . Here, the preset facial image model may be set by the user himself or may be the default set by the server. In addition, the user of the second terminal may reproduce the face image of the first terminal user using the facial image model by applying the photograph of the user of the second terminal or the photograph of the first terminal user.

본 발명의 실시예의 가상 동영상 통화 방법은, 얼굴 인식 기술을 이용하여 송신단(예를 들어, 제1 단말)에서 얼굴 표정 정보를 추출하고, 수신단(예를 들어, 제2 단말)에서 송신된 얼굴 표정 정보와 미리 설정된 안면 이미지 모형에 의거하여 안면 이미지의 간단한 합성과 환원을 실현한다. 송신단과 수신단에서 전송되는 정보는 얼굴 표정 정보에만 한정되고, 당해 얼굴 표정 정보는 완전한 안면 이미지를 합성하는 것을 필요로 하지 않으므로, 포함된 정보량이 적고, 코딩한 후 얼굴 표정 정보의 데이터량은 몇 개의 비트바이트만 차지한다. 따라서 배경 기술이 전송하는 정보와 비교할 때, 동영상 통화과정에서 전송되는 데이터량을 대폭 줄이고, 데이터트래픽을 절약함으로써, 동영상 통화가 더욱 유창하도록 하고, 네트워크 대역폭 한정 또는 트래픽 제한이 동영상 통화에 대한 영향을 줄여, 특히 모바일 네트워크에서의 전송에 적합하고, 사용자 체험을 향상시킨다. 이 외에, 제2 단말에서 제1 단말 사용자의 안면 이미지 모형을 재건할 필요 없이 제2 단말이 얼굴 표정 정보에 의거하여 미리 설정된 안면 이미지 모형에 대응되는 얼굴 표정을 디스플레이하기만 하면 되는바, 제2 단말이 조정하기 용이하도록 한다.In the virtual video call method of the embodiment of the present invention, facial expression information is extracted from a transmitting end (e.g., a first terminal) using a face recognition technique, and the facial expression transmitted from a receiving end (for example, Based on the information and the preset facial image model, simple synthesis and reduction of the facial image is realized. Since the information transmitted from the transmitting end and the receiving end is limited to facial expression information and the facial expression information does not need to synthesize a complete facial image, the amount of information contained is small, and the data amount of the facial expression information after coding is several It takes up only bit bytes. Therefore, when compared with the information transmitted by the background technology, it is possible to greatly reduce the amount of data transmitted during the video call process, to save data traffic, to make the video call more fluent, and to limit the effect of network bandwidth limitation or traffic restriction on the video call Which is particularly suitable for transmission in a mobile network and enhances the user experience. In addition, since the second terminal needs only to display the facial expression corresponding to the preset facial image model based on the facial expression information without having to reconstruct the facial image model of the first terminal user at the second terminal, So that the terminal can be easily adjusted.

도 3은 본 발명의 또 하나의 실시예에 따른 가상 동영상 통화 방법의 플로차트이다.FIG. 3 is a flowchart of a virtual moving picture communication method according to another embodiment of the present invention.

도 3이 도시하는 바와 같이, 가상 동영상 통화 방법은 아래의 단계 S301 내지 S303을 포함한다.As shown in Fig. 3, the virtual moving picture communication method includes the following steps S301 to S303.

S301: 제2 단말과 통화를 형성하는 제1 단말이 송신한 동영상 이미지의 얼굴 표정 정보를 수신한다.S301: The facial expression information of the moving image transmitted by the first terminal forming the call with the second terminal is received.

S302: 진실한 또는 만화 캐릭터의 안면 이미지 모형을 선택한다. 선택된 상기 진실한 또는 만화 캐릭터의 안면 이미지 모형은 얼굴 표정 정보와 동영상 이미지를 합성하고 디스플레이하기 위한 것이다.S302: Select a true or manga character's face image model. The selected face image model of the true or comic character is for synthesizing and displaying the facial expression information and the moving image.

구체적으로, 동영상 통화과정이 더욱 개성적이고 즐거움을 향상하기 위하여 제2 단말은 사용자에게 복수의 진실한 또는 만화 캐릭터의 안면 이미지 모형, 예를 들어 복수의 만화 캐릭터 안면 이미지 모형 또는 사진, 진실한 안면 이미지 모형 등을 제공할 수 있다. 제2 단말 사용자는 자신의 수요에 의거하여 자신이 좋아하는 안면 이미지 모형을 선택할 수 있다. 예를 들어, 제1 단말 사용자의 얼굴 표정 정보는 입을 벌리고, 입가의 라디안이 많고, 눈을 약간 가늘게 뜨고, 이때 제1 단말 사용자가 크게 웃고 있으며, 제2 단말 사용자가 선택한 것이 슈퍼맨의 안면 이미지 모형이면, 제2 단말은 제1 단말 사용자의 얼굴 표정 정보와 슈퍼맨의 만화 캐릭터 이미지를 합성하여 제1 단말 사용자의 얼굴 표정을 크게 웃고 있는 이미지로 재현한다.In particular, the second terminal may provide the user with a plurality of true or comic character face image models, for example, a plurality of cartoon character face image models or photographs, true face image models, etc. Can be provided. The second terminal user can select his / her favorite facial image model based on his / her demand. For example, the facial expression information of the first terminal user is wide open, the radians of the mouth are large, the eyes are slightly thin, the first terminal user laughs loudly, and the second terminal user selects the facial image model of Superman , The second terminal combines the facial expression information of the first terminal user with the comic character image of Superman to reproduce the facial expression of the first terminal user as a largely laughing image.

S303: 선택된 진실한 또는 만화 캐릭터의 안면 이미지 모형과 얼굴 표정 정보에 의거하여 동영상 이미지를 합성하고 디스플레이한다.S303: Composite and display the moving image based on the facial image model and facial expression information of the selected true or comic character.

본 발명의 실시예의 가상 동영상 통화 방법은, 제2 단말의 사용자가 진실한 또는 만화 캐릭터의 안면 이미지 모형을 선택하고, 선택한 진실한 또는 만화 캐릭터의 안면 이미지 모형과 얼굴 표정 정보에 의거하여 동영상 이미지를 합성하고 디스플레이할 수 있는 것으로, 즐거움을 향상하고, 사용자의 체험을 향상시킨다.In the virtual video call method of the embodiment of the present invention, the user of the second terminal selects a face image model of a true or comic character, synthesizes the motion image based on the facial image model of the selected true or comic character and the facial expression information What can be displayed is improved pleasure and improves user's experience.

본 발명의 실시예에서, 재현한 얼굴 이미지가 더욱 진실성을 갖게 하기 위하여, 제2 단말은 제1 단말 사용자의 진실한 안면 이미지 모형을 획득하여 얼굴 표정 재현을 행할 수 있다. 구체적으로, 제1 단말은 카메라를 통하여 동영상 이미지를 촬영하고, 촬영한 동영상 이미지에 대해 분석을 행할 수 있고, 이에 따라 진실한 안면 이미지 모형을 획득하거나, 또는 촬영할 필요없이 제1 단말이 사용자 자신이 선택한 안면 이미지에 의거하여 분석을 행하여 진실한 안면 이미지 모형을 획득한 후, 제2 단말로 송신하여 저장을 행할 수 있다.In an embodiment of the present invention, in order to make the reproduced facial image more authentic, the second terminal may acquire the true facial image model of the first terminal user to reproduce facial expression. Specifically, the first terminal may capture a moving image through a camera, analyze the captured moving image, and thereby acquire a true face image model or, without the need to photograph it, It is possible to obtain a true facial image model by analyzing based on the facial image, and transmit the facial image model to the second terminal for storage.

이 외에, 제2 단말은 제1 단말 사용자의 안면 이미지를 더 획득하고, 안면 이미지에 의거하여 분석을 행하여 진실한 안면 이미지 모형을 획득할 수 있는바, 즉 진실한 안면 이미지 모형은 제2 단말에서 생성될 수 있다. 제2 단말은 제1 단말 사용자의 진실한 안면 이미지 모형과 제1 단말 사용자의 얼굴 표정 정보에 의거하여 제1 단말 사용자의 얼굴 이미지를 합성하여 제2 단말의 동영상 통화 인터페이스에 재현할 수 있다. 이에 따라, 재현한 얼굴 이미지가 더욱 진실성을 갖도록 할 수 있다.In addition to this, the second terminal can acquire more of the facial image of the first terminal user, and can perform analysis based on the facial image to obtain a true facial image model, i.e., a true facial image model is generated at the second terminal . The second terminal may synthesize the face image of the first terminal user based on the true facial image model of the first terminal user and the facial expression information of the first terminal user and reproduce the same on the video call interface of the second terminal. Thus, the reproduced face image can be made more authentic.

진실한 안면 이미지 모형은 한 번만 형성하여, 제2 단말로 송신되어 저장될 수 있고, 추후의 데이터 송신과정에서는 얼굴 표정 정보만 송신하면 된다는 것을 이해해야 한다. 이 외에, 제2 단말에 선택 버튼을 제공할 수도 있는바, 제2 단말 사용자는 제1 단말 사용자의 진실한 얼굴 이미지를 재현한 것을 선택하여 디스플레이할 수 있고, 또는 만화 캐릭터 안면 이미지 모형을 선택하여 얼굴 이미지를 재현할 수 있다. 더욱 구체적으로, 제2 단말의 사용자는 구체적인 네트워크 환경과 단말 성능에 의거하여 선택할 수 있는바, 예를 들어 모바일 단말에서 만화 캐릭터 안면 이미지 모형을 선택하고 얼굴 표정 정보만 송신하여 동영상 통화를 실현하고, 개인용 컴퓨터에서 진실한 안면 이미지 모형을 선택하여 진실감을 증가할수 있다.It is to be understood that a true facial image model can be formed only once and transmitted to and stored in the second terminal, and only facial expression information can be transmitted in a subsequent data transmission process. In addition to this, the second terminal may be provided with a selection button so that the second terminal user can select and display the reproduction of the true face image of the first terminal user, or select the cartoon character face image model, The image can be reproduced. More specifically, the user of the second terminal can select based on the specific network environment and the terminal capabilities. For example, the mobile terminal selects the cartoon character face image model and transmits only the facial expression information to realize the video call, You can increase your sense of truth by choosing a true face image model on your personal computer.

본 발명의 실시예의 가상 동영상 통화 방법은 제1 단말 사용자의 진실한 안면 이미지 모형과 얼굴 표정 정보에 의거하여 제1 단말 사용자의 얼굴 이미지를 재현하고, 이에 따라 재현된 얼굴 이미지가 더욱 진실성을 갖게 할 수 있다. 이 외에, 진실한 안면 이미지 모형은 한번 전송되면 여러 번 사용할 수 있어 수신단이 통화과정에서 실시간으로 진실한 안면 이미지 모형을 재건할 필요가 없게 되어, 수신단의 조작과정을 간략화하고 사용자 체험을 향상시킨다.The virtual video call method of the embodiment of the present invention reproduces the face image of the first terminal user based on the true facial image model of the first terminal user and the facial expression information and thereby makes the reproduced facial image more authentic have. In addition, the true facial image model can be used multiple times once it is transmitted, eliminating the need for the receiver to reconstruct a true facial image model in real time during the call process, thus simplifying the process of operating the receiver and improving the user experience.

상기 실시예를 실현하기 위하여, 본 발명은 단말을 더 제공한다.In order to realize the above embodiment, the present invention further provides a terminal.

단말로서, 사용자의 동영상 이미지를 채집하기 위한 채집 모듈; 동영상 이미지에 대해 얼굴 인식을 행하여 얼굴 표정 정보를 획득하기 위한 인식 모듈; 얼굴 표정 정보를 단말과 통화를 형성하는 제2 단말로 송신하기 위한 송신 모듈, 을 포함하고, 상기 얼굴 표정 정보는 제2 단말이 얼굴 표정 정보와 기타 단말에 미리 설정된 안면 이미지 모형에 의거하여 동영상 이미지를 합성하고 디스플레이하도록 하 하기 위한 것이다.A terminal comprising: a collection module for collecting a user's moving image; A recognition module for performing facial recognition on the moving image to acquire facial expression information; And a transmission module for transmitting facial expression information to a second terminal that forms a call with the terminal, wherein the facial expression information is information indicating that the second terminal has received the facial expression information and the moving image To display and synthesize the image.

도 4는 본 발명의 일 실시예에 따른 단말의 구조 설명도이다.4 is an explanatory view of a structure of a UE according to an embodiment of the present invention.

도 4가 도시하는 바와 같이, 단말은 채집 모듈(110), 인식 모듈(120)과 송신 모듈(130)을 포함한다.As shown in FIG. 4, the terminal includes a collection module 110, a recognition module 120, and a transmission module 130.

구체적으로, 채집 모듈(110)은 사용자의 동영상 이미지를 채집하기 위한 것이다. 더욱 구체적으로, 채집 모듈(110)은 단말 자체의 또는 외장된 카메라를 통하여 촬영을 행하여 사용자의 동영상 이미지를 채집할 수 있다.Specifically, the collection module 110 is for collecting the moving image of the user. More specifically, the collection module 110 may capture a moving image of a user by shooting through the terminal itself or an external camera.

인식 모듈(120)은 동영상 이미지에 대해 얼굴 인식을 행하여 얼굴 표정 정보를 획득하기 위한 것이다. 더욱 구체적으로, 인식 모듈(120)은 종래의 여러가지 컴퓨터 이미지 프로세싱 기술, 예를 들어 유전 알고리즘의 안면 인식, 뉴럴 네트워크의 안면 인식 등을 통하여 동영상 이미지에 대해 얼굴 인식을 행하여 얼굴 표정 정보를 획득할 수 있다. 얼굴 표정 정보의 데이터량은 매우 작다. 후속의 실시예에서 얼굴 표정의 획득 과정을 상세하게 기술하기로 한다.The recognition module 120 is for acquiring facial expression information by performing face recognition on a moving image. More specifically, the recognition module 120 can acquire facial expression information by performing facial recognition on a moving image through various conventional computer image processing techniques, such as facial recognition of a genetic algorithm and facial recognition of a neural network have. The data amount of facial expression information is very small. In the following embodiment, the process of acquiring the facial expression will be described in detail.

송신 모듈(130)은 얼굴 표정 정보를 단말과 통화를 형성하는 제2 단말로 송신하기 위한 것이고, 상기 얼굴 표정 정보는 제2 단말이 얼굴 표정 정보와 기타 단말에 미리 설정된 안면 이미지 모형에 의거하여 동영상 이미지를 합성하고 디스플레이하도록 하기 위한 것이다.The transmitting module 130 is for transmitting facial expression information to a second terminal that forms a call with the terminal, and the facial expression information is information indicating whether or not the second terminal transmits the facial expression information and the moving image To synthesize and display images.

여기서, 단말이 서버를 통하여 제2 단말로 동영상 통화 청구를 송신하거나 또는 제2 단말이 서버를 통하여 단말로 동영상 통화 청구를 송신한다. 만약 제2 단말이 단말의 동영상 통화 요청을 동의했다면, 또는 단말이 제2 단말이 동영상 통화 요청을 동의했다면, 서버는 바로 단말과 제2 단말간의 동영상 통화를 형성할 수 있다.Here, the terminal transmits a video call request to the second terminal through the server, or the second terminal transmits a video call request to the terminal through the server. If the second terminal has agreed to the video call request of the terminal, or if the terminal has agreed to the video call request, the server can directly make a video call between the terminal and the second terminal.

더욱 구체적으로, 송신 모듈(130)은 얼굴 표정 정보에 대해 코딩을 행하여 디지털 표현을 형성하고, 당해 얼굴 표정 정보를 서버를 통하여 형성한 동영상 통화를 통하여 제2 단말로 송신할 수 있다.More specifically, the transmitting module 130 may code the facial expression information to form a digital representation, and transmit the facial expression information to the second terminal through the video call formed through the server.

얼굴 표정 정보를 제2 단말로 송신한 후, 제2 단말은 얼굴 표정 정보와 미리 설정된 안면 이미지 모형에 의거하여 합성을 행하여 단말 사용자의 얼굴 이미지를 재현하고, 제2 단말의 동영상 통화 인터페이스에 디스플레이할 수 있다. 여기서, 미리 설정된 안면 이미지 모형은 사용자 자신이 설정한 것일 수 있고, 서버가 디폴트 설정한 것일 수도 있다. 이 외에, 제2 단말의 사용자는 자신의 사진 또는 단말 사용자의 사진과 얼굴 표정 정보를 적용하여 합성을 행하여 제1 단말 사용자의 얼굴 이미지를 재현할 수도 있다.After transmitting facial expression information to the second terminal, the second terminal reproduces the facial image of the terminal user by performing synthesis based on the facial expression information and the preset facial image model, and displays it on the video call interface of the second terminal . Here, the preset facial image model may be set by the user himself or may be the default set by the server. In addition, the user of the second terminal may reproduce the face image of the first terminal user by applying the photograph of the user of the second terminal or the photograph of the user of the terminal and the facial expression information.

본 발명의 실시예의 단말은 얼굴 인식 기술을 이용하여 얼굴 표정 정보를 추출하고, 단말과 통화를 형성하는 제2 단말이 송신된 얼굴 표정 정보와 미리 설정된 안면 이미지 모형에 의거하여 안면 이미지의 간단한 합성과 환원을 실현하도록 한다. 전송한 정보는 얼굴 표정 정보에만 한정되고, 또 당해 얼굴 표정 정보는 완전한 안면 이미지를 합성하는 것이 필요하지 않으므로, 포함된 정보량이 적고, 코딩한 후 얼굴 표정 정보의 데이터량은 몇 개의 비트바이트만 차지한다. 따라서 배경 기술이 전송하는 정보와 비교할 때, 동영상 통화과정에서 전송되는 데이터량을 대폭 줄이고, 데이터트래픽을 절약함으로써, 동영상 통화가 더욱 유창하도록 하고, 네트워크 대역폭 한정 또는 트래픽 제한이 동영상 통화에 대한 영향을 줄여, 특히 모바일 네트워크에서의 전송에 적합하고, 사용자 체험을 향상시킨다. 이 외에, 제2 단말에서 사용자의 안면 이미지 모형을 재건할 필요 없이 제2 단말이 얼굴 표정 정보에 의거하여 미리 설정된 안면 이미지 모형에 대응되는 얼굴 표정을 디스플레이하기만 하면 되는바, 제2 단말이 조정하기 용이하도록 한다.The terminal of the embodiment of the present invention extracts facial expression information using the facial recognition technology, and a simple synthesis of a facial image based on facial expression information and a preset facial image model transmitted from a second terminal, which forms a call with the terminal, Reduction is realized. Since the transmitted information is limited to the facial expression information and the facial expression information does not need to synthesize a complete facial image, the amount of information included is small and the data amount of the facial expression information after coding is only occupied by several bit bytes do. Therefore, when compared with the information transmitted by the background technology, it is possible to greatly reduce the amount of data transmitted during the video call process, to save data traffic, to make the video call more fluent, and to limit the effect of network bandwidth limitation or traffic restriction on the video call Which is particularly suitable for transmission in a mobile network and enhances the user experience. In addition, since the second terminal only needs to display the facial expression corresponding to the preset facial image model based on the facial expression information without having to reconstruct the user's facial image model at the second terminal, .

본 발명의 일 실시예에서, 인식 모듈(120)은 또 동영상 이미지 얼굴에 대해 인식을 행하여 얼굴 특징을 획득하고, 얼굴 특징으로부터 얼굴 표정 정보를 추출하기 위한 것이다.In one embodiment of the present invention, the recognition module 120 is also for recognizing a moving image face to obtain facial features, and extracting facial expression information from facial features.

구체적으로, 우선, 인식 모듈(120)이 동영상 이미지로부터 추출한 얼굴 특징에 대해 말하자면, 얼굴 특징은 안면 특징(예컨대 눈, 코, 입, 귀 등)의 기하학적 정보, 예를 들어, 눈썹의 위치, 입의 각도, 눈의 크기 등을 포함할 수 있으나 이들에 한정되지 않는다. 기타 방법을 통하여 얼굴 특징 정보를 획득할 수도 있고, 미래의 새로운 안면 인식 기술에 대하여, 모두 미래의 새로운 안면 인식 기술을 사용하여 동영상 이미지에 대해 얼굴 인식을 행하여 얼굴 특징 정보를 획득할 수 있다는 것을 이해해야 한다. 그 다음 인식 모듈(120)이 얼굴 특징으로부터 얼굴 표정 정보를 추출하고, 인식 모듈(120)이 얼굴 특징 정보에 의거하여 분석을 하여 사용자의 얼굴 표정 정보를 획득할 수 있다.Specifically, the facial features extracted from the moving image by the recognition module 120 include facial features such as geometric information of the facial features (e.g., eyes, nose, mouth, ear, etc.) The size of the eye, and the like. It is necessary to understand that facial feature information can be acquired through other methods and that facial feature information can be obtained by performing face recognition on a moving image using a new facial recognition technology of the future do. Then, the recognition module 120 extracts facial expression information from the facial feature, and the recognition module 120 analyzes the facial feature information to obtain facial expression information of the user.

본 측면의 일 실시예에서, 얼굴 표정 정보는 눈살을 찌푸리는지, 입을 벌리거나 다무는지, 입가의 라디안, 눈을 떴는지 감았는지, 눈의 크기, 눈물이 있는지 등 내용 중의 하나 또는 여러 개를 포함한다.In one embodiment of this aspect, the facial expression information includes one or more of the following: whether frowning, opening or closing the mouth, radians in the mouth, opening or closing the eyes, eye size, do.

이 외에, 얼굴 표정 정보는 주로 사람의 정서 정보를 반영하는바, 예를 들어, 눈썹의 위치, 입의 각도, 눈의 크기 등에 대해 분석을 행하는 것을 통하여 사용자의 표정이 미소인지, 크게 웃는 것인지, 통곡인지, 침울인지, 흥분 또는 화를 내는 것인지 등등을 획득할 수 있다. 마찬가지로, 종래의 여러가지 얼굴 표정 정보 분석 기술(예컨대: 기계 학습 알고리즘 등)을 적용하여 분석을 행할 수 있다. 이 외에, 미래의 유사한 기능을 갖는 알고리즘에 대하여 모두 미래의 유사한 기능을 갖는 알고리즘을 사용하여 얼굴 특징 정보 분석을 행하여 얼굴 표정 정보를 획득할 수 있다.In addition, the facial expression information mainly reflects the emotional information of a person. For example, through analysis of the position of the eyebrow, the angle of the mouth, the size of the eyes, and the like, Whether it is weeping, gloom, excitement, or anger, and so on. Similarly, analysis can be performed by applying various conventional facial expression information analysis techniques (e.g., machine learning algorithm). In addition, facial expression information can be acquired by analyzing facial feature information using an algorithm having a similar function in the future for all algorithms having similar functions in the future.

이 외에, 송신 모듈(130)은 얼굴 표정 정보에 대해 코딩을 행하여 디지털 표현을 형성하고 - 예를 들어, 간단한 몇 개의 문자 부호이고 몇 비트만 차지하는 것일 수 있는바, 예를 들어, "크게 웃는 것"에 대해 직접 문자 부호"D: "송신하여 코드 전송을 행하는 것 등 일 수 있다. 물론 코딩방식이 더 풍부할 수 있는바, 여기서에는 단지 이해를 돕기 위해 예를 들어 설명하는 것일 뿐임 -, 당해 얼굴 표정 정보를 서버를 통하여 형성한 동영상 통화를 통하여 제2 단말로 송신할 수 있다.In addition to this, the transmitting module 130 may code the facial expression information to form a digital representation-for example, a simple character code and occupy only a few bits. For example, Quot; D: "to perform code transmission, for example. Of course, the coding method may be richer, which is merely an example for the sake of understanding, and the facial expression information may be transmitted to the second terminal through the video call formed through the server.

상기 실시예를 실현하기 위하여, 본 발명은 다른 한가지 단말을 더 제공한다.In order to realize the above embodiment, the present invention further provides another terminal.

도 5는 본 발명의 다른 한 실시예에 따른 단말의 구조 설명도이다.5 is a structural explanatory diagram of a terminal according to another embodiment of the present invention.

도 5가 도시하는바와 같이, 단말은 수신 모듈(210)과 합성 모듈(220)을 포함한다.As shown in FIG. 5, the terminal includes a receiving module 210 and a combining module 220.

구체적으로, 수신 모듈(210)은 단말과 통화를 형성하는 제1 단말이 송신하는 동영상 이미지의 얼굴 표정 정보를 수신하기 위한 것이다. 합성 모듈(220)은 얼굴 표정 정보와 단말에 미리 설정된 안면 이미지 모형에 의거하여 동영상 이미지를 합성하고 디스플레이하기 위한 것이다.Specifically, the receiving module 210 is for receiving facial expression information of a moving image transmitted from a first terminal forming a call with the terminal. The combining module 220 is for synthesizing and displaying the moving image based on the facial expression information and the facial image model preset in the terminal.

더 구체적으로, 합성 모듈(220)은 제1 단말 사용자의 얼굴 표정 정보와 미리 설정된 안면 이미지 모형에 의거하여 합성을 행하여 제1 단말 사용자의 얼굴 이미지를 재현하고, 단말의 동영상 통화 인터페이스에 디스플레이할 수 있다. 여기서, 미리 설정된 안면 이미지 모형은 사용자 자신이 설정한 것일 수 있고, 디폴트 설정한 것일 수도 있다. 이 외에, 단말의 사용자는 자신의 사진 또는 제1 단말 사용자의 사진이 디스플레이된 것을 적용하여 안면 이미지 모형으로 하여 제1 단말 사용자의 얼굴 이미지를 재현할 수도 있다.More specifically, the compositing module 220 synthesizes the facial expression information of the first terminal user and the preset facial image model to reproduce the facial image of the first terminal user and display it on the video call interface of the terminal have. Here, the preset face image model may be the one set by the user himself or the default one. In addition, the user of the terminal may reproduce the face image of the first terminal user using the facial image model by applying the photograph of the user of the terminal or the photograph of the first terminal user.

본 발명의 실시예의 단말은, 얼굴 인식 기술을 이용하여 얼굴 표정 정보를 추출하고, 단말과 통화를 형성하는 제1 단말이 송신한 얼굴 표정 정보와 미리 설정된 안면 이미지 모형에 의거하여 안면 이미지의 간단한 합성과 환원을 실현한다. 송신단과 수신단에서 전송되는 정보는 얼굴 표정 정보에만 한정되고, 당해 얼굴 표정 정보는 완전한 안면 이미지를 합성하는 것을 필요로 하지 않으므로, 포함된 정보량이 적고, 코딩한 후 얼굴 표정 정보의 데이터량은 몇 개의 비트바이트만 차지한다. 따라서 배경 기술이 전송하는 정보와 비교할 때, 동영상 통화과정에서 전송되는 데이터량을 대폭 줄이고, 데이터트래픽을 절약함으로써, 동영상 통화가 더욱 유창하도록 하고, 네트워크 대역폭 한정 또는 트래픽 제한이 동영상 통화에 대한 영향을 줄여 특히 모바일 네트워크에서의 전송에 적합하고, 사용자 체험을 향상시킨다. 이 외에, 안면 이미지 모형을 재건할 필요 없이, 얼굴 표정 정보에 의거하여 미리 설정된 안면 이미지 모형에 대응되는 얼굴 표정만 디스플레이하기만 하면 되는바, 조정하기 용이하고, 단말이 조정하기 용이하도록 한다.A terminal of the embodiment of the present invention extracts facial expression information using a facial recognition technology and generates a simple synthesis of a facial image based on facial expression information transmitted from a first terminal forming a call with a terminal and a preset facial image model And realizes reduction. Since the information transmitted from the transmitting end and the receiving end is limited to facial expression information and the facial expression information does not need to synthesize a complete facial image, the amount of information contained is small, and the data amount of the facial expression information after coding is several It takes up only bit bytes. Therefore, when compared with the information transmitted by the background technology, it is possible to greatly reduce the amount of data transmitted during the video call process, to save data traffic, to make the video call more fluent, and to limit the effect of network bandwidth limitation or traffic restriction on the video call Which is particularly suitable for transmission in a mobile network and enhances the user experience. In addition, since it is not necessary to reconstruct the facial image model, only the facial expression corresponding to the facial image model set in advance on the basis of the facial expression information is displayed, so that adjustment is easy and the terminal is easy to adjust.

도 6은 본 발명의 또 하나의 실시예에 따른 단말의 구조 설명도이다.6 is a structural explanatory view of a terminal according to another embodiment of the present invention.

도 6이 도시하는 바와 같이, 도 5가 도시하는 기초 상에서 단말은 선택 모듈(230)을 더 포함한다.As shown in FIG. 6, the terminal on the basis of FIG. 5 further includes a selection module 230.

구체적으로, 선택 모듈(230)은 수신 모듈(210)이 제2 단말과 통화를 형성하는 제1 단말이 송신한 동영상 이미지의 얼굴 표정 정보를 수신한 후, 진실한 또는 만화 캐릭터의 안면 이미지 모형을 선택하기 위한 것이고, 선택된 상기 진실한 또는 만화 캐릭터의 안면 이미지 모형은 얼굴 표정 정보와 동영상 이미지를 합성하고 디스플레이하기 위한 것이다.Specifically, the selection module 230 receives the facial expression information of the moving image transmitted from the first terminal forming the call with the second terminal by the receiving module 210, and then selects the true facial image model of the cartoon character And the selected face image model of the true or comic character is for synthesizing and displaying the facial expression information and the moving image.

더욱 구체적으로, 동영상 통화과정이 더욱 개성적이고 즐거움을 향상하기 위하여 단말은 사용자에게 복수의 진실한 또는 만화 캐릭터의 안면 이미지 모형, 예를 들어, 복수의 만화 캐릭터 안면 이미지 모형, 또는 사진, 진실한 안면 이미지 모형 등을 제공할 수 있는바, 사용자는 자신의 수요에 의거하여 자신이 좋아하는 안면 이미지 모형을 선택할 수 있다. 예를 들어, 제1 단말 사용자의 얼굴 표정 정보가 크게 웃는 것이고, 단말 사용자가 선택한 것은 슈퍼맨의 안면 이미지 모형이면, 단말은 제1 단말 사용자의 얼굴 표정 정보와 슈퍼맨의 만화 캐릭터 이미지를 합성하여 기타 단말 사용자의 얼굴 표정을 크게 웃고 있는 이미지로 재현한다.More specifically, in order to make the video call process more individual and enjoyable, the terminal may provide the user with a plurality of true or comic character face image models, for example, a plurality of cartoon character face image models or photographs, The user can select his / her favorite facial image model based on his or her demand. For example, if the facial expression information of the first terminal user is largely laid and the terminal user selects the facial image model of Superman, the terminal synthesizes the facial expression information of the first terminal user and the superman cartoon character image, The user's facial expression is reproduced with an image laughing loudly.

이에 따라, 사용자는 진실한 또는 만화 캐릭터의 안면 이미지 모형을 선택하고, 선택된 진실한 또는 만화 캐릭터의 안면 이미지 모형과 얼굴 표정 정보에 의거하여 동영상 이미지를 합성하고 디스플레이할 수 있어, 즐거움을 향상하고 사용자의 체험을 향상시킨다.Accordingly, the user can select a true or comic character's face image model, and can synthesize and display the motion image based on the facial image model and the facial expression information of the selected true or comic character, thereby improving the enjoyment, .

상기 목적을 실현하기 위하여, 본 발명은 단말설비를 더 제공한다.In order to achieve the above object, the present invention further provides a terminal equipment.

본 발명의 실시예의 단말설비는 하나 또는 복수의 프로세서; 메모리; 하나 또는 복수의 프로그램을 포함하고, 상기 하나 또는 복수의 프로그램은 상기 메모리에 저장되고, 상기 하나 또는 복수의 프로세서에 의해 수행될 때, 상기 단말설비 사용자의 동영상 이미지를 채집하는 조작; 상기 동영상 이미지에 대해 얼굴 인식을 행하여 얼굴 표정 정보를 획득하는 조작; 상기 얼굴 표정 정보를 상기 단말설비와 통화를 형성하는 제2 단말로 송신하는 조작을 행하며, 상기 얼굴 표정 정보는 상기 제2 단말이 상기 얼굴 표정 정보와 상기 제2 단말에 미리 설정된 안면 이미지 모형에 의거하여 동영상 이미지를 합성하고 디스플레이하도록 하기 위한 것이다.The terminal equipment of an embodiment of the present invention may include one or more processors; Memory; An operation of collecting a moving image of the terminal equipment user when the one or more programs are stored in the memory and executed by the one or more processors; An operation of performing facial recognition on the moving image to obtain facial expression information; The facial expression information is transmitted to the second terminal based on the facial expression information and a facial image model preset in the second terminal, Thereby synthesizing and displaying the moving image.

상기 목적을 실현하기 위하여, 본 발명은 다른 한가지 단말설비를 더 제공한다.In order to achieve the above object, the present invention further provides another terminal equipment.

본 발명의 실시예의 단말설비는 하나 또는 복수의 프로세서; 메모리; 하나 또는 복수의 프로그램을 포함하고, 상기 하나 또는 복수의 프로그램은 상기 메모리에 저장되고, 상기 하나 또는 복수의 프로세서에 의해 수행될 때, 상기 단말설비와 통화를 형성하는 제1 단말이 송신한 동영상 이미지의 얼굴 표정 정보를 수신하는 조작; 상기 얼굴 표정 정보와 상기 단말설비에 미리 설정된 안면 이미지 모형에 의거하여 동영상 이미지를 합성하고 디스플레이하는 조작을 행한다.The terminal equipment of an embodiment of the present invention may include one or more processors; Memory; Wherein the one or more programs are stored in the memory, and when the one or more programs are executed by the one or more processors, An operation of receiving facial expression information of a face; And performs an operation of synthesizing and displaying the moving image based on the facial expression information and the facial image model set in advance in the terminal equipment.

본 발명의 각 부분은 하드웨어, 소프트웨어, 펌웨어 또는 그들의 조합으로 실현될 수 있다는 것으로 이해되어야 한다. 상기 실시 방식에서, 복수 개의 단계나 방법은 메모리에 저장된 적합한 명령으로 시스템 실행을 실행하는 소프트웨어 또는 펌웨어로 실현할 수 있다. 예를 들어, 만약 하드웨어로 실현한다면 다른 한 실시 방식에서처럼 본 분야에서의 데이터 신호에 대해 로직 기능을 실현하기 위한 로직 게이트 회로를 구비한 이산 로직 회로, 적합한 조합 로직 게이트 회로를 구비한 전용 집적 회로, 프로그램 가능 게이트 어레이(PGA), 필드 프로그램 가능 게이트 어레이(FPGA)등 공지된 기술 중의 어느 하나 또는 그들의 조합으로 실현할 수 있다.It is to be understood that each part of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above implementation, a plurality of steps or methods may be realized with software or firmware that executes system execution with appropriate instructions stored in memory. For example, if implemented in hardware, a discrete logic circuit with a logic gate circuit for realizing logic functions for data signals in this field as in another implementation, a dedicated integrated circuit with a suitable combinational logic gate circuit, A programmable gate array (PGA), a field programmable gate array (FPGA), or any combination thereof.

본 명세서의 설명에서 참조 용어 "일 실시예", "일부 실시예", "예시", "구체적 예시" 또는 "일부 예시" 등의 설명은 당해 실시예 또는 예시를 결부하여 설명하는 구체적인 특징, 구조, 재료 또는 특점이 본 발명의 적어도 하나의 실시예 또는 예시에 포함된다는 것을 의미한다. 본 명세서에서 상기 용어에 대한 함축적인 표달이 반드시 동일한 실시예 또는 예시를 가리키는 것은 아니다. 그리고, 설명된 구체적 특징, 구조, 재료 또는 특점은 임의의 하나 또는 복수의 실시예 또는 예시에서 적합한 방식으로 결합될 수 있다.In the description of the present specification, the description of the terms "one embodiment", "some embodiments", "an example", "a specific example" or "some examples" , Material, or characteristic is included in at least one embodiment or example of the present invention. The terms used in the specification are not necessarily referring to the same embodiment or example. In addition, the specific features, structures, materials, or features described may be combined in any suitable manner in any one or more embodiments or examples.

비록 위에서 이미 본 발명의 실시예를 제시하고 설명하였지만 본 기술 분야의 통상의 지식을 가진 자들은 본 발명의 범위내에서 상기 실시예에 대해 변화, 수정, 대체와 변형을 진행할 수 있고, 본 발명의 범위는 청구항 및 그 균등물에 의해 한정된다는 것을 이해할 것이다.
Although the preferred embodiments of the present invention have been shown and described, it will be apparent to those skilled in the art that changes, modifications, substitutions and alterations can be made hereto without departing from the scope of the present invention, It is to be understood that the scope is defined by the claims and their equivalents.

Claims

As a virtual video call method,
Collecting a moving image of the first terminal user;
Performing facial recognition on the moving image to obtain facial expression information;
Transmitting only the facial expression information to a second terminal that forms a call with the first terminal; And
Acquiring a face image of a user and analyzing based on the face image to obtain a true face image model and transmitting the true face image model to a second terminal that forms a call with the first terminal,
Wherein the facial expression information is for causing the second terminal to synthesize and display a moving image based on the facial expression information and the true facial image model,
The virtual video call method comprising:

The method according to claim 1,
Wherein the step of acquiring facial expression information by performing face recognition on the moving image comprises:
Acquiring facial features by performing face recognition on the moving image, extracting the facial expression information from the facial features,
Wherein the virtual video calling method comprises the steps of:

3. The method according to claim 1 or 2,
The facial expression information includes one or more of the following contents, such as frowning, opening or closing the mouth, radians in the mouth, opening or closing of the eyes, the size of the eyes,
The virtual video call method comprising:

As a virtual video call method,
Receiving only facial expression information of a moving image transmitted by a first terminal forming a call with a second terminal;
Receiving a user's true facial image model transmitted by a first terminal forming a call with a second terminal; And
Synthesizing and displaying a moving image based on the facial expression information and the true facial image model,
Wherein the virtual video calling method comprises the steps of:

As a terminal,
A collection module for collecting a user's moving image and acquiring a user's facial image;
A recognition module for acquiring facial expression information by performing face recognition on the moving image and analyzing based on the facial image to obtain a true facial image model;
And a transmission module for transmitting the facial expression information and the true facial image model to a second terminal that forms a call with the terminal,
Wherein the facial expression information is for causing the second terminal to synthesize and display a moving image based on the facial expression information and the true facial image model,
.

6. The method of claim 5,
Wherein the recognition module further comprises a face recognition unit for acquiring a face feature by performing face recognition on the moving image and extracting the face expression information from the face feature,
.

The method according to claim 5 or 6,
The facial expression information includes one or more of the following contents, such as frowning, opening or closing the mouth, radians in the mouth, eyes opened or closed, eye size,
.

As a terminal,
A receiving module for receiving facial expression information of a moving image transmitted by a first terminal forming a call with the terminal and a true facial image model of the user;
A synthesizing module for synthesizing and displaying a moving image on the basis of the facial expression information and the true face image model of the user,
And a terminal device.

As terminal equipment,
One or more processors;
Memory;
One or more programs,
Wherein the one or more programs are stored in the memory and when executed by the one or more processors,
An operation of collecting a moving image of the user of the terminal equipment;
An operation of performing facial recognition on the moving image to obtain facial expression information;
And transmitting only the facial expression information to a second terminal that forms a call with the terminal equipment;
Performing an operation of acquiring a face image of a user and analyzing based on the face image to obtain a true face image model and transmitting the true face image model to a second terminal that forms a call with the first terminal,
Wherein the facial expression information is for causing the second terminal to synthesize and display a moving image based on the facial expression information and the true facial image model,
.

As terminal equipment,
One or more processors;
Memory;
One or more programs,
Wherein the one or more programs are stored in the memory and when executed by the one or more processors,
An operation of receiving facial expression information of a moving image transmitted by a first terminal forming a call with the terminal equipment and a true facial image model of the user;
Performing an operation of synthesizing and displaying a moving image on the basis of the facial expression information and the true face image model of the user,
.

delete