WO2018103220A1 - 一种图像处理的方法及装置 - Google Patents

一种图像处理的方法及装置 Download PDF

Info

Publication number
WO2018103220A1
WO2018103220A1 PCT/CN2017/075742 CN2017075742W WO2018103220A1 WO 2018103220 A1 WO2018103220 A1 WO 2018103220A1 CN 2017075742 W CN2017075742 W CN 2017075742W WO 2018103220 A1 WO2018103220 A1 WO 2018103220A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
facial expression
dimensional model
eye
anthropomorphic
Prior art date
Application number
PCT/CN2017/075742
Other languages
English (en)
French (fr)
Inventor
张威
Original Assignee
武汉斗鱼网络科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 武汉斗鱼网络科技有限公司 filed Critical 武汉斗鱼网络科技有限公司
Publication of WO2018103220A1 publication Critical patent/WO2018103220A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/04Texture mapping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • H04N21/4314Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations for fitting data in a restricted space on the screen, e.g. EPG data in a rectangular grid
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Definitions

  • the present invention relates to the field of image processing technologies, and in particular, to a method and an apparatus for image processing.
  • Face recognition is a biometric recognition technology based on human facial feature information for identification. Using a camera or camera to capture an image or video stream containing a face, and automatically detect and track the face in the image, and then perform a series of related techniques on the face of the detected face, usually called portrait recognition and face recognition.
  • the embodiment of the invention provides a method and a device for image processing, which use the face recognition algorithm to realize the facial expression of the anthropomorphic three-dimensional model to follow the change of the facial expression of the user, and enhance the interest of the display effect in the live video/video recording process.
  • the user experience uses the face recognition algorithm to realize the facial expression of the anthropomorphic three-dimensional model to follow the change of the facial expression of the user, and enhance the interest of the display effect in the live video/video recording process.
  • the application provides a method for image processing, the method comprising:
  • a facial recognition algorithm is used to obtain facial expression data of the user
  • the step of acquiring the facial expression data of the user by using the face recognition algorithm comprises:
  • the location of the specific key point of the user's face is marked;
  • the user facial expression data includes a state of the specific key position in a preset time, orientation information of the user's face in a three-dimensional space, and a gaze direction of the user's eyes.
  • the specific key points include eye key points, eyebrow key points, and mouth key points;
  • the step of detecting the state of the specific key point position at a preset time according to the specific key point location includes:
  • the opening and closing size of the user's mouth is calculated according to the key point of the mouth.
  • the step of adjusting the facial expression of the anthropomorphic three-dimensional model according to the facial expression data of the user, so that the facial expression of the anthropomorphic three-dimensional model changes according to the facial expression of the user specifically includes:
  • the step of adjusting the facial expression of the anthropomorphic three-dimensional model according to the facial expression data of the user, so that the facial expression of the anthropomorphic three-dimensional model changes according to the facial expression of the user specifically includes:
  • a small amount of small motions and subtle expressions are generated according to a preset pre-made skeletal animation random application, and applied to the face of the anthropomorphic three-dimensional model.
  • the present application provides an apparatus for image processing, the apparatus comprising:
  • a user expression obtaining module configured to acquire facial expression data of the user by using a face recognition algorithm in a live video or video recording scene
  • a model expression obtaining module configured to acquire a facial expression of a preset anthropomorphic three-dimensional model in the live video scene
  • an adjustment module configured to adjust a facial expression of the personification three-dimensional model according to the user facial expression data, so that the facial expression of the anthropomorphic three-dimensional model changes according to the facial expression of the user.
  • the user expression obtaining module specifically includes:
  • a marking unit configured to mark a user's face with a specific key point after identifying the user's face by using a face recognition algorithm
  • a detecting unit configured to detect a state of the specific key point position at a preset time according to the specific key point position
  • An acquiring unit configured to acquire orientation information of a user's face in three-dimensional space by using a face recognition algorithm And the gaze direction of the user's eyes;
  • the user facial expression data includes a state of the specific key position in a preset time, orientation information of the user's face in a three-dimensional space, and a gaze direction of the user's eyes.
  • the specific key points include eye key points, eyebrow key points, and mouth key points;
  • the detecting unit is specifically configured to:
  • the opening and closing size of the user's mouth is calculated according to the key point of the mouth.
  • the adjustment module is specifically configured to:
  • the adjustment module is further used to:
  • a small amount of small motions and subtle expressions are generated according to a preset pre-made skeletal animation random application, and applied to the face of the anthropomorphic three-dimensional model.
  • the face recognition algorithm is used to obtain the facial expression data of the user; the facial expression of the anthropomorphic three-dimensional model preset in the live video scene is obtained; and the anthropomorphic three-dimensional model is adjusted according to the facial expression data of the user.
  • the facial expression is such that the facial expression of the anthropomorphic three-dimensional model changes in accordance with the facial expression of the user.
  • the face recognition algorithm is used to implement the facial expression of the anthropomorphic three-dimensional model to follow the change of the facial expression of the user, and the interest of the display effect in the live video/video recording process is enhanced, and the user experience is improved.
  • FIG. 1 is a schematic diagram of an embodiment of a method for image processing in an embodiment of the present invention
  • FIG. 2 is a schematic diagram of an embodiment of step S102 in the embodiment shown in Figure 1;
  • FIG. 3 is a schematic diagram of 68 face key points marked by the OpenFace face recognition algorithm
  • FIG. 4 is a schematic diagram of an embodiment of a virtual three-dimensional square constructed according to orientation information of a face in a three-dimensional space according to an embodiment of the present invention
  • FIG. 5 is a schematic diagram of an embodiment of recognizing a gaze direction of a user's eyes according to a face recognition algorithm according to an embodiment of the present invention
  • Figure 6 is a schematic diagram of an embodiment of step S1022 in the embodiment shown in Figure 3;
  • FIG. 7 is a schematic diagram of an embodiment of step S103 in the embodiment shown in Figure 1;
  • FIG. 8 is a schematic diagram of one embodiment of processing an anthropomorphic three-dimensional model eye texture and mouth texture
  • FIG. 9 is a schematic diagram of an embodiment of an apparatus for image processing according to an embodiment of the present invention.
  • FIG. 10 is a schematic diagram of another embodiment of an apparatus for image processing in an embodiment of the present invention.
  • the method of the image processing in the embodiment of the present invention is first applied to the image processing device.
  • the device may be located in a fixed terminal, such as a desktop computer, a server, or the like, or may be located in a mobile terminal, such as a mobile phone. Tablets, etc.
  • an embodiment of a method for image processing in an embodiment of the present invention includes:
  • the face recognition algorithm may be an OpenFace face recognition algorithm.
  • OpenFace face recognition algorithm is an open source face recognition and face key point tracking algorithm. It is mainly used to detect face regions and then mark the key points of face features. OpenFace marks 68 key points of the face and can Track eye orientation and face orientation.
  • the anthropomorphic three-dimensional model is not limited to a virtual animal or a virtual pet, and may also be a natural object, such as an anthropomorphic cabbage, an anthropomorphic table, or an anime.
  • a virtual three-dimensional character or a virtual three-dimensional animal is not limited herein.
  • Obtaining a facial expression of the anthropomorphic three-dimensional model preset in the live video scene may be an image frame directly acquiring a facial expression of the current anthropomorphic three-dimensional model, where the image frame includes a facial expression of the anthropomorphic three-dimensional model.
  • the expression data of the user expression data and the anthropomorphic three-dimensional model may be acquired in units of frames, and the subsequent adjustment may also be adjusted in units of frames.
  • the face recognition algorithm is used to obtain the facial expression data of the user; the facial expression of the anthropomorphic three-dimensional model preset in the live video scene is obtained; and the anthropomorphic three-dimensional model is adjusted according to the facial expression data of the user.
  • the facial expression is such that the facial expression of the anthropomorphic three-dimensional model changes in accordance with the facial expression of the user.
  • the face recognition algorithm is used to implement the facial expression of the anthropomorphic three-dimensional model to follow the change of the facial expression of the user, and the interest of the display effect in the live video/video recording process is enhanced, and the user experience is improved.
  • the step S102 may specifically include:
  • S1021 After identifying a user's face by using a face recognition algorithm, marking a location of a specific key point of the user's face;
  • an OpenFace face recognition algorithm is taken as an example. After using the OpenFace face recognition technology to detect a face, the marker tracks the position of the key point of the face. The feature points that need to be used are recorded from these points, and the three facial features of the eyes, eyebrows, and mouth are illustrated. Figure 3 shows the 68 face keys of the OpenFace tag.
  • 68 feature points of the face are marked, which are numbered by 1 to 68.
  • the three facial features of the eyes, eyebrows and mouth are used as examples.
  • the key points to be used are as follows:
  • Mouth 49, 55, 61, 62, 63, 64, 65, 66, 67, 68
  • the OpenFace face recognition algorithm can return 68 key points of the face in pixel coordinates.
  • S1022 Detect the specific key position at a preset time according to the specific key position status
  • the specific key position it is possible to calculate a state in which the specific key position is calculated at a preset time, such as an eye opening/closing state, an eye size, an eyebrow lifting amplitude, a mouth opening and closing size, and the like.
  • the user facial expression data includes a state of the specific key position in a preset time, orientation information of the user's face in a three-dimensional space, and a gaze direction of the user's eyes.
  • the orientation information of the user's face in the three-dimensional space is obtained by using the OpenFace face recognition algorithm, and the orientation information includes three steering angle information: a yaw angle, a pitch angle, and a side angle ( Roll), constructing a virtual three-dimensional square according to the three steering angles to indicate orientation information, specifically a rectangular three-dimensional square as shown in FIG.
  • the gaze direction of the user's eyes can be directly recognized by the OpenFace face recognition algorithm, and the white line on the eyes in FIG. 5 indicates the recognized eye gaze direction.
  • the specific key points include an eye key point, an eyebrow key point, and a mouth key point, wherein each of the above-mentioned eye key points, eyebrow key points, and mouth key points includes one or more key points. .
  • the step S1022 may specifically include:
  • S10221 Calculate a user's eye opening/closing state and an eye size according to the eye key point;
  • the distance calculation formula used in this calculation is as follows:
  • a key point a, the corresponding pixel coordinates are (x1, y1);
  • d represents the length of the distance from the key point a to the key point b;
  • the height and width of the rectangular area of the eye are obtained.
  • the rectangular area of the eye is used to indicate the size of the eye.
  • the pixel distance value e between the key point 20 at the highest point of the eyebrow and the key point 38 of the eye is calculated. Since the head-up, top-down, and left-right swings affect this value, the face width value is calculated based on the face width, and the face width value calculates the distance f between the key point 3 and the key point 15, and the value of the eyebrow lift amplitude is e/f. When the eyebrows are lifted, the value of e/f will change accordingly. Therefore, the value of the eyelash's amplitude is calculated based on the minimum value of e/f. The minimum value can be used to quickly and effectively determine the eyebrow movement.
  • S10223 Calculate a size of a user's mouth opening and closing according to the key point of the mouth.
  • the pixel distance g between the key point 63 and the key point 67 is calculated, and the pixel distance h between the key point 61 and the key point 65 is calculated.
  • the size of the user's mouth opening and closing is: g/h.
  • the step S103 may specifically include:
  • S1013 processing an eye part of the anthropomorphic three-dimensional model into a transparent; processing a transparent gap between the upper and lower lips of the anthropomorphic three-dimensional model to process the drawing of the tooth;
  • the orientation information in the three-dimensional space of the user's face obtained before is set: the angle of the Yaw, the pitch, and the roll are: ⁇ , ⁇ , ⁇ .
  • the rotation transformation matrix M corresponding to the rotation using the Euler angle is:
  • the orientation of the three-dimensional object can be changed by applying the modified rotation transformation matrix to the three-dimensional object.
  • S1033 Obtain a pre-made eye texture and a mouth texture, and attach the eye texture and the mouth texture to the anthropomorphic three-dimensional model face;
  • the preset eye texture and the mouth texture may be preset reference eye textures and reference mouth textures of the anthropomorphic three-dimensional model.
  • Applying the eye texture and the mouth texture to the anthropomorphic three-dimensional model face may be: aligning the facial key points recognized by the OpenFace face recognition algorithm with the eyes of the anthropomorphic three-dimensional model and the opening and closing of the mouth Texture processing,
  • the texture of the opening of the eye opening and closing mouth and the opening and closing mouth of the mouth are stretched, and then the opening and closing of the eye and the opening and closing of the mouth are respectively restricted according to the size of the eye and the opening and closing of the mouth.
  • Aspect ratio As shown in FIG. 8, the eye texture map position is calculated according to the user's eye gaze direction to process the rotation and orientation information of the anthropomorphic three-dimensional model eyeball, and the eyeball orientation changes only the position of the eye texture without affecting the size of the eye texture.
  • posion is the coordinates of the vertices of the 3D model created by the 3DS MAX 3D modeling software
  • inputTextureCoordinate is the texture texture coordinate corresponding to the vertex coordinates of the 3D model created by 3DS MAX 3D modeling software; textureCoordinate is the coordinate to be passed to the fragment shader; matrixM is the transformation matrix M, which is used to process the rotation of the model; gl_Position is output to OpenGL The vertex coordinates of the process. The role of matrixM*postion is to do a rotation transformation on vertex coordinates. Then assign matrixM*postion to gl_Position to get the coordinates after the final model is rotated. Finally, gl_Position is handed over to OpenGL for automatic processing to get the picture of the model head rotation.
  • the motion in order to make the motion of the three-dimensional animal simulation natural, it is necessary to randomly generate small-scale small motions and subtle expressions, and the motion here uses a 3D modeling software such as 3DS MAX to pre-produce several sets of skeletal animations, which are randomly used here. Group animation.
  • the step of adjusting the facial expression of the anthropomorphic three-dimensional model according to the facial expression data of the user, so that the facial expression of the anthropomorphic three-dimensional model changes according to the facial expression of the user may further include:
  • 3D modeling software for example, 3DS MAX
  • small-scale small motions and subtle expressions are generated according to a preset pre-made skeletal animation random application, and applied to the face of the anthropomorphic three-dimensional model.
  • the anchor or video recorder shows a face
  • a small window image is displayed in a corner of the live broadcast or video recording screen to display a virtual anthropomorphic three-dimensional model, anchor or video recording.
  • people are not willing to show their faces, they only show anthropomorphic 3D in a small window.
  • the model simulates the expression movements of the anchor and the video recorder, so that the sound and picture are synchronized.
  • FIG. 9 is a schematic diagram of an embodiment of an apparatus for image processing according to an embodiment of the present invention, where the apparatus includes:
  • the user expression obtaining module 901 is configured to acquire facial expression data of the user in a live video or video recording scene;
  • the model expression obtaining module 902 is configured to acquire a facial expression of a personification three-dimensional model preset in the live video scene by using a face recognition algorithm;
  • the adjusting module 903 is configured to adjust the facial expression of the personification three-dimensional model according to the facial expression data of the user, so that the facial expression of the anthropomorphic three-dimensional model changes according to the facial expression of the user.
  • the user expression obtaining module 901 may specifically include:
  • a marking unit 9011 configured to mark a user's face with a specific key position after identifying the user's face by using a face recognition algorithm
  • the detecting unit 9012 is configured to detect a state of the specific key point position at a preset time according to the specific key point position;
  • the obtaining unit 9013 is configured to acquire orientation information of the user's face in the three-dimensional space and a gaze direction of the user's eyes by using a face recognition algorithm;
  • the user facial expression data includes a state of the specific key position in a preset time, orientation information of the user's face in a three-dimensional space, and a gaze direction of the user's eyes.
  • the specific key points include eye key points, eyebrow key points, and mouth key points;
  • the detecting unit 9012 is specifically configured to:
  • the opening and closing size of the user's mouth is calculated according to the key point of the mouth.
  • the adjustment module 903 is specifically configured to:
  • the adjustment module 903 is further configured to:
  • a small amount of small motions and subtle expressions are generated according to a preset pre-made skeletal animation random application, and applied to the face of the anthropomorphic three-dimensional model.
  • the disclosed system, apparatus, and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium.
  • the technical solution of the present invention which is essential or contributes to the prior art, or all or part of the technical solution, may be embodied in the form of a software product stored in a storage medium.
  • a number of instructions are included to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like. .

Abstract

本发明实施例公开了一种图像处理的方法及装置,用于图像处理技术领域。本发明实施例方法包括:在视频直播或视频录制场景中,利用人脸识别算法获取用户面部表情数据;获取视频直播场景中预设的拟人化三维模型的面部表情;根据用户面部表情数据调整拟人化三维模型的面部表情,以使得拟人化三维模型的面部表情跟随所述用户面部表情而变化。本发明实施例中利用人脸识别算法实现拟人化三维模型面部表情跟随用户面部表情变化而变化,增强视频直播/视频录制过程中展示效果的趣味性,提升了用户体验。

Description

一种图像处理的方法及装置 技术领域
本发明涉及图像处理技术领域,特别涉及一种图像处理的方法及装置。
背景技术
人脸识别是基于人的脸部特征信息进行身份识别的一种生物识别技术。用摄像机或摄像头采集含有人脸的图像或视频流,并自动在图像中检测和跟踪人脸,进而对检测到的人脸进行脸部的一系列相关技术,通常也叫做人像识别、面部识别。
虽然随着人脸识别技术的发展,越来越多的应用到人们生活的方方面面,但是在某些领域的应用仍有待开发。
发明内容
本发明实施例提供了一种图像处理的方法及装置,利用人脸识别算法实现拟人化三维模型面部表情跟随用户面部表情变化而变化,增强视频直播/视频录制过程中展示效果的趣味性,提升了用户体验。
第一方面,本申请提供一种图像处理的方法,该方法包括:
在视频直播或视频录制场景中,利用人脸识别算法获取用户面部表情数据;
获取所述视频直播场景中预设的拟人化三维模型的面部表情;
根据所述用户面部表情数据调整所述拟人化三维模型的面部表情,以使得所述拟人化三维模型的面部表情跟随所述用户面部表情而变化。
优选的,所述利用人脸识别算法获取用户面部表情数据的步骤,具体包括:
利用人脸识别算法识别用户人脸后,标记用户人脸特定关键点位置;
根据所述特定关键点位置,检测所述特定关键点位置在预设时间的状态;
利用人脸识别算法获取用户人脸在三维空间中的朝向信息以及用户眼睛的凝视方向;
其中,所述用户面部表情数据包括所述特定关键点位置在预设时间内的状态、所述用户人脸在三维空间中的朝向信息以及用户眼睛的凝视方向。
优选的,所述特定关键点包括眼睛关键点、眉毛关键点和嘴巴关键点;
所述根据所述特定关键点位置,检测所述特定关键点位置在预设时间的状态的步骤,具体包括:
根据所述眼睛关键点计算用户眼睛睁开/闭合状态以及眼睛大小;
根据所述眉毛关键点计算用户眉毛挑动幅度;
根据所述嘴巴关键点计算用户嘴巴开合大小。
优选的,所述根据所述用户面部表情数据调整所述拟人化三维模型的面部表情,以使得所述拟人化三维模型的面部表情跟随所述用户面部表情而变化的步骤,具体包括:
将所述拟人化三维模型的眼睛部分处理成透明;将所述拟人化三维模型嘴部的上下唇之间处理处一条透明的缝隙,以处理绘制牙齿;
利用欧拉角进行旋转所述用户人脸在三维空间中的朝向信息得到旋转变化矩阵;
获取预先制作的眼部纹理和嘴部纹理,并将所述眼部纹理和嘴部纹理贴合到所述拟人化三维模型面部;
根据所述用户眼睛睁开/闭合状态以及眼睛大小和所述用户眼睛的凝视方调整所述眼部纹理;根据所述嘴巴开合大小调整所述嘴部纹理;
将所述旋转变换矩阵应用到所述拟人化三维模型,用来改变所述拟人化三维模型的朝向,使得所述拟人化三维模型的面部表情跟随所述用户面部表情变化。
优选的,所述根据所述用户面部表情数据调整所述拟人化三维模型的面部表情,以使得所述拟人化三维模型的面部表情跟随所述用户面部表情而变化的步骤,具体还包括:
在3D建模软件中,根据预设的预先制作好的骨骼动画随机套用生成小幅度的小动作和细微表情,并应用在所述拟人化三维模型的面部。
第二方面,本申请提供一种图像处理的装置,所述装置包括:
用户表情获取模块,用于在视频直播或视频录制场景中,利用人脸识别算法获取用户面部表情数据;
模型表情获取模块,用于获取所述视频直播场景中预设的拟人化三维模型的面部表情;
调整模块,用于根据所述用户面部表情数据调整所述拟人化三维模型的面部表情,以使得所述拟人化三维模型的面部表情跟随所述用户面部表情而变化。
优选的,所述用户表情获取模块具体包括:
标记单元,用于利用人脸识别算法识别用户人脸后,标记用户人脸特定关键点位置;
检测单元,用于根据所述特定关键点位置,检测所述特定关键点位置在预设时间的状态;
获取单元,用于利用人脸识别算法获取用户人脸在三维空间中的朝向信息 以及用户眼睛的凝视方向;
其中,所述用户面部表情数据包括所述特定关键点位置在预设时间内的状态、所述用户人脸在三维空间中的朝向信息以及用户眼睛的凝视方向。
优选的,所述特定关键点包括眼睛关键点、眉毛关键点和嘴巴关键点;
所述检测单元具体用于:
根据所述眼睛关键点计算用户眼睛睁开/闭合状态以及眼睛大小;
根据所述眉毛关键点计算用户眉毛挑动幅度;
根据所述嘴巴关键点计算用户嘴巴开合大小。
优选的,所述调整模块具体用于:
将所述拟人化三维模型的眼睛部分处理成透明;将所述拟人化三维模型嘴部的上下唇之间处理处一条透明的缝隙,以处理绘制牙齿;
利用欧拉角进行旋转所述用户人脸在三维空间中的朝向信息得到旋转变化矩阵;
获取预先制作的眼部纹理和嘴部纹理,并将所述眼部纹理和嘴部纹理贴合到所述拟人化三维模型面部;
根据所述用户眼睛睁开/闭合状态以及眼睛大小和所述用户眼睛的凝视方调整所述眼部纹理;根据所述嘴巴开合大小调整所述嘴部纹理;
将所述旋转变换矩阵应用到所述拟人化三维模型,用来改变所述拟人化三维模型的朝向,使得所述拟人化三维模型的面部表情跟随所述用户面部表情变化。
优选的,所述调整模块具体还用于:
在3D建模软件中,根据预设的预先制作好的骨骼动画随机套用生成小幅度的小动作和细微表情,并应用在所述拟人化三维模型的面部。
从以上技术方案可以看出,本发明实施例具有以下优点:
本发明实施例在视频直播或视频录制场景中,利用人脸识别算法获取用户面部表情数据;获取视频直播场景中预设的拟人化三维模型的面部表情;根据用户面部表情数据调整拟人化三维模型的面部表情,以使得拟人化三维模型的面部表情跟随所述用户面部表情而变化。本发明实施例中利用人脸识别算法实现拟人化三维模型面部表情跟随用户面部表情变化而变化,增强视频直播/视频录制过程中展示效果的趣味性,提升了用户体验。
附图说明
图1是本发明实施例中图像处理的方法的一个实施例示意图;
图2是图1所示实施例中步骤S102的一个实施例示意图;
图3是OpenFace人脸识别算法标记的68个脸部关键点示意图;
图4是本发明实施例中根据人脸在三维空间的朝向信息构建的虚拟三维方块的一个实施例示意图;
图5是本发明实施例中根据人脸识别算法识别用户眼睛的凝视方向的一个实施例示意图;
图6是图3所示实施例中步骤S1022的一个实施例示意图;
图7是图1所示实施例中步骤S103的一个实施例示意图;
图8是处理拟人化三维模型眼部纹理及嘴部纹理的一个实施例示意图
图9是本发明实施例中图像处理的装置的一个实施例示意图;
图10是本发明实施例中图像处理的装置的另一个实施例示意图。
具体实施方式
为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分的实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明保护的范围。
本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
下面首先本发明实施例中图像处理的方法,该图像处理的方法应用于图像处理的装置中,该装置可以位于固定终端中,例如桌面电脑,服务器等,也可以位于移动终端中,例如手机、平板电脑等。
请参阅图1,本发明实施例中图像处理的方法一个实施例包括:
S101、在视频直播或视频录制场景中,利用人脸识别算法获取用户面部表情数据;
本发明实施例中,人脸识别算法可以是OpenFace人脸识别算法。OpenFace人脸识别算法是一种开源人脸识别及人脸关键点追踪算法,主要用来检测人脸区域,然后标记脸部特征关键点位置,OpenFace标记了脸部68个特征关键点,并且能够追踪眼球朝向和脸部朝向。
S102、获取所述视频直播场景中预设的拟人化三维模型的面部表情;
本发明实施例中,拟人化三维模型不限于虚拟动物、虚拟宠物,也可以是自然物件,如可以是一颗拟人化的白菜,也可以是一张拟人化的桌子,也可以是动漫中的虚拟三维人物或虚拟三维动物,具体此处不做限定。
获取所述视频直播场景中预设的拟人化三维模型的面部表情,可以是直接获取当前拟人化三维模型的面部表情的图像帧,该图像帧中包括拟人化三维模型的面部表情。
S103、根据所述用户面部表情数据调整所述拟人化三维模型的面部表情,以使得所述拟人化三维模型的面部表情跟随所述用户面部表情而变化。
需要说明的是,本发明实施例中获取用户表情数据及拟人化三维模型的表情数据可以是以帧为单位获取的,后续调整也可以是以帧为单位对应调整的。
本发明实施例在视频直播或视频录制场景中,利用人脸识别算法获取用户面部表情数据;获取视频直播场景中预设的拟人化三维模型的面部表情;根据用户面部表情数据调整拟人化三维模型的面部表情,以使得拟人化三维模型的面部表情跟随所述用户面部表情而变化。本发明实施例中利用人脸识别算法实现拟人化三维模型面部表情跟随用户面部表情变化而变化,增强视频直播/视频录制过程中展示效果的趣味性,提升了用户体验。
优选的,如图2所示,所述步骤S102具体可以包括:
S1021、利用人脸识别算法识别用户人脸后,标记用户人脸特定关键点位置;
本发明实施例中以OpenFace人脸识别算法为例说明,运用OpenFace人脸识别技术检测人脸之后,标记追踪人脸关键点位置。从这些点中记录下需要用到的特征点,以眼睛、眉毛、嘴巴三个五官特征举例说明。如图3是OpenFace标记的68个脸部关键点。
其中,图3中标记了脸部68个特征点,以1~68为编号说明,以眼睛、眉毛、嘴巴三个五官特征举例,需要用到的关键点编号如下:
眼睛(左):37、38、39、40、41、42
眼睛(右):43、44、45、46、47、48
眉毛(左):18、19、20、21、22
眉毛(右):23、24、25、26、27
嘴巴:49、55、61、62、63、64、65、66、67、68
本发明实施例中,利用OpenFace人脸识别算法能返回脸部68个关键点在像素坐标。
S1022、根据所述特定关键点位置,检测所述特定关键点位置在预设时间 的状态;
根据上述特定关键点位置,可以计算分别计算特定关键点位置在预设时间的状态,例如眼睛睁开/闭合状态、眼睛大小、眉毛挑动幅度、嘴巴开合大小等。
S1023、利用人脸识别算法获取用户人脸在三维空间中的朝向信息以及用户眼睛的凝视方向;
其中,所述用户面部表情数据包括所述特定关键点位置在预设时间内的状态、所述用户人脸在三维空间中的朝向信息以及用户眼睛的凝视方向。
本发明实施例中,使用OpenFace人脸识别算法获取用户人脸在三维空间中的朝向信息,朝向信息包括三个转向角信息:偏航角(Yaw)、俯仰角(Pitch)、侧偏角(Roll),根据三个转向角构建一个虚拟的三维方块来指示朝向信息,具体如图4所示的矩形三维方块。同时,如图5所示,通过OpenFace人脸识别算法可以直接识别获取用户眼睛的凝视方向,图5中眼睛上的白色线条表示识别的眼睛凝视方向。
优选的,本发明实施例中,所述特定关键点包括眼睛关键点、眉毛关键点和嘴巴关键点,其中,上述眼睛关键点、眉毛关键点和嘴巴关键点每项包括一个或多个关键点。
如图6所示,所述步骤S1022具体可以包括:
S10221、根据所述眼睛关键点计算用户眼睛睁开/闭合状态以及眼睛大小;
该计算中需要用到的距离计算公式如下:
公式含义:
a:关键点a,对应的像素坐标为(x1,y1);
b:关键点b,对应的像素坐标为(x2,y2);
d:表示关键点a到关键点b的距离长度;
具体计算眼睛睁开/闭合状态细节如下:
以左眼为例,计算如图3中关键点38与关键点42之间的像素距离a,计算39与41之间的像素距离b,取a、b的平均值c=(a+b)/2,c即为眼睛的高度;计算37与40之间的像素距离d,d即为眼睛的宽度。当a/d<0.15(0.15为经验值)时判断眼睛为闭合状态。用同样的方法计算右眼的张开闭合状态。
计算眼睛大小具体细节如下:
使用步骤上述计算结果c(眼睛的高度)和d(眼睛的宽度),得到眼睛矩形区域的高度和宽度。眼睛矩形区域即用于表示眼睛大小。
S10222、根据所述眉毛关键点计算用户眉毛挑动幅度;
本发明实施例中,计算眉毛挑动幅度具体细节如下:
以左眼为例,计算眉弓最高处关键点20与眼睛关键点38之间的像素距离值e。由于抬头、俯视、左右摆动会影响此值,因此要以脸部宽度为基准计算,脸部宽度值计算关键点3至关键点15之间的距离f,眉毛挑动幅度的值为e/f。眉毛挑动时e/f的值会随之发生变化,因此以e/f的最小值为基准计算眉毛的挑动幅度值,以最小值为基准能迅速有效的判定挑眉动作。
S10223、根据所述嘴巴关键点计算用户嘴巴开合大小。
本发明实施例中,计算用户嘴巴开合大小具体细节如下:
计算关键点63与关键点67之间的像素距离g,计算关键点61到关键点65之间的像素距离h。用户嘴巴开合大小值为:g/h。
优选的,如图7所示,所述步骤S103具体可以包括:
S1031、将所述拟人化三维模型的眼睛部分处理成透明;将所述拟人化三维模型嘴部的上下唇之间处理处一条透明的缝隙,以处理绘制牙齿;
S1032、利用欧拉角进行旋转所述用户人脸在三维空间中的朝向信息得到旋转变化矩阵;
设之前获取的用户人脸三维空间中的朝向信息:航角(Yaw)、俯仰角(Pitch)、侧偏角(Roll)分别为:θ,φ,ψ。那么利用欧拉角进行旋转对应的旋转变换矩阵M为:
通过将改旋转变换矩阵应用到三维物体上,可以改变三维物体的朝向。
S1033、获取预先制作的眼部纹理和嘴部纹理,并将所述眼部纹理和嘴部纹理贴合到所述拟人化三维模型面部;
其中,预设的眼部纹理和嘴部纹理可以是预设的所述拟人化三维模型的基准眼部纹理和基准嘴部纹理。
所述眼部纹理和嘴部纹理贴合到所述拟人化三维模型面部可以是:将OpenFace人脸识别算法识别的面部关键点与拟人化三维模型的眼睛开何处和嘴巴的开合处对齐贴图处理,
S1034、根据所述用户眼睛睁开/闭合状态以及眼睛大小和所述用户眼睛的凝视方调整所述眼部纹理,根据所述嘴巴开合大小调整所述嘴部纹理;
具体,根据所述用户眼睛睁开/闭合状态以及眼睛大小拉伸眼睛开合口、嘴巴开合口附近的贴图纹理,然后根据眼睛大小和嘴巴开合大小分别限制眼睛开合处、嘴巴开合处矩形的长宽比。如图8所示,根据所述用户眼睛凝视方向计算眼部纹理贴图位置来处理拟人化三维模型眼球的转动和朝向信息,眼球朝向只改变眼部纹理的位置,不影响眼部纹理的大小。
S1035、将所述旋转变换矩阵应用到所述拟人化三维模型,用来改变所述拟人化三维模型的朝向,使得所述拟人化三维模型的面部表情跟随所述用户面部表情变化。
以OpenGL2.0GPU编程为例,此变换矩阵M应用到三维模型的代码如下:
顶点着色器代码:
Figure PCTCN2017075742-appb-000001
其中,posion为3DS MAX三维建模软件创建的三维模型的顶点的坐标;
inputTextureCoordinate为3DS MAX三维建模软件创建的三维模型顶点坐标对应的贴图纹理坐标;textureCoordinate为将要传递给片元着色器的坐标;matrixM为变换矩阵M,用来处理模型的旋转;gl_Position为输出给OpenGL处理的顶点坐标。matrixM*postion的作用是对顶点坐标做旋转变换。再将matrixM*postion赋值给gl_Position得到最终模型转动之后的坐标,最终gl_Position交给OpenGL内部自动处理,得到模型头部转动的画面。
优选的,为了使三维动物模拟的动作自然,需要随机产生小幅度的小动作和细微表情,此处的动作使用3DS MAX等3D建模软件预先制作好的几组骨骼动画,在这里随机套用这几组动画。如:耳朵自然摆动、头部轻微幅度自然摇动。因此,所述根据所述用户面部表情数据调整所述拟人化三维模型的面部表情,以使得所述拟人化三维模型的面部表情跟随所述用户面部表情而变化的步骤,具体还可以包括:
在3D建模软件(例如3DS MAX)中,根据预设的预先制作好的骨骼动画随机套用生成小幅度的小动作和细微表情,并应用在所述拟人化三维模型的面部。
在本发明方法应用在视频直播场景中时,在主播或者视频录制者露脸时,在直播或视频录制画面的一个角落开一个小窗口画面,用来展示虚拟的拟人化三维模型,主播或者视频录制者不愿露脸时,仅在小窗口画面展示拟人化三维 模型来模拟主播、视频录制者的表情动作,做到音画同步。
下面介绍本发明实施例中图像处理的装置的实施例。
请参阅图9,为本发明实施例中图像处理的装置的一个实施例示意图,该装置包括:
用户表情获取模块901,用于在视频直播或视频录制场景中,获取用户面部表情数据;
模型表情获取模块902,用于利用人脸识别算法获取所述视频直播场景中预设的拟人化三维模型的面部表情;
调整模块903,用于根据所述用户面部表情数据调整所述拟人化三维模型的面部表情,以使得所述拟人化三维模型的面部表情跟随所述用户面部表情而变化。
优选的,如图10所示,所述用户表情获取模块901具体可以包括:
标记单元9011,用于利用人脸识别算法识别用户人脸后,标记用户人脸特定关键点位置;
检测单元9012,用于根据所述特定关键点位置,检测所述特定关键点位置在预设时间的状态;
获取单元9013,用于利用人脸识别算法获取用户人脸在三维空间中的朝向信息以及用户眼睛的凝视方向;
其中,所述用户面部表情数据包括所述特定关键点位置在预设时间内的状态、所述用户人脸在三维空间中的朝向信息以及用户眼睛的凝视方向。
优选的,所述特定关键点包括眼睛关键点、眉毛关键点和嘴巴关键点;
所述检测单元9012具体用于:
根据所述眼睛关键点计算用户眼睛睁开/闭合状态以及眼睛大小;
根据所述眉毛关键点计算用户眉毛挑动幅度;
根据所述嘴巴关键点计算用户嘴巴开合大小。
优选的,所述调整模块903具体用于:
将所述拟人化三维模型的眼睛部分处理成透明;将所述拟人化三维模型嘴部的上下唇之间处理处一条透明的缝隙,以处理绘制牙齿;
利用欧拉角进行旋转所述用户人脸在三维空间中的朝向信息得到旋转变化矩阵;
获取预先制作的眼部纹理和嘴部纹理,并将所述眼部纹理和嘴部纹理贴合到所述拟人化三维模型面部;
根据所述用户眼睛睁开/闭合状态以及眼睛大小和所述用户眼睛的凝视方调整所述眼部纹理;根据所述嘴巴开合大小调整所述嘴部纹理;
将所述旋转变换矩阵应用到所述拟人化三维模型,用来改变所述拟人化三维模型的朝向,使得所述拟人化三维模型的面部表情跟随所述用户面部表情变化。
优选的,所述调整模块903具体还用于:
在3D建模软件中,根据预设的预先制作好的骨骼动画随机套用生成小幅度的小动作和细微表情,并应用在所述拟人化三维模型的面部。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,以上实施例仅用以说明本发明的技术方案,而非对其限制;尽 管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。

Claims (10)

  1. 一种图像处理的方法,其特征在于,所述方法包括:
    在视频直播或视频录制场景中,利用人脸识别算法获取用户面部表情数据;
    获取所述视频直播场景中预设的拟人化三维模型的面部表情;
    根据所述用户面部表情数据调整所述拟人化三维模型的面部表情,以使得所述拟人化三维模型的面部表情跟随所述用户面部表情而变化。
  2. 根据权利要求1所述的方法,其特征在于,所述利用人脸识别算法获取用户面部表情数据的步骤,具体包括:
    利用人脸识别算法识别用户人脸后,标记用户人脸特定关键点位置;
    根据所述特定关键点位置,检测所述特定关键点位置在预设时间的状态;
    利用人脸识别算法获取用户人脸在三维空间中的朝向信息以及用户眼睛的凝视方向;
    其中,所述用户面部表情数据包括所述特定关键点位置在预设时间内的状态、所述用户人脸在三维空间中的朝向信息以及用户眼睛的凝视方向。
  3. 根据权利要求2所述的方法,其特征在于,所述特定关键点包括眼睛关键点、眉毛关键点和嘴巴关键点;
    所述根据所述特定关键点位置,检测所述特定关键点位置在预设时间的状态的步骤,具体包括:
    根据所述眼睛关键点计算用户眼睛睁开/闭合状态以及眼睛大小;
    根据所述眉毛关键点计算用户眉毛挑动幅度;
    根据所述嘴巴关键点计算用户嘴巴开合大小。
  4. 根据权利要求3所述的方法,其特征在于,所述根据所述用户面部表情数据调整所述拟人化三维模型的面部表情,以使得所述拟人化三维模型的面部表情跟随所述用户面部表情而变化的步骤,具体包括:
    将所述拟人化三维模型的眼睛部分处理成透明;将所述拟人化三维模型嘴部的上下唇之间处理处一条透明的缝隙,以处理绘制牙齿;
    利用欧拉角进行旋转所述用户人脸在三维空间中的朝向信息得到旋转变化矩阵;
    获取预先制作的眼部纹理和嘴部纹理,并将所述眼部纹理和嘴部纹理贴合到所述拟人化三维模型面部;
    根据所述用户眼睛睁开/闭合状态以及眼睛大小和所述用户眼睛的凝视方调整所述眼部纹理;根据所述嘴巴开合大小调整所述嘴部纹理;
    将所述旋转变换矩阵应用到所述拟人化三维模型,用来改变所述拟人化三 维模型的朝向,使得所述拟人化三维模型的面部表情跟随所述用户面部表情变化。
  5. 根据权利要求1至4中任一所述的方法,其特征在于,所述根据所述用户面部表情数据调整所述拟人化三维模型的面部表情,以使得所述拟人化三维模型的面部表情跟随所述用户面部表情而变化的步骤,具体还包括:
    在3D建模软件中,根据预设的预先制作好的骨骼动画随机套用生成小幅度的小动作和细微表情,并应用在所述拟人化三维模型的面部。
  6. 一种图像处理的装置,其特征在于,所述装置包括:
    用户表情获取模块,用于在视频直播或视频录制场景中,利用人脸识别算法获取用户面部表情数据;
    模型表情获取模块,用于获取所述视频直播场景中预设的拟人化三维模型的面部表情;
    调整模块,用于根据所述用户面部表情数据调整所述拟人化三维模型的面部表情,以使得所述拟人化三维模型的面部表情跟随所述用户面部表情而变化。
  7. 根据权利要求6所述的装置,其特征在于,所述用户表情获取模块具体包括:
    标记单元,用于利用人脸识别算法识别用户人脸后,标记用户人脸特定关键点位置;
    检测单元,用于根据所述特定关键点位置,检测所述特定关键点位置在预设时间的状态;
    获取单元,用于利用人脸识别算法获取用户人脸在三维空间中的朝向信息以及用户眼睛的凝视方向;
    其中,所述用户面部表情数据包括所述特定关键点位置在预设时间内的状态、所述用户人脸在三维空间中的朝向信息以及用户眼睛的凝视方向。
  8. 根据权利要求7所述的装置,其特征在于,所述特定关键点包括眼睛关键点、眉毛关键点和嘴巴关键点;
    所述检测单元具体用于:
    根据所述眼睛关键点计算用户眼睛睁开/闭合状态以及眼睛大小;
    根据所述眉毛关键点计算用户眉毛挑动幅度;
    根据所述嘴巴关键点计算用户嘴巴开合大小。
  9. 根据权利要求8所述的装置,其特征在于,所述调整模块具体用于:
    将所述拟人化三维模型的眼睛部分处理成透明,将所述拟人化三维模型嘴部的上下唇之间处理处一条透明的缝隙,以处理绘制牙齿;
    利用欧拉角进行旋转所述用户人脸在三维空间中的朝向信息得到旋转变化矩阵;
    获取预先制作的眼部纹理和嘴部纹理,并将所述眼部纹理和嘴部纹理贴合到所述拟人化三维模型面部;
    根据所述用户眼睛睁开/闭合状态以及眼睛大小和所述用户眼睛的凝视方调整所述眼部纹理;根据所述嘴巴开合大小调整所述嘴部纹理;
    将所述旋转变换矩阵应用到所述拟人化三维模型,用来改变所述拟人化三维模型的朝向,使得所述拟人化三维模型的面部表情跟随所述用户面部表情变化。
  10. 根据权利要求6至9中任一所述的装置,其特征在于,所述调整模块具体还用于:
    在3D建模软件中,根据预设的预先制作好的骨骼动画随机套用生成小幅度的小动作和细微表情,并应用在所述拟人化三维模型的面部。
PCT/CN2017/075742 2016-12-09 2017-03-06 一种图像处理的方法及装置 WO2018103220A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201611129431.3A CN108229239B (zh) 2016-12-09 2016-12-09 一种图像处理的方法及装置
CN201611129431.3 2016-12-09

Publications (1)

Publication Number Publication Date
WO2018103220A1 true WO2018103220A1 (zh) 2018-06-14

Family

ID=62490579

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/075742 WO2018103220A1 (zh) 2016-12-09 2017-03-06 一种图像处理的方法及装置

Country Status (2)

Country Link
CN (1) CN108229239B (zh)
WO (1) WO2018103220A1 (zh)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109064548A (zh) * 2018-07-03 2018-12-21 百度在线网络技术(北京)有限公司 视频生成方法、装置、设备及存储介质
CN109308731A (zh) * 2018-08-24 2019-02-05 浙江大学 级联卷积lstm的语音驱动唇形同步人脸视频合成算法
CN110335194A (zh) * 2019-06-28 2019-10-15 广州久邦世纪科技有限公司 一种人脸变老图像处理方法
CN110458751A (zh) * 2019-06-28 2019-11-15 广东智媒云图科技股份有限公司 一种基于粤剧图片的面部替换方法、设备及介质
CN110610546A (zh) * 2018-06-15 2019-12-24 Oppo广东移动通信有限公司 视频画面显示方法、装置、终端及存储介质
CN110782529A (zh) * 2019-10-24 2020-02-11 重庆灵翎互娱科技有限公司 一种基于三维人脸实现眼球转动效果的方法和设备
CN110969673A (zh) * 2018-09-30 2020-04-07 武汉斗鱼网络科技有限公司 一种直播换脸交互实现方法、存储介质、设备及系统
CN111161418A (zh) * 2019-11-25 2020-05-15 西安夏光网络科技有限责任公司 面部美容整形仿真模拟方法
CN111444743A (zh) * 2018-12-27 2020-07-24 北京奇虎科技有限公司 一种视频人像替换方法及装置
CN111563465A (zh) * 2020-05-12 2020-08-21 淮北师范大学 一种动物行为学自动分析系统
CN111638784A (zh) * 2020-05-26 2020-09-08 浙江商汤科技开发有限公司 人脸表情互动方法、互动装置以及计算机存储介质
CN112434578A (zh) * 2020-11-13 2021-03-02 浙江大华技术股份有限公司 口罩佩戴规范性检测方法、装置、计算机设备和存储介质
CN112614213A (zh) * 2020-12-14 2021-04-06 杭州网易云音乐科技有限公司 人脸表情确定方法、表情参数确定模型、介质及设备
CN112652041A (zh) * 2020-12-18 2021-04-13 北京大米科技有限公司 虚拟形象的生成方法、装置、存储介质及电子设备
CN112862859A (zh) * 2020-08-21 2021-05-28 海信视像科技股份有限公司 一种人脸特征值创建方法、人物锁定追踪方法及显示设备
CN112906494A (zh) * 2021-01-27 2021-06-04 浙江大学 一种面部捕捉方法、装置、电子设备及存储介质
CN113436301A (zh) * 2020-03-20 2021-09-24 华为技术有限公司 拟人化3d模型生成的方法和装置
WO2021209042A1 (zh) * 2020-04-16 2021-10-21 广州虎牙科技有限公司 三维模型驱动方法、装置、电子设备及存储介质
CN113946221A (zh) * 2021-11-03 2022-01-18 广州繁星互娱信息科技有限公司 眼部驱动控制方法和装置、存储介质及电子设备

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108985241B (zh) * 2018-07-23 2023-05-02 腾讯科技(深圳)有限公司 图像处理方法、装置、计算机设备及存储介质
CN109165578A (zh) * 2018-08-08 2019-01-08 盎锐(上海)信息科技有限公司 基于拍摄装置的表情检测装置及数据处理方法
CN109147024A (zh) * 2018-08-16 2019-01-04 Oppo广东移动通信有限公司 基于三维模型的表情更换方法和装置
CN111200747A (zh) * 2018-10-31 2020-05-26 百度在线网络技术(北京)有限公司 基于虚拟形象的直播方法和装置
CN111144169A (zh) * 2018-11-02 2020-05-12 深圳比亚迪微电子有限公司 人脸识别方法、装置和电子设备
CN109509242B (zh) * 2018-11-05 2023-12-29 网易(杭州)网络有限公司 虚拟对象面部表情生成方法及装置、存储介质、电子设备
CN109621418B (zh) * 2018-12-03 2022-09-30 网易(杭州)网络有限公司 一种游戏中虚拟角色的表情调整及制作方法、装置
CN109784175A (zh) * 2018-12-14 2019-05-21 深圳壹账通智能科技有限公司 基于微表情识别的异常行为人识别方法、设备和存储介质
CN109727303B (zh) * 2018-12-29 2023-07-25 广州方硅信息技术有限公司 视频展示方法、系统、计算机设备、存储介质和终端
CN111435546A (zh) * 2019-01-15 2020-07-21 北京字节跳动网络技术有限公司 模型动作方法、装置、带屏音箱、电子设备及存储介质
CN111460870A (zh) 2019-01-18 2020-07-28 北京市商汤科技开发有限公司 目标的朝向确定方法及装置、电子设备及存储介质
WO2020147794A1 (zh) * 2019-01-18 2020-07-23 北京市商汤科技开发有限公司 图像处理方法及装置、图像设备及存储介质
CN111507143B (zh) * 2019-01-31 2023-06-02 北京字节跳动网络技术有限公司 表情图像效果生成方法、装置和电子设备
CN110035271B (zh) * 2019-03-21 2020-06-02 北京字节跳动网络技术有限公司 保真图像生成方法、装置及电子设备
CN111178294A (zh) * 2019-12-31 2020-05-19 北京市商汤科技开发有限公司 状态识别方法、装置、设备及存储介质
CN111986301A (zh) * 2020-09-04 2020-11-24 网易(杭州)网络有限公司 直播中数据处理的方法及装置、电子设备、存储介质
CN112258382A (zh) * 2020-10-23 2021-01-22 北京中科深智科技有限公司 一种基于图像到图像的面部风格转移方法和系统
CN112528835B (zh) * 2020-12-08 2023-07-04 北京百度网讯科技有限公司 表情预测模型的训练方法、识别方法、装置及电子设备
CN115334325A (zh) * 2022-06-23 2022-11-11 联通沃音乐文化有限公司 基于可编辑三维虚拟形象生成直播视频流的方法和系统
CN115797523B (zh) * 2023-01-05 2023-04-18 武汉创研时代科技有限公司 一种基于人脸动作捕捉技术的虚拟角色处理系统及方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1920886A (zh) * 2006-09-14 2007-02-28 浙江大学 基于视频流的三维动态人脸表情建模方法
US20130215113A1 (en) * 2012-02-21 2013-08-22 Mixamo, Inc. Systems and methods for animating the faces of 3d characters using images of human faces
WO2014205239A1 (en) * 2013-06-20 2014-12-24 Elwha Llc Systems and methods for enhancement of facial expressions
US9479736B1 (en) * 2013-03-12 2016-10-25 Amazon Technologies, Inc. Rendered audiovisual communication
CN106060572A (zh) * 2016-06-08 2016-10-26 乐视控股(北京)有限公司 视频播放方法及装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103116902A (zh) * 2011-11-16 2013-05-22 华为软件技术有限公司 三维虚拟人头像生成方法、人头像运动跟踪方法和装置
CN103389798A (zh) * 2013-07-23 2013-11-13 深圳市欧珀通信软件有限公司 一种操作移动终端的方法及装置
US9898849B2 (en) * 2014-11-05 2018-02-20 Intel Corporation Facial expression based avatar rendering in video animation and method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1920886A (zh) * 2006-09-14 2007-02-28 浙江大学 基于视频流的三维动态人脸表情建模方法
US20130215113A1 (en) * 2012-02-21 2013-08-22 Mixamo, Inc. Systems and methods for animating the faces of 3d characters using images of human faces
US9479736B1 (en) * 2013-03-12 2016-10-25 Amazon Technologies, Inc. Rendered audiovisual communication
WO2014205239A1 (en) * 2013-06-20 2014-12-24 Elwha Llc Systems and methods for enhancement of facial expressions
CN106060572A (zh) * 2016-06-08 2016-10-26 乐视控股(北京)有限公司 视频播放方法及装置

Cited By (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110610546A (zh) * 2018-06-15 2019-12-24 Oppo广东移动通信有限公司 视频画面显示方法、装置、终端及存储介质
CN110610546B (zh) * 2018-06-15 2023-03-28 Oppo广东移动通信有限公司 视频画面显示方法、装置、终端及存储介质
CN109064548B (zh) * 2018-07-03 2023-11-03 百度在线网络技术(北京)有限公司 视频生成方法、装置、设备及存储介质
CN109064548A (zh) * 2018-07-03 2018-12-21 百度在线网络技术(北京)有限公司 视频生成方法、装置、设备及存储介质
CN109308731A (zh) * 2018-08-24 2019-02-05 浙江大学 级联卷积lstm的语音驱动唇形同步人脸视频合成算法
CN109308731B (zh) * 2018-08-24 2023-04-25 浙江大学 级联卷积lstm的语音驱动唇形同步人脸视频合成算法
CN110969673B (zh) * 2018-09-30 2023-12-15 西藏博今文化传媒有限公司 一种直播换脸交互实现方法、存储介质、设备及系统
CN110969673A (zh) * 2018-09-30 2020-04-07 武汉斗鱼网络科技有限公司 一种直播换脸交互实现方法、存储介质、设备及系统
CN111444743A (zh) * 2018-12-27 2020-07-24 北京奇虎科技有限公司 一种视频人像替换方法及装置
CN110458751B (zh) * 2019-06-28 2023-03-24 广东智媒云图科技股份有限公司 一种基于粤剧图片的面部替换方法、设备及介质
CN110335194B (zh) * 2019-06-28 2023-11-10 广州久邦世纪科技有限公司 一种人脸变老图像处理方法
CN110458751A (zh) * 2019-06-28 2019-11-15 广东智媒云图科技股份有限公司 一种基于粤剧图片的面部替换方法、设备及介质
CN110335194A (zh) * 2019-06-28 2019-10-15 广州久邦世纪科技有限公司 一种人脸变老图像处理方法
CN110782529B (zh) * 2019-10-24 2024-04-05 重庆灵翎互娱科技有限公司 一种基于三维人脸实现眼球转动效果的方法和设备
CN110782529A (zh) * 2019-10-24 2020-02-11 重庆灵翎互娱科技有限公司 一种基于三维人脸实现眼球转动效果的方法和设备
CN111161418A (zh) * 2019-11-25 2020-05-15 西安夏光网络科技有限责任公司 面部美容整形仿真模拟方法
CN111161418B (zh) * 2019-11-25 2023-04-25 西安夏光网络科技有限责任公司 面部美容整形仿真模拟方法
CN113436301A (zh) * 2020-03-20 2021-09-24 华为技术有限公司 拟人化3d模型生成的方法和装置
CN113436301B (zh) * 2020-03-20 2024-04-09 华为技术有限公司 拟人化3d模型生成的方法和装置
WO2021209042A1 (zh) * 2020-04-16 2021-10-21 广州虎牙科技有限公司 三维模型驱动方法、装置、电子设备及存储介质
CN111563465B (zh) * 2020-05-12 2023-02-07 淮北师范大学 一种动物行为学自动分析系统
CN111563465A (zh) * 2020-05-12 2020-08-21 淮北师范大学 一种动物行为学自动分析系统
CN111638784A (zh) * 2020-05-26 2020-09-08 浙江商汤科技开发有限公司 人脸表情互动方法、互动装置以及计算机存储介质
CN111638784B (zh) * 2020-05-26 2023-07-18 浙江商汤科技开发有限公司 人脸表情互动方法、互动装置以及计算机存储介质
CN112862859A (zh) * 2020-08-21 2021-05-28 海信视像科技股份有限公司 一种人脸特征值创建方法、人物锁定追踪方法及显示设备
CN112862859B (zh) * 2020-08-21 2023-10-31 海信视像科技股份有限公司 一种人脸特征值创建方法、人物锁定追踪方法及显示设备
CN112434578A (zh) * 2020-11-13 2021-03-02 浙江大华技术股份有限公司 口罩佩戴规范性检测方法、装置、计算机设备和存储介质
CN112614213A (zh) * 2020-12-14 2021-04-06 杭州网易云音乐科技有限公司 人脸表情确定方法、表情参数确定模型、介质及设备
CN112614213B (zh) * 2020-12-14 2024-01-23 杭州网易云音乐科技有限公司 人脸表情确定方法、表情参数确定模型、介质及设备
CN112652041B (zh) * 2020-12-18 2024-04-02 北京大米科技有限公司 虚拟形象的生成方法、装置、存储介质及电子设备
CN112652041A (zh) * 2020-12-18 2021-04-13 北京大米科技有限公司 虚拟形象的生成方法、装置、存储介质及电子设备
CN112906494A (zh) * 2021-01-27 2021-06-04 浙江大学 一种面部捕捉方法、装置、电子设备及存储介质
CN113946221A (zh) * 2021-11-03 2022-01-18 广州繁星互娱信息科技有限公司 眼部驱动控制方法和装置、存储介质及电子设备

Also Published As

Publication number Publication date
CN108229239B (zh) 2020-07-10
CN108229239A (zh) 2018-06-29

Similar Documents

Publication Publication Date Title
WO2018103220A1 (zh) 一种图像处理的方法及装置
CN109325437B (zh) 图像处理方法、装置和系统
US10489959B2 (en) Generating a layered animatable puppet using a content stream
KR102045695B1 (ko) 안면 이미지 처리 방법 및 장치, 및 저장 매체
US9697635B2 (en) Generating an avatar from real time image data
Ichim et al. Dynamic 3D avatar creation from hand-held video input
WO2021093453A1 (zh) 三维表情基的生成方法、语音互动方法、装置及介质
Blanz et al. Reanimating faces in images and video
US9094576B1 (en) Rendered audiovisual communication
US9314692B2 (en) Method of creating avatar from user submitted image
US20190213773A1 (en) 4d hologram: real-time remote avatar creation and animation control
CN108335345B (zh) 面部动画模型的控制方法及装置、计算设备
US20190082211A1 (en) Producing realistic body movement using body Images
US20170069124A1 (en) Avatar generation and animations
CN110363133B (zh) 一种视线检测和视频处理的方法、装置、设备和存储介质
WO2019075666A1 (zh) 图像处理方法、装置、终端及存储介质
EP3992919B1 (en) Three-dimensional facial model generation method and apparatus, device, and medium
CN108876886B (zh) 图像处理方法、装置和计算机设备
KR101743763B1 (ko) 감성 아바타 이모티콘 기반의 스마트 러닝 학습 제공 방법, 그리고 이를 구현하기 위한 스마트 러닝 학습 단말장치
CN109035415B (zh) 虚拟模型的处理方法、装置、设备和计算机可读存储介质
JP2023513980A (ja) 画面上の話者のフューショット合成
TW202244852A (zh) 用於擷取臉部表情且產生網格資料之人工智慧
WO2022267653A1 (zh) 图像处理方法、电子设备及计算机可读存储介质
US10976829B1 (en) Systems and methods for displaying augmented-reality objects
US20140306953A1 (en) 3D Rendering for Training Computer Vision Recognition

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17879139

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17879139

Country of ref document: EP

Kind code of ref document: A1