WO2018018957A1 - 三维模型的实时控制方法和系统 - Google Patents
三维模型的实时控制方法和系统 Download PDFInfo
- Publication number
- WO2018018957A1 WO2018018957A1 PCT/CN2017/081376 CN2017081376W WO2018018957A1 WO 2018018957 A1 WO2018018957 A1 WO 2018018957A1 CN 2017081376 W CN2017081376 W CN 2017081376W WO 2018018957 A1 WO2018018957 A1 WO 2018018957A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- face
- real
- image
- calibration
- key point
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 94
- 230000009471 action Effects 0.000 claims abstract description 72
- 230000033001 locomotion Effects 0.000 claims description 76
- 230000008859 change Effects 0.000 claims description 41
- 238000012545 processing Methods 0.000 claims description 23
- 230000008921 facial expression Effects 0.000 claims description 17
- 230000005540 biological transmission Effects 0.000 claims description 15
- 230000014509 gene expression Effects 0.000 claims description 15
- 238000013507 mapping Methods 0.000 claims description 15
- 238000005070 sampling Methods 0.000 claims description 13
- 230000003247 decreasing effect Effects 0.000 claims description 10
- 230000004913 activation Effects 0.000 claims description 5
- 230000001360 synchronised effect Effects 0.000 claims description 3
- 239000007787 solid Substances 0.000 claims 2
- 210000003128 head Anatomy 0.000 description 63
- 230000000875 corresponding effect Effects 0.000 description 56
- 230000008569 process Effects 0.000 description 14
- 238000010586 diagram Methods 0.000 description 13
- 230000001815 facial effect Effects 0.000 description 7
- 238000004519 manufacturing process Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 4
- 238000009877 rendering Methods 0.000 description 4
- 230000015572 biosynthetic process Effects 0.000 description 3
- 230000001276 controlling effect Effects 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000003139 buffering effect Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000006073 displacement reaction Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 206010011469 Crying Diseases 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000002996 emotional effect Effects 0.000 description 1
- 210000004709 eyebrow Anatomy 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 230000004886 head movement Effects 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000036651 mood Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—3D [Three Dimensional] animation
- G06T13/40—3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
- G06T19/003—Navigation within 3D models or images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/165—Detection; Localisation; Normalisation using facial parts and geometric relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/21—Server components or server architectures
- H04N21/218—Source of audio or video content, e.g. local disk arrays
- H04N21/2187—Live feed
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/23418—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/41—Structure of client; Structure of client peripherals
- H04N21/422—Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
- H04N21/4223—Cameras
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44008—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/478—Supplemental services, e.g. displaying phone caller identification, shopping application
- H04N21/4788—Supplemental services, e.g. displaying phone caller identification, shopping application communicating with other users, e.g. chatting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
- G06T2200/08—Indexing scheme for image data processing or generation, in general involving all processing steps from image acquisition to 3D model generation
Definitions
- Embodiments of the present invention relate to a control method and system for a stereo model, and in particular, to a real-time control method and system for a three-dimensional model.
- Video playback or video interaction on audiovisual equipment and mobile communication equipment is very common, and the communication objects in the video are often the real image of people.
- 3D character models With advances in technology in communications, sensors, and modeling, real-time interaction of 3D character models is emerging worldwide.
- the prior art solution can realize real-time replacement of a person's real image with a virtual cartoon image, form a real-time interaction between cartoon characters replacing the real person image, and better complete emotional expressions such as mood, crying and laughing. For example, turning a live-action live story into a cartoon character telling a story, turning a real teacher into a famous scientist to talk about physics. Two strangers can interact with each other by playing different roles. For example, Snow White can video chat with Prince Charming.
- the Stanford University computer department uses the RGBD camera to achieve similar functions by using the depth information provided by the camera.
- most mobile devices are now equipped with RGB cameras. Without depth information, the algorithm cannot be extended to a wider mobile Internet scene.
- FaceRig and Adobe's technology implements similar functions based on RGB cameras on a PC computer.
- the embodiment of the present invention provides a real-time control method for a three-dimensional model, which is used to solve real-time feedback on a real object through a limited computing resource of a terminal in a mobile Internet environment, so as to control the action formation of the three-dimensional model.
- the embodiment of the present invention further provides a real-time control system for a three-dimensional model, which is used to solve hardware resource constraints such as the mobile Internet environment, the processing capability of the mobile terminal, and the performance of the camera.
- the motion of the 3D model of the real object cannot achieve real-time motion control.
- the real-time control method of the three-dimensional model of the invention comprises:
- an action control instruction of the corresponding 3D model is formed.
- the real-time control method of the three-dimensional model of the invention comprises:
- the real-time control system of the three-dimensional model of the invention comprises:
- a video acquiring device configured to acquire a real-time video of a real object
- An image identification device for identifying an action of a real object in the real-time video image
- the motion command generating means is configured to form an action control command of the corresponding 3D model according to the change of the marking action.
- the real-time control method of the three-dimensional model of the present invention forms an action control command for controlling the 3D model by recognizing a real object in the acquired real-time video and an action change of the real object.
- the action control instruction has small data volume and low data bandwidth requirement for real-time transmission, which can ensure real-time transmission in the mobile Internet environment.
- the real-time control method of the three-dimensional model of the invention avoids the delay of transmission of a large amount of video data formed by real-time rendering of the 3D model in the mobile internet environment, and the formation of the VR video playback, so that the rendering process and control of the 3D model The generation process can be completed at both ends of the mobile Internet environment.
- One end uses a mobile terminal with limited hardware resources to complete the recognition capture and instruction formation of the real object motion change, and the other end uses the mobile Internet environment to complete the necessary 3D model and scene download, load and Activation, the 3D model completes the corresponding action of the real object through the real-time transmission of the control instruction, and forms a corresponding model rendering and scene rendering VR live broadcast.
- the real-time control system of the three-dimensional model of the invention can be deployed in a mobile terminal with limited resources in a mobile internet environment, and utilizes limited terminal processing and camera capabilities to centrally process the action change process of the real object, and efficiently acquire the accurate action state of the real object, thereby forming an accurate action state of the real object.
- the control command can perform accurate real-time motion control on any matched 3D model, and complete the real expression of the real-time action of the real object in the 3D model.
- the 3D model motion control does not need to be fused in the video of the real object, and the motion simulation of the real object is no longer limited to the limited bandwidth of the mobile Internet environment.
- 1a is a process flow diagram of an embodiment of a real-time control method for a three-dimensional model of the present invention.
- FIG. 1b is a processing flowchart of an embodiment of a real-time control method for a three-dimensional model according to the present invention.
- FIG. 2 is a flow chart of motion recognition of an embodiment of a real-time control method for a three-dimensional model of the present invention.
- FIG. 3 is a flow chart of an embodiment of facial expression recognition in an embodiment of a real-time control method for a three-dimensional model of the present invention.
- FIG. 4 is a flow chart of another embodiment of facial expression recognition in an embodiment of a real-time control method for a three-dimensional model of the present invention.
- FIG. 5 is a flow chart of another embodiment of facial expression recognition according to an embodiment of a real-time control method for a three-dimensional model of the present invention.
- FIG. 6 is a flow chart of an embodiment of head motion recognition and facial expression recognition according to an embodiment of a real-time control method for a three-dimensional model according to the present invention.
- FIG. 7 is a flowchart of control commands and audio data synchronization in an embodiment of a real-time control method for a three-dimensional model according to the present invention.
- FIG. 8 is a schematic diagram of control effects of an embodiment of a real-time control method for a three-dimensional model according to the present invention.
- FIG. 9 is a schematic structural diagram of an embodiment of a real-time control system of a three-dimensional model according to the present invention.
- FIG. 10 is a schematic structural diagram of image recognition of an embodiment of a real-time control system of a three-dimensional model according to the present invention.
- FIG. 11 is a schematic structural diagram of single frame object and key point recognition according to an embodiment of a real-time control system of a three-dimensional model of the present invention.
- FIG. 12 is a schematic structural diagram of object recognition in consecutive frames according to an embodiment of a real-time control system for a three-dimensional model of the present invention.
- FIG. 13 is a schematic structural diagram of head and face motion recognition according to an embodiment of a real-time control system for a three-dimensional model of the present invention.
- FIG. 1a is a flowchart of a real-time control method of a three-dimensional model according to an embodiment of the present invention, which is a control process independently completed by a content production end. As shown in Figure 1a, the method includes:
- Step 100 Acquire a real-time video of a real object
- the above-mentioned realistic objects include a complete human body, or a limb, a head or a face of the human body, and correspondingly include a limb motion, a head motion, and a facial motion (expression).
- Step 200 Identify an action of a real object in the real-time video image
- the above identifier includes identification of a real object, positioning of the recognized real object, positioning of the recognized real object motion, and positioning of the recognized real object motion change. Examples include capture (marking) and analysis (recognition) of limb or head movements, or capture (marking) and analysis (recognition) of facial expressions.
- Step 300 Form an action control instruction of the corresponding 3D model according to the change of the identification action.
- the change in the above (identification action) is a change in the positioning state of the start and end of the recognized real object action, the change being measurable or quantifiable.
- the corresponding 3D model described above is a 3D model of a VR object forming a real object, such as a limb model, a head model, or a face model.
- the real-time control method of the three-dimensional model of the present invention forms an action control command for controlling the 3D model by recognizing a real object in the acquired real-time video and an action change of the real object.
- the action control instruction has small data volume and low data bandwidth requirement for real-time transmission, which can ensure real-time transmission in the mobile Internet environment.
- the above steps are independently performed by the content production end, and the formed action control instructions can be buffered or saved as a kind of data.
- the formed action control instructions can be buffered or saved as a kind of data.
- On the content consumption side only the corresponding 3D model acquired is called, and the corresponding 3D model is controlled according to the received motion control instruction, so that the 3D model can complete the corresponding action.
- the method when the system still needs to transmit the audio data, as shown in FIG. 1a, the method further includes:
- Step 400 Synchronize the audio data and the motion control instruction, and output.
- the above synchronization means that the action control instruction and the audio data per unit time are given the same reference point, or the reference tag, or the time stamp, so that the execution of the action control instruction and the audio data output can be synthesized to form a synchronization.
- the above steps are for synchronizing the audio data accompanying the actual object motion with the continuous motion control command on the time axis to overcome the data asynchronous phenomenon caused by the processing delay in the data processing process.
- FIG. 1b illustrates a method for real-time control of a three-dimensional model according to an embodiment of the present invention.
- the method is a method for controlling a 3D model by using a motion control instruction, as shown in FIG. 1b, the method includes:
- Step 500 Call the corresponding 3D model obtained
- Step 600 Control the corresponding 3D model to complete the action according to the received action control instruction.
- Step 600 may include:
- the above cache is to overcome the data delay caused by multi-path transmission of the mobile Internet.
- the real-time control method of the three-dimensional model in the embodiment can capture continuous real-time video by using the mobile terminal device at the content production end, perform object recognition on the main real object, locate the action of the real object, and mark the action change.
- the marker data of the motion change is formed into a continuous motion control command.
- the action control of the corresponding 3D model is completed by the action control instruction at the content consumption end.
- the amount of data of the action control instruction formed by the content production end is greatly reduced compared with the amount of VR video data formed after the 3D model is rendered, which is more conducive to real-time transmission in the mobile Internet environment and guarantees the quality of the VR live broadcast.
- the content production end and the content consumption end may be deployed on different devices or multimedia terminals of the local network, or may be deployed on different devices or multimedia terminals on both ends of the mobile Internet, and one content production end may be deployed on multiple local networks or The content consumer of the mobile Internet remote.
- FIG. 2 is a flow chart showing motion recognition in a real-time control method of a three-dimensional model according to an embodiment of the present invention. As shown in FIG. 2, the step 200 shown in FIG. 1a includes the following steps:
- Step 201 Identify a real object in an image of the real-time video according to the preset object recognition policy
- Step 202 Identify a key point of a real object in the image according to a preset key point identification strategy
- the position (coordinate) change of the above key points can reflect the slight movement changes of a specific object.
- the position change of the head (key point) can reflect the movement of the head
- the position change of the joint of the limb (key point) can reflect the torso
- the movements of the face (key points) of the mouth, eyebrows and mouth shape can reflect facial expressions.
- Step 203 forming a plane coordinate space of a key point and a stereo coordinate space of the corresponding 3D model
- Step 204 Measure coordinate changes of key points in the plane coordinate space in the continuous image, and record corresponding coordinate changes of the key points in the continuous image in the three-dimensional coordinate space.
- the real-time control method of the three-dimensional model of the embodiment uses an object recognition strategy to identify a specific object in the image, such as a limb, a head or a face, and uses a key point recognition strategy to identify a key to the specific object in the image that is closely related to the action change. point.
- an object recognition strategy to identify a specific object in the image, such as a limb, a head or a face
- a key point recognition strategy to identify a key to the specific object in the image that is closely related to the action change. point.
- the coordinate changes of the key points form an action control command of the corresponding 3D model of the real object.
- the coordinate difference of the key points of the same real object in the continuous image may be used as a parameter included in the motion control instruction of the corresponding 3D model to form a description of the action of the real object.
- the abstract narrow-band coordinate data is used to form the control command, and the 3D model is controlled to form a corresponding action, thereby forming a rendered broadband VR video, so that the VR live broadcast is no longer limited by the transmission bandwidth, and is formed directly in the content consumption end in real time.
- FIG. 3 is a flow chart of an embodiment of facial expression recognition in an embodiment of a real-time control method for a three-dimensional model of the present invention.
- the method of recognizing the face and face key points in one frame image is as shown in FIG. 3, including:
- Step 221 Acquire a frame of the original image M0 of the real-time video
- Step 222 Generate a set of original image copies with correspondingly decreasing resolution according to the decreasing sampling rate: M1, M2...Mm-i, ... Mm-1, Mm;
- Step 223 using the number of original image copies m as the number of loops, starting from the original image copy Mm with the lowest resolution, and sequentially performing face region calibration in the original image copy (using the face object recognition strategy);
- Step 224 Determine whether the face region mark is completed in an original image copy; if not, return to step 223 to continue the face region calibration of the next original image copy; if completed, execute 225; when m original images After the copy is all looped, and the face area calibration is still not completed, step 227 is performed;
- Step 225 Mark the corresponding original image copy Mm-i, and form face area calibration data
- Step 226 using the face region calibration data in combination with the corresponding sampling rate, and completing the face region calibration in the subsequent original image copy (Mm-i...M2, M1) and the original image M0;
- Step 227 Perform face region calibration using the original image M0;
- the above step of completing the face region calibration step is further optimized, and a set of original image copies with correspondingly decreasing resolution may be generated according to the decreasing sampling rate, and the (most) low resolution original image copy obtained by completing the face region calibration is obtained therefrom. Face area calibration data.
- the steps for face key calibration include:
- Step 228 Perform face key point calibration on the original image copy Mm-i, or / and subsequent original image copy (Mm-i...M2, M1), or / and the original image M0 calibration face area,
- the face key point calibration data with the difference in accuracy is formed.
- the face key point identification strategy can be used to perform face key point calibration.
- a method of sampling and attenuating the original image is used to obtain a set of original image copies that gradually reduce the resolution, so that the face area recognition strategy that causes the processing delay to be processed most consumes As fast as possible in a lower-precision image copy, saving processing resources; then combining the obtained face area calibration data with the sampling rate of each original image copy to quickly complete the original image copy and the original image in higher resolution
- the face area calibration on the upper surface obtains high-precision face area calibration and corresponding face area calibration data, and the key points that do not consume processing resources are calibrated on the original image copy and the original image of each face area calibration. Get more accurate face key calibration data.
- the face area calibration data of the original image copy is coordinate data, and the corresponding sampling rate is used as the scaling ratio of the original image, and the face area calibration data of an original image copy can be quickly and accurately mapped to different original image copies or original images. Complete the face area calibration with the corresponding position.
- the face region calibrated for the original image copy Mm-i in step 228 is directly performed for the face key.
- the point calibration determines the calibration data of the face key points, and the optimal processing rate of the face area calibration and the face key point calibration of one frame image can be obtained.
- the face area calibration data of the original image M0 and the face key point calibration data help to improve the stability of the face key point calibration, and are applied to the high precision mode.
- the camera of a mobile device such as the iPhone has a slight difference between each frame and each frame, the image after sampling is more stable by calculating the average value, and the difference between each frame and each frame is smaller.
- the face image calibration data of the original image copy Mm-i and the face key point calibration data help to improve the stability of the algorithm and are applied to the stability mode.
- the face area calibration and the face key point calibration data processing speed are very high, and can meet the real-time requirement of 25 frames per second (25 fps), and can implement an action on a mobile device or Real-time recognition of expressions.
- the real-time face (face) detection and alignment processing is realized by utilizing the features of the real object in the video image, such as area, area, and displacement.
- the method balances processing speed and processing accuracy. Under the premise of ensuring a certain precision, the real-time control method of the three-dimensional model of the present embodiment significantly improves the processing speed of continuous face region recognition.
- FIG. 4 is a flow chart of another embodiment of facial expression recognition in an embodiment of a real-time control method for a three-dimensional model of the present invention. It shows a flow chart of a method for recognizing facial key points in successive frame images based on a method of recognizing face and face key points in a frame image. As shown in FIG. 4, the method includes:
- Step 231 Acquire the corresponding original image copy Mm-i and the face image calibration data of the original image M0 according to the face region calibration of the real-time video one-frame image; this step may take the execution process of steps 221 to 226;
- Step 232 Acquire an original image M0 of the subsequent consecutive time frame image and a corresponding original image copy Mm-i; then perform step 233 and step 234 respectively;
- Step 233 Perform face area calibration of the original image copy Mm-i of the subsequent continuous time frame image by using the face region calibration data of the original image copy Mm-i;
- Step 234 Perform calibration of the face region of the original image M0 of the frame image of the subsequent continuous duration by using the face region calibration data of the original image M0;
- step 234 may be performed first and then step 233 may be performed, or both may be performed synchronously.
- Step 235 Perform face key point calibration on the face image of the original image copy Mm-i and the original image M0 of the subsequent frames to form face key point calibration data with different precision.
- the real-time control method of the three-dimensional model of the embodiment applies the face region calibration data in the previous frame to the face of the subsequent limited number of images for the feature that the real object does not have a large displacement in the specific scene in the real-time video.
- the regional calibration further improves the calibration speed of the face region while ensuring the stability of the face region, and further reduces the consumption of processing resources in the face region calibration process.
- FIG. 5 is a flow chart of another embodiment of facial expression recognition according to an embodiment of a real-time control method for a three-dimensional model of the present invention. It shows a flow chart of a method for recognizing face and face key points in a continuous frame image based on a method for recognizing a face key point in a frame image. As shown in FIG. 5, the method includes:
- Step 241 Acquire a face region calibration data of the corresponding original image copy Mm-i or the original image M0 according to the face region calibration of the real-time video one-frame image; this step may take the execution process of steps 221 to 226;
- Step 242 calibrate the face key point in the calibrated face area
- Step 243 forming a bounding box range by using a face key point contour
- Step 244 using the expanded bounding box range as the face area of the next frame, and performing face key point calibration in the bounding box range;
- Step 245 determining whether the face key point calibration is successful; if successful, executing step 246; if not, proceeding to step 241;
- Step 246 Forming an updated bounding box range using the face keypoint contour and scaling up the updated bounding box range; and proceeding to step 244 to obtain data for the next frame.
- the contour of the determined face key point (the bounding box) in the previous frame is used as the face area calibration data of the next frame image, that is, the result of the previous frame is used as the next
- the initial value of the frame is used to predict the next frame.
- the calibration range of the face area is enlarged, so that when the face movement is not severe, we avoid the time-consuming face area detection for each frame, thereby improving the real-time performance of the overall operation of the algorithm. If the face key point calibration of this embodiment cannot obtain the correct result, indicating that the face may have strenuous motion between the two frames, then we perform a face detection again to obtain the location of the new face, and then re- Do key point calibration.
- Facial expression capture in video images includes facial region recognition calibration procedures, facial keypoint location (eg, facial features) calibration, and general processing for images in video including, for example, image replication, sub-sampling to form images, image scaling, and similar images
- image replication e.g., image replication
- sub-sampling e.g., image scaling
- similar images The coordinate mapping is established, the same or similar partial alignment and translation between different images, and the coordinate-based two-dimensional or three-dimensional angular transformation and distortion are not described in detail in this embodiment.
- FIG. 6 is a flow chart of an embodiment of head motion recognition and facial expression recognition according to an embodiment of a real-time control method for a three-dimensional model according to the present invention. Based on the integrity of the head and face, which shows the method flow for recognizing the head motion in successive frame images based on the method of recognizing the key points in a frame image when the real object in the image is the head. Figure. As shown in FIG. 6, the method includes:
- Step 251 According to the face region calibration of the face of the face in the real-time video image, the 2D key points of the front face are calibrated, and the key points having the relatively fixed positions are used to form the head toward the reference pattern; the process proceeds to step 254;
- Step 253 forming a face reference plane according to a 2D key point of the face that has a relatively fixed position a face reference pattern of the face reference plane; performing step 254;
- Step 254 forming a perspective projection on the face reference plane of the 2D face of the adjacent key image in the adjacent frame image of the real-time video, and obtaining the perspective of the face obtained by the step 2 of the 2D human face head toward the reference pattern according to step 251
- the deformation of the face reference pattern of the reference plane obtains the Euler rotation data of the head or the quaternion rotation data.
- the Euler rotation data described above includes the angle of rotation of the head with respect to the three axial directions of x, y, and z.
- the Euler rotation data can be converted to quaternion rotation data for higher rotation state processing efficiency and smoothing difference during rotation.
- the real-time control method of the three-dimensional model of the present embodiment utilizes a key point (for example, both eyes and a nose tip) in a face-up 2D (planar) face key point in the image to form a head-oriented reference pattern (for example, in both eyes and The polygon pattern with the tip of the nose as the apex, and the face reference plane and the face reference pattern are formed at the same time, and the 2D (planar) face key point and the projection coincidence of the face 3D (stereo) face key point are used to establish 2D.
- the mapping relationship between face key point coordinates and 3D face key point coordinates It realizes the coordinate of 3D face key points and forms a mapping relationship through the coordinates of 2D face key points, so that the position change of 2D face key points can be accurately reflected in the 3D face model (including the integrated head model).
- an action control instruction forming a corresponding 3D model of the head and the face includes:
- Step 226 using the face region calibration data in combination with the corresponding sampling rate, and completing the face region calibration in the subsequent original image copy (Mm-i...M2, M1) and the original image M0;
- Step 242 calibrate the face key point in the calibrated face area
- Step 252 According to the 2D key points of the face in the real-time video image, the front view triangle mesh of the corresponding 3D head model is formed, and the coordinate mapping between the 2D key point of the face and the 3D key point of the 3D head model is formed;
- Step 311 According to the obtained face key point, rotation angle and coordinate mapping, use the coordinate change of each key point of the 2D face of the real-time video continuous frame image acquired in step 254 and the head Euler rotation data or quaternion rotation Data, forming a face key point movement parameter between frames and a direction of rotation of the head;
- Step 312 Encapsulate the key point movement parameter and the direction of rotation of the head into control instructions of the 3D model head and face of the corresponding frame.
- the 2D key point is first upgraded into a 3D key point, and then the dimensionality is returned to the 2D to generate a 2D control point control method, which can be effective.
- the modeling tool is used according to the modeling process of the general modeling rules, including the establishment of the three-dimensional model, the establishment of the three-dimensional scene, the transmission, storage and download of the three-dimensional model. And the deployment of 3D models in 3D scenes, not described in detail.
- a 3D model of a cartoon image a 3D model of the torso and the head is usually included, and the 3D model of the head also includes a 3D model of the face, which can be separately stored, transmitted, or controlled.
- FIG. 7 is a flowchart of control commands and audio data synchronization in an embodiment of a real-time control method for a three-dimensional model according to the present invention.
- the step 400 shown in FIG. 1a may include:
- Step 421 Add a time label (or time stamp) to the control instruction of the 3D model header in units of frames;
- Step 422 Add a corresponding time label (or time stamp) to the audio data according to the time label of the control instruction;
- Step 423 Adapt the control command and the audio data signal to the transmission link, and output in real time.
- the mobile Internet transmission mechanism is affected, so that the control terminal and the audio data cannot be accurately synchronized at the content consumption end.
- an appropriate buffer can be utilized to reduce the requirement for synchronous reception of signals, so that The synchronization output of the control command and the audio data is restored by the same time tag to ensure the audio and video synchronization quality of the VR live broadcast.
- FIG. 8 is a schematic diagram showing the control effect of an embodiment of a real-time control method for a three-dimensional model according to the present invention.
- the real object takes the face of the person as an example, and recognizes the change of the key position in the face region and the face region in the continuous image of the video, and forms a change parameter of the facial action expression according to the change amount, thereby forming a facial expression.
- the continuous motion control command controls the motion of the corresponding key points on the face 3D model of the corresponding cartoon 3D model to form a facial expression of the real-time cartoon face 3D model.
- the basic steps of the face area identification in the real-time control method of the three-dimensional model mainly include:
- the face area identification speed is further improved by directly applying the face area on the corresponding copy of the adjacent frame image
- the face key points are identified in the frame image or the face area of the corresponding copy to suit different application modes.
- the basic steps of the head rotation identification in the real-time control method of the three-dimensional model mainly include:
- the face reference pattern of the head orientation reference pattern, the face reference plane and the face reference plane is established by using the fixed key points of the front view 2D face in the frame or the corresponding duplicate image, so as to face the face 3D head model face
- the key points form a coordinate mapping relationship with the 2D face key points
- Obtaining a head rotation angle by measuring deformation of the head toward the reference pattern relative to the face reference pattern when the head of the adjacent frame image is rotated;
- the control command of the head and face action expression is formed by combining the position change of the 2D face key point of the adjacent frame and the change of the head rotation angle.
- FIG. 9 is a schematic structural diagram of an embodiment of a real-time control system of a three-dimensional model according to the present invention. As shown in FIG. 9, a video acquisition device 10, an image identification device 20, and an action instruction generation device 30 are included, wherein:
- a video acquiring device 10 configured to acquire a real-time video of a real object
- An image identifying device 20 configured to identify an action of a real object in the real-time video image
- the motion command generating device 30 is configured to form an action control command of the corresponding 3D model according to the change of the marking action.
- a real-time control system for a three-dimensional model further includes a synchronization output device 40 for synchronizing audio data and motion control commands and outputting them.
- a real-time control system for a three-dimensional model further includes an activation device 80 and a playback device 90, wherein:
- the activation device 80 is configured to invoke the acquired corresponding 3D model
- the playing device 90 is configured to control the corresponding 3D model to complete the action according to the received motion control instruction.
- the playback device 90 further includes a receiving device 91, a buffer device 92, a synchronization device 93, and an audio playback device 94, wherein:
- a receiving device 91 configured to receive audio data and an action control instruction
- a buffering device 92 configured to cache audio data and motion control instructions
- the playing device 94 is configured to control the corresponding 3D model to complete the action and synchronously play the audio.
- FIG. 10 is a schematic structural diagram of image recognition of an embodiment of a real-time control system of a three-dimensional model according to the present invention.
- the image identification device 20 includes an object recognition device 21, an object key point recognition device 22, and a target bit.
- the coordinate establishing means 23 and the object motion change recording means 24 are provided, wherein:
- the object recognition device 21 is configured to identify a real object in an image of the real-time video according to the preset object recognition policy
- the object key point identifying device 22 is configured to identify a key point of the real object in the image according to the preset key point identification strategy
- the object position coordinate establishing means 23 is configured to form a plane coordinate space of the key point and a stereo coordinate space of the corresponding 3D model;
- the object motion change recording device 24 is configured to measure coordinate changes of key points in the plane coordinate space in the continuous image, and record corresponding coordinate changes of the key points in the continuous image in the three-dimensional coordinate space.
- the motion command generating device 30 includes a motion converting device 31 for changing a coordinate of a key point to form an action control command of a corresponding 3D model of the real object.
- FIG. 11 is a schematic structural diagram of single frame object and key point recognition according to an embodiment of a real-time control system of a three-dimensional model of the present invention.
- the original image capturing device 41, the image copy generating device 42, the copy cycle calibration device 43, the region calibration determining device 44, the copy region calibration device 45, the universal region calibration device 46, the universal region calibration device 47, and the key are included.
- Point calibration device 48 wherein:
- the original image capturing device 41 is configured to acquire a frame of the original image M0 of the real-time video
- the image copy generating means 42 is configured to generate a set of original image copies with correspondingly decreasing resolution according to the decreasing sampling rate: M1, M2...Mm-i, ... Mm-1, Mm;
- the copy cycle calibration device 43 is configured to perform face region calibration in the original image copy (using the face object recognition strategy) in the original image copy Mm starting from the original image copy Mm with the lowest resolution;
- the area calibration determining means 44 is configured to determine whether the face area calibration is completed in an original image copy. If not completed, the copy cycle calibration device 43 is called to continue the next cycle calibration; when completed, the copy area calibration device 45 is called; when the loop ends If the face area calibration is not completed, the universal area calibration device 47 is called;
- a copy area calibration device 45 for marking the corresponding original image copy Mm-i and forming face area calibration data
- the universal area calibration device 46 is configured to perform face area calibration on the subsequent original image copies (Mm-i...M2, M1) and the original image M0 by using the face area calibration data in combination with the corresponding sampling rate;
- the universal area calibration device 47 is configured to perform face area calibration by using the original image M0 when the end of the cycle is not completed;
- the key point calibration device 48 is used for the face region of the original image copy Mm-i, the subsequent original image copy (Mm-i...M2, M1), and the original image M0, (using the face key point recognition strategy)
- the face key point calibration is performed to form the face key point calibration data with the difference in accuracy.
- FIG. 12 is a schematic structural diagram of object recognition in consecutive frames according to an embodiment of a real-time control system for a three-dimensional model of the present invention.
- a face region calibration device 51 As shown in FIG. 12, a face region calibration device 51, a continuous frame processing device 52, a continuous frame region calibration device 53, a copy region calibration determining device 54, and a raw region calibration device 55 are included, wherein:
- the face area calibration device 51 is configured to acquire (by the universal area calibration device 46) the corresponding original image copy Mm-i and the face area calibration data of the original image M0;
- the continuous frame processing device 52 is configured to acquire the original image M0 of the frame image of the subsequent continuous duration and the corresponding original image copy Mm-i;
- the continuous frame area calibration device 53 is configured to perform face area calibration of the original image M0 of the subsequent continuous time frame image by using the face area calibration data of the original image M0;
- the copy area calibration determining means 54 is configured to perform face area calibration of the original image copy Mm-i of the subsequent continuous time frame image by using the face area calibration data of the original image copy Mm-i;
- the original area calibration device 55 is configured to perform face key point calibration on the face image of the original image copy Mm-i and/or the original image M0 of subsequent frames to form face key point calibration data with different precision.
- the face key point calibration device 62 As shown in FIG. 12, the face key point calibration device 62, the key point contour generating device 63, and the adjacent frame are also included.
- the face key point calibration device 62 is configured to calibrate the face key point in the face area obtained by acquiring the corresponding original image copy Mm-i or the original image M0;
- a key point contour generating device 63 configured to form a bounding box range by using a face key point contour
- the adjacent frame key point calibration device 64 is configured to perform the face key point calibration in the expanded bounding box range by using the expanded bounding box range as the face area of the next frame;
- the adjacent frame calibration determining means 65 is configured to determine whether the face key point calibration is successful, and if successful, the key point contour updating means 66 is invoked; if not, the face key point calibration means 62 is invoked;
- the keypoint contour updating device 66 is configured to form an updated bounding box range by using the face keypoint contour, and scale the updated bounding box range to call the adjacent frame keypoint calibration device 64.
- FIG. 13 is a schematic structural diagram of head and face motion recognition according to an embodiment of a real-time control system for a three-dimensional model of the present invention.
- a head orientation reference generating means 71 a coordinate map generating means 72, a face reference generating means 73, and a turning angle measuring means 74 are included, wherein:
- a head-facing reference generating device 71 configured to calibrate a 2D key point of the face of the face according to the face region calibration of the face in the real-time video image, and use the key point having a relatively fixed position to form the head toward the reference pattern;
- the coordinate map generating device 72 is configured to form a 3D key point of the face and a 3D key point of the 3D head model according to the front view triangle mesh of the corresponding 3D head model formed by the 2D key points of the face in the real-time video image. Coordinate mapping between
- a face reference generating device 73 configured to form a face reference pattern of the face reference plane and the face reference plane according to the 2D key points of the front face having relatively fixed positions;
- the rotation angle measuring device 74 is configured to form a perspective projection on the face reference plane of the 2D face of the adjacent key image in the adjacent frame image of the real-time video, according to the face of the 2D person face facing the reference pattern relative to the face reference plane
- the deformation of the face reference pattern obtains the Euler rotation data of the head or the quaternion rotation data.
- a real-time control system for a three-dimensional model includes a head-and-face motion parameter generating device 32 and a control command generating device 33 for forming a control command for a head-and-face object motion of a video continuous frame, wherein:
- the head and face motion parameter generating means 32 is configured to use the coordinate change of each key point of the 2D face of the real-time video continuous frame image and the head Euler rotation data or the quaternion rotation data to form a face key point movement between frames Parameters and the direction of rotation of the head;
- the control command generating means 33 is configured to encapsulate the key point movement parameter and the rotation direction of the head into control instructions of the 3D model head and face of the corresponding frame.
- a real-time control system for a three-dimensional model includes an audio data synchronization device 35.
- the audio data synchronization device 35 is configured to add a corresponding time label to the audio data according to the time label of the control instruction;
- the control instruction synchronization device 36 is configured to add a time stamp to the control instruction of the 3D model header in units of frames;
- the real-time output device 37 is configured to adapt the control command and the audio data signal to the transmission link for real-time output.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Signal Processing (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Geometry (AREA)
- Computer Graphics (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Computer Hardware Design (AREA)
- Remote Sensing (AREA)
- Radar, Positioning & Navigation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Processing Or Creating Images (AREA)
Abstract
Description
Claims (28)
- 三维模型的实时控制方法,包括:获取现实对象的实时视频;标识实时视频图像中现实对象的动作;根据标识动作的变化,形成相应3D模型的动作控制指令。
- 如权利要求1所述的三维模型的实时控制方法,所述标识实时视频图像中现实对象的动作包括:根据预置对象识别策略在实时视频的图像中识别现实对象;根据预置关键点识别策略识别图像中现实对象的关键点;形成关键点的平面坐标空间和相应3D模型的立体坐标空间;测量连续图像中关键点在平面坐标空间的坐标变化,记录连续图像中关键点在立体坐标空间中的相应坐标变化。
- 如权利要求2所述的三维模型的实时控制方法,所述根据标识动作的变化,形成相应3D模型的动作控制指令,包括:将关键点的坐标变化形成现实对象相应3D模型的动作控制指令。
- 如权利要求2所述的三维模型的实时控制方法,所述根据预置对象识别策略在实时视频的图像中识别现实对象包括:获取实时视频的一帧原始图像M0;根据递减的抽样率,生成分辨率相应递减的一组原始图像副本,从中获取完成人脸区域标定的低分辨率的原始图像副本,形成人脸区域标定数据。
- 如权利要求4所述的三维模型的实时控制方法,所述根据预置对象识别策略在实时视频的图像中识别现实对象还进一步包括:当在所有原始图像副本中没完成人脸区域标定,采用原始图像M0完成人脸区域标定。
- 如权利要求4所述的三维模型的实时控制方法,所述根据预置关键点识别策略识别图像中现实对象的关键点包括:在原始图像副本Mm-i,或/和后续的原始图像副本(Mm-i...M2,M1),或/和原始图像M0标定的人脸区域,进行人脸关键点标定,形成存在精度差别的人脸关键点标定数据。
- 如权利要求1所述的三维模型的实时控制方法,所述现实对象包括肢体、头部或面部,所述标识包括对肢体或头部动作的捕捉和分析,或对面部表情的捕捉和分析。
- 如权利要求2所述的三维模型的实时控制方法,所述根据预置对象识别策略在实时视频的图像中识别现实对象包括:根据实时视频一帧图像的人脸区域标定,获取相应的原始图像副本Mm-i和原始图像M0的人脸区域标定数据;获取后续连续时长的帧图像的原始图像M0和相应的原始图像副本Mm-i;利用原始图像副本Mm-i的人脸区域标定数据完成后续连续时长的帧图像的原始图像副本Mm-i的人脸区域标定;或者,利用原始图像M0的人脸区域标定数据,完成后续连续时长的帧图像的原始图像M0的人脸区域标定。
- 如权利要求2所述的三维模型的实时控制方法,所述根据预置对象识别策略在实时视频的图像中识别现实对象包括:根据实时视频一帧图像的人脸区域标定,获取相应的原始图像副本Mm-i或原始图像M0的人脸区域标定数据;在标定的人脸区域,标定人脸关键点;利用人脸关键点轮廓形成包围盒范围;利用扩大的包围盒范围作为下一帧的人脸区域,在包围盒范围内进行人脸关键点标定。
- 如权利要求9所述的三维模型的实时控制方法,所述根据预置对象识别策略在实时视频的图像中识别现实对象,还进一步包括:当判断人脸关键点标定成功时,利用人脸关键点轮廓形成更新的包围盒范围并比例放大;当判断人脸关键点标定成功时,获取相应的原始图像副本Mm-i或原始图像M0的人脸区域标定数据。
- 如权利要求2所述的三维模型的实时控制方法,所述形成关键点的平面坐标空间和相应3D模型的立体坐标空间包括:根据实时视频图像中正视人脸的人脸区域标定,标定正视人脸的2D关键点,利用其中具有相对固定位置的关键点,形成头部朝向基准图案;根据实时视频图像中正视人脸的2D关键点,形成的相应3D头部模型的正视三角网格,形成人脸的2D关键点与3D头部模型的3D关键点间的坐标映射;根据正视人脸的具有相对固定位置的2D关键点,形成人脸基准平面和人脸基准平面的人脸基准图案;实时视频的相邻帧图像中被标定关键点的2D人脸在人脸基准平面上形成透视投影,根据2D人脸上头部朝向基准图案相对人脸基准平面的人脸基准图案的变形,获得头部的欧拉旋转数据或四元数旋转数据。
- 如权利要求11所述的三维模型的实时控制方法,所述测量连续图像中关键点在平面坐标空间的坐标变化,记录连续图像中关键点在立体坐标空间中的相应坐标变化包括:利用实时视频连续帧图像的2D人脸的各关键点的坐标变化和头部欧拉旋转数据或四元数旋转数据,形成帧间的人脸关键点移动参数和头部的转动方向。
- 如权利要求1至12任一所述的三维模型的实时控制方法,还包括:同步化音频数据和动作控制指令并输出。
- 如权利要求13所述的三维模型的实时控制方法,所述同步化音频数据和动作控制指令并输出,包括:对3D模型头部的控制指令以帧为单位增加时间标签;根据控制指令的时间标签,为音频数据增加相应的时间标签;将控制指令和音频数据信号适配传输链路,实时输出。
- 如权利要求1至12任一所述的三维模型的实时控制方法,还包括:调用获取的相应3D模型;根据接收到的动作控制指令控制相应3D模型完成动作。
- 三维模型的实时控制方法,包括:获取现实对象中头部及面部的实时视频;利用视频中帧图像的低分辨率副本定位人脸区域;通过将人脸区域在相邻帧图像的相应副本上直接应用;在帧图像或相应副本的人脸区域标识人脸关键点;利用图像中的正视2D人脸的位置固定的关键点建立头部朝向基准图案、人脸基准平面和人脸基准平面的人脸基准图案,与正视的3D头部模型形成坐标映射关系;通过测量相邻帧图像的头部转动时头部朝向基准图案相对人脸基准图案的变形,获得头部旋转数据;结合相邻帧2D人脸关键点的位置变化和头部旋转数据,形成头面部动作表情的控制指令。
- 三维模型的实时控制系统,包括:视频获取装置(10),用于获取现实对象的实时视频;图像标识装置(20),用于标识实时视频图像中现实对象的动作;动作指令生成装置(30),用于根据标识动作的变化,形成相应3D模型的动作控制指令。
- 如权利要求17所述的三维模型的实时控制系统,所述图像标识装置(20)包括对象识别装置(21)、对象关键点识别装置(22)、对象位置坐标建立装置(23)和对象动作变化记录装置(24),其中:对象识别装置(21),用于根据预置对象识别策略在实时视频的图像中识别现实对象;对象关键点识别装置(22),用于根据预置关键点识别策略识别图像中现实对象的关键点;对象位置坐标建立装置(23),用于形成关键点的平面坐标空间和相应3D模型的立体坐标空间;对象动作变化记录装置(24),用于测量连续图像中关键点在平面坐标空间的坐标变化,记录连续图像中关键点在立体坐标空间中的相应坐标变化。
- 如权利要求17所述的三维模型的实时控制系统,所述动作指令生成装置(30)包括动作转换装置(31),用于将关键点的坐标变化形成现实对象相应3D模型的动作控制指令。
- 如权利要求18所述的三维模型的实时控制系统,所述对象识别装置(21)包括原始图像捕捉装置(41)、图像副本生成装置(42)和副本循环标定装置(43),其中:原始图像捕捉装置(41),用于获取实时视频的一帧原始图像M0;图像副本生成装置(42),用于根据递减的抽样率,生成分辨率相应递减的一组原始图像副本:M1,M2...Mm-i,…Mm-1,Mm;副本循环标定装置(43),用于以原始图像副本个数m为循环次数,从分辨率最低的原始图像副本Mm开始,顺序在原始图像副本中进行人脸区域标定,形成人脸区域标定数据。
- 如权利要求18所述的三维模型的实时控制系统,所述对象关键点识别装置(22)包括关键点标定装置(48),用于在原始图像副本Mm-i、后续的原始图像副本(Mm-i...M2,M1)、原始图像M0标定的人脸区域,进行人脸关键点标定,形成存在精度差别的人脸关键点标定数据。
- 如权利要求18所述的三维模型的实时控制系统,所述对象识别装置(21)包括人脸区域标定装置(51)、连续帧处理装置(52)、连续帧区域标定装置(53)、副本区域标定判断装置(54)和原始区域标定装置(55),其中:人脸区域标定装置(51),用于获取相应的原始图像副本Mm-i和原始图像M0的人脸区域标定数据;连续帧处理装置(52),用于获取后续连续时长的帧图像的原始图像M0和相应的原始图像副本Mm-i;连续帧区域标定装置(53),用于利用原始图像M0的人脸区域标定数据,完成后续连续时长的帧图像的原始图像M0的人脸区域标定;副本区域标定判断装置(54),用于利用原始图像副本Mm-i的人脸区域标定数据完成后续连续时长的帧图像的原始图像副本Mm-i的人脸区域标定;原始区域标定装置(55),用于在后续各帧的原始图像副本Mm-i和/或原始图像M0标定的人脸区域,进行人脸关键点标定,形成存在精度差别的人脸关键点标定数据。
- 如权利要求18所述的三维模型的实时控制系统,所述对象识别装置(21)包 括人脸关键点标定装置(62)、关键点轮廓生成装置(63)、相邻帧关键点标定装置(64)、相邻帧标定判断装置(65)和关键点轮廓更新装置(66),其中:人脸关键点标定装置(62),用于在获取相应的原始图像副本Mm-i或原始图像M0标定的人脸区域标定人脸关键点;关键点轮廓生成装置(63),用于利用人脸关键点轮廓形成包围盒范围;相邻帧关键点标定装置(64),用于利用扩大的包围盒范围作为下一帧的人脸区域,在包围盒范围内进行人脸关键点标定;相邻帧标定判断装置(65),用于判断人脸关键点标定是否成功,成功则调用关键点轮廓更新装置(66),不成功则调用人脸关键点标定装置(62);关键点轮廓更新装置(66),用于利用人脸关键点轮廓形成更新的包围盒范围,并比例放大更新的包围盒范围后调用相邻帧关键点标定装置(64)。
- 如权利要求18所述的三维模型的实时控制系统,所述对象位置坐标建立装置(23)包括头部朝向基准生成装置(71)、坐标映射生成装置(72)、面部基准生成装置(73)和转动角度测量装置(74),其中:头部朝向基准生成装置(71),用于根据实时视频图像中正视人脸的人脸区域标定,标定正视人脸的2D关键点,利用其中具有相对固定位置的关键点,形成头部朝向基准图案;坐标映射生成装置(72),用于根据实时视频图像中正视人脸的2D关键点,形成的相应3D头部模型的正视三角网格,形成人脸的2D关键点与3D头部模型的3D关键点间的坐标映射;面部基准生成装置(73),用于根据正视人脸的具有相对固定位置的2D关键点,形成人脸基准平面和人脸基准平面的人脸基准图案;转动角度测量装置(74),用于实时视频的相邻帧图像中被标定关键点的2D人脸在人脸基准平面上形成透视投影,根据2D人脸上头部朝向基准图案相对人脸基准平面的人脸基准图案的变形,获得头部欧拉旋转数据或四元数旋转数据。
- 如权利要求18所述的三维模型的实时控制系统,所述对象位置坐标建立装置(23)包括头面部动作参数生成装置(32)和控制指令生成装置(33),其中:头面部动作参数生成装置(32),用于利用实时视频连续帧图像的2D人脸的各关键点的坐标变化和头部欧拉旋转数据或四元数旋转数据,形成帧间的人脸关键点移动参数和头部的转动方向;控制指令生成装置(33),用于将关键点移动参数和头部的转动方向封装成相应帧的3D模型头部和面部的控制指令。
- 如权利要求16至25任一所述的三维模型的实时控制系统,还包括同步化输出装置(40),用于同步化音频数据和动作控制指令,并输出。
- 如权利要求26所述的三维模型的实时控制系统,所述同步化输出装置(40)包括音频数据同步装置(35)、控制指令同步装置(36)和实时输出装置(37),其中:音频数据同步装置(35),用于根据控制指令的时间标签,为音频数据增加相应的时间标签;控制指令同步装置(36),用于对3D模型头部的控制指令以帧为单位增加时间标签;实时输出装置(37),用于将控制指令和音频数据信号适配传输链路,实时输出。
- 如权利要求17至27任一所述的三维模型的实时控制系统,还包括还包括激活装置(80)和播放装置(90),其中:激活装置(80),用于调用获取的相应3D模型;播放装置(90),用于根据接收到的动作控制指令控制相应3D模型完成动作。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/261,482 US10930074B2 (en) | 2016-07-29 | 2019-01-29 | Method and system for real-time control of three-dimensional models |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610619560.4A CN106251396B (zh) | 2016-07-29 | 2016-07-29 | 三维模型的实时控制方法和系统 |
CN201610619560.4 | 2016-07-29 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/261,482 Continuation US10930074B2 (en) | 2016-07-29 | 2019-01-29 | Method and system for real-time control of three-dimensional models |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018018957A1 true WO2018018957A1 (zh) | 2018-02-01 |
Family
ID=57606112
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2017/081376 WO2018018957A1 (zh) | 2016-07-29 | 2017-04-21 | 三维模型的实时控制方法和系统 |
Country Status (3)
Country | Link |
---|---|
US (1) | US10930074B2 (zh) |
CN (1) | CN106251396B (zh) |
WO (1) | WO2018018957A1 (zh) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111405361A (zh) * | 2020-03-27 | 2020-07-10 | 咪咕文化科技有限公司 | 一种视频获取方法、电子设备及计算机可读存储介质 |
CN113427486A (zh) * | 2021-06-18 | 2021-09-24 | 上海非夕机器人科技有限公司 | 机械臂控制方法、装置、计算机设备、存储介质和机械臂 |
CN115442519A (zh) * | 2022-08-08 | 2022-12-06 | 珠海普罗米修斯视觉技术有限公司 | 视频处理方法、装置及计算机可读存储介质 |
Families Citing this family (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106251396B (zh) | 2016-07-29 | 2021-08-13 | 迈吉客科技(北京)有限公司 | 三维模型的实时控制方法和系统 |
CN106993195A (zh) * | 2017-03-24 | 2017-07-28 | 广州创幻数码科技有限公司 | 虚拟人物角色直播方法及系统 |
CN107172040A (zh) * | 2017-05-11 | 2017-09-15 | 上海微漫网络科技有限公司 | 一种虚拟角色的播放方法及系统 |
CN109309866B (zh) * | 2017-07-27 | 2022-03-08 | 腾讯科技(深圳)有限公司 | 图像处理方法及装置、存储介质 |
CN107705365A (zh) * | 2017-09-08 | 2018-02-16 | 郭睿 | 可编辑的三维人体模型创建方法、装置、电子设备及计算机程序产品 |
CN107613240A (zh) * | 2017-09-11 | 2018-01-19 | 广东欧珀移动通信有限公司 | 视频画面处理方法、装置和移动终端 |
CN107750014B (zh) * | 2017-09-25 | 2020-10-16 | 迈吉客科技(北京)有限公司 | 一种连麦直播方法和系统 |
CN108109189A (zh) * | 2017-12-05 | 2018-06-01 | 北京像素软件科技股份有限公司 | 动作共享方法及装置 |
CN108769802A (zh) * | 2018-06-21 | 2018-11-06 | 北京密境和风科技有限公司 | 网络表演的实现方法、装置和系统 |
JP2022500795A (ja) * | 2018-07-04 | 2022-01-04 | ウェブ アシスタンツ ゲーエムベーハー | アバターアニメーション |
TWI704501B (zh) * | 2018-08-09 | 2020-09-11 | 宏碁股份有限公司 | 可由頭部操控的電子裝置與其操作方法 |
CN109191593A (zh) * | 2018-08-27 | 2019-01-11 | 百度在线网络技术(北京)有限公司 | 虚拟三维模型的运动控制方法、装置及设备 |
CN110942479B (zh) * | 2018-09-25 | 2023-06-02 | Oppo广东移动通信有限公司 | 虚拟对象控制方法、存储介质及电子设备 |
CN113498530A (zh) * | 2018-12-20 | 2021-10-12 | 艾奎菲股份有限公司 | 基于局部视觉信息的对象尺寸标注系统和方法 |
WO2020147791A1 (zh) * | 2019-01-18 | 2020-07-23 | 北京市商汤科技开发有限公司 | 图像处理方法及装置、图像设备及存储介质 |
WO2020147794A1 (zh) * | 2019-01-18 | 2020-07-23 | 北京市商汤科技开发有限公司 | 图像处理方法及装置、图像设备及存储介质 |
CN111460870A (zh) | 2019-01-18 | 2020-07-28 | 北京市商汤科技开发有限公司 | 目标的朝向确定方法及装置、电子设备及存储介质 |
CN110264499A (zh) * | 2019-06-26 | 2019-09-20 | 北京字节跳动网络技术有限公司 | 基于人体关键点的交互位置控制方法、装置及电子设备 |
CN110536095A (zh) * | 2019-08-30 | 2019-12-03 | Oppo广东移动通信有限公司 | 通话方法、装置、终端及存储介质 |
US11532093B2 (en) | 2019-10-10 | 2022-12-20 | Intermap Technologies, Inc. | First floor height estimation from optical images |
CN111476871B (zh) * | 2020-04-02 | 2023-10-03 | 百度在线网络技术(北京)有限公司 | 用于生成视频的方法和装置 |
CN111541932B (zh) * | 2020-04-30 | 2022-04-12 | 广州方硅信息技术有限公司 | 直播间的用户形象展示方法、装置、设备及存储介质 |
CN112019921A (zh) * | 2020-09-01 | 2020-12-01 | 北京德火科技有限责任公司 | 应用于虚拟演播室的肢体动作数据处理方法 |
CN112019922A (zh) * | 2020-09-01 | 2020-12-01 | 北京德火科技有限责任公司 | 应用于虚拟演播室的面部表情数据处理方法 |
US11551366B2 (en) * | 2021-03-05 | 2023-01-10 | Intermap Technologies, Inc. | System and methods for correcting terrain elevations under forest canopy |
CN113507627B (zh) * | 2021-07-08 | 2022-03-25 | 北京的卢深视科技有限公司 | 视频生成方法、装置、电子设备及存储介质 |
CN113989928B (zh) * | 2021-10-27 | 2023-09-05 | 南京硅基智能科技有限公司 | 一种动作捕捉和重定向方法 |
CN113965773A (zh) * | 2021-11-03 | 2022-01-21 | 广州繁星互娱信息科技有限公司 | 直播展示方法和装置、存储介质及电子设备 |
CN114554267B (zh) * | 2022-02-22 | 2024-04-02 | 上海艾融软件股份有限公司 | 基于数字孪生技术的音频视频的同步方法及装置 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101086681A (zh) * | 2006-06-09 | 2007-12-12 | 中国科学院自动化研究所 | 基于立体视觉的游戏控制系统及方法 |
CN101452582A (zh) * | 2008-12-18 | 2009-06-10 | 北京中星微电子有限公司 | 一种实现三维视频特效的方法和装置 |
CN105338369A (zh) * | 2015-10-28 | 2016-02-17 | 北京七维视觉科技有限公司 | 一种在视频中实时合成动画的方法和装置 |
CN105528805A (zh) * | 2015-12-25 | 2016-04-27 | 苏州丽多数字科技有限公司 | 一种虚拟人脸动画合成方法 |
CN106251396A (zh) * | 2016-07-29 | 2016-12-21 | 迈吉客科技(北京)有限公司 | 三维模型的实时控制方法和系统 |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6363380B1 (en) * | 1998-01-13 | 2002-03-26 | U.S. Philips Corporation | Multimedia computer system with story segmentation capability and operating program therefor including finite automation video parser |
CN101047434B (zh) * | 2007-04-10 | 2010-09-29 | 华为技术有限公司 | 一种时间标签同步方法、系统、装置 |
CN101271520A (zh) * | 2008-04-01 | 2008-09-24 | 北京中星微电子有限公司 | 一种确定图像中的特征点位置的方法及装置 |
CN101763636B (zh) * | 2009-09-23 | 2012-07-04 | 中国科学院自动化研究所 | 视频序列中的三维人脸位置和姿态跟踪的方法 |
US9747495B2 (en) * | 2012-03-06 | 2017-08-29 | Adobe Systems Incorporated | Systems and methods for creating and distributing modifiable animated video messages |
US9600742B2 (en) * | 2015-05-05 | 2017-03-21 | Lucasfilm Entertainment Company Ltd. | Determining control values of an animation model using performance capture |
CN105518714A (zh) * | 2015-06-30 | 2016-04-20 | 北京旷视科技有限公司 | 活体检测方法及设备、计算机程序产品 |
US9865072B2 (en) * | 2015-07-23 | 2018-01-09 | Disney Enterprises, Inc. | Real-time high-quality facial performance capture |
CN105069830A (zh) * | 2015-08-14 | 2015-11-18 | 广州市百果园网络科技有限公司 | 表情动画生成方法及装置 |
-
2016
- 2016-07-29 CN CN201610619560.4A patent/CN106251396B/zh active Active
-
2017
- 2017-04-21 WO PCT/CN2017/081376 patent/WO2018018957A1/zh active Application Filing
-
2019
- 2019-01-29 US US16/261,482 patent/US10930074B2/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101086681A (zh) * | 2006-06-09 | 2007-12-12 | 中国科学院自动化研究所 | 基于立体视觉的游戏控制系统及方法 |
CN101452582A (zh) * | 2008-12-18 | 2009-06-10 | 北京中星微电子有限公司 | 一种实现三维视频特效的方法和装置 |
CN105338369A (zh) * | 2015-10-28 | 2016-02-17 | 北京七维视觉科技有限公司 | 一种在视频中实时合成动画的方法和装置 |
CN105528805A (zh) * | 2015-12-25 | 2016-04-27 | 苏州丽多数字科技有限公司 | 一种虚拟人脸动画合成方法 |
CN106251396A (zh) * | 2016-07-29 | 2016-12-21 | 迈吉客科技(北京)有限公司 | 三维模型的实时控制方法和系统 |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111405361A (zh) * | 2020-03-27 | 2020-07-10 | 咪咕文化科技有限公司 | 一种视频获取方法、电子设备及计算机可读存储介质 |
CN111405361B (zh) * | 2020-03-27 | 2022-06-14 | 咪咕文化科技有限公司 | 一种视频获取方法、电子设备及计算机可读存储介质 |
CN113427486A (zh) * | 2021-06-18 | 2021-09-24 | 上海非夕机器人科技有限公司 | 机械臂控制方法、装置、计算机设备、存储介质和机械臂 |
CN113427486B (zh) * | 2021-06-18 | 2022-10-28 | 上海非夕机器人科技有限公司 | 机械臂控制方法、装置、计算机设备、存储介质和机械臂 |
CN115442519A (zh) * | 2022-08-08 | 2022-12-06 | 珠海普罗米修斯视觉技术有限公司 | 视频处理方法、装置及计算机可读存储介质 |
CN115442519B (zh) * | 2022-08-08 | 2023-12-15 | 珠海普罗米修斯视觉技术有限公司 | 视频处理方法、装置及计算机可读存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN106251396B (zh) | 2021-08-13 |
US10930074B2 (en) | 2021-02-23 |
US20190156574A1 (en) | 2019-05-23 |
CN106251396A (zh) | 2016-12-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018018957A1 (zh) | 三维模型的实时控制方法和系统 | |
CN111738220B (zh) | 三维人体姿态估计方法、装置、设备及介质 | |
KR20180121494A (ko) | 단안 카메라들을 이용한 실시간 3d 캡처 및 라이브 피드백을 위한 방법 및 시스템 | |
US11386633B2 (en) | Image augmentation for analytics | |
CN106710003B (zh) | 一种基于OpenGL ES的三维拍照方法和系统 | |
WO2019100932A1 (zh) | 一种运动控制方法及其设备、存储介质、终端 | |
WO2010038693A1 (ja) | 情報処理装置、情報処理方法、プログラム及び情報記憶媒体 | |
JP7483301B2 (ja) | 画像処理及び画像合成方法、装置及びコンピュータプログラム | |
TWI752419B (zh) | 影像處理方法及裝置、圖像設備及儲存媒介 | |
WO2019200719A1 (zh) | 三维人脸模型生成方法、装置及电子设备 | |
CN103999455B (zh) | 协作交叉平台视频捕捉 | |
WO2019019927A1 (zh) | 一种视频处理方法、网络设备和存储介质 | |
JP2023514289A (ja) | 3次元顔モデルの構築方法、3次元顔モデルの構築装置、コンピュータ機器、及びコンピュータプログラム | |
CN107707899B (zh) | 包含运动目标的多视角图像处理方法、装置及电子设备 | |
CN112348937A (zh) | 人脸图像处理方法及电子设备 | |
WO2023066120A1 (zh) | 图像处理方法、装置、电子设备及存储介质 | |
KR20150068895A (ko) | 삼차원 출력 데이터 생성 장치 및 방법 | |
CN110152293A (zh) | 操控对象的定位方法及装置、游戏对象的定位方法及装置 | |
CN111064981B (zh) | 一种视频串流的系统及方法 | |
KR20150025462A (ko) | 상호작용하는 캐릭터를 모델링 하는 방법 및 장치 | |
JP5066047B2 (ja) | 情報処理装置、情報処理方法、プログラム及び情報記憶媒体 | |
WO2024131204A1 (zh) | 虚拟场景设备交互方法及相关产品 | |
US11675195B2 (en) | Alignment of 3D representations for hologram/avatar control | |
Smolska et al. | Reconstruction of the Face Shape using the Motion Capture System in the Blender Environment. | |
KR20240048207A (ko) | 사용자의 상황 정보 예측 기반 확장현실 디바이스의 영상 스트리밍 방법 및 장치 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17833263 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 17833263 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 02.07.2019) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 17833263 Country of ref document: EP Kind code of ref document: A1 |