WO2018018957A1 - 三维模型的实时控制方法和系统 - Google Patents

三维模型的实时控制方法和系统 Download PDF

Info

Publication number
WO2018018957A1
WO2018018957A1 PCT/CN2017/081376 CN2017081376W WO2018018957A1 WO 2018018957 A1 WO2018018957 A1 WO 2018018957A1 CN 2017081376 W CN2017081376 W CN 2017081376W WO 2018018957 A1 WO2018018957 A1 WO 2018018957A1
Authority
WO
WIPO (PCT)
Prior art keywords
face
real
image
calibration
key point
Prior art date
Application number
PCT/CN2017/081376
Other languages
English (en)
French (fr)
Inventor
伏英娜
金宇林
Original Assignee
迈吉客科技(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 迈吉客科技(北京)有限公司 filed Critical 迈吉客科技(北京)有限公司
Publication of WO2018018957A1 publication Critical patent/WO2018018957A1/zh
Priority to US16/261,482 priority Critical patent/US10930074B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/003Navigation within 3D models or images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/165Detection; Localisation; Normalisation using facial parts and geometric relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/4223Cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4788Supplemental services, e.g. displaying phone caller identification, shopping application communicating with other users, e.g. chatting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/08Indexing scheme for image data processing or generation, in general involving all processing steps from image acquisition to 3D model generation

Definitions

  • Embodiments of the present invention relate to a control method and system for a stereo model, and in particular, to a real-time control method and system for a three-dimensional model.
  • Video playback or video interaction on audiovisual equipment and mobile communication equipment is very common, and the communication objects in the video are often the real image of people.
  • 3D character models With advances in technology in communications, sensors, and modeling, real-time interaction of 3D character models is emerging worldwide.
  • the prior art solution can realize real-time replacement of a person's real image with a virtual cartoon image, form a real-time interaction between cartoon characters replacing the real person image, and better complete emotional expressions such as mood, crying and laughing. For example, turning a live-action live story into a cartoon character telling a story, turning a real teacher into a famous scientist to talk about physics. Two strangers can interact with each other by playing different roles. For example, Snow White can video chat with Prince Charming.
  • the Stanford University computer department uses the RGBD camera to achieve similar functions by using the depth information provided by the camera.
  • most mobile devices are now equipped with RGB cameras. Without depth information, the algorithm cannot be extended to a wider mobile Internet scene.
  • FaceRig and Adobe's technology implements similar functions based on RGB cameras on a PC computer.
  • the embodiment of the present invention provides a real-time control method for a three-dimensional model, which is used to solve real-time feedback on a real object through a limited computing resource of a terminal in a mobile Internet environment, so as to control the action formation of the three-dimensional model.
  • the embodiment of the present invention further provides a real-time control system for a three-dimensional model, which is used to solve hardware resource constraints such as the mobile Internet environment, the processing capability of the mobile terminal, and the performance of the camera.
  • the motion of the 3D model of the real object cannot achieve real-time motion control.
  • the real-time control method of the three-dimensional model of the invention comprises:
  • an action control instruction of the corresponding 3D model is formed.
  • the real-time control method of the three-dimensional model of the invention comprises:
  • the real-time control system of the three-dimensional model of the invention comprises:
  • a video acquiring device configured to acquire a real-time video of a real object
  • An image identification device for identifying an action of a real object in the real-time video image
  • the motion command generating means is configured to form an action control command of the corresponding 3D model according to the change of the marking action.
  • the real-time control method of the three-dimensional model of the present invention forms an action control command for controlling the 3D model by recognizing a real object in the acquired real-time video and an action change of the real object.
  • the action control instruction has small data volume and low data bandwidth requirement for real-time transmission, which can ensure real-time transmission in the mobile Internet environment.
  • the real-time control method of the three-dimensional model of the invention avoids the delay of transmission of a large amount of video data formed by real-time rendering of the 3D model in the mobile internet environment, and the formation of the VR video playback, so that the rendering process and control of the 3D model The generation process can be completed at both ends of the mobile Internet environment.
  • One end uses a mobile terminal with limited hardware resources to complete the recognition capture and instruction formation of the real object motion change, and the other end uses the mobile Internet environment to complete the necessary 3D model and scene download, load and Activation, the 3D model completes the corresponding action of the real object through the real-time transmission of the control instruction, and forms a corresponding model rendering and scene rendering VR live broadcast.
  • the real-time control system of the three-dimensional model of the invention can be deployed in a mobile terminal with limited resources in a mobile internet environment, and utilizes limited terminal processing and camera capabilities to centrally process the action change process of the real object, and efficiently acquire the accurate action state of the real object, thereby forming an accurate action state of the real object.
  • the control command can perform accurate real-time motion control on any matched 3D model, and complete the real expression of the real-time action of the real object in the 3D model.
  • the 3D model motion control does not need to be fused in the video of the real object, and the motion simulation of the real object is no longer limited to the limited bandwidth of the mobile Internet environment.
  • 1a is a process flow diagram of an embodiment of a real-time control method for a three-dimensional model of the present invention.
  • FIG. 1b is a processing flowchart of an embodiment of a real-time control method for a three-dimensional model according to the present invention.
  • FIG. 2 is a flow chart of motion recognition of an embodiment of a real-time control method for a three-dimensional model of the present invention.
  • FIG. 3 is a flow chart of an embodiment of facial expression recognition in an embodiment of a real-time control method for a three-dimensional model of the present invention.
  • FIG. 4 is a flow chart of another embodiment of facial expression recognition in an embodiment of a real-time control method for a three-dimensional model of the present invention.
  • FIG. 5 is a flow chart of another embodiment of facial expression recognition according to an embodiment of a real-time control method for a three-dimensional model of the present invention.
  • FIG. 6 is a flow chart of an embodiment of head motion recognition and facial expression recognition according to an embodiment of a real-time control method for a three-dimensional model according to the present invention.
  • FIG. 7 is a flowchart of control commands and audio data synchronization in an embodiment of a real-time control method for a three-dimensional model according to the present invention.
  • FIG. 8 is a schematic diagram of control effects of an embodiment of a real-time control method for a three-dimensional model according to the present invention.
  • FIG. 9 is a schematic structural diagram of an embodiment of a real-time control system of a three-dimensional model according to the present invention.
  • FIG. 10 is a schematic structural diagram of image recognition of an embodiment of a real-time control system of a three-dimensional model according to the present invention.
  • FIG. 11 is a schematic structural diagram of single frame object and key point recognition according to an embodiment of a real-time control system of a three-dimensional model of the present invention.
  • FIG. 12 is a schematic structural diagram of object recognition in consecutive frames according to an embodiment of a real-time control system for a three-dimensional model of the present invention.
  • FIG. 13 is a schematic structural diagram of head and face motion recognition according to an embodiment of a real-time control system for a three-dimensional model of the present invention.
  • FIG. 1a is a flowchart of a real-time control method of a three-dimensional model according to an embodiment of the present invention, which is a control process independently completed by a content production end. As shown in Figure 1a, the method includes:
  • Step 100 Acquire a real-time video of a real object
  • the above-mentioned realistic objects include a complete human body, or a limb, a head or a face of the human body, and correspondingly include a limb motion, a head motion, and a facial motion (expression).
  • Step 200 Identify an action of a real object in the real-time video image
  • the above identifier includes identification of a real object, positioning of the recognized real object, positioning of the recognized real object motion, and positioning of the recognized real object motion change. Examples include capture (marking) and analysis (recognition) of limb or head movements, or capture (marking) and analysis (recognition) of facial expressions.
  • Step 300 Form an action control instruction of the corresponding 3D model according to the change of the identification action.
  • the change in the above (identification action) is a change in the positioning state of the start and end of the recognized real object action, the change being measurable or quantifiable.
  • the corresponding 3D model described above is a 3D model of a VR object forming a real object, such as a limb model, a head model, or a face model.
  • the real-time control method of the three-dimensional model of the present invention forms an action control command for controlling the 3D model by recognizing a real object in the acquired real-time video and an action change of the real object.
  • the action control instruction has small data volume and low data bandwidth requirement for real-time transmission, which can ensure real-time transmission in the mobile Internet environment.
  • the above steps are independently performed by the content production end, and the formed action control instructions can be buffered or saved as a kind of data.
  • the formed action control instructions can be buffered or saved as a kind of data.
  • On the content consumption side only the corresponding 3D model acquired is called, and the corresponding 3D model is controlled according to the received motion control instruction, so that the 3D model can complete the corresponding action.
  • the method when the system still needs to transmit the audio data, as shown in FIG. 1a, the method further includes:
  • Step 400 Synchronize the audio data and the motion control instruction, and output.
  • the above synchronization means that the action control instruction and the audio data per unit time are given the same reference point, or the reference tag, or the time stamp, so that the execution of the action control instruction and the audio data output can be synthesized to form a synchronization.
  • the above steps are for synchronizing the audio data accompanying the actual object motion with the continuous motion control command on the time axis to overcome the data asynchronous phenomenon caused by the processing delay in the data processing process.
  • FIG. 1b illustrates a method for real-time control of a three-dimensional model according to an embodiment of the present invention.
  • the method is a method for controlling a 3D model by using a motion control instruction, as shown in FIG. 1b, the method includes:
  • Step 500 Call the corresponding 3D model obtained
  • Step 600 Control the corresponding 3D model to complete the action according to the received action control instruction.
  • Step 600 may include:
  • the above cache is to overcome the data delay caused by multi-path transmission of the mobile Internet.
  • the real-time control method of the three-dimensional model in the embodiment can capture continuous real-time video by using the mobile terminal device at the content production end, perform object recognition on the main real object, locate the action of the real object, and mark the action change.
  • the marker data of the motion change is formed into a continuous motion control command.
  • the action control of the corresponding 3D model is completed by the action control instruction at the content consumption end.
  • the amount of data of the action control instruction formed by the content production end is greatly reduced compared with the amount of VR video data formed after the 3D model is rendered, which is more conducive to real-time transmission in the mobile Internet environment and guarantees the quality of the VR live broadcast.
  • the content production end and the content consumption end may be deployed on different devices or multimedia terminals of the local network, or may be deployed on different devices or multimedia terminals on both ends of the mobile Internet, and one content production end may be deployed on multiple local networks or The content consumer of the mobile Internet remote.
  • FIG. 2 is a flow chart showing motion recognition in a real-time control method of a three-dimensional model according to an embodiment of the present invention. As shown in FIG. 2, the step 200 shown in FIG. 1a includes the following steps:
  • Step 201 Identify a real object in an image of the real-time video according to the preset object recognition policy
  • Step 202 Identify a key point of a real object in the image according to a preset key point identification strategy
  • the position (coordinate) change of the above key points can reflect the slight movement changes of a specific object.
  • the position change of the head (key point) can reflect the movement of the head
  • the position change of the joint of the limb (key point) can reflect the torso
  • the movements of the face (key points) of the mouth, eyebrows and mouth shape can reflect facial expressions.
  • Step 203 forming a plane coordinate space of a key point and a stereo coordinate space of the corresponding 3D model
  • Step 204 Measure coordinate changes of key points in the plane coordinate space in the continuous image, and record corresponding coordinate changes of the key points in the continuous image in the three-dimensional coordinate space.
  • the real-time control method of the three-dimensional model of the embodiment uses an object recognition strategy to identify a specific object in the image, such as a limb, a head or a face, and uses a key point recognition strategy to identify a key to the specific object in the image that is closely related to the action change. point.
  • an object recognition strategy to identify a specific object in the image, such as a limb, a head or a face
  • a key point recognition strategy to identify a key to the specific object in the image that is closely related to the action change. point.
  • the coordinate changes of the key points form an action control command of the corresponding 3D model of the real object.
  • the coordinate difference of the key points of the same real object in the continuous image may be used as a parameter included in the motion control instruction of the corresponding 3D model to form a description of the action of the real object.
  • the abstract narrow-band coordinate data is used to form the control command, and the 3D model is controlled to form a corresponding action, thereby forming a rendered broadband VR video, so that the VR live broadcast is no longer limited by the transmission bandwidth, and is formed directly in the content consumption end in real time.
  • FIG. 3 is a flow chart of an embodiment of facial expression recognition in an embodiment of a real-time control method for a three-dimensional model of the present invention.
  • the method of recognizing the face and face key points in one frame image is as shown in FIG. 3, including:
  • Step 221 Acquire a frame of the original image M0 of the real-time video
  • Step 222 Generate a set of original image copies with correspondingly decreasing resolution according to the decreasing sampling rate: M1, M2...Mm-i, ... Mm-1, Mm;
  • Step 223 using the number of original image copies m as the number of loops, starting from the original image copy Mm with the lowest resolution, and sequentially performing face region calibration in the original image copy (using the face object recognition strategy);
  • Step 224 Determine whether the face region mark is completed in an original image copy; if not, return to step 223 to continue the face region calibration of the next original image copy; if completed, execute 225; when m original images After the copy is all looped, and the face area calibration is still not completed, step 227 is performed;
  • Step 225 Mark the corresponding original image copy Mm-i, and form face area calibration data
  • Step 226 using the face region calibration data in combination with the corresponding sampling rate, and completing the face region calibration in the subsequent original image copy (Mm-i...M2, M1) and the original image M0;
  • Step 227 Perform face region calibration using the original image M0;
  • the above step of completing the face region calibration step is further optimized, and a set of original image copies with correspondingly decreasing resolution may be generated according to the decreasing sampling rate, and the (most) low resolution original image copy obtained by completing the face region calibration is obtained therefrom. Face area calibration data.
  • the steps for face key calibration include:
  • Step 228 Perform face key point calibration on the original image copy Mm-i, or / and subsequent original image copy (Mm-i...M2, M1), or / and the original image M0 calibration face area,
  • the face key point calibration data with the difference in accuracy is formed.
  • the face key point identification strategy can be used to perform face key point calibration.
  • a method of sampling and attenuating the original image is used to obtain a set of original image copies that gradually reduce the resolution, so that the face area recognition strategy that causes the processing delay to be processed most consumes As fast as possible in a lower-precision image copy, saving processing resources; then combining the obtained face area calibration data with the sampling rate of each original image copy to quickly complete the original image copy and the original image in higher resolution
  • the face area calibration on the upper surface obtains high-precision face area calibration and corresponding face area calibration data, and the key points that do not consume processing resources are calibrated on the original image copy and the original image of each face area calibration. Get more accurate face key calibration data.
  • the face area calibration data of the original image copy is coordinate data, and the corresponding sampling rate is used as the scaling ratio of the original image, and the face area calibration data of an original image copy can be quickly and accurately mapped to different original image copies or original images. Complete the face area calibration with the corresponding position.
  • the face region calibrated for the original image copy Mm-i in step 228 is directly performed for the face key.
  • the point calibration determines the calibration data of the face key points, and the optimal processing rate of the face area calibration and the face key point calibration of one frame image can be obtained.
  • the face area calibration data of the original image M0 and the face key point calibration data help to improve the stability of the face key point calibration, and are applied to the high precision mode.
  • the camera of a mobile device such as the iPhone has a slight difference between each frame and each frame, the image after sampling is more stable by calculating the average value, and the difference between each frame and each frame is smaller.
  • the face image calibration data of the original image copy Mm-i and the face key point calibration data help to improve the stability of the algorithm and are applied to the stability mode.
  • the face area calibration and the face key point calibration data processing speed are very high, and can meet the real-time requirement of 25 frames per second (25 fps), and can implement an action on a mobile device or Real-time recognition of expressions.
  • the real-time face (face) detection and alignment processing is realized by utilizing the features of the real object in the video image, such as area, area, and displacement.
  • the method balances processing speed and processing accuracy. Under the premise of ensuring a certain precision, the real-time control method of the three-dimensional model of the present embodiment significantly improves the processing speed of continuous face region recognition.
  • FIG. 4 is a flow chart of another embodiment of facial expression recognition in an embodiment of a real-time control method for a three-dimensional model of the present invention. It shows a flow chart of a method for recognizing facial key points in successive frame images based on a method of recognizing face and face key points in a frame image. As shown in FIG. 4, the method includes:
  • Step 231 Acquire the corresponding original image copy Mm-i and the face image calibration data of the original image M0 according to the face region calibration of the real-time video one-frame image; this step may take the execution process of steps 221 to 226;
  • Step 232 Acquire an original image M0 of the subsequent consecutive time frame image and a corresponding original image copy Mm-i; then perform step 233 and step 234 respectively;
  • Step 233 Perform face area calibration of the original image copy Mm-i of the subsequent continuous time frame image by using the face region calibration data of the original image copy Mm-i;
  • Step 234 Perform calibration of the face region of the original image M0 of the frame image of the subsequent continuous duration by using the face region calibration data of the original image M0;
  • step 234 may be performed first and then step 233 may be performed, or both may be performed synchronously.
  • Step 235 Perform face key point calibration on the face image of the original image copy Mm-i and the original image M0 of the subsequent frames to form face key point calibration data with different precision.
  • the real-time control method of the three-dimensional model of the embodiment applies the face region calibration data in the previous frame to the face of the subsequent limited number of images for the feature that the real object does not have a large displacement in the specific scene in the real-time video.
  • the regional calibration further improves the calibration speed of the face region while ensuring the stability of the face region, and further reduces the consumption of processing resources in the face region calibration process.
  • FIG. 5 is a flow chart of another embodiment of facial expression recognition according to an embodiment of a real-time control method for a three-dimensional model of the present invention. It shows a flow chart of a method for recognizing face and face key points in a continuous frame image based on a method for recognizing a face key point in a frame image. As shown in FIG. 5, the method includes:
  • Step 241 Acquire a face region calibration data of the corresponding original image copy Mm-i or the original image M0 according to the face region calibration of the real-time video one-frame image; this step may take the execution process of steps 221 to 226;
  • Step 242 calibrate the face key point in the calibrated face area
  • Step 243 forming a bounding box range by using a face key point contour
  • Step 244 using the expanded bounding box range as the face area of the next frame, and performing face key point calibration in the bounding box range;
  • Step 245 determining whether the face key point calibration is successful; if successful, executing step 246; if not, proceeding to step 241;
  • Step 246 Forming an updated bounding box range using the face keypoint contour and scaling up the updated bounding box range; and proceeding to step 244 to obtain data for the next frame.
  • the contour of the determined face key point (the bounding box) in the previous frame is used as the face area calibration data of the next frame image, that is, the result of the previous frame is used as the next
  • the initial value of the frame is used to predict the next frame.
  • the calibration range of the face area is enlarged, so that when the face movement is not severe, we avoid the time-consuming face area detection for each frame, thereby improving the real-time performance of the overall operation of the algorithm. If the face key point calibration of this embodiment cannot obtain the correct result, indicating that the face may have strenuous motion between the two frames, then we perform a face detection again to obtain the location of the new face, and then re- Do key point calibration.
  • Facial expression capture in video images includes facial region recognition calibration procedures, facial keypoint location (eg, facial features) calibration, and general processing for images in video including, for example, image replication, sub-sampling to form images, image scaling, and similar images
  • image replication e.g., image replication
  • sub-sampling e.g., image scaling
  • similar images The coordinate mapping is established, the same or similar partial alignment and translation between different images, and the coordinate-based two-dimensional or three-dimensional angular transformation and distortion are not described in detail in this embodiment.
  • FIG. 6 is a flow chart of an embodiment of head motion recognition and facial expression recognition according to an embodiment of a real-time control method for a three-dimensional model according to the present invention. Based on the integrity of the head and face, which shows the method flow for recognizing the head motion in successive frame images based on the method of recognizing the key points in a frame image when the real object in the image is the head. Figure. As shown in FIG. 6, the method includes:
  • Step 251 According to the face region calibration of the face of the face in the real-time video image, the 2D key points of the front face are calibrated, and the key points having the relatively fixed positions are used to form the head toward the reference pattern; the process proceeds to step 254;
  • Step 253 forming a face reference plane according to a 2D key point of the face that has a relatively fixed position a face reference pattern of the face reference plane; performing step 254;
  • Step 254 forming a perspective projection on the face reference plane of the 2D face of the adjacent key image in the adjacent frame image of the real-time video, and obtaining the perspective of the face obtained by the step 2 of the 2D human face head toward the reference pattern according to step 251
  • the deformation of the face reference pattern of the reference plane obtains the Euler rotation data of the head or the quaternion rotation data.
  • the Euler rotation data described above includes the angle of rotation of the head with respect to the three axial directions of x, y, and z.
  • the Euler rotation data can be converted to quaternion rotation data for higher rotation state processing efficiency and smoothing difference during rotation.
  • the real-time control method of the three-dimensional model of the present embodiment utilizes a key point (for example, both eyes and a nose tip) in a face-up 2D (planar) face key point in the image to form a head-oriented reference pattern (for example, in both eyes and The polygon pattern with the tip of the nose as the apex, and the face reference plane and the face reference pattern are formed at the same time, and the 2D (planar) face key point and the projection coincidence of the face 3D (stereo) face key point are used to establish 2D.
  • the mapping relationship between face key point coordinates and 3D face key point coordinates It realizes the coordinate of 3D face key points and forms a mapping relationship through the coordinates of 2D face key points, so that the position change of 2D face key points can be accurately reflected in the 3D face model (including the integrated head model).
  • an action control instruction forming a corresponding 3D model of the head and the face includes:
  • Step 226 using the face region calibration data in combination with the corresponding sampling rate, and completing the face region calibration in the subsequent original image copy (Mm-i...M2, M1) and the original image M0;
  • Step 242 calibrate the face key point in the calibrated face area
  • Step 252 According to the 2D key points of the face in the real-time video image, the front view triangle mesh of the corresponding 3D head model is formed, and the coordinate mapping between the 2D key point of the face and the 3D key point of the 3D head model is formed;
  • Step 311 According to the obtained face key point, rotation angle and coordinate mapping, use the coordinate change of each key point of the 2D face of the real-time video continuous frame image acquired in step 254 and the head Euler rotation data or quaternion rotation Data, forming a face key point movement parameter between frames and a direction of rotation of the head;
  • Step 312 Encapsulate the key point movement parameter and the direction of rotation of the head into control instructions of the 3D model head and face of the corresponding frame.
  • the 2D key point is first upgraded into a 3D key point, and then the dimensionality is returned to the 2D to generate a 2D control point control method, which can be effective.
  • the modeling tool is used according to the modeling process of the general modeling rules, including the establishment of the three-dimensional model, the establishment of the three-dimensional scene, the transmission, storage and download of the three-dimensional model. And the deployment of 3D models in 3D scenes, not described in detail.
  • a 3D model of a cartoon image a 3D model of the torso and the head is usually included, and the 3D model of the head also includes a 3D model of the face, which can be separately stored, transmitted, or controlled.
  • FIG. 7 is a flowchart of control commands and audio data synchronization in an embodiment of a real-time control method for a three-dimensional model according to the present invention.
  • the step 400 shown in FIG. 1a may include:
  • Step 421 Add a time label (or time stamp) to the control instruction of the 3D model header in units of frames;
  • Step 422 Add a corresponding time label (or time stamp) to the audio data according to the time label of the control instruction;
  • Step 423 Adapt the control command and the audio data signal to the transmission link, and output in real time.
  • the mobile Internet transmission mechanism is affected, so that the control terminal and the audio data cannot be accurately synchronized at the content consumption end.
  • an appropriate buffer can be utilized to reduce the requirement for synchronous reception of signals, so that The synchronization output of the control command and the audio data is restored by the same time tag to ensure the audio and video synchronization quality of the VR live broadcast.
  • FIG. 8 is a schematic diagram showing the control effect of an embodiment of a real-time control method for a three-dimensional model according to the present invention.
  • the real object takes the face of the person as an example, and recognizes the change of the key position in the face region and the face region in the continuous image of the video, and forms a change parameter of the facial action expression according to the change amount, thereby forming a facial expression.
  • the continuous motion control command controls the motion of the corresponding key points on the face 3D model of the corresponding cartoon 3D model to form a facial expression of the real-time cartoon face 3D model.
  • the basic steps of the face area identification in the real-time control method of the three-dimensional model mainly include:
  • the face area identification speed is further improved by directly applying the face area on the corresponding copy of the adjacent frame image
  • the face key points are identified in the frame image or the face area of the corresponding copy to suit different application modes.
  • the basic steps of the head rotation identification in the real-time control method of the three-dimensional model mainly include:
  • the face reference pattern of the head orientation reference pattern, the face reference plane and the face reference plane is established by using the fixed key points of the front view 2D face in the frame or the corresponding duplicate image, so as to face the face 3D head model face
  • the key points form a coordinate mapping relationship with the 2D face key points
  • Obtaining a head rotation angle by measuring deformation of the head toward the reference pattern relative to the face reference pattern when the head of the adjacent frame image is rotated;
  • the control command of the head and face action expression is formed by combining the position change of the 2D face key point of the adjacent frame and the change of the head rotation angle.
  • FIG. 9 is a schematic structural diagram of an embodiment of a real-time control system of a three-dimensional model according to the present invention. As shown in FIG. 9, a video acquisition device 10, an image identification device 20, and an action instruction generation device 30 are included, wherein:
  • a video acquiring device 10 configured to acquire a real-time video of a real object
  • An image identifying device 20 configured to identify an action of a real object in the real-time video image
  • the motion command generating device 30 is configured to form an action control command of the corresponding 3D model according to the change of the marking action.
  • a real-time control system for a three-dimensional model further includes a synchronization output device 40 for synchronizing audio data and motion control commands and outputting them.
  • a real-time control system for a three-dimensional model further includes an activation device 80 and a playback device 90, wherein:
  • the activation device 80 is configured to invoke the acquired corresponding 3D model
  • the playing device 90 is configured to control the corresponding 3D model to complete the action according to the received motion control instruction.
  • the playback device 90 further includes a receiving device 91, a buffer device 92, a synchronization device 93, and an audio playback device 94, wherein:
  • a receiving device 91 configured to receive audio data and an action control instruction
  • a buffering device 92 configured to cache audio data and motion control instructions
  • the playing device 94 is configured to control the corresponding 3D model to complete the action and synchronously play the audio.
  • FIG. 10 is a schematic structural diagram of image recognition of an embodiment of a real-time control system of a three-dimensional model according to the present invention.
  • the image identification device 20 includes an object recognition device 21, an object key point recognition device 22, and a target bit.
  • the coordinate establishing means 23 and the object motion change recording means 24 are provided, wherein:
  • the object recognition device 21 is configured to identify a real object in an image of the real-time video according to the preset object recognition policy
  • the object key point identifying device 22 is configured to identify a key point of the real object in the image according to the preset key point identification strategy
  • the object position coordinate establishing means 23 is configured to form a plane coordinate space of the key point and a stereo coordinate space of the corresponding 3D model;
  • the object motion change recording device 24 is configured to measure coordinate changes of key points in the plane coordinate space in the continuous image, and record corresponding coordinate changes of the key points in the continuous image in the three-dimensional coordinate space.
  • the motion command generating device 30 includes a motion converting device 31 for changing a coordinate of a key point to form an action control command of a corresponding 3D model of the real object.
  • FIG. 11 is a schematic structural diagram of single frame object and key point recognition according to an embodiment of a real-time control system of a three-dimensional model of the present invention.
  • the original image capturing device 41, the image copy generating device 42, the copy cycle calibration device 43, the region calibration determining device 44, the copy region calibration device 45, the universal region calibration device 46, the universal region calibration device 47, and the key are included.
  • Point calibration device 48 wherein:
  • the original image capturing device 41 is configured to acquire a frame of the original image M0 of the real-time video
  • the image copy generating means 42 is configured to generate a set of original image copies with correspondingly decreasing resolution according to the decreasing sampling rate: M1, M2...Mm-i, ... Mm-1, Mm;
  • the copy cycle calibration device 43 is configured to perform face region calibration in the original image copy (using the face object recognition strategy) in the original image copy Mm starting from the original image copy Mm with the lowest resolution;
  • the area calibration determining means 44 is configured to determine whether the face area calibration is completed in an original image copy. If not completed, the copy cycle calibration device 43 is called to continue the next cycle calibration; when completed, the copy area calibration device 45 is called; when the loop ends If the face area calibration is not completed, the universal area calibration device 47 is called;
  • a copy area calibration device 45 for marking the corresponding original image copy Mm-i and forming face area calibration data
  • the universal area calibration device 46 is configured to perform face area calibration on the subsequent original image copies (Mm-i...M2, M1) and the original image M0 by using the face area calibration data in combination with the corresponding sampling rate;
  • the universal area calibration device 47 is configured to perform face area calibration by using the original image M0 when the end of the cycle is not completed;
  • the key point calibration device 48 is used for the face region of the original image copy Mm-i, the subsequent original image copy (Mm-i...M2, M1), and the original image M0, (using the face key point recognition strategy)
  • the face key point calibration is performed to form the face key point calibration data with the difference in accuracy.
  • FIG. 12 is a schematic structural diagram of object recognition in consecutive frames according to an embodiment of a real-time control system for a three-dimensional model of the present invention.
  • a face region calibration device 51 As shown in FIG. 12, a face region calibration device 51, a continuous frame processing device 52, a continuous frame region calibration device 53, a copy region calibration determining device 54, and a raw region calibration device 55 are included, wherein:
  • the face area calibration device 51 is configured to acquire (by the universal area calibration device 46) the corresponding original image copy Mm-i and the face area calibration data of the original image M0;
  • the continuous frame processing device 52 is configured to acquire the original image M0 of the frame image of the subsequent continuous duration and the corresponding original image copy Mm-i;
  • the continuous frame area calibration device 53 is configured to perform face area calibration of the original image M0 of the subsequent continuous time frame image by using the face area calibration data of the original image M0;
  • the copy area calibration determining means 54 is configured to perform face area calibration of the original image copy Mm-i of the subsequent continuous time frame image by using the face area calibration data of the original image copy Mm-i;
  • the original area calibration device 55 is configured to perform face key point calibration on the face image of the original image copy Mm-i and/or the original image M0 of subsequent frames to form face key point calibration data with different precision.
  • the face key point calibration device 62 As shown in FIG. 12, the face key point calibration device 62, the key point contour generating device 63, and the adjacent frame are also included.
  • the face key point calibration device 62 is configured to calibrate the face key point in the face area obtained by acquiring the corresponding original image copy Mm-i or the original image M0;
  • a key point contour generating device 63 configured to form a bounding box range by using a face key point contour
  • the adjacent frame key point calibration device 64 is configured to perform the face key point calibration in the expanded bounding box range by using the expanded bounding box range as the face area of the next frame;
  • the adjacent frame calibration determining means 65 is configured to determine whether the face key point calibration is successful, and if successful, the key point contour updating means 66 is invoked; if not, the face key point calibration means 62 is invoked;
  • the keypoint contour updating device 66 is configured to form an updated bounding box range by using the face keypoint contour, and scale the updated bounding box range to call the adjacent frame keypoint calibration device 64.
  • FIG. 13 is a schematic structural diagram of head and face motion recognition according to an embodiment of a real-time control system for a three-dimensional model of the present invention.
  • a head orientation reference generating means 71 a coordinate map generating means 72, a face reference generating means 73, and a turning angle measuring means 74 are included, wherein:
  • a head-facing reference generating device 71 configured to calibrate a 2D key point of the face of the face according to the face region calibration of the face in the real-time video image, and use the key point having a relatively fixed position to form the head toward the reference pattern;
  • the coordinate map generating device 72 is configured to form a 3D key point of the face and a 3D key point of the 3D head model according to the front view triangle mesh of the corresponding 3D head model formed by the 2D key points of the face in the real-time video image. Coordinate mapping between
  • a face reference generating device 73 configured to form a face reference pattern of the face reference plane and the face reference plane according to the 2D key points of the front face having relatively fixed positions;
  • the rotation angle measuring device 74 is configured to form a perspective projection on the face reference plane of the 2D face of the adjacent key image in the adjacent frame image of the real-time video, according to the face of the 2D person face facing the reference pattern relative to the face reference plane
  • the deformation of the face reference pattern obtains the Euler rotation data of the head or the quaternion rotation data.
  • a real-time control system for a three-dimensional model includes a head-and-face motion parameter generating device 32 and a control command generating device 33 for forming a control command for a head-and-face object motion of a video continuous frame, wherein:
  • the head and face motion parameter generating means 32 is configured to use the coordinate change of each key point of the 2D face of the real-time video continuous frame image and the head Euler rotation data or the quaternion rotation data to form a face key point movement between frames Parameters and the direction of rotation of the head;
  • the control command generating means 33 is configured to encapsulate the key point movement parameter and the rotation direction of the head into control instructions of the 3D model head and face of the corresponding frame.
  • a real-time control system for a three-dimensional model includes an audio data synchronization device 35.
  • the audio data synchronization device 35 is configured to add a corresponding time label to the audio data according to the time label of the control instruction;
  • the control instruction synchronization device 36 is configured to add a time stamp to the control instruction of the 3D model header in units of frames;
  • the real-time output device 37 is configured to adapt the control command and the audio data signal to the transmission link for real-time output.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Geometry (AREA)
  • Computer Graphics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computer Hardware Design (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Processing Or Creating Images (AREA)

Abstract

一种三维模型的实时控制方法,用于解决在移动互联网环境中,无法通过有限资源形成对现实对象的实时反馈,以控制三维模型的动作形成流畅视频的技术问题。其中,所述方法包括:获取现实对象的实时视频(100);标识实时视频图像中现实对象的动作(200);根据标识动作的变化,形成相应3D模型的动作控制指令(300),同步化音频数据和动作控制指令,并输出(400)。

Description

三维模型的实时控制方法和系统 技术领域
本发明实施例涉及一种立体模型的控制方法和系统,特别是涉及一种三维模型的实时控制方法和系统。
背景技术
在视听设备以及移动通信设备上进行视频播放或视频交互已十分普遍,视频中的交流对象往往是人的真实形象。随着通信、传感器和建模领域的技术进步,三维人物模型实时交互在全世界范围内正在兴起。已有技术方案可以实现以虚拟的卡通形象实时替换人的真实形象,形成替代真人形象的卡通形象间实时交互,并且较好的完成喜怒哭笑等情感的表情表达。例如,将真人直播讲故事变为卡通人物讲故事,将真人老师讲物理变为著名科学家讲物理。两个陌生人可以通过分别扮演不同的角色来进行视屏交互,例如白雪公主可以与白马王子视频聊天。
要实现以上的目标,我们需要用现实世界中真人的肢体,尤其是面部表情和动作,来控制虚拟世界中三维模型的表情和动作,使两者实现联动。
然而,作为一个在全球范围内很流行和很新颖的领域,现有技术方案中在基于真人肢体动作,特别是真人表情和动作的三维模型控制,在移动互联网领域的应用存在明显的技术缺陷。
例如在一种现有技术中,针对头部以及面部的现实对象,需要借助专业设备的高清摄像头,并保持与人脸位置相对固定,结合脸上贴点的方式来实现高精度的表情控制,摄像机相对于脸的位置是固定的保持垂直。本技术方案通过固定摄像头与立案的相对位置,避免了人在转头时摄像机相对于脸部的运动。而在利用移动终端的摄像头摄取人脸时,转头的结果会造成摄像头与面部正面不再垂直,导致面部动作表情无法准确采集。
另一种现有技术中,美国Stanford大学计算机系通过使用RGBD摄像头,借助摄像头提供的深度信息来实现类似功能。但是现在的移动设备大都配备了RGB摄像头,没有深度信息使得该算法无法推广到更广阔的移动互联网场景中。
另一种现有技术中,FaceRig和Adobe的技术是在PC计算机上基于RGB摄像头实现了类似功能。然而,由于移动设备计算能力较弱,不容易做出实时性的效果。
由于可见,现有技术的基于真人表情和动作的三维模型控制方案或者依赖于特殊的视频采集设备,或者依赖于计算机的强大计算能力,都没有能够实现只借助于普通移动设备(手机)实现对三维模型的实时控制。
发明内容
有鉴于此,本发明实施例提供了一种三维模型的实时控制方法,用于解决在移动互联网环境中,无法通过终端的有限运算资源形成对现实对象的实时反馈,以控制三维模型的动作形成流畅视频的技术问题。
同时,本发明实施例还提供了一种三维模型的实时控制系统,用于解决受移动互联网环境、移动终端处理能力和摄像头性能等硬件资源制约,现实对象的三维模型的动作无法实现实时动作控制形成流畅视频的技术问题。
本发明的三维模型的实时控制方法,包括:
获取现实对象的实时视频;
标识实时视频图像中现实对象的动作;
根据标识动作的变化,形成相应3D模型的动作控制指令。
本发明的三维模型的实时控制方法,包括:
获取现实对象中头部及面部的实时视频;
利用视频中帧图像的低分辨率副本定位人脸区域;
通过将人脸区域在相邻帧图像的相应副本上直接应用;
在帧图像或相应副本的人脸区域标识人脸关键点;
利用图像中的正视2D人脸的位置固定的关键点建立头部朝向基准图案、人脸基准平面和人脸基准平面的人脸基准图案,与正视的3D头部模型形成坐标映射关系;
通过测量相邻帧图像的头部转动时头部朝向基准图案相对人脸基准图案的变形,获得头部旋转数据;
结合相邻帧2D人脸关键点的位置变化和头部旋转数据,形成头面部动作表情的控制指令。
本发明的三维模型的实时控制系统,包括:
视频获取装置,用于获取现实对象的实时视频;
图像标识装置,用于标识实时视频图像中现实对象的动作;
动作指令生成装置,用于根据标识动作的变化,形成相应3D模型的动作控制指令。
本发明的三维模型的实时控制方法,通过在获取的实时视频中识别现实对象,以及现实对象的动作变化,形成用于控制3D模型的动作控制指令。动作控制指令作为具有具体含义的抽象数据,数据量小,实时传输的数据带宽要求低,可以保证在移动互联网环境中的传输实时性。本发明的三维模型的实时控制方法,避免了3D模型实时渲染形成的大量视频数据在移动互联网环境中传输的时延,以及形成的VR视频播放的卡顿,使得3D模型的渲染生成过程与控制生成过程可以在移动互联网环境的两端完成,一端利用有限硬件资源的移动终端完成对现实对象动作变化的识别捕捉和指令形成,另一端利用移动互联网环境完成必要3D模型和场景的下载、装载和激活,3D模型通过实时传输的控制指令完成现实对象的相应动作,形成相应的模型渲染和场景渲染得VR直播。
本发明的三维模型的实时控制系统可以在移动互联网环境中资源有限的移动终端中部署,利用有限的终端处理和摄像头能力集中处理现实对象的动作变化过程,高效获取现实对象的准确动作状态,形成基于动作变化的控制指令。控制指令可以对任意匹配的3D模型进行准确的实时动作控制,完成现实对象的实时动作在3D模型的真实表达。使得3D模型动作控制不需要融合在现实对象的视频中,对现实对象的动作仿真不再局限于移动互联网环境的有限带宽。
附图说明
图1a为本发明的三维模型的实时控制方法一实施例的处理流程图。
图1b为本发明的三维模型的实时控制方法一实施例的处理流程图。
图2为本发明的三维模型的实时控制方法一实施例的动作识别的流程图。
图3为本发明的三维模型的实时控制方法一实施例的面部表情识别的一个实施例流程图。
图4为本发明的三维模型的实时控制方法一实施例的面部表情识别的另一个实施例流程图。
图5为本发明的三维模型的实时控制方法一实施例的面部表情识别的另一个实施例流程图。
图6为本发明的三维模型的实时控制方法一实施例的头部动作识别与面部表情识别的一个实施例流程图。
图7为本发明的三维模型的实时控制方法一实施例的控制指令和音频数据同步化的流程图。
图8为本发明的三维模型的实时控制方法一实施例的控制效果示意图。
图9为本发明的三维模型的实时控制系统一实施例的结构示意图。
图10为本发明的三维模型的实时控制系统一实施例的图像识别的结构示意图。
图11为本发明的三维模型的实时控制系统一实施例的单帧对象及关键点识别的结构示意图。
图12为本发明的三维模型的实时控制系统一实施例的连续帧中对象识别的结构示意图。
图13为本发明的三维模型的实时控制系统一实施例的头部及面部动作识别的结构示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
图纸中的步骤编号仅用于作为该步骤的附图标记,不表示执行顺序。
图1a为本发明一实施例的三维模型的实时控制方法流程图,该方法为内容生产端独立完成的控制过程。如图1a所示,该方法包括:
步骤100:获取现实对象的实时视频;
上述的现实对象包括完整的人体,或人体的肢体、头部或面部,相应的包括肢体动作、头部动作及面部动作(表情)。
步骤200:标识实时视频图像中现实对象的动作;
上述标识包括对现实对象的识别,对识别的现实对象的定位,对识别的现实对象动作的定位,对识别的现实对象动作变化的定位。例如包括对肢体或头部动作的捕捉(标记)和分析(识别),或对面部表情的捕捉(标记)和分析(识别)。
步骤300:根据标识动作的变化,形成相应3D模型的动作控制指令。
上述(标识动作)的变化,是对识别的现实对象动作起始和终止的定位状态的变化,该变化是可测量的或可以量化的。
上述相应3D模型,是形成现实对象的VR对象的3D模型,例如肢体模型、头部模型或面部模型。
本发明的三维模型的实时控制方法,通过在获取的实时视频中识别现实对象,以及现实对象的动作变化,形成用于控制3D模型的动作控制指令。动作控制指令作为具有具体含义的抽象数据,数据量小,实时传输的数据带宽要求低,可以保证在移动互联网环境中的传输实时性。
以上步骤由内容生产端独立完成,形成的动作控制指令作为一种数据可以缓冲或保存。在内容消费端,只需要调用获取的相应3D模型,并根据接收到的动作控制指令控制相应3D模型,就可以令3D模型完成相应动作。
在本发明另一实施例的三维模型的实时控制方法中,当系统还存在音频数据需要同时传输时,如图1a所示,还可以进一步包括:
步骤400:同步化音频数据和动作控制指令,并输出。
上述的同步化是指单位时间内的动作控制指令和音频数据赋予相同的参照点,或参考标签,或时间戳,使得动作控制指令的执行和音频数据输出可以合成,形成同步。
上述步骤是为了将伴随现实对象动作的音频数据与连续的动作控制指令在时间轴上同步,以克服数据处理过程中处理时延造成的数据不同步现象。
图1b所示为本发明一实施例的三维模型的实时控制方法,该方法为内容消费端利用动作控制指令控制3D模型的方法,如图1b所示,该方法包括:
步骤500:调用获取的相应3D模型;
步骤600:根据接收到的动作控制指令控制相应3D模型完成动作。
当接收到的信息除了包含动作控制指令外还包括伴音的音频数据时,为了将动作控 制指令形成的3D模型动作与伴音的音频数据准确匹配,步骤600可以包括:
接收音频数据以及动作控制指令的步骤;
缓存音频数据和动作控制指令的步骤;
音频数据和动作控制指令配合的步骤;
相应3D模型完成动作过程中同步播放音频的步骤。
上述的缓存,是为了克服移动互联网的多路径传输造成的数据延时。
本实施例的三维模型的实时控制方法,在内容生产端可以利用移动终端设备捕捉连续的实时视频,对其中主要的现实对象进行对象识别,对现实对象的动作进行定位,对动作变化进行标记,将动作变化的标记数据形成连续的动作控制指令。
进而在内容消费端通过动作控制指令完成对相应3D模型的动作控制。内容生产端形成的动作控制指令的数据量与3D模型渲染后形成的VR视频数据量相较,大大降低,更有利于在移动互联网环境中实时传输,保证VR直播的质量。
内容生产端和内容消费端可以是在本地网络的不同设备或多媒体终端上部署,也可以是在移动互联网两端的不同设备或多媒体终端上部署,一个内容生产端可以对应多个部署在本地网络或移动互联网远端的内容消费端。
图2所示为本发明一实施例的三维模型的实时控制方法中的动作识别的流程图。如图2所示,图1a所示的步骤200包括以下步骤:
步骤201:根据预置对象识别策略在实时视频的图像中识别现实对象;
步骤202:根据预置关键点识别策略识别图像中现实对象的关键点;
上述关键点的位置(坐标)变化可以反映特定对象的细微动作变化,例如头部的(关键点)五官的位置变化可以反映头部的动作,肢体的(关键点)关节的位置变化可以反映躯干的动作,面部的(关键点)嘴角、眉梢和嘴型的位置变化可以反映面部表情。
步骤203:形成关键点的平面坐标空间和相应3D模型的立体坐标空间;
步骤204:测量连续图像中关键点在平面坐标空间的坐标变化,记录连续图像中关键点在立体坐标空间中的相应坐标变化。
本实施例的三维模型的实时控制方法,利用对象识别策略,识别图像中的特定对象,如肢体、头部或面部,利用关键点识别策略,识别图像中特定对象的与动作变化联系紧密的关键点。通过建立图像中2D现实对象的平面坐标系和相应3D模型的立体坐标系的初始映射关系,可以将在2D图像中反应的关键点位置变化数据转换为相应3D模型的关键点位置变化数据。
这种情况下,将关键点的坐标变化形成现实对象相应3D模型的动作控制指令。
具体而言,可以将连续图像中的相同现实对象的关键点的坐标差异作为相应3D模型的动作控制指令包括的参数,形成对现实对象动作的描述。这样,利用抽象的窄带的坐标数据形成控制指令,控制3D模型形成相应动作,进而形成渲染的宽带VR视频,使得VR直播不再受传输带宽限制,直接在内容消费端实时形成。
图3为本发明的三维模型的实时控制方法一实施例的面部表情识别的一个实施例流程图。当图像中的现实对象为面部,在一帧图像中识别面部和面部关键点的方法如图3所示,包括:
步骤221:获取实时视频的一帧原始图像M0;
步骤222:根据递减的抽样率,生成分辨率相应递减的一组原始图像副本:M1,M2...Mm-i,…Mm-1,Mm;
步骤223:以原始图像副本个数m为循环次数,从分辨率最低的原始图像副本Mm开始,顺序在原始图像副本中(利用人脸对象识别策略)进行人脸区域标定;
步骤224:判断在一个原始图像副本中是否完成人脸区域标;若没完成则转回执行步骤223,继续下一个原始图像副本的人脸区域标定;若完成则执行225;当m个原始图像副本全部循环结束,依然没完成人脸区域标定,则执行步骤227;
步骤225:将相应的原始图像副本Mm-i标记,并形成人脸区域标定数据;
步骤226:利用人脸区域标定数据结合相应的抽样率,在后续的原始图像副本(Mm-i...M2,M1)和原始图像M0完成人脸区域标定;
步骤227:采用原始图像M0完成人脸区域标定;
通过以上步骤完成人脸区域标定。
以上完成人脸区域标定步骤进一步优化处理,可以根据递减的抽样率,生成分辨率相应递减的一组原始图像副本,从中获取完成人脸区域标定的(最)低分辨率的原始图像副本,形成人脸区域标定数据。
人脸关键点标定的步骤包括:
步骤228:在原始图像副本Mm-i,或/和后续的原始图像副本(Mm-i...M2,M1),或/和原始图像M0标定的人脸区域,进行人脸关键点标定,形成存在精度差别的人脸关键点标定数据。在本发明一实施例中,可以利用人脸关键点识别策略进行人脸关键点标定。
本实施例的三维模型的实时控制方法中,作为普通模式,采用抽样衰减原始图像的方法获得一组逐渐降低分辨率的原始图像副本,使得最消耗处理资源造成处理时延的人脸区域识别策略尽可能在精度较低的图像副本中快速完成,节约处理资源;再将获得的人脸区域标定数据与各原始图像副本的抽样率结合,快速完成在更高分辨率的原始图像副本和原始图像上的人脸区域标定,获得高精度的人脸区域标定以及相应的人脸区域标定数据,同时将不消耗处理资源的关键点标定在每个人脸区域标定的原始图像副本和原始图像上标定,获得更高精度的人脸关键点标定数据。这样,利用本实施例的三维模型的实时控制方法,可以获得针对不同精度要求的人脸区域标定和人脸关键点标定。
原始图像副本的人脸区域标定数据是坐标数据,相应的抽样率作为原始图像的缩放比例,可以快捷准确的将一个原始图像副本的人脸区域标定数据映射至不同的原始图像副本或原始图像的相应位置,完成人脸区域标定。
本领域技术人员可以理解,作为快速模式,在步骤224完成原始图像副本Mm-i的人脸区域标定后,直接进行步骤228中的针对原始图像副本Mm-i标定的人脸区域进行人脸关键点标定,形成人脸关键点标定数据,可以获得对一帧图像的人脸区域标定和人脸关键点标定的最优处理速率。
原始图像M0的人脸区域标定数据和人脸关键点标定数据有助于提高人脸关键点标定的稳定性,应用于高精度模式。另一方面,由于iPhone这样的移动设备的摄像头摄取的每帧与每帧之间存在细微的差别,通过计算平均值的方法抽样后的图像更加稳定,每帧与每帧之间的差别更小,原始图像副本Mm-i的人脸区域标定数据和人脸关键点标定数据有助于提高算法的稳定性,应用于稳定性模式。
本实施例的三维模型的实时控制方法中,人脸区域标定和人脸关键点标定数据处理速度非常高,可以满足每秒25帧(25fps)的实时性要求,能够在移动设备上实现动作或表情的实时识别。通过对主播直播场景,视频通话场景,快速运动场景等应用场景的分析,利用现实对象在视频图像中的面积、区域、位移等特点,实现了高实时性人脸(面部)检测与对齐的处理方法,可以在处理速度和处理精度之间进行平衡。在保证一定精度的前提下,本实施例的三维模型的实时控制方法显著提高了连续人脸区域识别的处理速度。
图4为本发明的三维模型的实时控制方法一实施例的面部表情识别的另一个实施例流程图。其示出了在一帧图像中识别面部和面部关键点的方法的基础上,在连续帧图像中识别面部关键点的方法流程图。如图4所示,该方法包括:
步骤231:根据实时视频一帧图像的人脸区域标定,获取相应的原始图像副本Mm-i和原始图像M0的人脸区域标定数据;该步骤可以采取步骤221至步骤226的执行过程;
步骤232:获取后续连续时长的帧图像的原始图像M0和相应的原始图像副本 Mm-i;然后分别执行步骤233和步骤234;
步骤233:利用原始图像副本Mm-i的人脸区域标定数据完成后续连续时长的帧图像的原始图像副本Mm-i的人脸区域标定;
步骤234:利用原始图像M0的人脸区域标定数据,完成后续连续时长的帧图像的原始图像M0的人脸区域标定;
本领域技术人员可以理解,步骤233和步骤234之间并没有先后执行顺序的差异,也可以先执行步骤234再执行步骤233,或者两者同步执行。
步骤235:在后续各帧的原始图像副本Mm-i和原始图像M0标定的人脸区域,进行人脸关键点标定,形成存在精度差别的人脸关键点标定数据。
本实施例的三维模型的实时控制方法,针对实时视频中现实对象在特定场景中不会产生较大位移的特点,将前一帧中的人脸区域标定数据应用于后续有限数量图像的人脸区域标定,在保证人脸区域标定稳定度的情况下进一步提高了人脸区域的标定识别速度,进一步降低了人脸区域标定过程对处理资源的消耗。
图5为本发明的三维模型的实时控制方法一实施例的面部表情识别的另一个实施例流程图。其示出了在一帧图像中识别面部关键点的方法的基础上,另一种在连续帧图像中识别面部和面部关键点的方法流程图,如图5所示,该方法包括:
步骤241:根据实时视频一帧图像的人脸区域标定,获取相应的原始图像副本Mm-i或原始图像M0的人脸区域标定数据;该步骤可以采取步骤221至步骤226的执行过程;
步骤242:在标定的人脸区域,标定人脸关键点;
步骤243:利用人脸关键点轮廓形成包围盒范围;
步骤244:利用扩大的包围盒范围作为下一帧的人脸区域,在包围盒范围内进行人脸关键点标定;
步骤245:判断人脸关键点标定是否成功;若成功则执行步骤246,不成功则转向步骤241;
步骤246:利用人脸关键点轮廓形成更新的包围盒范围,并比例放大更新的包围盒范围;并转向执行步骤244以获得下一帧的数据。
本实施例的三维模型的实时控制方法,以前一帧中的确定的人脸关键点的轮廓(包围盒)作为下一帧图像的人脸区域标定数据,即以前一帧的结果来作为下一帧的初值,来对下一帧进行预测。当人脸没有做剧烈运动时,此算法运行速度很高,对处理资源的消耗极小。当人脸正在做剧烈运动时,例如主播在跳舞和快速甩头,此算法运行速度与普通算法基本相同。
利用包围盒的适当扩大,扩大人脸区域标定范围,使得在人脸运动不剧烈时,我们避免了每一帧都进行耗时的人脸区域检测,从而提高了算法整体运行的实时性。如果本实施例的人脸关键点标定不能得到正确的结果,说明人脸可能在此两帧之间产生了剧烈运动,那么我们重新进行一次人脸检测,得到新的人脸所在位置,再重新做关键点标定。
在视频图像中面部表情捕获包括面部区域识别标定过程、面部关键点位置(如五官)标定,以及对于视频中的图像通用处理过程包括例如图像复制、二次抽样形成图像、图像缩放、相似图像间的坐标映射建立、不同图像间相同或相似局部的对齐和平移等基于坐标的二维或三维的角度变换和扭曲,本实施例中不做详细描述。
图6为本发明的三维模型的实时控制方法一实施例的头部动作识别与面部表情识别的一个实施例流程图。基于头部和面部的整体性,其示出了当图像中的现实对象为头部,在一帧图像中识别面部关键点的方法的基础上,在连续帧图像中识别头部动作的方法流程图。如图6所示,该方法包括:
步骤251:根据实时视频图像中正视人脸的人脸区域标定,标定正视人脸的2D关键点,利用其中具有相对固定位置的关键点,形成头部朝向基准图案;跳转到步骤254;
步骤253:根据正视人脸的具有相对固定位置的2D关键点,形成人脸基准平面和 人脸基准平面的人脸基准图案;执行步骤254;
步骤254:实时视频的相邻帧图像中被标定关键点的2D人脸在人脸基准平面上形成透视投影,根据步骤251获得的2D人脸上头部朝向基准图案相对步骤253获得的人脸基准平面的人脸基准图案的变形,获得头部的欧拉旋转数据或四元数旋转数据。
上述的欧拉旋转数据包括头部相对于x、y、z三个轴向的转动角度。
通过欧拉旋转数据可以转换为四元数旋转数据,以获得更高的旋转状态处理效率,和旋转过程中的平滑差值。
本实施例的三维模型的实时控制方法,利用图像中的正视2D(平面的)人脸关键点中保持固定间距的关键点(例如双眼和鼻尖)形成头部朝向的基准图案(例如以双眼和鼻尖为顶点的多边形图案),同时形成人脸基准平面和人脸基准图案,并利用正视2D(平面的)人脸关键点与正视3D(立体的)人脸关键点的投影重合性,建立2D人脸关键点坐标与3D人脸关键点坐标的映射关系。实现了通过2D人脸关键点坐标升维3D人脸关键点坐标并形成映射关系,使得2D人脸关键点的位置变化可以在3D人脸模型(包括一体的头部模型)中准确反映。
通过比较头部转动时的头部朝向基准图案相对人脸基准平面的人脸基准图案中线条的的变形角度和变形长度,以获得头部相对于x、y、z三个轴向的转动角度用于欧拉旋转或四元数旋转。
这意味着人脸关键点的坐标变化即体现了脸部表情变化时的关键点的坐标变化,也体现了在不同坐标系空间内头部转动的坐标变化。经过本实施例的三维模型的实时控制方法,这种坐标变化可以成为3D模型的控制基础。
如图6所示,当现实对象为头部时,形成头部和面部的相应3D模型的动作控制指令,包括:
步骤226:利用人脸区域标定数据结合相应的抽样率,在后续的原始图像副本(Mm-i...M2,M1)和原始图像M0完成人脸区域标定;
步骤242:在标定的人脸区域,标定人脸关键点;
步骤252:根据实时视频图像中正视人脸的2D关键点,形成的相应3D头部模型的正视三角网格,形成人脸的2D关键点与3D头部模型的3D关键点间的坐标映射;
步骤311:根据获得的人脸关键点、转动角度和坐标映射,利用步骤254获取的实时视频连续帧图像的2D人脸的各关键点的坐标变化和头部欧拉旋转数据或四元数旋转数据,形成帧间的人脸关键点移动参数和头部的转动方向;
步骤312:将关键点移动参数和头部的转动方向封装成相应帧的3D模型头部和面部的控制指令。
在本发明一实施例中,对于头部转动对面部表情的变形影响,首先先将2D关键点升维成3D关键点,再降维回2D从而生成2D控制点的控制方法,这样可以有效的解决头部存在转角的条件下,对精细表情的识别和表达。当现实对象正视摄像头无转头的情况下,可以认为转角是0度,再统一采用相同的方法进行处理。
对于在现有技术中体现的三维(3D)建模过程,利用建模工具根据通用的建模规则的建模过程,包括三维模型的建立、三维场景的建立、三维模型的传输、存储和下载,以及三维模型在三维场景中的调配,不做详细描述。对于卡通形象的三维模型,通常包括躯干和头部的3D模型,头部的3D模型还包括面部的3D模型,这些3D模型形成可分别存储、传输或控制。对于在三维场景中的三维模型表面形成体现凹凸纹理的精细的3D网格,通过调整对应3D网格顶点的空间坐标改变三维模型的局部形状的过程,不做详细描述。
图7为本发明的三维模型的实时控制方法一实施例的控制指令和音频数据同步化的流程图。如图7所示,图1a所示的步骤400可以包括包括:
步骤421:对3D模型头部的控制指令以帧为单位增加时间标签(或时间戳);
步骤422:根据控制指令的时间标签,为音频数据增加相应的时间标签(或时间戳);
步骤423:将控制指令和音频数据信号适配传输链路,实时输出。
在本发明一实施例中,受移动互联网传输机制影响,使得在内容消费端不能准确同步接收控制指令和音频数据,这种情况下,可以利用适当的缓冲区降低对信号同步接收的要求,使得通过相同的时间标签恢复控制指令和音频数据的同步输出,以保证了VR直播的音视频同步质量。
图8为本发明的三维模型的实时控制方法一实施例的控制效果示意图图。如图8所示,现实对象以人物面部为例,通过识别视频连续图像中的人脸区域和人脸区域内的关键点位置变化,根据变化量形成面部动作表情的变化参数,进而形成面部表情的连续动作控制指令,在对应的卡通3D模型的面部3D模型上,对相应关键点进行动作控制,形成实时的卡通面部3D模型的面部表情。
概括的,在本发明一实施例中,一种三维模型的实时控制方法中人脸区域标识的基本步骤主要包括:
通过视频中帧图像的低分辨率副本定位人脸区域,以提高人脸区域标识速度;
通过将人脸区域在相邻帧图像的相应副本上直接应用,进一步提高人脸区域标识速度;
在帧图像或相应副本的人脸区域标识人脸关键点,以适用不同应用模式。
概括的在本发明一实施例中,一种三维模型的实时控制方法中头部转动标识的基本步骤主要包括:
利用帧或相应副本图像中的正视2D人脸的位置固定的关键点建立头部朝向基准图案、人脸基准平面和人脸基准平面的人脸基准图案,以便于将正视的3D头部模型面部的关键点与2D人脸关键点形成坐标映射关系;
通过测量相邻帧图像的头部转动时头部朝向基准图案相对人脸基准图案的变形,获得头部转动角度;
结合相邻帧2D人脸关键点的位置变化和头部转动角度变化,形成头面部动作表情的控制指令。
图9为本发明的三维模型的实时控制系统一实施例的结构示意图。如图9所示,包括视频获取装置10、图像标识装置20和动作指令生成装置30,其中:
视频获取装置10,用于获取现实对象的实时视频;
图像标识装置20,用于标识实时视频图像中现实对象的动作;
动作指令生成装置30,用于根据标识动作的变化,形成相应3D模型的动作控制指令。
本发明一实施例的三维模型的实时控制系统,还包括同步化输出装置40,用于同步化音频数据和动作控制指令,并输出。
本发明一实施例的三维模型的实时控制系统,还包括激活装置80和播放装置90,其中:
激活装置80,用于调用获取的相应3D模型;
播放装置90,用于根据接收到的动作控制指令控制相应3D模型完成动作。
本发明一实施例的三维模型的实时控制系统,播放装置90还包括接收装置91、缓存装置92、同步装置93和音频播放装置94,其中:
接收装置91,用于接收音频数据和动作控制指令;
缓存装置92,用于缓存音频数据和动作控制指令;
同步装置93,用于音频数据和动作控制指令配合;
播放装置94,用于控制相应3D模型完成动作并同步播放音频。
图10为本发明的三维模型的实时控制系统一实施例的图像识别的结构示意图。如图10所示,图像标识装置20包括对象识别装置21、对象关键点识别装置22、对象位 置坐标建立装置23和对象动作变化记录装置24,其中:
对象识别装置21,用于根据预置对象识别策略在实时视频的图像中识别现实对象;
对象关键点识别装置22,用于根据预置关键点识别策略识别图像中现实对象的关键点;
对象位置坐标建立装置23,用于形成关键点的平面坐标空间和相应3D模型的立体坐标空间;
对象动作变化记录装置24,用于测量连续图像中关键点在平面坐标空间的坐标变化,记录连续图像中关键点在立体坐标空间中的相应坐标变化。
如图10所示,动作指令生成装置30包括动作转换装置31,用于将关键点的坐标变化形成现实对象相应3D模型的动作控制指令。
图11为本发明的三维模型的实时控制系统一实施例的单帧对象及关键点识别的结构示意图。如图11所示,包括原始图像捕捉装置41、图像副本生成装置42、副本循环标定装置43、区域标定判断装置44、副本区域标定装置45、普遍区域标定装置46、通用区域标定装置47和关键点标定装置48,其中:
原始图像捕捉装置41,用于获取实时视频的一帧原始图像M0;
图像副本生成装置42,用于根据递减的抽样率,生成分辨率相应递减的一组原始图像副本:M1,M2...Mm-i,…Mm-1,Mm;
副本循环标定装置43,用于以原始图像副本个数m为循环次数,从分辨率最低的原始图像副本Mm开始,顺序在原始图像副本中(利用人脸对象识别策略)进行人脸区域标定;
区域标定判断装置44,用于判断在一个原始图像副本中是否完成人脸区域标定,没完成则调用副本循环标定装置43,继续下一个循环标定;完成则调用副本区域标定装置45;当循环结束没完成人脸区域标定则调用通用区域标定装置47;
副本区域标定装置45,用于将相应的原始图像副本Mm-i标记,并形成人脸区域标定数据;
普遍区域标定装置46,用于利用人脸区域标定数据结合相应的抽样率,在后续的原始图像副本(Mm-i...M2,M1)和原始图像M0完成人脸区域标定;
通用区域标定装置47,用于当循环结束没完成人脸区域标定,采用原始图像M0完成人脸区域标定;
关键点标定装置48,用于在原始图像副本Mm-i、后续的原始图像副本(Mm-i...M2,M1)、原始图像M0标定的人脸区域,(利用人脸关键点识别策略)进行人脸关键点标定,形成存在精度差别的人脸关键点标定数据。
图12为本发明的三维模型的实时控制系统一实施例的连续帧中对象识别的结构示意图。如图12所示,包括人脸区域标定装置51、连续帧处理装置52、连续帧区域标定装置53、副本区域标定判断装置54和原始区域标定装置55,其中:
人脸区域标定装置51,用于(通过普遍区域标定装置46)获取相应的原始图像副本Mm-i和原始图像M0的人脸区域标定数据;
连续帧处理装置52,用于获取后续连续时长的帧图像的原始图像M0和相应的原始图像副本Mm-i;
连续帧区域标定装置53,用于利用原始图像M0的人脸区域标定数据,完成后续连续时长的帧图像的原始图像M0的人脸区域标定;
副本区域标定判断装置54,用于利用原始图像副本Mm-i的人脸区域标定数据完成后续连续时长的帧图像的原始图像副本Mm-i的人脸区域标定;
原始区域标定装置55,用于在后续各帧的原始图像副本Mm-i和/或原始图像M0标定的人脸区域,进行人脸关键点标定,形成存在精度差别的人脸关键点标定数据。
如图12所示,还包括人脸关键点标定装置62、关键点轮廓生成装置63、相邻帧关 键点标定装置64、相邻帧标定判断装置65和关键点轮廓更新装置66,其中:
人脸关键点标定装置62,用于在获取相应的原始图像副本Mm-i或原始图像M0标定的人脸区域标定人脸关键点;
关键点轮廓生成装置63,用于利用人脸关键点轮廓形成包围盒范围;
相邻帧关键点标定装置64,用于利用扩大的包围盒范围作为下一帧的人脸区域,在扩大的包围盒范围内进行人脸关键点标定;
相邻帧标定判断装置65,用于判断人脸关键点标定是否成功,成功则调用关键点轮廓更新装置66,不成功则调用人脸关键点标定装置62;
关键点轮廓更新装置66,用于利用人脸关键点轮廓形成更新的包围盒范围,并比例放大更新的包围盒范围后调用相邻帧关键点标定装置64。
图13为本发明的三维模型的实时控制系统一实施例的头部及面部动作识别的结构示意图。如图13所示,包括头部朝向基准生成装置71、坐标映射生成装置72、面部基准生成装置73和转动角度测量装置74,其中:
头部朝向基准生成装置71,用于根据实时视频图像中正视人脸的人脸区域标定,标定正视人脸的2D关键点,利用其中具有相对固定位置的关键点,形成头部朝向基准图案;
坐标映射生成装置72,用于根据实时视频图像中正视人脸的2D关键点,形成的相应3D头部模型的正视三角网格,形成人脸的2D关键点与3D头部模型的3D关键点间的坐标映射;
面部基准生成装置73,用于根据正视人脸的具有相对固定位置的2D关键点,形成人脸基准平面和人脸基准平面的人脸基准图案;
转动角度测量装置74,用于实时视频的相邻帧图像中被标定关键点的2D人脸在人脸基准平面上形成透视投影,根据2D人脸上头部朝向基准图案相对人脸基准平面的人脸基准图案的变形,获得头部的欧拉旋转数据或四元数旋转数据。
如图13所示,本发明一实施例的三维模型的实时控制系统,对于视频连续帧的头面部对象动作形成控制指令的结构包括头面部动作参数生成装置32和控制指令生成装置33,其中:
头面部动作参数生成装置32,用于利用实时视频连续帧图像的2D人脸的各关键点的坐标变化和头部欧拉旋转数据或四元数旋转数据,形成帧间的人脸关键点移动参数和头部的转动方向;
控制指令生成装置33,用于将关键点移动参数和头部的转动方向封装成相应帧的3D模型头部和面部的控制指令。
如图13所示,本发明一实施例的三维模型的实时控制系统,对于视频连续帧的头面部对象动作控制指令与音频数据的同步化(同步化输出装置40)的结构包括音频数据同步装置35、控制指令同步装置36和实时输出装置37,其中:
音频数据同步装置35,用于根据控制指令的时间标签,为音频数据增加相应的时间标签;
控制指令同步装置36,用于对3D模型头部的控制指令以帧为单位增加时间标签;
实时输出装置37,用于将控制指令和音频数据信号适配传输链路,实时输出。
所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内,所作的任何修改、等同替换等,均应包含在本发明的保护范围之内。

Claims (28)

  1. 三维模型的实时控制方法,包括:
    获取现实对象的实时视频;
    标识实时视频图像中现实对象的动作;
    根据标识动作的变化,形成相应3D模型的动作控制指令。
  2. 如权利要求1所述的三维模型的实时控制方法,所述标识实时视频图像中现实对象的动作包括:
    根据预置对象识别策略在实时视频的图像中识别现实对象;
    根据预置关键点识别策略识别图像中现实对象的关键点;
    形成关键点的平面坐标空间和相应3D模型的立体坐标空间;
    测量连续图像中关键点在平面坐标空间的坐标变化,记录连续图像中关键点在立体坐标空间中的相应坐标变化。
  3. 如权利要求2所述的三维模型的实时控制方法,所述根据标识动作的变化,形成相应3D模型的动作控制指令,包括:
    将关键点的坐标变化形成现实对象相应3D模型的动作控制指令。
  4. 如权利要求2所述的三维模型的实时控制方法,所述根据预置对象识别策略在实时视频的图像中识别现实对象包括:
    获取实时视频的一帧原始图像M0;
    根据递减的抽样率,生成分辨率相应递减的一组原始图像副本,从中获取完成人脸区域标定的低分辨率的原始图像副本,形成人脸区域标定数据。
  5. 如权利要求4所述的三维模型的实时控制方法,所述根据预置对象识别策略在实时视频的图像中识别现实对象还进一步包括:
    当在所有原始图像副本中没完成人脸区域标定,采用原始图像M0完成人脸区域标定。
  6. 如权利要求4所述的三维模型的实时控制方法,所述根据预置关键点识别策略识别图像中现实对象的关键点包括:
    在原始图像副本Mm-i,或/和后续的原始图像副本(Mm-i...M2,M1),或/和原始图像M0标定的人脸区域,进行人脸关键点标定,形成存在精度差别的人脸关键点标定数据。
  7. 如权利要求1所述的三维模型的实时控制方法,所述现实对象包括肢体、头部或面部,所述标识包括对肢体或头部动作的捕捉和分析,或对面部表情的捕捉和分析。
  8. 如权利要求2所述的三维模型的实时控制方法,所述根据预置对象识别策略在实时视频的图像中识别现实对象包括:
    根据实时视频一帧图像的人脸区域标定,获取相应的原始图像副本Mm-i和原始图像M0的人脸区域标定数据;
    获取后续连续时长的帧图像的原始图像M0和相应的原始图像副本Mm-i;
    利用原始图像副本Mm-i的人脸区域标定数据完成后续连续时长的帧图像的原始图像副本Mm-i的人脸区域标定;或者,
    利用原始图像M0的人脸区域标定数据,完成后续连续时长的帧图像的原始图像M0的人脸区域标定。
  9. 如权利要求2所述的三维模型的实时控制方法,所述根据预置对象识别策略在实时视频的图像中识别现实对象包括:
    根据实时视频一帧图像的人脸区域标定,获取相应的原始图像副本Mm-i或原始图像M0的人脸区域标定数据;
    在标定的人脸区域,标定人脸关键点;
    利用人脸关键点轮廓形成包围盒范围;
    利用扩大的包围盒范围作为下一帧的人脸区域,在包围盒范围内进行人脸关键点标定。
  10. 如权利要求9所述的三维模型的实时控制方法,所述根据预置对象识别策略在实时视频的图像中识别现实对象,还进一步包括:
    当判断人脸关键点标定成功时,利用人脸关键点轮廓形成更新的包围盒范围并比例放大;
    当判断人脸关键点标定成功时,获取相应的原始图像副本Mm-i或原始图像M0的人脸区域标定数据。
  11. 如权利要求2所述的三维模型的实时控制方法,所述形成关键点的平面坐标空间和相应3D模型的立体坐标空间包括:
    根据实时视频图像中正视人脸的人脸区域标定,标定正视人脸的2D关键点,利用其中具有相对固定位置的关键点,形成头部朝向基准图案;
    根据实时视频图像中正视人脸的2D关键点,形成的相应3D头部模型的正视三角网格,形成人脸的2D关键点与3D头部模型的3D关键点间的坐标映射;
    根据正视人脸的具有相对固定位置的2D关键点,形成人脸基准平面和人脸基准平面的人脸基准图案;
    实时视频的相邻帧图像中被标定关键点的2D人脸在人脸基准平面上形成透视投影,根据2D人脸上头部朝向基准图案相对人脸基准平面的人脸基准图案的变形,获得头部的欧拉旋转数据或四元数旋转数据。
  12. 如权利要求11所述的三维模型的实时控制方法,所述测量连续图像中关键点在平面坐标空间的坐标变化,记录连续图像中关键点在立体坐标空间中的相应坐标变化包括:
    利用实时视频连续帧图像的2D人脸的各关键点的坐标变化和头部欧拉旋转数据或四元数旋转数据,形成帧间的人脸关键点移动参数和头部的转动方向。
  13. 如权利要求1至12任一所述的三维模型的实时控制方法,还包括:同步化音频数据和动作控制指令并输出。
  14. 如权利要求13所述的三维模型的实时控制方法,所述同步化音频数据和动作控制指令并输出,包括:
    对3D模型头部的控制指令以帧为单位增加时间标签;
    根据控制指令的时间标签,为音频数据增加相应的时间标签;
    将控制指令和音频数据信号适配传输链路,实时输出。
  15. 如权利要求1至12任一所述的三维模型的实时控制方法,还包括:
    调用获取的相应3D模型;
    根据接收到的动作控制指令控制相应3D模型完成动作。
  16. 三维模型的实时控制方法,包括:
    获取现实对象中头部及面部的实时视频;
    利用视频中帧图像的低分辨率副本定位人脸区域;
    通过将人脸区域在相邻帧图像的相应副本上直接应用;
    在帧图像或相应副本的人脸区域标识人脸关键点;
    利用图像中的正视2D人脸的位置固定的关键点建立头部朝向基准图案、人脸基准平面和人脸基准平面的人脸基准图案,与正视的3D头部模型形成坐标映射关系;
    通过测量相邻帧图像的头部转动时头部朝向基准图案相对人脸基准图案的变形,获得头部旋转数据;
    结合相邻帧2D人脸关键点的位置变化和头部旋转数据,形成头面部动作表情的控制指令。
  17. 三维模型的实时控制系统,包括:
    视频获取装置(10),用于获取现实对象的实时视频;
    图像标识装置(20),用于标识实时视频图像中现实对象的动作;
    动作指令生成装置(30),用于根据标识动作的变化,形成相应3D模型的动作控制指令。
  18. 如权利要求17所述的三维模型的实时控制系统,所述图像标识装置(20)包括对象识别装置(21)、对象关键点识别装置(22)、对象位置坐标建立装置(23)和对象动作变化记录装置(24),其中:
    对象识别装置(21),用于根据预置对象识别策略在实时视频的图像中识别现实对象;
    对象关键点识别装置(22),用于根据预置关键点识别策略识别图像中现实对象的关键点;
    对象位置坐标建立装置(23),用于形成关键点的平面坐标空间和相应3D模型的立体坐标空间;
    对象动作变化记录装置(24),用于测量连续图像中关键点在平面坐标空间的坐标变化,记录连续图像中关键点在立体坐标空间中的相应坐标变化。
  19. 如权利要求17所述的三维模型的实时控制系统,所述动作指令生成装置(30)包括动作转换装置(31),用于将关键点的坐标变化形成现实对象相应3D模型的动作控制指令。
  20. 如权利要求18所述的三维模型的实时控制系统,所述对象识别装置(21)包括原始图像捕捉装置(41)、图像副本生成装置(42)和副本循环标定装置(43),其中:
    原始图像捕捉装置(41),用于获取实时视频的一帧原始图像M0;
    图像副本生成装置(42),用于根据递减的抽样率,生成分辨率相应递减的一组原始图像副本:M1,M2...Mm-i,…Mm-1,Mm;
    副本循环标定装置(43),用于以原始图像副本个数m为循环次数,从分辨率最低的原始图像副本Mm开始,顺序在原始图像副本中进行人脸区域标定,形成人脸区域标定数据。
  21. 如权利要求18所述的三维模型的实时控制系统,所述对象关键点识别装置(22)包括关键点标定装置(48),用于在原始图像副本Mm-i、后续的原始图像副本(Mm-i...M2,M1)、原始图像M0标定的人脸区域,进行人脸关键点标定,形成存在精度差别的人脸关键点标定数据。
  22. 如权利要求18所述的三维模型的实时控制系统,所述对象识别装置(21)包括人脸区域标定装置(51)、连续帧处理装置(52)、连续帧区域标定装置(53)、副本区域标定判断装置(54)和原始区域标定装置(55),其中:
    人脸区域标定装置(51),用于获取相应的原始图像副本Mm-i和原始图像M0的人脸区域标定数据;
    连续帧处理装置(52),用于获取后续连续时长的帧图像的原始图像M0和相应的原始图像副本Mm-i;
    连续帧区域标定装置(53),用于利用原始图像M0的人脸区域标定数据,完成后续连续时长的帧图像的原始图像M0的人脸区域标定;
    副本区域标定判断装置(54),用于利用原始图像副本Mm-i的人脸区域标定数据完成后续连续时长的帧图像的原始图像副本Mm-i的人脸区域标定;
    原始区域标定装置(55),用于在后续各帧的原始图像副本Mm-i和/或原始图像M0标定的人脸区域,进行人脸关键点标定,形成存在精度差别的人脸关键点标定数据。
  23. 如权利要求18所述的三维模型的实时控制系统,所述对象识别装置(21)包 括人脸关键点标定装置(62)、关键点轮廓生成装置(63)、相邻帧关键点标定装置(64)、相邻帧标定判断装置(65)和关键点轮廓更新装置(66),其中:
    人脸关键点标定装置(62),用于在获取相应的原始图像副本Mm-i或原始图像M0标定的人脸区域标定人脸关键点;
    关键点轮廓生成装置(63),用于利用人脸关键点轮廓形成包围盒范围;
    相邻帧关键点标定装置(64),用于利用扩大的包围盒范围作为下一帧的人脸区域,在包围盒范围内进行人脸关键点标定;
    相邻帧标定判断装置(65),用于判断人脸关键点标定是否成功,成功则调用关键点轮廓更新装置(66),不成功则调用人脸关键点标定装置(62);
    关键点轮廓更新装置(66),用于利用人脸关键点轮廓形成更新的包围盒范围,并比例放大更新的包围盒范围后调用相邻帧关键点标定装置(64)。
  24. 如权利要求18所述的三维模型的实时控制系统,所述对象位置坐标建立装置(23)包括头部朝向基准生成装置(71)、坐标映射生成装置(72)、面部基准生成装置(73)和转动角度测量装置(74),其中:
    头部朝向基准生成装置(71),用于根据实时视频图像中正视人脸的人脸区域标定,标定正视人脸的2D关键点,利用其中具有相对固定位置的关键点,形成头部朝向基准图案;
    坐标映射生成装置(72),用于根据实时视频图像中正视人脸的2D关键点,形成的相应3D头部模型的正视三角网格,形成人脸的2D关键点与3D头部模型的3D关键点间的坐标映射;
    面部基准生成装置(73),用于根据正视人脸的具有相对固定位置的2D关键点,形成人脸基准平面和人脸基准平面的人脸基准图案;
    转动角度测量装置(74),用于实时视频的相邻帧图像中被标定关键点的2D人脸在人脸基准平面上形成透视投影,根据2D人脸上头部朝向基准图案相对人脸基准平面的人脸基准图案的变形,获得头部欧拉旋转数据或四元数旋转数据。
  25. 如权利要求18所述的三维模型的实时控制系统,所述对象位置坐标建立装置(23)包括头面部动作参数生成装置(32)和控制指令生成装置(33),其中:
    头面部动作参数生成装置(32),用于利用实时视频连续帧图像的2D人脸的各关键点的坐标变化和头部欧拉旋转数据或四元数旋转数据,形成帧间的人脸关键点移动参数和头部的转动方向;
    控制指令生成装置(33),用于将关键点移动参数和头部的转动方向封装成相应帧的3D模型头部和面部的控制指令。
  26. 如权利要求16至25任一所述的三维模型的实时控制系统,还包括同步化输出装置(40),用于同步化音频数据和动作控制指令,并输出。
  27. 如权利要求26所述的三维模型的实时控制系统,所述同步化输出装置(40)包括音频数据同步装置(35)、控制指令同步装置(36)和实时输出装置(37),其中:
    音频数据同步装置(35),用于根据控制指令的时间标签,为音频数据增加相应的时间标签;
    控制指令同步装置(36),用于对3D模型头部的控制指令以帧为单位增加时间标签;
    实时输出装置(37),用于将控制指令和音频数据信号适配传输链路,实时输出。
  28. 如权利要求17至27任一所述的三维模型的实时控制系统,还包括还包括激活装置(80)和播放装置(90),其中:
    激活装置(80),用于调用获取的相应3D模型;
    播放装置(90),用于根据接收到的动作控制指令控制相应3D模型完成动作。
PCT/CN2017/081376 2016-07-29 2017-04-21 三维模型的实时控制方法和系统 WO2018018957A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/261,482 US10930074B2 (en) 2016-07-29 2019-01-29 Method and system for real-time control of three-dimensional models

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610619560.4A CN106251396B (zh) 2016-07-29 2016-07-29 三维模型的实时控制方法和系统
CN201610619560.4 2016-07-29

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/261,482 Continuation US10930074B2 (en) 2016-07-29 2019-01-29 Method and system for real-time control of three-dimensional models

Publications (1)

Publication Number Publication Date
WO2018018957A1 true WO2018018957A1 (zh) 2018-02-01

Family

ID=57606112

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/081376 WO2018018957A1 (zh) 2016-07-29 2017-04-21 三维模型的实时控制方法和系统

Country Status (3)

Country Link
US (1) US10930074B2 (zh)
CN (1) CN106251396B (zh)
WO (1) WO2018018957A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111405361A (zh) * 2020-03-27 2020-07-10 咪咕文化科技有限公司 一种视频获取方法、电子设备及计算机可读存储介质
CN113427486A (zh) * 2021-06-18 2021-09-24 上海非夕机器人科技有限公司 机械臂控制方法、装置、计算机设备、存储介质和机械臂
CN115442519A (zh) * 2022-08-08 2022-12-06 珠海普罗米修斯视觉技术有限公司 视频处理方法、装置及计算机可读存储介质

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106251396B (zh) 2016-07-29 2021-08-13 迈吉客科技(北京)有限公司 三维模型的实时控制方法和系统
CN106993195A (zh) * 2017-03-24 2017-07-28 广州创幻数码科技有限公司 虚拟人物角色直播方法及系统
CN107172040A (zh) * 2017-05-11 2017-09-15 上海微漫网络科技有限公司 一种虚拟角色的播放方法及系统
CN109309866B (zh) * 2017-07-27 2022-03-08 腾讯科技(深圳)有限公司 图像处理方法及装置、存储介质
CN107705365A (zh) * 2017-09-08 2018-02-16 郭睿 可编辑的三维人体模型创建方法、装置、电子设备及计算机程序产品
CN107613240A (zh) * 2017-09-11 2018-01-19 广东欧珀移动通信有限公司 视频画面处理方法、装置和移动终端
CN107750014B (zh) * 2017-09-25 2020-10-16 迈吉客科技(北京)有限公司 一种连麦直播方法和系统
CN108109189A (zh) * 2017-12-05 2018-06-01 北京像素软件科技股份有限公司 动作共享方法及装置
CN108769802A (zh) * 2018-06-21 2018-11-06 北京密境和风科技有限公司 网络表演的实现方法、装置和系统
JP2022500795A (ja) * 2018-07-04 2022-01-04 ウェブ アシスタンツ ゲーエムベーハー アバターアニメーション
TWI704501B (zh) * 2018-08-09 2020-09-11 宏碁股份有限公司 可由頭部操控的電子裝置與其操作方法
CN109191593A (zh) * 2018-08-27 2019-01-11 百度在线网络技术(北京)有限公司 虚拟三维模型的运动控制方法、装置及设备
CN110942479B (zh) * 2018-09-25 2023-06-02 Oppo广东移动通信有限公司 虚拟对象控制方法、存储介质及电子设备
CN113498530A (zh) * 2018-12-20 2021-10-12 艾奎菲股份有限公司 基于局部视觉信息的对象尺寸标注系统和方法
WO2020147791A1 (zh) * 2019-01-18 2020-07-23 北京市商汤科技开发有限公司 图像处理方法及装置、图像设备及存储介质
WO2020147794A1 (zh) * 2019-01-18 2020-07-23 北京市商汤科技开发有限公司 图像处理方法及装置、图像设备及存储介质
CN111460870A (zh) 2019-01-18 2020-07-28 北京市商汤科技开发有限公司 目标的朝向确定方法及装置、电子设备及存储介质
CN110264499A (zh) * 2019-06-26 2019-09-20 北京字节跳动网络技术有限公司 基于人体关键点的交互位置控制方法、装置及电子设备
CN110536095A (zh) * 2019-08-30 2019-12-03 Oppo广东移动通信有限公司 通话方法、装置、终端及存储介质
US11532093B2 (en) 2019-10-10 2022-12-20 Intermap Technologies, Inc. First floor height estimation from optical images
CN111476871B (zh) * 2020-04-02 2023-10-03 百度在线网络技术(北京)有限公司 用于生成视频的方法和装置
CN111541932B (zh) * 2020-04-30 2022-04-12 广州方硅信息技术有限公司 直播间的用户形象展示方法、装置、设备及存储介质
CN112019921A (zh) * 2020-09-01 2020-12-01 北京德火科技有限责任公司 应用于虚拟演播室的肢体动作数据处理方法
CN112019922A (zh) * 2020-09-01 2020-12-01 北京德火科技有限责任公司 应用于虚拟演播室的面部表情数据处理方法
US11551366B2 (en) * 2021-03-05 2023-01-10 Intermap Technologies, Inc. System and methods for correcting terrain elevations under forest canopy
CN113507627B (zh) * 2021-07-08 2022-03-25 北京的卢深视科技有限公司 视频生成方法、装置、电子设备及存储介质
CN113989928B (zh) * 2021-10-27 2023-09-05 南京硅基智能科技有限公司 一种动作捕捉和重定向方法
CN113965773A (zh) * 2021-11-03 2022-01-21 广州繁星互娱信息科技有限公司 直播展示方法和装置、存储介质及电子设备
CN114554267B (zh) * 2022-02-22 2024-04-02 上海艾融软件股份有限公司 基于数字孪生技术的音频视频的同步方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101086681A (zh) * 2006-06-09 2007-12-12 中国科学院自动化研究所 基于立体视觉的游戏控制系统及方法
CN101452582A (zh) * 2008-12-18 2009-06-10 北京中星微电子有限公司 一种实现三维视频特效的方法和装置
CN105338369A (zh) * 2015-10-28 2016-02-17 北京七维视觉科技有限公司 一种在视频中实时合成动画的方法和装置
CN105528805A (zh) * 2015-12-25 2016-04-27 苏州丽多数字科技有限公司 一种虚拟人脸动画合成方法
CN106251396A (zh) * 2016-07-29 2016-12-21 迈吉客科技(北京)有限公司 三维模型的实时控制方法和系统

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6363380B1 (en) * 1998-01-13 2002-03-26 U.S. Philips Corporation Multimedia computer system with story segmentation capability and operating program therefor including finite automation video parser
CN101047434B (zh) * 2007-04-10 2010-09-29 华为技术有限公司 一种时间标签同步方法、系统、装置
CN101271520A (zh) * 2008-04-01 2008-09-24 北京中星微电子有限公司 一种确定图像中的特征点位置的方法及装置
CN101763636B (zh) * 2009-09-23 2012-07-04 中国科学院自动化研究所 视频序列中的三维人脸位置和姿态跟踪的方法
US9747495B2 (en) * 2012-03-06 2017-08-29 Adobe Systems Incorporated Systems and methods for creating and distributing modifiable animated video messages
US9600742B2 (en) * 2015-05-05 2017-03-21 Lucasfilm Entertainment Company Ltd. Determining control values of an animation model using performance capture
CN105518714A (zh) * 2015-06-30 2016-04-20 北京旷视科技有限公司 活体检测方法及设备、计算机程序产品
US9865072B2 (en) * 2015-07-23 2018-01-09 Disney Enterprises, Inc. Real-time high-quality facial performance capture
CN105069830A (zh) * 2015-08-14 2015-11-18 广州市百果园网络科技有限公司 表情动画生成方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101086681A (zh) * 2006-06-09 2007-12-12 中国科学院自动化研究所 基于立体视觉的游戏控制系统及方法
CN101452582A (zh) * 2008-12-18 2009-06-10 北京中星微电子有限公司 一种实现三维视频特效的方法和装置
CN105338369A (zh) * 2015-10-28 2016-02-17 北京七维视觉科技有限公司 一种在视频中实时合成动画的方法和装置
CN105528805A (zh) * 2015-12-25 2016-04-27 苏州丽多数字科技有限公司 一种虚拟人脸动画合成方法
CN106251396A (zh) * 2016-07-29 2016-12-21 迈吉客科技(北京)有限公司 三维模型的实时控制方法和系统

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111405361A (zh) * 2020-03-27 2020-07-10 咪咕文化科技有限公司 一种视频获取方法、电子设备及计算机可读存储介质
CN111405361B (zh) * 2020-03-27 2022-06-14 咪咕文化科技有限公司 一种视频获取方法、电子设备及计算机可读存储介质
CN113427486A (zh) * 2021-06-18 2021-09-24 上海非夕机器人科技有限公司 机械臂控制方法、装置、计算机设备、存储介质和机械臂
CN113427486B (zh) * 2021-06-18 2022-10-28 上海非夕机器人科技有限公司 机械臂控制方法、装置、计算机设备、存储介质和机械臂
CN115442519A (zh) * 2022-08-08 2022-12-06 珠海普罗米修斯视觉技术有限公司 视频处理方法、装置及计算机可读存储介质
CN115442519B (zh) * 2022-08-08 2023-12-15 珠海普罗米修斯视觉技术有限公司 视频处理方法、装置及计算机可读存储介质

Also Published As

Publication number Publication date
CN106251396B (zh) 2021-08-13
US10930074B2 (en) 2021-02-23
US20190156574A1 (en) 2019-05-23
CN106251396A (zh) 2016-12-21

Similar Documents

Publication Publication Date Title
WO2018018957A1 (zh) 三维模型的实时控制方法和系统
CN111738220B (zh) 三维人体姿态估计方法、装置、设备及介质
KR20180121494A (ko) 단안 카메라들을 이용한 실시간 3d 캡처 및 라이브 피드백을 위한 방법 및 시스템
US11386633B2 (en) Image augmentation for analytics
CN106710003B (zh) 一种基于OpenGL ES的三维拍照方法和系统
WO2019100932A1 (zh) 一种运动控制方法及其设备、存储介质、终端
WO2010038693A1 (ja) 情報処理装置、情報処理方法、プログラム及び情報記憶媒体
JP7483301B2 (ja) 画像処理及び画像合成方法、装置及びコンピュータプログラム
TWI752419B (zh) 影像處理方法及裝置、圖像設備及儲存媒介
WO2019200719A1 (zh) 三维人脸模型生成方法、装置及电子设备
CN103999455B (zh) 协作交叉平台视频捕捉
WO2019019927A1 (zh) 一种视频处理方法、网络设备和存储介质
JP2023514289A (ja) 3次元顔モデルの構築方法、3次元顔モデルの構築装置、コンピュータ機器、及びコンピュータプログラム
CN107707899B (zh) 包含运动目标的多视角图像处理方法、装置及电子设备
CN112348937A (zh) 人脸图像处理方法及电子设备
WO2023066120A1 (zh) 图像处理方法、装置、电子设备及存储介质
KR20150068895A (ko) 삼차원 출력 데이터 생성 장치 및 방법
CN110152293A (zh) 操控对象的定位方法及装置、游戏对象的定位方法及装置
CN111064981B (zh) 一种视频串流的系统及方法
KR20150025462A (ko) 상호작용하는 캐릭터를 모델링 하는 방법 및 장치
JP5066047B2 (ja) 情報処理装置、情報処理方法、プログラム及び情報記憶媒体
WO2024131204A1 (zh) 虚拟场景设备交互方法及相关产品
US11675195B2 (en) Alignment of 3D representations for hologram/avatar control
Smolska et al. Reconstruction of the Face Shape using the Motion Capture System in the Blender Environment.
KR20240048207A (ko) 사용자의 상황 정보 예측 기반 확장현실 디바이스의 영상 스트리밍 방법 및 장치

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17833263

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17833263

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 02.07.2019)

122 Ep: pct application non-entry in european phase

Ref document number: 17833263

Country of ref document: EP

Kind code of ref document: A1