WO2020253716A1 - 图像生成方法和装置 - Google Patents

图像生成方法和装置 Download PDF

Info

Publication number
WO2020253716A1
WO2020253716A1 PCT/CN2020/096547 CN2020096547W WO2020253716A1 WO 2020253716 A1 WO2020253716 A1 WO 2020253716A1 CN 2020096547 W CN2020096547 W CN 2020096547W WO 2020253716 A1 WO2020253716 A1 WO 2020253716A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
limb
sample
training
limb part
Prior art date
Application number
PCT/CN2020/096547
Other languages
English (en)
French (fr)
Inventor
陈奇
Original Assignee
北京字节跳动网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字节跳动网络技术有限公司 filed Critical 北京字节跳动网络技术有限公司
Priority to US17/620,452 priority Critical patent/US20220358662A1/en
Publication of WO2020253716A1 publication Critical patent/WO2020253716A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/215Motion-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30221Sports video; Sports image

Definitions

  • the embodiments of the present disclosure relate to the field of computer technology, in particular to an image generation method and device.
  • the captured user image is added to the virtual scene, so that the user is placed in the virtual world to achieve the effect of virtual reality.
  • virtual reality technology is applied to motion applications, it is often difficult for the camera to capture clear body parts due to the excessively fast body movements such as the human body waving arms, and thus it is difficult to accurately guide the user to complete various actions.
  • the embodiments of the present disclosure propose an image generation method and apparatus.
  • the embodiments of the present disclosure provide an image generation method.
  • the method includes: acquiring an image of a target moving object, the image presents the limbs of the target moving object; inputting the image to a pre-trained detection model to obtain To indicate the output result of the position distribution of each limb part in the set of preset limb parts in the image; based on the output result, generate a heat map corresponding to each limb part; superimpose the generated heat map into the image, and each At the location of the area corresponding to the limbs, an image with a superimposed heat map is generated.
  • the output result includes a preset number of score matrices, and each score matrix includes a score corresponding to the image and used to indicate the pixel distribution of the limb part in the image; the score matrix and a set of limb parts The body parts in the corresponding one-to-one.
  • generating a heat map corresponding to each limb part includes: for each score matrix in a preset number of score matrixes, determining a score in the score matrix that is greater than a preset threshold. The area of the pixel corresponding to the value in the image; based on the determined area, the limb part corresponding to the score matrix, and the preset heat map color value corresponding to each limb part, a heat map corresponding to each limb part is generated.
  • the detection model is obtained by training in the following steps: obtaining a training sample set, the training sample set includes a sample image showing limb parts and an indication for indicating the position distribution of each limb part presented in the sample image in the sample image Information: Based on the training sample set, the sample image is used as input, the instruction information corresponding to the sample image is used as the desired output, and the detection model is trained using machine learning.
  • the indication information includes a score matrix corresponding to the sample image and used to indicate the pixel distribution of the limb parts in the sample image;
  • the sample image is used as input, the instruction information corresponding to the sample image is used as the expected output, and the machine learning method is used to train the detection model, including: performing the following training steps: For the sample images in the training sample set, The sample image is input to the convolutional neural network to obtain a sample score matrix indicating the pixel distribution of each limb part in the sample image; determine the sample score matrix corresponding to each sample image and the score in the indication information Whether the difference between the value matrices is less than the preset threshold; in response to determining that the difference is less than the preset threshold, it is determined that the convolutional neural network training is completed, and the trained convolutional neural network is used as the detection model; in response to determining that the difference is greater than or equal to the predetermined threshold Set the threshold, adjust the parameters of the convolutional neural network to be trained, and re-execute the training steps.
  • an embodiment of the present disclosure provides an image generation device, the device includes: an acquisition unit configured to acquire an image of a target moving object, the image presents limb parts of the target moving object; an input unit configured to Input the image to the pre-trained detection model to obtain the output result indicating the position distribution of each limb part in the preset limb part set in the image; the first generating unit is configured to generate and The heat map corresponding to the limb part; the second generating unit is configured to superimpose the generated heat map on the image at the location of the region corresponding to each limb part to generate an image with the superimposed heat map.
  • the output result includes a preset number of score matrices, and each score matrix includes a score corresponding to the image and used to indicate the pixel distribution of the limb part in the image; the score matrix and a set of limb parts The body parts in the corresponding one-to-one.
  • the first generating unit is further configured to: for each score matrix in the preset number of score matrices, determine that a pixel corresponding to a score greater than a preset threshold in the score matrix is in the image The area; based on the determined area, the limb parts corresponding to the score matrix, and the preset heat map color value corresponding to each limb part, the heat map corresponding to each limb part is generated.
  • the detection model is obtained by training in the following steps: obtaining a training sample set, the training sample set includes a sample image showing limb parts and an indication for indicating the position distribution of each limb part presented in the sample image in the sample image Information: Based on the training sample set, the sample image is used as input, the instruction information corresponding to the sample image is used as the desired output, and the detection model is trained using machine learning.
  • the indication information includes a score matrix corresponding to the sample image and used to indicate the pixel distribution of the limb parts in the sample image; and the detection model is further trained by the following steps: perform the following training steps: for training samples Collect the sample image, input the sample image to the convolutional neural network to obtain the sample score matrix indicating the pixel distribution of each limb part in the sample image; determine the obtained sample score matrix corresponding to each sample image And whether the difference between the score matrix in the indication information is less than the preset threshold; in response to determining that the difference is less than the preset threshold, it is determined that the training of the convolutional neural network is completed, and the trained convolutional neural network is used as a detection model; in response to Determine that the difference is greater than or equal to the preset threshold, adjust the parameters of the convolutional neural network to be trained, and re-execute the training step.
  • the embodiments of the present disclosure provide a terminal device, the terminal device includes: one or more processors; a storage device for storing one or more programs; when one or more programs are used by one or more Execution by two processors, so that one or more processors implement the method described in any implementation manner of the first aspect.
  • an embodiment of the present disclosure provides a computer-readable medium on which a computer program is stored, and when the computer program is executed by a processor, the method as described in any implementation manner in the first aspect is implemented.
  • the image generation method and device detect the acquired image of the target moving object to determine the position of the limb part of the target moving object in the image, and generate a thermal map of the limb position to be superimposed on the above
  • the corresponding position of the limb in the image does not need to detect the limb of the moving object by means of key point detection, which leads to positioning deviation, improves the accuracy of limb positioning, and helps accurately guide the user to complete subsequent limb actions.
  • FIG. 1 is an exemplary system architecture diagram in which an embodiment of the present disclosure can be applied
  • Fig. 2 is a flowchart of an embodiment of an image generation method according to the present disclosure
  • Fig. 3 is a schematic diagram of an application scenario of the image generation method according to an embodiment of the present disclosure
  • FIG. 4 is a flowchart of another embodiment of the image generation method according to the present disclosure.
  • Fig. 5 is a schematic diagram of an application scenario of a score matrix according to an embodiment of the present disclosure
  • Fig. 6 is a schematic structural diagram of an embodiment of an image generating device according to the present disclosure.
  • Fig. 7 is a schematic structural diagram of an electronic device suitable for implementing the embodiments of the present disclosure.
  • FIG. 1 shows an exemplary architecture 100 to which an embodiment of the image generation method or image generation apparatus of the present disclosure can be applied.
  • the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105.
  • the network 104 is used to provide a medium for communication links between the terminal devices 101, 102, 103 and the server 105.
  • the network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables.
  • Various client applications may be installed on the terminal devices 101, 102, 103.
  • image processing applications augmented reality applications, virtual reality applications, action guidance applications, sports fitness applications, etc. You can also install cameras, camera applications, etc.
  • the terminal devices 101, 102, 103 can interact with the server 105 through the network 104 to receive or send messages and so on.
  • the terminal devices 101, 102, and 103 may be hardware or software.
  • the terminal devices 101, 102, 103 can be various electronic devices that can receive user operations, including but not limited to smart phones, tablet computers, e-book readers, laptop computers, desktop computers, and so on.
  • the terminal devices 101, 102, and 103 are software, they can be installed in the electronic devices listed above. It can be implemented as multiple software or software modules (for example, multiple software or software modules used to provide distributed services), or as a single software or software module. There is no specific limitation here.
  • the server 105 may be a background server that supports client applications installed on the terminal devices 101, 102, 103.
  • the server 105 may perform detection processing on the received image of the target moving object acquired by the terminal, and generate an image including the heat map of each limb part for presentation on the terminal.
  • the server 105 may be hardware or software.
  • the server When the server is hardware, it can be implemented as a distributed server cluster composed of multiple servers, or as a single server.
  • the server When the server is software, it can be implemented as multiple software or software modules (for example, multiple software or software modules for providing distributed services), or as a single software or software module. There is no specific limitation here.
  • the image generation method provided by the embodiments of the present disclosure may be executed by the server 105, or may be executed by the terminal devices 101, 102, 103.
  • the image generating device can be set in the server 105 or in the terminal devices 101, 102, 103.
  • terminal devices, networks, and servers in FIG. 1 are merely illustrative. According to implementation needs, there can be any number of terminal devices, networks and servers.
  • the above system architecture may not include a network, but only include terminal devices or servers.
  • FIG. 2 shows a process 200 of an embodiment of the image generation method according to the present disclosure.
  • the image generation method includes the following steps:
  • Step 201 Obtain an image of the target moving object.
  • the execution subject of the above-mentioned image generation method may be installed with a photographing device or connected to the photographing device.
  • the image of the target moving object may be sent to the above-mentioned execution subject after the shooting device is shot.
  • the image of the target moving object may be acquired in real time by the shooting device, or may be acquired based on a preset time interval.
  • the aforementioned target moving object may be a human body.
  • the acquired image presents the body parts of the target moving object.
  • the limb parts may include, but are not limited to, hands, upper arms, forearms, thighs, lower legs, neck, back, waist, feet, knees, and shoulders.
  • Step 202 Input the acquired image into a pre-trained detection model, and obtain an output result indicating the position distribution of each limb part in the preset limb part set in the acquired image.
  • the aforementioned preset set of limb parts may include one limb part or multiple limb parts, which are set according to the needs of the scene.
  • the limb parts in the preset limb part set include arms and left legs
  • the aforementioned output result is used to indicate the position distribution of the left arm and left leg in the acquired image.
  • the foregoing output result may include a feature map.
  • the size of the feature map is the same as the size of the acquired image, and the displayed image content includes the contours of each limb part.
  • the coordinate position of the limb contour presented by the feature map in the feature map can be determined, and then the coordinate position is used as the image coordinate information of the limb part in the obtained image.
  • multiple feature maps may be output, and each feature map corresponds to a limb part.
  • the above output result may include two feature maps, one of which includes the contour of the right arm and the other includes the contour of the left arm.
  • the aforementioned detection model may be obtained by training based on training samples and using an existing network structure.
  • the network structure may include, for example, a generative confrontation network, a convolutional neural network, and the like.
  • the above-mentioned training sample includes a sample moving image, and a desired feature map corresponding to the sample moving image.
  • the foregoing detection model may be obtained by training the generative confrontation network.
  • the generative confrontation network includes a generative network and a discriminant network.
  • the generating network is used to extract the feature map of the sample image;
  • the discriminant network is used to determine the error between the obtained feature map and the expected feature map.
  • the generation network may be a convolutional neural network for image processing (for example, a convolutional neural network with various structures including a convolution layer, a pooling layer, a depooling layer, and a deconvolution layer).
  • the above-mentioned discriminant network may also be a convolutional neural network (for example, a convolutional neural network of various structures including a fully connected layer, where the above-mentioned fully connected layer can implement a classification function).
  • the trained generation network is used as the above-mentioned detection model.
  • Step 203 Generate a heat map corresponding to each limb part based on the output result.
  • the above-mentioned execution subject may generate a heat map corresponding to each limb part.
  • the heat map is an image that presents the contours of the above-mentioned limbs in a special highlight form.
  • the foregoing execution may determine the coordinate position of the limb contour presented by the feature map based on the feature map obtained in step 202, and display the area corresponding to the coordinate position in a specially highlighted form, thereby obtaining a heat map.
  • step 204 the generated heat map is superimposed on the obtained image at the location of the region corresponding to each limb part to generate an image after the superimposed heat map.
  • the above-mentioned execution subject may superimpose the heat map generated in step 202 on the image obtained in step 201 at the location of the region corresponding to each limb part.
  • the generated heat map of the left arm and left leg can be superimposed on the image obtained in step 201 to obtain an image after the superimposed heat map, and then The image is presented on the terminal.
  • the image generation method shown in this application detects the image of the target moving object to determine the location area of each limb part in the image.
  • the electronic device running on the application can take images of the user, and then take the image
  • the displayed body movements are compared with the movements in the preset movement library.
  • the user's movement speed is too fast, it is usually impossible to accurately capture the movements of the body's limbs.
  • the heat map of each limb part can be displayed to compare with the action in the action library, which is helpful to guide the user to follow-up
  • the completion of the action improves the user experience.
  • FIG. 3 shows an application scene diagram of the present disclosure, which shows the image generation method of the present disclosure.
  • the photographing device 301 obtains the image of the user A
  • the image is sent to the electronic device 302.
  • the electronic device 302 may be a terminal such as a mobile phone or a server.
  • the electronic device 302 inputs the acquired image to the detection model 303 to obtain the position distribution of the arm in the image.
  • the detection model 303 obtains the position distribution of the arm in the image.
  • generate a heat map of the arm is superimposed on the position of the arm in the acquired image to obtain an image 303 after the superimposed heat map.
  • FIG. 4 shows a process 400 of another embodiment of the image generation method according to the present disclosure.
  • the image generation method includes the following steps:
  • Step 401 Acquire an image of a target moving object.
  • the execution subject of the above-mentioned image generation method may be installed with a photographing device or connected to the photographing device.
  • the image of the target moving object may be sent to the above-mentioned execution subject after the shooting device is shot.
  • the image of the target moving object may be acquired in real time by the shooting device, or may be acquired based on a preset time interval.
  • the aforementioned target moving object may be a human body.
  • the acquired image presents the body parts of the target moving object.
  • Step 402 Input the acquired image into a pre-trained detection model, and obtain an output result indicating the position distribution of each limb part in the preset limb part set in the acquired image.
  • the output result may be a score matrix, and the score matrix corresponds to each pixel in the acquired image one-to-one.
  • An image is composed of pixels, and each pixel is determined by its coordinate position in the image.
  • an image with a resolution of 1024*540 consists of 1024 pixels in the horizontal direction and 540 pixels in the vertical direction.
  • Each pixel is composed of RGB color values.
  • the first pixel in the first line is the coordinate position of the pixel in the image.
  • each pixel in the image includes a pixel value and a coordinate position in the image.
  • Each score in the aforementioned score matrix is used to indicate the probability value of each pixel in the acquired image showing a limb part.
  • FIG. 5 schematically shows a schematic diagram of an application scenario of the score matrix provided by the present disclosure.
  • the output result shown in FIG. 3 is a 15*15 score matrix, that is, the pixels of the obtained image are 15*15.
  • the coordinate position of each score in the score matrix corresponds to the coordinate position of the pixel in the image.
  • the scores in the score matrix include 0-9. Assume that the score matrix is used to indicate the distribution of the left arm in the acquired image.
  • the preset score threshold is 8 in the score matrix, the coordinate position of the pixel in the image corresponding to the score greater than or equal to 8 is presented, which is the position distribution of the left arm in the acquired image .
  • the above-mentioned output result may include a preset number of score matrices, and each score matrix includes a pixel distribution corresponding to the acquired image and used to indicate the pixel distribution of the limb part in the acquired image.
  • Score the score matrix corresponds to the limb parts in the limb part set.
  • a score matrix is used to indicate the position distribution of a limb part in the set of limb parts in the obtained image. Therefore, in a score matrix, the score used to present the pixel distribution of a limb part can be obtained. Make the determined body parts more accurate.
  • the aforementioned detection model may be obtained by training based on training samples.
  • a training sample set is obtained.
  • the training sample set includes a sample image showing limb parts and indication information for indicating the position distribution of each limb part presented in the sample image in the sample image; based on the training sample set, the sample image is taken as Input and use the instruction information corresponding to the sample image as the expected output, and use the machine learning method to train the detection model.
  • the above-mentioned indication information includes a score matrix corresponding to the sample image and used to indicate the pixel distribution of the limb parts in the sample image.
  • the sample image is used as the input
  • the instruction information corresponding to the sample image is used as the desired output
  • the detection model is obtained by training using the machine learning method, which may specifically include:
  • the convolutional neural network For the sample image in the training sample set, input the sample image to the convolutional neural network to obtain a sample score matrix indicating the pixel distribution of each limb part in the sample image; determine the obtained sample corresponding to each sample image Whether the difference between the score matrix and the score matrix in the indication information is less than the preset threshold; in response to determining that the difference is less than the preset threshold, it is determined that the training of the convolutional neural network is completed, and the trained convolutional neural network is used as the target The detection model; in response to determining that the difference is greater than or equal to a preset threshold, adjust the parameters of the convolutional neural network to be trained, and re-execute the training step.
  • the above detection model may be obtained by training a convolutional neural network.
  • the above indication information is a score matrix corresponding to the pixels of the sample image, wherein the score value of the pixel corresponding to the image showing the limb part in the score matrix is set to 10, and the remaining score values are set to 0.
  • the corresponding coordinate position in the score matrix corresponding to the pixel showing the left arm and left leg can be set to 10, and the remaining scores are set to 0.
  • the sample can be input into the convolutional neural network to be trained to obtain a sample score matrix for indicating pixels corresponding to the image of the limb part.
  • the obtained sample score matrix is compared with the score matrix in the preset indication information, and the difference between the obtained sample score matrix and the score matrix in the indication information is determined.
  • the difference includes the difference between the scores of each coordinate position in the score matrix.
  • the above difference being less than the preset threshold may specifically include: in the obtained sample score matrix, the score at the coordinate position corresponding to the score matrix in the preset indication information, and the difference between the scores is less than the preset
  • the number of points of the threshold is set to be greater than the preset number value.
  • the parameters of the convolutional neural network can be adjusted, for example, the number of convolutional layers in the convolutional neural network, the size of the convolution kernel, etc. can be adjusted. Then use the above-mentioned training samples to continue training the convolutional neural network after parameter adjustment until the above-mentioned error is less than the preset threshold.
  • Step 403 For each score matrix of the preset number of score matrices, determine the area in the image of the pixel corresponding to the score greater than the preset threshold in the score matrix.
  • the above-mentioned execution subject determines the score of each score matrix that is greater than a preset threshold. Since the position of the pixel in the image corresponds to the coordinate position of each score in the score matrix, the coordinate position of the score greater than the preset threshold in the score matrix can be determined to determine the The distribution area of the pixel corresponding to the score of the preset threshold.
  • Step 404 Generate a heat map corresponding to each limb part based on the determined area, the limb part corresponding to the score matrix, and the preset heat map color value corresponding to each limb part.
  • the color value of the heat map corresponding to each limb part may be preset in the above-mentioned execution body.
  • the heat map used to indicate the arm may be yellow
  • the color value of the heat map used to indicate the leg may be blue
  • the color value of the heat map used to indicate the shoulder may be red.
  • a heat map corresponding to each limb part can be generated.
  • the heat map is an image that presents the contours of the above-mentioned limbs in a special highlight form.
  • the pixel area corresponding to the score greater than the preset threshold in the score matrix can be set to a bright color.
  • each score matrix indicates the position distribution of a limb
  • the pixel area corresponding to the score is set to a bright color.
  • the setting of the specific color is determined by setting the color value of the heat map corresponding to each limb part in advance.
  • step 405 the generated heat map is superimposed on the obtained image at the location of the region corresponding to each limb part to generate an image after the superimposed heat map.
  • the above-mentioned execution subject may superimpose the heat map generated in step 504 on the image obtained in step 401 at the location of the region corresponding to each limb part.
  • the generated heat map of the left arm and left leg can be superimposed on the image obtained in step 401, so as to obtain the superimposed image of the heat map, and then The image is presented on the terminal.
  • Fig. 4 it can be seen from Fig. 4 that, unlike the embodiment shown in Fig. 4, this embodiment highlights that the output result of the detection model is a score matrix, and each score matrix is used to indicate the position distribution of a limb part A step of. Therefore, by determining the position distribution of the limb parts based on the positioning pixel values, the result of the detected position distribution of the limb parts can be made more accurate.
  • the present disclosure provides an embodiment of an image generation device.
  • the device embodiment corresponds to the method embodiment shown in FIG. 2, and the device can be specifically applied to Various electronic devices.
  • the image generating device 600 includes an acquiring unit 601, an input unit 602, a first generating unit 603, and a second generating unit 604.
  • the acquiring unit 601 is configured to acquire an image of the target moving object, and the image presents the limb parts of the target moving object
  • the input unit 602 is configured to input the image to a pre-trained detection model to obtain an indication of the preset limb
  • the output result of the position distribution of each limb part in the part set is presented in the image
  • the first generating unit 603 is configured to generate a heat map corresponding to each limb part based on the output result
  • the second generating unit 604 is configured to The generated heat map is superimposed on the image at the location of the region corresponding to each limb part to generate an image after the superimposed heat map.
  • step 201 the specific processing of the acquiring unit 601, the input unit 602, the first generating unit 603, and the second generating unit 604 and the technical effects brought by them can be referred to the corresponding embodiments in FIG. 2 respectively.
  • step 201, step 202, step 203, and step 204 in step 201 will not be repeated here.
  • the output result includes a preset number of score matrices, and each score matrix includes a score corresponding to the image and used to indicate the pixel distribution of the limbs in the image;
  • the value matrix corresponds to the limb parts in the limb parts collection one-to-one.
  • the first generating unit 703 is further configured to: for each score matrix of the preset number of score matrixes, determine the score in the score matrix that is greater than the preset threshold. The area of the pixel corresponding to the value in the image; based on the determined area, the limb part corresponding to the score matrix, and the preset heat map color value corresponding to each limb part, a heat map corresponding to each limb part is generated.
  • the detection model is obtained by training in the following steps: Obtain a training sample set, the training sample set includes sample images showing limb parts and indicating that each limb part presented in the sample image is in the sample image Indication information of the position distribution in; Based on the training sample set, the sample image is used as input, and the indication information corresponding to the sample image is used as the desired output, and the detection model is trained by machine learning.
  • the indication information includes a score matrix corresponding to the sample image and used to indicate the pixel distribution of the limb parts in the sample image; and the detection model is further trained through the following steps: Training steps: For the sample image in the training sample set, input the sample image to the convolutional neural network to obtain the sample score matrix indicating the pixel distribution of each limb part in the sample image; determine the obtained and each sample image Whether the difference between the corresponding sample score matrix and the score matrix in the indication information is less than the preset threshold; in response to determining that the difference is less than the preset threshold, it is determined that the training of the convolutional neural network is completed, and the completed convolutional neural network is trained As a detection model; in response to determining that the difference is greater than or equal to a preset threshold, adjust the parameters of the convolutional neural network to be trained, and re-execute the training step.
  • the image generation device detects the acquired image of the target moving object, determines the position of the limb part of the target moving object in the image, and generates a thermal map of the limb position to be superimposed on the above image For the corresponding positions of the limbs, there is no need to detect the limbs of the moving object by means of key point detection, leading to positioning deviations, improving the accuracy of limb positioning, and helping to accurately guide the user to complete subsequent limb actions.
  • Fig. 7 shows a schematic structural diagram of an electronic device (for example, the terminal device in Fig. 1) 700 suitable for implementing the embodiments of the present disclosure.
  • the terminal devices in the embodiments of the present disclosure may include, but are not limited to, mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablets), PMPs (portable multimedia players), vehicle-mounted terminals ( For example, mobile terminals such as car navigation terminals) and fixed terminals such as digital TVs and desktop computers.
  • the terminal device shown in FIG. 7 is only an example, and should not bring any limitation to the function and scope of use of the embodiments of the present disclosure.
  • the electronic device 700 may include a processing device (such as a central processing unit, a graphics processor, etc.) 701, which may be loaded into a random access device according to a program stored in a read-only memory (ROM) 702 or from a storage device 708.
  • the program in the memory (RAM) 703 executes various appropriate actions and processing.
  • the RAM 703 also stores various programs and data required for the operation of the electronic device 700.
  • the processing device 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704.
  • An input/output (I/O) interface 705 is also connected to the bus 704.
  • the following devices can be connected to the I/O interface 705: including input devices 706 such as touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, liquid crystal display (LCD), speakers, vibration An output device 707 such as a device; a storage device 708 such as a magnetic tape and a hard disk; and a communication device 709.
  • the communication device 709 may allow the electronic device 700 to perform wireless or wired communication with other devices to exchange data.
  • FIG. 7 shows an electronic device 700 having various devices, it should be understood that it is not required to implement or have all the illustrated devices. It may alternatively be implemented or provided with more or fewer devices. Each block shown in FIG. 7 can represent one device, or can represent multiple devices as needed.
  • the process described above with reference to the flowchart can be implemented as a computer software program.
  • the embodiments of the present disclosure include a computer program product, which includes a computer program carried on a computer-readable medium, and the computer program contains program code for executing the method shown in the flowchart.
  • the computer program may be downloaded and installed from the network through the communication device 709, or installed from the storage device 708, or installed from the ROM 702.
  • the processing device 701 the above-mentioned functions defined in the method of the embodiment of the present disclosure are executed.
  • the computer-readable medium described in the embodiments of the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the two.
  • the computer-readable storage medium may be, for example, but not limited to, an electric, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination of the above. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • the computer-readable storage medium may be any tangible medium that contains or stores a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device.
  • the computer-readable signal medium may include a data signal propagated in a baseband or as a part of a carrier wave, and a computer-readable program code is carried therein. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than the computer-readable storage medium.
  • the computer-readable signal medium may send, propagate or transmit the program for use by or in combination with the instruction execution system, apparatus, or device .
  • the program code contained on the computer-readable medium can be transmitted by any suitable medium, including but not limited to: wire, optical cable, RF (Radio Frequency), etc., or any suitable combination of the above.
  • the above-mentioned computer-readable medium may be included in the above-mentioned terminal device; or it may exist alone without being assembled into the terminal device.
  • the aforementioned computer-readable medium carries one or more programs.
  • the electronic device acquires an image of the target moving object, and the image presents the limbs of the target moving object;
  • the image is input to the pre-trained detection model, and the output result indicating the position distribution of each limb part in the preset limb part set in the image is obtained; based on the output result, a heat map corresponding to each limb part is generated;
  • the generated heat map is superimposed on the image at the location of the region corresponding to each limb part to generate an image after the superimposed heat map.
  • the computer program code used to perform the operations of the embodiments of the present disclosure can be written in one or more programming languages or a combination thereof.
  • the programming languages include object-oriented programming languages such as Java, Smalltalk, C++, and Conventional procedural programming language-such as "C" language or similar programming language.
  • the program code can be executed entirely on the user's computer, partly on the user's computer, executed as an independent software package, partly on the user's computer and partly executed on a remote computer, or entirely executed on the remote computer or server.
  • the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (for example, using an Internet service provider to Connect via the Internet).
  • LAN local area network
  • WAN wide area network
  • each block in the flowchart or block diagram can represent a module, program segment, or part of code, and the module, program segment, or part of code contains one or more for realizing the specified logical function Executable instructions.
  • the functions marked in the block may also occur in a different order from the order marked in the drawings. For example, two blocks shown in succession can actually be executed substantially in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart can be implemented by a dedicated hardware-based system that performs the specified functions or operations Or it can be realized by a combination of dedicated hardware and computer instructions.
  • a processor includes a processor including an acquisition unit, an input unit, a first generation unit, and a second generation unit.
  • the names of these units do not constitute a limitation on the unit itself under certain circumstances.
  • the acquisition unit can also be described as a "unit for acquiring an image of a target moving object".

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Computer Graphics (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

图像生成方法和装置,方法包括:获取目标运动对象的图像(201),图像呈现有目标运动对象的肢体部位;将获取到的图像输入至预先训练的检测模型,得到用于指示预设肢体部位集合中的各肢体部位呈现在图像中的位置分布的输出结果(202);基于输出结果,生成与各肢体部位对应的热力图(203);将所生成的热力图叠加至图像中、与各肢体部位对应的区域位置处,生成叠加热力图后的图像(204)。该方法不需要对运动对象的肢体部位通过关键点检测的方式进行检测导致定位偏差,能够提高肢体部位定位的准确性,有利于准确的指导用户完成后续的肢体动作。

Description

图像生成方法和装置
相关申请的交叉引用
本申请基于申请号为201910528723.1、申请日为2019年06月18日、名称为“图像生成方法和装置”的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本公开的实施例涉及计算机技术领域,具体涉及图像生成方法和装置。
背景技术
随着科学技术的发展以及人工智能技术的普及,虚拟现实技术、增强现实技术得到了快速的发展。现有的虚拟现实技术、增强现实技术等技术通常与机器学习技术、图像处理技术等相结合,以开发各种终端应用。
相关技术中,将拍摄到的用户图像添加至虚拟场景,使得用户置身于虚拟世界以达到虚拟现实的效果。将虚拟现实技术应用至动作类应用时,由于人体挥舞手臂等肢体动作过快,摄像机通常难以捕捉到清晰的肢体部位,进而难以准确的指导用户完成各种动作。
发明内容
本公开的实施例提出了图像生成方法和装置。
第一方面,本公开的实施例提供了一种图像生成方法,该方法包括:获取目标运动对象的图像,图像呈现有目标运动对象的肢体部位;将图像输入至预先训练的检测模型,得到用于指示预设肢体部位集合中的各肢体部位呈现在图像中的位置分布的输出结果;基于输出结果, 生成与各肢体部位对应的热力图;将所生成的热力图叠加至图像中、与各肢体部位对应的区域位置处,生成叠加热力图后的图像。
在一些实施例中,输出结果包括预设数目个分值矩阵,各分值矩阵包括与图像的对应的、用于指示图像中呈现肢体部位的像素分布的分值;分值矩阵与肢体部位集合中的肢体部位一一对应。
在一些实施例中,基于输出结果,生成与各肢体部位对应的热力图,包括:对于预设数目个分值矩阵中的每一个分值矩阵,确定该分值矩阵中大于预设阈值的分值对应的像素在图像中的区域;基于所确定的区域、与分值矩阵对应的肢体部位、预先设定的与各肢体部位对应的热力图色值,生成与各肢体部位对应的热力图。
在一些实施例中,检测模型通过如下步骤训练得到:获取训练样本集,训练样本集包括呈现肢体部位的样本图像和用于指示样本图像所呈现的各肢体部位在样本图像中的位置分布的指示信息;基于训练样本集,将样本图像作为输入、将与样本图像对应的指示信息作为期望输出,利用机器学习的方法,训练得到检测模型。
在一些实施例中,指示信息包括与样本图像对应的、用于指示样本图像中呈现肢体部位的像素分布的分值矩阵;以及
基于训练样本集,将样本图像作为输入、将与样本图像对应的指示信息作为期望输出,利用机器学习的方法,训练得到检测模型,包括:执行如下训练步骤:对于训练样本集中的样本图像,将该样本图像输入至卷积神经网络,得到用于指示各肢体部位在样本图像中的像素分布的样本分值矩阵;确定所得到的与各样本图像对应的样本分值矩阵和指示信息中的分值矩阵之间的差异是否小于预设阈值;响应于确定差异小于预设阈值,确定卷积神经网络训练完成,以及将训练完成的卷积神经网络作为检测模型;响应于确定差异大于或等于预设阈值,调整待训练的卷积神经网络的参数,重新执行训练步骤。
第二方面,本公开的实施例提供了一种图像生成装置,该装置包括:获取单元,被配置成获取目标运动对象的图像,图像呈现有目标运动对象的肢体部位;输入单元,被配置成将图像输入至预先训练的检测模型,得到用于指示预设肢体部位集合中的各肢体部位呈现在图 像中的位置分布的输出结果;第一生成单元,被配置成基于输出结果,生成与各肢体部位对应的热力图;第二生成单元,被配置成将所生成的热力图叠加至图像中、与各肢体部位对应的区域位置处,生成叠加热力图后的图像。
在一些实施例中,输出结果包括预设数目个分值矩阵,各分值矩阵包括与图像的对应的、用于指示图像中呈现肢体部位的像素分布的分值;分值矩阵与肢体部位集合中的肢体部位一一对应。
在一些实施例中,第一生成单元进一步被配置成:对于预设数目个分值矩阵中的每一个分值矩阵,确定该分值矩阵中大于预设阈值的分值对应的像素在图像中的区域;基于所确定的区域、与分值矩阵对应的肢体部位、预先设定的与各肢体部位对应的热力图色值,生成与各肢体部位对应的热力图。
在一些实施例中,检测模型通过如下步骤训练得到:获取训练样本集,训练样本集包括呈现肢体部位的样本图像和用于指示样本图像所呈现的各肢体部位在样本图像中的位置分布的指示信息;基于训练样本集,将样本图像作为输入、将与样本图像对应的指示信息作为期望输出,利用机器学习的方法,训练得到检测模型。
在一些实施例中,指示信息包括与样本图像对应的、用于指示样本图像中呈现肢体部位的像素分布的分值矩阵;以及检测模型进一步通过如下步骤训练得到:执行如下训练步骤:对于训练样本集中的样本图像,将该样本图像输入至卷积神经网络,得到用于指示各肢体部位在样本图像中的像素分布的样本分值矩阵;确定所得到的与各样本图像对应的样本分值矩阵和指示信息中的分值矩阵之间的差异是否小于预设阈值;响应于确定差异小于预设阈值,确定卷积神经网络训练完成,以及将训练完成的卷积神经网络作为检测模型;响应于确定差异大于或等于预设阈值,调整待训练的卷积神经网络的参数,重新执行训练步骤。
第三方面,本公开的实施例提供了一种终端设备,该终端设备包括:一个或多个处理器;存储装置,用于存储一个或多个程序;当一个或多个程序被一个或多个处理器执行,使得一个或多个处理器实现 如第一方面中任一实现方式描述的方法。
第四方面,本公开的实施例提供了一种计算机可读介质,其上存储有计算机程序,该计算机程序被处理器执行时实现如第一方面中任一实现方式描述的方法。
本公开的实施例提供的图像生成方法和装置,通过对获取到的目标运动对象的图像进行检测,确定目标运动对象的肢体部位在图像中的位置,生成肢体位置的热力图,以叠加至上述图像中给肢体部位的相应位置处,不需要对运动对象的肢体部位通过关键点检测的方式进行检测导致定位偏差,提高肢体部位定位的准确性,有利于准确的指导用户完成后续的肢体动作。
附图说明
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本公开的其它特征、目的和优点将会变得更明显:
图1是本公开的一个实施例可以应用于其中的示例性系统架构图;
图2是根据本公开的图像生成方法的一个实施例的流程图;
图3是根据本公开的实施例的图像生成方法的一个应用场景的示意图;
图4是根据本公开的图像生成方法的又一个实施例的流程图;
图5是根据本公开的实施例的分值矩阵的一个应用场景示意图;
图6是根据本公开的图像生成装置的一个实施例的结构示意图;
图7是适于用来实现本公开的实施例的电子设备的结构示意图。
具体实施方式
下面结合附图和实施例对本公开作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释相关发明,而非对该发明的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与有关发明相关的部分。
需要说明的是,在不冲突的情况下,本公开中的实施例及实施例 中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本公开。
图1示出了可以应用本公开的图像生成方法或图像生成装置的实施例的示例性架构100。
如图1所示,系统架构100可以包括终端设备101、102、103,网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。
终端设备101、102、103上可以安装有各种客户端应用。例如图像处理类应用、增强现实类应用、虚拟现实类应用、动作指导类应用、运动健身类应用等。还可以安装有摄像头、摄像类应用等。终端设备101、102、103可以通过网络104与服务器105交互,以接收或发送消息等。
终端设备101、102、103可以是硬件,也可以是软件。当终端设备101、102、103为硬件时,可以是可以接收用户操作的各种电子设备,包括但不限于智能手机、平板电脑、电子书阅读器、膝上型便携计算机和台式计算机等等。当终端设备101、102、103为软件时,可以安装在上述所列举的电子设备中。其可以实现成多个软件或软件模块(例如用来提供分布式服务的多个软件或软件模块),也可以实现成单个软件或软件模块。在此不做具体限定。
服务器105可以是支持终端设备101、102、103上安装的客户端应用的后台服务器。服务器105可以对接收到的终端获取的目标运动对象的图像进行检测处理,生成包括各肢体部位的热力图的图像以在终端呈现。
需要说明的是,服务器105可以是硬件,也可以是软件。当服务器为硬件时,可以实现成多个服务器组成的分布式服务器集群,也可以实现成单个服务器。当服务器为软件时,可以实现成多个软件或软件模块(例如用来提供分布式服务的多个软件或软件模块),也可以实现成单个软件或软件模块。在此不做具体限定。
需要说明的是,本公开的实施例所提供的图像生成方法可以由服 务器105执行,也可以由终端设备101、102、103执行。相应地,图像生成装置可以设置于服务器105中,也可以设置于终端设备101、102、103中。
应该理解,图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。在生成动态图像的过程中所使用的数据不需要从远程获取的情况下,上述系统架构可以不包括网络,只包括终端设备或服务器。
继续参考图2,其示出了根据本公开的图像生成方法的一个实施例的流程200。该图像生成方法包括以下步骤:
步骤201,获取目标运动对象的图像。
在本实施例中,上述图像生成方法的执行主体(例如图1所示的终端设备101、102、103或者服务器105)可以安装有拍摄设备,或者与拍摄设备连接。该目标运动对象的图像可以为拍摄设拍摄后发送给上述执行主体的。在这里,该目标运动对象的图像可以为拍摄设备实时获取的,也可以为基于预设时间间隔获取到的。
在这里,上述目标运动对象可以为人体。所获取到的图像中呈现有目标运动对象的肢体部位。在这里,该肢体部位可以包括但不限于手、大臂、小臂、大腿、小腿、颈部、背部、腰部、脚、膝盖、肩膀。
步骤202,将获取到的图像输入至预先训练的检测模型,得到用于指示预设肢体部位集合中的各肢体部位呈现在所获取到的图像中的位置分布的输出结果。
在本实施例中,上述预设肢体部位集合中可以包括一个肢体部位,也可以包括多个肢体部位,根据场景的需要设置。作为示例,当预设肢体部位集合中的肢体部位包括做胳膊和左腿时,上述输出结果即为用于指示左胳膊和左腿呈现在所获取到的图像中的位置分布。
在本实施例中,上述输出结果可以包括特征图。在这里,该特征图的尺寸与所获取到的图像的尺寸相同,其呈现的图像内容包括各肢体部位的轮廓。基于该特征图,可以确定出特征图呈现的肢体轮廓在特征图中的坐标位置,然后将该坐标位置作为肢体部位在所获得的图 像中的图像坐标信息。在本实施例中,为了使得检测结果更加准确,更好的区分各个肢体部位,可以输出多张特征图,每一张特征图对应一个肢体部位。例如,当需要确定右胳膊和右腿在图像中的位置分布时,上述输出结果可以包括两张特征图,其中一张特征图包括右胳膊的轮廓,另外一张图包括左胳膊的轮廓。
在本实施例中,上述检测模型可以是基于训练样本,利用现有的网络结构进行训练得到的。该网络结构例如可以包括生成对抗网络,卷积神经网络等。上述训练样本包括样本运动图像,与样本运动图像对应的所期望得到的特征图。
作为示例,上述检测模型可以是对生成对抗网络训练得到的。具体的,生成对抗网络包括生成网络和判别网络。其中,生成网络用于对样本图像进行特征提取后得到特征图;判别网络用于确定所得到的特征图和期望得到的特征图之间的误差。
生成网络可以是用于进行图像处理的卷积神经网络(例如包含卷积层、池化层、反池化层、反卷积层的各种结构的卷积神经网络)。上述判别网络也可以是卷积神经网络(例如包含全连接层的各种结构的卷积神经网络,其中,上述全连接层可以实现分类功能)。
在这里,可以基于判别网络输出的上述误差,反复迭代调整生成网络,直到判别网络输出的上述误差小于预设值。此时,将训练完成的生成网络作为上述检测模型。
步骤203,基于输出结果,生成与各肢体部位对应的热力图。
在本实施例中,根据步骤202所得到的输出结果,上述执行主体可以生成与各肢体部位对应的热力图。在这里,热力图是以特殊高亮的形式呈现上述肢体部位的轮廓的图像。具体的,上述执行可以基于步骤202所得到的特征图,确定特征图呈现的肢体轮廓的坐标位置,将该坐标位置对应的区域以特殊高亮的形式显示,从而得到热力图。
步骤204,将所生成的热力图叠加至所得到的图像中、与各肢体部位对应的区域位置处,生成叠加热力图后的图像。
在本实施例中,上述执行主体可以将步骤202生成的热力图叠加至步骤201获取到的图像中、与各肢体部位对应的区域位置处。作为 示例,当上述肢体部位为左胳膊和左腿时,可以将所生成的左胳膊、左腿的热力图叠加至步骤201获取到的图像中,从而得到叠加热力图后的图像,并将该图像呈现在终端。
从图2所示的实施例可以看出,本申请所示的图像生成方法,通过对目标运动对象的图像进行检测,可以确定出各肢体部位呈现在图像中的位置区域。在一些应用场景中,例如运动健身类应用中,当用户需要通过该应用对自身的动作进行纠正、检测时,通常该应用运行于其上的电子设备可以对用户进行图像拍摄,然后将拍摄图像呈现的肢体动作与预置的动作库中的动作进行比对。在用户的运动速度过快时,通常无法准确捕捉到身体各肢体动作。通过将与肢体部位对应的热力图叠加在所获取的图像中呈现肢体部位的位置处,可以基于呈现的各肢体部位的热力图来与动作库中的动作进行比对,有利于指导用户进行后续动作的完成,提高用户体验。
继续参考图3,其示出了本公开的其示出了本公开的图像生成方法的一个应用场景图。
在如图3所示的应用场景中,拍摄设备301获取用户A的图像后,将该图像发送至电子设备302。该电子设备302可以为手机等终端,也可以为服务器。然后,该电子设备302将获取到的图像输入至检测模型303,得到胳膊在图像中的位置分布。接着,生成胳膊的热力图。最后,将生成的热力图叠加至所获取的图像中胳膊的位置处,得到叠加热力图后的图像303。
进一步参考图4,其示出了根据本公开的图像生成方法的又一个实施例的流程400。该图像生成方法包括以下步骤:
步骤401,获取目标运动对象的图像。
在本实施例中,上述图像生成方法的执行主体(例如图1所示的终端设备101、102、103或者服务器105)可以安装有拍摄设备,或者与拍摄设备连接。该目标运动对象的图像可以为拍摄设拍摄后发送给上述执行主体的。在这里,该目标运动对象的图像可以为拍摄设备实时获取的,也可以为基于预设时间间隔获取到的。
在这里,上述目标运动对象可以为人体。所获取到的图像中呈现有目标运动对象的肢体部位。
步骤402,将所获取到的图像输入至预先训练的检测模型,得到用于指示预设肢体部位集合中的各肢体部位呈现在所获取到的图像中的位置分布的输出结果。
在本实施例中,该输出结果可以是分值矩阵,该分值矩阵与所获取到的图像中的各像素一一对应。图像是由像素组成,每一个像素均由其在图像中的坐标位置。例如,分辨率为1024*540的图像,其由横向1024个像素、纵向540个像素组成。每一个像素均有RGB三色色值组成。第一行第一个像素即为图像中像素的坐标位置。也即是说,图像中的每一个像素包括像素值和位于图像中的坐标位置。上述分值矩阵中的每一个分值用于指示所获取到的图像中每一像素呈现肢体部位的概率值。从而,上述位置分布即为所获取到的图像中,哪些像素用于呈现预设肢体部位集合中的肢体部位。如图5所示,图5示意性的示出了本公开提供的分值矩阵的一个应用场景示意图。在这里,图3示出的输出结果为15*15的分值矩阵,也即是说所得到的图像的像素为15*15。其中,分值矩阵中各分值的坐标位置与图像中的像素的坐标位置一一对应。分值矩阵中的分值包括0到9。假设该分值矩阵用于指示左胳膊在所获取到的图像中的分布。当预设分值阈值为8时,分值矩阵中,分值大于等于8的分值对应的图像中的像素的坐标位置所呈现的,即为左胳膊在所获取到的图像中的位置分布。
在本实施例中,上述输出结果可以包括预设数目个分值矩阵,每一个分值矩阵包括与所获取到的图像对应的、用于指示所获取到的图像中呈现肢体部位的像素分布的分值;该分值矩阵与肢体部位集合中的肢体部位一一对应。也即是说,一个分值矩阵用于指示上述肢体部位集合中的一个肢体部位在所获取到的图像中的位置分布。从而,在一个分值矩阵中,即可得到用于呈现一个肢体部位的像素分布的分值。使得所确定出的肢体部位更加准确。
在本实施例中,上述检测模型可以是基于训练样本训练得到。具体的,获取训练样本集,训练样本集包括呈现肢体部位的样本图像和 用于指示样本图像所呈现的各肢体部位在样本图像中的位置分布的指示信息;基于训练样本集,将样本图像作为输入、将与样本图像对应的指示信息作为期望输出,利用机器学习的方法,训练得到检测模型。
在这里,上述指示信息包括与样本图像对应的、用于指示样本图像中呈现肢体部位的像素分布的分值矩阵。以及,基于训练样本集,将样本图像作为输入、将与样本图像对应的指示信息作为期望输出,利用机器学习的方法,训练得到检测模型,具体可以包括:
对于训练样本集中的样本图像,将该样本图像输入至卷积神经网络,得到用于指示各肢体部位在样本图像中的像素分布的样本分值矩阵;确定所得到的与各样本图像对应的样本分值矩阵和指示信息中的分值矩阵之间的差异是否小于预设阈值;响应于确定该差异小于预设阈值,确定卷积神经网络训练完成,以及将训练完成的卷积神经网络作为所述检测模型;响应于确定该差异大于或等于预设阈值,调整待训练的卷积神经网络的参数,重新执行所述训练步骤。
具体来说,上述检测模型可以是对卷积神经网络训练得到的。上述指示信息是与样本图像的像素对应的分值矩阵,其中,分值矩阵中与呈现肢体部位的图像对应的像素的分值设置为10,其余分值设置为0。例如,当需要检测左胳膊和左腿呈现在样本图像中的位置分布时,可以将呈现左胳膊和左腿的像素对应的分值矩阵中相应的坐标位置处设置成10,其余分值设置为0。然后,对于训练样本集中的每一个样本,可以将该样本输入至待训练的卷积神经网络中,得到用于指示呈现肢体部位的图像对应的像素的样本分值矩阵。将所得到的样本分值矩阵与预先设置的指示信息中的分值矩阵进行比较,确定所得到的样本分值矩阵与指示信息中的分值矩阵之间的差异。该差异包括分值矩阵中各坐标位置的分值之间的差异。响应于该差异小于预设阈值,可以确定卷积神经网络训练完成,然后将训练完成的神经网络作为检测模型。在这里,上述差异小于预设阈值具体可以包括:在所得到的样本分值矩阵中,与预先设置的指示信息中的分值矩阵相对应坐标位置处的分值,之间的差值小于预设阈值的分值的数目大于预设数目值。
响应于确定上述差异大于或等于预设阈值,可以调整卷积神经网 络的参数,例如可以为调整卷积神经网络中卷积层的数目、卷积核的大小等。然后利用上述训练样本对参数调整后的卷积神经网络继续训练,直到上述误差小于预设阈值。
步骤403,对于预设数目个分值矩阵中的每一个分值矩阵,确定该分值矩阵中大于预设阈值的分值对应的像素在图像中的区域。
在本实施例中,根据步骤402所确定出的分值矩阵,上述执行主体确定每一个分值矩阵中,大于预设阈值的分值。由于图像中的像素的位置与分值矩阵中各分值的坐标位置一一对应,因此,可以通过确定分值矩阵中大于预设阈值的分值的坐标位置,来确定出图像中、与大于预设阈值的分值对应的像素的分布区域。
步骤404,基于所确定的区域、与分值矩阵对应的肢体部位、预先设定的与各肢体部位对应的热力图色值,生成与各肢体部位对应的热力图。
在本实施例中,上述执行主体中可以预先设置与各肢体部位对应的热力图色值。例如,用于指示胳膊的热力图可以为黄色,用于指示腿的热力图色值可以为蓝色,用于指示肩膀的热力图色值可以为红色。
根据与每一个分值矩阵对应的肢体部位、步骤503所确定出的区域,以及上述预先设置的与各肢体部位对应的热力图色值,可以生成与各肢体部位对应的热力图。在这里,热力图是以特殊高亮的形式呈现上述肢体部位的轮廓的图像。
具体的,当上述分值矩阵的数目为一个时,可以将与分值矩阵中,大于预设阈值的分值对应的像素区域设置成一种高亮的颜色。但上述分值矩阵的数目为多个,例如2个时,由于每一个分值矩阵指示一个肢体部位的位置分布,对于每一个分值矩阵,将与该分值矩阵中,大于预设阈值的分值对应的像素区域设置设成一种高亮的颜色。具体颜色的设置是通过上述预先设置与各肢体部位对应的热力图色值来确定的。从而可以形成多张热力图,每一个热力图对应一个肢体部位;也可以形成一张热力图,即基于颜色、位置分布将多个肢体部位的热力图显示在同一张图像中。
步骤405,将所生成的热力图叠加至所得到的图像中、与各肢体 部位对应的区域位置处,生成叠加热力图后的图像。
在本实施例中,上述执行主体可以将步骤504生成的热力图叠加至步骤401获取到的图像中、与各肢体部位对应的区域位置处。作为示例,当上述肢体部位为左胳膊和左腿时,可以将所生成的左胳膊、左腿的热力图叠加至步骤401获取到的图像中,从而得到叠加热力图后的图像,并将该图像呈现在终端。
从图4中可以看出,与图4所示的实施例不同的是,本实施例突出了检测模型的输出结果为分值矩阵,以及每一个分值矩阵用于指示一个肢体部位的位置分布的步骤。从而,通过基于定位像素值来确定肢体部位的位置分布的方式,可以使得所检测出的肢体部位的位置分布的结果更加准确。
进一步参考图6,作为对上述各图所示方法的实现,本公开提供了图像生成装置的一个实施例,该装置实施例与图2所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。
如图6所示,本实施例提供的图像生成装置600包括获取单元601、输入单元602、第一生成单元603和第二生成单元604。其中,获取单元601,被配置成获取目标运动对象的图像,图像呈现有目标运动对象的肢体部位;输入单元602,被配置成将图像输入至预先训练的检测模型,得到用于指示预设肢体部位集合中的各肢体部位呈现在图像中的位置分布的输出结果;第一生成单元603,被配置成基于输出结果,生成与各肢体部位对应的热力图;第二生成单元604,被配置成将所生成的热力图叠加至图像中、与各肢体部位对应的区域位置处,生成叠加热力图后的图像。
在本实施例中,图像生成装置600中:获取单元601、输入单元602、第一生成单元603和第二生成单元604的具体处理及其所带来的技术效果可分别参考图2对应实施例中的步骤201、步骤202、步骤203和步骤204的相关说明,在此不再赘述。
在本实施的一些可选的实现方式中,输出结果包括预设数目个分值矩阵,各分值矩阵包括与图像的对应的、用于指示图像中呈现肢体 部位的像素分布的分值;分值矩阵与肢体部位集合中的肢体部位一一对应。
在本实施的一些可选的实现方式中,第一生成单元703进一步被配置成:对于预设数目个分值矩阵中的每一个分值矩阵,确定该分值矩阵中大于预设阈值的分值对应的像素在图像中的区域;基于所确定的区域、与分值矩阵对应的肢体部位、预先设定的与各肢体部位对应的热力图色值,生成与各肢体部位对应的热力图。
在本实施的一些可选的实现方式中,检测模型通过如下步骤训练得到:获取训练样本集,训练样本集包括呈现肢体部位的样本图像和用于指示样本图像所呈现的各肢体部位在样本图像中的位置分布的指示信息;基于训练样本集,将样本图像作为输入、将与样本图像对应的指示信息作为期望输出,利用机器学习的方法,训练得到检测模型。
在本实施的一些可选的实现方式中,指示信息包括与样本图像对应的、用于指示样本图像中呈现肢体部位的像素分布的分值矩阵;以及检测模型进一步通过如下步骤训练得到:执行如下训练步骤:对于训练样本集中的样本图像,将该样本图像输入至卷积神经网络,得到用于指示各肢体部位在样本图像中的像素分布的样本分值矩阵;确定所得到的与各样本图像对应的样本分值矩阵和指示信息中的分值矩阵之间的差异是否小于预设阈值;响应于确定差异小于预设阈值,确定卷积神经网络训练完成,以及将训练完成的卷积神经网络作为检测模型;响应于确定差异大于或等于预设阈值,调整待训练的卷积神经网络的参数,重新执行训练步骤。
本公开的实施例提供的图像生成装置,通过对获取到的目标运动对象的图像进行检测,确定目标运动对象的肢体部位在图像中的位置,生成肢体位置的热力图,以叠加至上述图像中给肢体部位的相应位置处,不需要对运动对象的肢体部位通过关键点检测的方式进行检测导致定位偏差,提高肢体部位定位的准确性,有利于准确的指导用户完成后续的肢体动作。
下面参考图7,其示出了适于用来实现本公开的实施例的电子设 备(例如图1中的终端设备)700的结构示意图。本公开的实施例中的终端设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、PDA(个人数字助理)、PAD(平板电脑)、PMP(便携式多媒体播放器)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字TV、台式计算机等等的固定终端。图7示出的终端设备仅仅是一个示例,不应对本公开的实施例的功能和使用范围带来任何限制。
如图7所示,电子设备700可以包括处理装置(例如中央处理器、图形处理器等)701,其可以根据存储在只读存储器(ROM)702中的程序或者从存储装置708加载到随机访问存储器(RAM)703中的程序而执行各种适当的动作和处理。在RAM 703中,还存储有电子设备700操作所需的各种程序和数据。处理装置701、ROM 702以及RAM 703通过总线704彼此相连。输入/输出(I/O)接口705也连接至总线704。
通常,以下装置可以连接至I/O接口705:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置706;包括例如液晶显示器(LCD)、扬声器、振动器等的输出装置707;包括例如磁带、硬盘等的存储装置708;以及通信装置709。通信装置709可以允许电子设备700与其他设备进行无线或有线通信以交换数据。虽然图7示出了具有各种装置的电子设备700,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。图7中示出的每个方框可以代表一个装置,也可以根据需要代表多个装置。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置709从网络上被下载和安装,或者从存储装置708被安装,或者从ROM 702被安装。在该计算机程序被处理装置701执行时,执行本公开的实施例的方法中限定的上述功能。
需要说明的是,本公开的实施例描述的计算机可读介质可以是计 算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开的实施例中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开的实施例中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、RF(射频)等等,或者上述的任意合适的组合。
上述计算机可读介质可以是上述终端设备中所包含的;也可以是单独存在,而未装配入该终端设备中。上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:获取目标运动对象的图像,图像呈现有目标运动对象的肢体部位;将图像输入至预先训练的检测模型,得到用于指示预设肢体部位集合中的各肢体部位呈现在图像中的位置分布的输出结果;基于输出结果,生成与各肢体部位对应的热力图;将所生成的热力图叠加至图像中、与各肢体部位对应的区域位置处,生成叠加热力图后的图像。
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的实施例的操作的计算机程序代码,程序设计语言包括面向对象的程 序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)——连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开的实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的单元也可以设置在处理器中,例如,可以描述为:一种处理器包括一种处理器,包括获取单元、输入单元、第一生成单元和第二生成单元。其中,这些单元的名称在某种情况下并不构成对该单元本身的限定,例如,获取单元还可以被描述为“获取目标运动对象的图像的单元”。
以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本公开的实施例中所涉及的发明范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述发明构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本公开的实施例中公 开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。

Claims (12)

  1. 一种图像生成方法,包括:
    获取目标运动对象的图像,所述图像呈现有所述目标运动对象的肢体部位;
    将所述图像输入至预先训练的检测模型,得到用于指示预设肢体部位集合中的各肢体部位呈现在所述图像中的位置分布的输出结果;
    基于所述输出结果,生成与各肢体部位对应的热力图;
    将所生成的热力图叠加至所述图像中、与各肢体部位对应的区域位置处,生成叠加热力图后的图像。
  2. 根据权利要求1所述的方法,其中,所述输出结果包括预设数目个分值矩阵,各分值矩阵包括与所述图像的对应的、用于指示所述图像中呈现肢体部位的像素分布的分值;所述分值矩阵与所述肢体部位集合中的肢体部位一一对应。
  3. 根据权利要求2所述的方法,其中,所述基于所述输出结果,生成与各肢体部位对应的热力图,包括:
    对于所述预设数目个分值矩阵中的每一个分值矩阵,确定该分值矩阵中大于预设阈值的分值对应的像素在所述图像中的区域;
    基于所确定的区域、与分值矩阵对应的肢体部位、预先设定的与各肢体部位对应的热力图色值,生成与各肢体部位对应的热力图。
  4. 根据权利要求1或2所述的方法,其中,所述检测模型通过如下步骤训练得到:
    获取训练样本集,所述训练样本集包括呈现肢体部位的样本图像和用于指示样本图像所呈现的各肢体部位在样本图像中的位置分布的指示信息;
    基于训练样本集,将样本图像作为输入、将与样本图像对应的指示信息作为期望输出,利用机器学习的方法,训练得到所述检测模型。
  5. 根据权利要求4所述的方法,其中,指示信息包括与样本图像对应的、用于指示样本图像中呈现肢体部位的像素分布的分值矩阵;以及
    所述基于训练样本集,将样本图像作为输入、将与样本图像对应的指示信息作为期望输出,利用机器学习的方法,训练得到所述检测模型,包括:
    执行如下训练步骤:对于训练样本集中的样本图像,将该样本图像输入至卷积神经网络,得到用于指示各肢体部位在样本图像中的像素分布的样本分值矩阵;确定所得到的与各样本图像对应的样本分值矩阵和所述指示信息中的分值矩阵之间的差异是否小于预设阈值;响应于确定所述差异小于预设阈值,确定卷积神经网络训练完成,以及将训练完成的卷积神经网络作为所述检测模型;
    响应于确定所述差异大于或等于预设阈值,调整待训练的卷积神经网络的参数,重新执行所述训练步骤。
  6. 一种图像生成装置,包括:
    获取单元,被配置成获取目标运动对象的图像,所述图像呈现有所述目标运动对象的肢体部位;
    输入单元,被配置成将所述图像输入至预先训练的检测模型,得到用于指示预设肢体部位集合中的各肢体部位呈现在所述图像中的位置分布的输出结果;
    第一生成单元,被配置成基于所述输出结果,生成与各肢体部位对应的热力图;
    第二生成单元,被配置成将所生成的热力图叠加至所述图像中、与各肢体部位对应的区域位置处,生成叠加热力图后的图像。
  7. 根据权利要求6所述的装置,其中,所述输出结果包括预设数目个分值矩阵,各分值矩阵包括与所述图像的对应的、用于指示所述图像中呈现肢体部位的像素分布的分值;所述分值矩阵与所述肢体部 位集合中的肢体部位一一对应。
  8. 根据权利要求7所述的装置,其中,所述第一生成单元进一步被配置成:
    对于所述预设数目个分值矩阵中的每一个分值矩阵,确定该分值矩阵中大于预设阈值的分值对应的像素在所述图像中的区域;
    基于所确定的区域、与分值矩阵对应的肢体部位、预先设定的与各肢体部位对应的热力图色值,生成与各肢体部位对应的热力图。
  9. 根据权利要求6或7所述的装置,其中,所述检测模型通过如下步骤训练得到:
    获取训练样本集,所述训练样本集包括呈现肢体部位的样本图像和用于指示样本图像所呈现的各肢体部位在样本图像中的位置分布的指示信息;
    基于训练样本集,将样本图像作为输入、将与样本图像对应的指示信息作为期望输出,利用机器学习的方法,训练得到所述检测模型。
  10. 根据权利要求9所述的装置,其中,指示信息包括与样本图像对应的、用于指示样本图像中呈现肢体部位的像素分布的分值矩阵;以及
    所述检测模型进一步通过如下步骤训练得到:
    执行如下训练步骤:对于训练样本集中的样本图像,将该样本图像输入至卷积神经网络,得到用于指示各肢体部位在样本图像中的像素分布的样本分值矩阵;确定所得到的与各样本图像对应的样本分值矩阵和所述指示信息中的分值矩阵之间的差异是否小于预设阈值;响应于确定所述差异小于预设阈值,确定卷积神经网络训练完成,以及将训练完成的卷积神经网络作为所述检测模型;
    响应于确定所述差异大于或等于预设阈值,调整待训练的卷积神经网络的参数,重新执行所述训练步骤。
  11. 一种电子设备,包括:
    一个或多个处理器;
    存储装置,其上存储有一个或多个程序;
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-5中任一所述的方法。
  12. 一种计算机可读介质,其上存储有计算机程序,其中,该程序被处理器执行时实现如权利要求1-5中任一所述的方法。
PCT/CN2020/096547 2019-06-18 2020-06-17 图像生成方法和装置 WO2020253716A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/620,452 US20220358662A1 (en) 2019-06-18 2020-06-17 Image generation method and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910528723.1A CN110264539A (zh) 2019-06-18 2019-06-18 图像生成方法和装置
CN201910528723.1 2019-06-18

Publications (1)

Publication Number Publication Date
WO2020253716A1 true WO2020253716A1 (zh) 2020-12-24

Family

ID=67919196

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/096547 WO2020253716A1 (zh) 2019-06-18 2020-06-17 图像生成方法和装置

Country Status (3)

Country Link
US (1) US20220358662A1 (zh)
CN (1) CN110264539A (zh)
WO (1) WO2020253716A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113553959A (zh) * 2021-07-27 2021-10-26 杭州逗酷软件科技有限公司 动作识别方法及装置、计算机可读介质和电子设备

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110264539A (zh) * 2019-06-18 2019-09-20 北京字节跳动网络技术有限公司 图像生成方法和装置
CN111325822B (zh) 2020-02-18 2022-09-06 腾讯科技(深圳)有限公司 热点图的显示方法、装置、设备及可读存储介质
CN113762015A (zh) * 2021-01-05 2021-12-07 北京沃东天骏信息技术有限公司 一种图像处理方法和装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140307955A1 (en) * 2013-04-12 2014-10-16 Samsung Electronics Co., Ltd. Apparatus and method for detecting body parts from user image
CN108875482A (zh) * 2017-09-14 2018-11-23 北京旷视科技有限公司 物体检测方法和装置、神经网络训练方法和装置
CN109117753A (zh) * 2018-07-24 2019-01-01 广州虎牙信息科技有限公司 部位识别方法、装置、终端及存储介质
CN109274883A (zh) * 2018-07-24 2019-01-25 广州虎牙信息科技有限公司 姿态矫正方法、装置、终端及存储介质
CN109685013A (zh) * 2018-12-25 2019-04-26 上海智臻智能网络科技股份有限公司 人体姿态识别中头部关键点的检测方法及装置
CN110264539A (zh) * 2019-06-18 2019-09-20 北京字节跳动网络技术有限公司 图像生成方法和装置

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101599177B (zh) * 2009-07-01 2011-07-27 北京邮电大学 一种基于视频的人体肢体运动的跟踪方法
CN109658455B (zh) * 2017-10-11 2023-04-18 阿里巴巴集团控股有限公司 图像处理方法和处理设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140307955A1 (en) * 2013-04-12 2014-10-16 Samsung Electronics Co., Ltd. Apparatus and method for detecting body parts from user image
CN108875482A (zh) * 2017-09-14 2018-11-23 北京旷视科技有限公司 物体检测方法和装置、神经网络训练方法和装置
CN109117753A (zh) * 2018-07-24 2019-01-01 广州虎牙信息科技有限公司 部位识别方法、装置、终端及存储介质
CN109274883A (zh) * 2018-07-24 2019-01-25 广州虎牙信息科技有限公司 姿态矫正方法、装置、终端及存储介质
CN109685013A (zh) * 2018-12-25 2019-04-26 上海智臻智能网络科技股份有限公司 人体姿态识别中头部关键点的检测方法及装置
CN110264539A (zh) * 2019-06-18 2019-09-20 北京字节跳动网络技术有限公司 图像生成方法和装置

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113553959A (zh) * 2021-07-27 2021-10-26 杭州逗酷软件科技有限公司 动作识别方法及装置、计算机可读介质和电子设备

Also Published As

Publication number Publication date
US20220358662A1 (en) 2022-11-10
CN110264539A (zh) 2019-09-20

Similar Documents

Publication Publication Date Title
WO2020253716A1 (zh) 图像生成方法和装置
CN109902659B (zh) 用于处理人体图像的方法和装置
CN109584276B (zh) 关键点检测方法、装置、设备及可读介质
CN110058685B (zh) 虚拟对象的显示方法、装置、电子设备和计算机可读存储介质
WO2020224479A1 (zh) 目标的位置获取方法、装置、计算机设备及存储介质
CN110866977B (zh) 增强现实处理方法及装置、系统、存储介质和电子设备
CN109754464B (zh) 用于生成信息的方法和装置
WO2020211573A1 (zh) 用于处理图像的方法和装置
CN111368668B (zh) 三维手部识别方法、装置、电子设备及存储介质
CN110059624B (zh) 用于检测活体的方法和装置
WO2020155915A1 (zh) 用于播放音频的方法和装置
CN111402122A (zh) 图像的贴图处理方法、装置、可读介质和电子设备
WO2020124995A1 (zh) 手掌法向量确定方法、装置、设备及存储介质
CN111967515A (zh) 图像信息提取方法、训练方法及装置、介质和电子设备
WO2023051244A1 (zh) 图像生成方法、装置、设备及存储介质
CN110288532B (zh) 生成全身图像的方法、装置、设备及计算机可读存储介质
CN109829431B (zh) 用于生成信息的方法和装置
CN112270242B (zh) 轨迹的显示方法、装置、可读介质和电子设备
CN109816791B (zh) 用于生成信息的方法和装置
WO2023151558A1 (zh) 用于显示图像的方法、装置和电子设备
CN112418233B (zh) 图像处理方法、装置、可读介质及电子设备
CN113238652A (zh) 视线估计方法、装置、设备及存储介质
CN112070903A (zh) 虚拟对象的展示方法、装置、电子设备及计算机存储介质
KR102534449B1 (ko) 이미지 처리 방법, 장치, 전자 장치 및 컴퓨터 판독 가능 저장 매체
WO2022194061A1 (zh) 目标跟踪方法、装置、设备及介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20825428

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20825428

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 28.03.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20825428

Country of ref document: EP

Kind code of ref document: A1