WO2020253716A1 - 图像生成方法和装置 - Google Patents
图像生成方法和装置 Download PDFInfo
- Publication number
- WO2020253716A1 WO2020253716A1 PCT/CN2020/096547 CN2020096547W WO2020253716A1 WO 2020253716 A1 WO2020253716 A1 WO 2020253716A1 CN 2020096547 W CN2020096547 W CN 2020096547W WO 2020253716 A1 WO2020253716 A1 WO 2020253716A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- limb
- sample
- training
- limb part
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/215—Motion-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
- G06T19/006—Mixed reality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30221—Sports video; Sports image
Definitions
- the embodiments of the present disclosure relate to the field of computer technology, in particular to an image generation method and device.
- the captured user image is added to the virtual scene, so that the user is placed in the virtual world to achieve the effect of virtual reality.
- virtual reality technology is applied to motion applications, it is often difficult for the camera to capture clear body parts due to the excessively fast body movements such as the human body waving arms, and thus it is difficult to accurately guide the user to complete various actions.
- the embodiments of the present disclosure propose an image generation method and apparatus.
- the embodiments of the present disclosure provide an image generation method.
- the method includes: acquiring an image of a target moving object, the image presents the limbs of the target moving object; inputting the image to a pre-trained detection model to obtain To indicate the output result of the position distribution of each limb part in the set of preset limb parts in the image; based on the output result, generate a heat map corresponding to each limb part; superimpose the generated heat map into the image, and each At the location of the area corresponding to the limbs, an image with a superimposed heat map is generated.
- the output result includes a preset number of score matrices, and each score matrix includes a score corresponding to the image and used to indicate the pixel distribution of the limb part in the image; the score matrix and a set of limb parts The body parts in the corresponding one-to-one.
- generating a heat map corresponding to each limb part includes: for each score matrix in a preset number of score matrixes, determining a score in the score matrix that is greater than a preset threshold. The area of the pixel corresponding to the value in the image; based on the determined area, the limb part corresponding to the score matrix, and the preset heat map color value corresponding to each limb part, a heat map corresponding to each limb part is generated.
- the detection model is obtained by training in the following steps: obtaining a training sample set, the training sample set includes a sample image showing limb parts and an indication for indicating the position distribution of each limb part presented in the sample image in the sample image Information: Based on the training sample set, the sample image is used as input, the instruction information corresponding to the sample image is used as the desired output, and the detection model is trained using machine learning.
- the indication information includes a score matrix corresponding to the sample image and used to indicate the pixel distribution of the limb parts in the sample image;
- the sample image is used as input, the instruction information corresponding to the sample image is used as the expected output, and the machine learning method is used to train the detection model, including: performing the following training steps: For the sample images in the training sample set, The sample image is input to the convolutional neural network to obtain a sample score matrix indicating the pixel distribution of each limb part in the sample image; determine the sample score matrix corresponding to each sample image and the score in the indication information Whether the difference between the value matrices is less than the preset threshold; in response to determining that the difference is less than the preset threshold, it is determined that the convolutional neural network training is completed, and the trained convolutional neural network is used as the detection model; in response to determining that the difference is greater than or equal to the predetermined threshold Set the threshold, adjust the parameters of the convolutional neural network to be trained, and re-execute the training steps.
- an embodiment of the present disclosure provides an image generation device, the device includes: an acquisition unit configured to acquire an image of a target moving object, the image presents limb parts of the target moving object; an input unit configured to Input the image to the pre-trained detection model to obtain the output result indicating the position distribution of each limb part in the preset limb part set in the image; the first generating unit is configured to generate and The heat map corresponding to the limb part; the second generating unit is configured to superimpose the generated heat map on the image at the location of the region corresponding to each limb part to generate an image with the superimposed heat map.
- the output result includes a preset number of score matrices, and each score matrix includes a score corresponding to the image and used to indicate the pixel distribution of the limb part in the image; the score matrix and a set of limb parts The body parts in the corresponding one-to-one.
- the first generating unit is further configured to: for each score matrix in the preset number of score matrices, determine that a pixel corresponding to a score greater than a preset threshold in the score matrix is in the image The area; based on the determined area, the limb parts corresponding to the score matrix, and the preset heat map color value corresponding to each limb part, the heat map corresponding to each limb part is generated.
- the detection model is obtained by training in the following steps: obtaining a training sample set, the training sample set includes a sample image showing limb parts and an indication for indicating the position distribution of each limb part presented in the sample image in the sample image Information: Based on the training sample set, the sample image is used as input, the instruction information corresponding to the sample image is used as the desired output, and the detection model is trained using machine learning.
- the indication information includes a score matrix corresponding to the sample image and used to indicate the pixel distribution of the limb parts in the sample image; and the detection model is further trained by the following steps: perform the following training steps: for training samples Collect the sample image, input the sample image to the convolutional neural network to obtain the sample score matrix indicating the pixel distribution of each limb part in the sample image; determine the obtained sample score matrix corresponding to each sample image And whether the difference between the score matrix in the indication information is less than the preset threshold; in response to determining that the difference is less than the preset threshold, it is determined that the training of the convolutional neural network is completed, and the trained convolutional neural network is used as a detection model; in response to Determine that the difference is greater than or equal to the preset threshold, adjust the parameters of the convolutional neural network to be trained, and re-execute the training step.
- the embodiments of the present disclosure provide a terminal device, the terminal device includes: one or more processors; a storage device for storing one or more programs; when one or more programs are used by one or more Execution by two processors, so that one or more processors implement the method described in any implementation manner of the first aspect.
- an embodiment of the present disclosure provides a computer-readable medium on which a computer program is stored, and when the computer program is executed by a processor, the method as described in any implementation manner in the first aspect is implemented.
- the image generation method and device detect the acquired image of the target moving object to determine the position of the limb part of the target moving object in the image, and generate a thermal map of the limb position to be superimposed on the above
- the corresponding position of the limb in the image does not need to detect the limb of the moving object by means of key point detection, which leads to positioning deviation, improves the accuracy of limb positioning, and helps accurately guide the user to complete subsequent limb actions.
- FIG. 1 is an exemplary system architecture diagram in which an embodiment of the present disclosure can be applied
- Fig. 2 is a flowchart of an embodiment of an image generation method according to the present disclosure
- Fig. 3 is a schematic diagram of an application scenario of the image generation method according to an embodiment of the present disclosure
- FIG. 4 is a flowchart of another embodiment of the image generation method according to the present disclosure.
- Fig. 5 is a schematic diagram of an application scenario of a score matrix according to an embodiment of the present disclosure
- Fig. 6 is a schematic structural diagram of an embodiment of an image generating device according to the present disclosure.
- Fig. 7 is a schematic structural diagram of an electronic device suitable for implementing the embodiments of the present disclosure.
- FIG. 1 shows an exemplary architecture 100 to which an embodiment of the image generation method or image generation apparatus of the present disclosure can be applied.
- the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105.
- the network 104 is used to provide a medium for communication links between the terminal devices 101, 102, 103 and the server 105.
- the network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables.
- Various client applications may be installed on the terminal devices 101, 102, 103.
- image processing applications augmented reality applications, virtual reality applications, action guidance applications, sports fitness applications, etc. You can also install cameras, camera applications, etc.
- the terminal devices 101, 102, 103 can interact with the server 105 through the network 104 to receive or send messages and so on.
- the terminal devices 101, 102, and 103 may be hardware or software.
- the terminal devices 101, 102, 103 can be various electronic devices that can receive user operations, including but not limited to smart phones, tablet computers, e-book readers, laptop computers, desktop computers, and so on.
- the terminal devices 101, 102, and 103 are software, they can be installed in the electronic devices listed above. It can be implemented as multiple software or software modules (for example, multiple software or software modules used to provide distributed services), or as a single software or software module. There is no specific limitation here.
- the server 105 may be a background server that supports client applications installed on the terminal devices 101, 102, 103.
- the server 105 may perform detection processing on the received image of the target moving object acquired by the terminal, and generate an image including the heat map of each limb part for presentation on the terminal.
- the server 105 may be hardware or software.
- the server When the server is hardware, it can be implemented as a distributed server cluster composed of multiple servers, or as a single server.
- the server When the server is software, it can be implemented as multiple software or software modules (for example, multiple software or software modules for providing distributed services), or as a single software or software module. There is no specific limitation here.
- the image generation method provided by the embodiments of the present disclosure may be executed by the server 105, or may be executed by the terminal devices 101, 102, 103.
- the image generating device can be set in the server 105 or in the terminal devices 101, 102, 103.
- terminal devices, networks, and servers in FIG. 1 are merely illustrative. According to implementation needs, there can be any number of terminal devices, networks and servers.
- the above system architecture may not include a network, but only include terminal devices or servers.
- FIG. 2 shows a process 200 of an embodiment of the image generation method according to the present disclosure.
- the image generation method includes the following steps:
- Step 201 Obtain an image of the target moving object.
- the execution subject of the above-mentioned image generation method may be installed with a photographing device or connected to the photographing device.
- the image of the target moving object may be sent to the above-mentioned execution subject after the shooting device is shot.
- the image of the target moving object may be acquired in real time by the shooting device, or may be acquired based on a preset time interval.
- the aforementioned target moving object may be a human body.
- the acquired image presents the body parts of the target moving object.
- the limb parts may include, but are not limited to, hands, upper arms, forearms, thighs, lower legs, neck, back, waist, feet, knees, and shoulders.
- Step 202 Input the acquired image into a pre-trained detection model, and obtain an output result indicating the position distribution of each limb part in the preset limb part set in the acquired image.
- the aforementioned preset set of limb parts may include one limb part or multiple limb parts, which are set according to the needs of the scene.
- the limb parts in the preset limb part set include arms and left legs
- the aforementioned output result is used to indicate the position distribution of the left arm and left leg in the acquired image.
- the foregoing output result may include a feature map.
- the size of the feature map is the same as the size of the acquired image, and the displayed image content includes the contours of each limb part.
- the coordinate position of the limb contour presented by the feature map in the feature map can be determined, and then the coordinate position is used as the image coordinate information of the limb part in the obtained image.
- multiple feature maps may be output, and each feature map corresponds to a limb part.
- the above output result may include two feature maps, one of which includes the contour of the right arm and the other includes the contour of the left arm.
- the aforementioned detection model may be obtained by training based on training samples and using an existing network structure.
- the network structure may include, for example, a generative confrontation network, a convolutional neural network, and the like.
- the above-mentioned training sample includes a sample moving image, and a desired feature map corresponding to the sample moving image.
- the foregoing detection model may be obtained by training the generative confrontation network.
- the generative confrontation network includes a generative network and a discriminant network.
- the generating network is used to extract the feature map of the sample image;
- the discriminant network is used to determine the error between the obtained feature map and the expected feature map.
- the generation network may be a convolutional neural network for image processing (for example, a convolutional neural network with various structures including a convolution layer, a pooling layer, a depooling layer, and a deconvolution layer).
- the above-mentioned discriminant network may also be a convolutional neural network (for example, a convolutional neural network of various structures including a fully connected layer, where the above-mentioned fully connected layer can implement a classification function).
- the trained generation network is used as the above-mentioned detection model.
- Step 203 Generate a heat map corresponding to each limb part based on the output result.
- the above-mentioned execution subject may generate a heat map corresponding to each limb part.
- the heat map is an image that presents the contours of the above-mentioned limbs in a special highlight form.
- the foregoing execution may determine the coordinate position of the limb contour presented by the feature map based on the feature map obtained in step 202, and display the area corresponding to the coordinate position in a specially highlighted form, thereby obtaining a heat map.
- step 204 the generated heat map is superimposed on the obtained image at the location of the region corresponding to each limb part to generate an image after the superimposed heat map.
- the above-mentioned execution subject may superimpose the heat map generated in step 202 on the image obtained in step 201 at the location of the region corresponding to each limb part.
- the generated heat map of the left arm and left leg can be superimposed on the image obtained in step 201 to obtain an image after the superimposed heat map, and then The image is presented on the terminal.
- the image generation method shown in this application detects the image of the target moving object to determine the location area of each limb part in the image.
- the electronic device running on the application can take images of the user, and then take the image
- the displayed body movements are compared with the movements in the preset movement library.
- the user's movement speed is too fast, it is usually impossible to accurately capture the movements of the body's limbs.
- the heat map of each limb part can be displayed to compare with the action in the action library, which is helpful to guide the user to follow-up
- the completion of the action improves the user experience.
- FIG. 3 shows an application scene diagram of the present disclosure, which shows the image generation method of the present disclosure.
- the photographing device 301 obtains the image of the user A
- the image is sent to the electronic device 302.
- the electronic device 302 may be a terminal such as a mobile phone or a server.
- the electronic device 302 inputs the acquired image to the detection model 303 to obtain the position distribution of the arm in the image.
- the detection model 303 obtains the position distribution of the arm in the image.
- generate a heat map of the arm is superimposed on the position of the arm in the acquired image to obtain an image 303 after the superimposed heat map.
- FIG. 4 shows a process 400 of another embodiment of the image generation method according to the present disclosure.
- the image generation method includes the following steps:
- Step 401 Acquire an image of a target moving object.
- the execution subject of the above-mentioned image generation method may be installed with a photographing device or connected to the photographing device.
- the image of the target moving object may be sent to the above-mentioned execution subject after the shooting device is shot.
- the image of the target moving object may be acquired in real time by the shooting device, or may be acquired based on a preset time interval.
- the aforementioned target moving object may be a human body.
- the acquired image presents the body parts of the target moving object.
- Step 402 Input the acquired image into a pre-trained detection model, and obtain an output result indicating the position distribution of each limb part in the preset limb part set in the acquired image.
- the output result may be a score matrix, and the score matrix corresponds to each pixel in the acquired image one-to-one.
- An image is composed of pixels, and each pixel is determined by its coordinate position in the image.
- an image with a resolution of 1024*540 consists of 1024 pixels in the horizontal direction and 540 pixels in the vertical direction.
- Each pixel is composed of RGB color values.
- the first pixel in the first line is the coordinate position of the pixel in the image.
- each pixel in the image includes a pixel value and a coordinate position in the image.
- Each score in the aforementioned score matrix is used to indicate the probability value of each pixel in the acquired image showing a limb part.
- FIG. 5 schematically shows a schematic diagram of an application scenario of the score matrix provided by the present disclosure.
- the output result shown in FIG. 3 is a 15*15 score matrix, that is, the pixels of the obtained image are 15*15.
- the coordinate position of each score in the score matrix corresponds to the coordinate position of the pixel in the image.
- the scores in the score matrix include 0-9. Assume that the score matrix is used to indicate the distribution of the left arm in the acquired image.
- the preset score threshold is 8 in the score matrix, the coordinate position of the pixel in the image corresponding to the score greater than or equal to 8 is presented, which is the position distribution of the left arm in the acquired image .
- the above-mentioned output result may include a preset number of score matrices, and each score matrix includes a pixel distribution corresponding to the acquired image and used to indicate the pixel distribution of the limb part in the acquired image.
- Score the score matrix corresponds to the limb parts in the limb part set.
- a score matrix is used to indicate the position distribution of a limb part in the set of limb parts in the obtained image. Therefore, in a score matrix, the score used to present the pixel distribution of a limb part can be obtained. Make the determined body parts more accurate.
- the aforementioned detection model may be obtained by training based on training samples.
- a training sample set is obtained.
- the training sample set includes a sample image showing limb parts and indication information for indicating the position distribution of each limb part presented in the sample image in the sample image; based on the training sample set, the sample image is taken as Input and use the instruction information corresponding to the sample image as the expected output, and use the machine learning method to train the detection model.
- the above-mentioned indication information includes a score matrix corresponding to the sample image and used to indicate the pixel distribution of the limb parts in the sample image.
- the sample image is used as the input
- the instruction information corresponding to the sample image is used as the desired output
- the detection model is obtained by training using the machine learning method, which may specifically include:
- the convolutional neural network For the sample image in the training sample set, input the sample image to the convolutional neural network to obtain a sample score matrix indicating the pixel distribution of each limb part in the sample image; determine the obtained sample corresponding to each sample image Whether the difference between the score matrix and the score matrix in the indication information is less than the preset threshold; in response to determining that the difference is less than the preset threshold, it is determined that the training of the convolutional neural network is completed, and the trained convolutional neural network is used as the target The detection model; in response to determining that the difference is greater than or equal to a preset threshold, adjust the parameters of the convolutional neural network to be trained, and re-execute the training step.
- the above detection model may be obtained by training a convolutional neural network.
- the above indication information is a score matrix corresponding to the pixels of the sample image, wherein the score value of the pixel corresponding to the image showing the limb part in the score matrix is set to 10, and the remaining score values are set to 0.
- the corresponding coordinate position in the score matrix corresponding to the pixel showing the left arm and left leg can be set to 10, and the remaining scores are set to 0.
- the sample can be input into the convolutional neural network to be trained to obtain a sample score matrix for indicating pixels corresponding to the image of the limb part.
- the obtained sample score matrix is compared with the score matrix in the preset indication information, and the difference between the obtained sample score matrix and the score matrix in the indication information is determined.
- the difference includes the difference between the scores of each coordinate position in the score matrix.
- the above difference being less than the preset threshold may specifically include: in the obtained sample score matrix, the score at the coordinate position corresponding to the score matrix in the preset indication information, and the difference between the scores is less than the preset
- the number of points of the threshold is set to be greater than the preset number value.
- the parameters of the convolutional neural network can be adjusted, for example, the number of convolutional layers in the convolutional neural network, the size of the convolution kernel, etc. can be adjusted. Then use the above-mentioned training samples to continue training the convolutional neural network after parameter adjustment until the above-mentioned error is less than the preset threshold.
- Step 403 For each score matrix of the preset number of score matrices, determine the area in the image of the pixel corresponding to the score greater than the preset threshold in the score matrix.
- the above-mentioned execution subject determines the score of each score matrix that is greater than a preset threshold. Since the position of the pixel in the image corresponds to the coordinate position of each score in the score matrix, the coordinate position of the score greater than the preset threshold in the score matrix can be determined to determine the The distribution area of the pixel corresponding to the score of the preset threshold.
- Step 404 Generate a heat map corresponding to each limb part based on the determined area, the limb part corresponding to the score matrix, and the preset heat map color value corresponding to each limb part.
- the color value of the heat map corresponding to each limb part may be preset in the above-mentioned execution body.
- the heat map used to indicate the arm may be yellow
- the color value of the heat map used to indicate the leg may be blue
- the color value of the heat map used to indicate the shoulder may be red.
- a heat map corresponding to each limb part can be generated.
- the heat map is an image that presents the contours of the above-mentioned limbs in a special highlight form.
- the pixel area corresponding to the score greater than the preset threshold in the score matrix can be set to a bright color.
- each score matrix indicates the position distribution of a limb
- the pixel area corresponding to the score is set to a bright color.
- the setting of the specific color is determined by setting the color value of the heat map corresponding to each limb part in advance.
- step 405 the generated heat map is superimposed on the obtained image at the location of the region corresponding to each limb part to generate an image after the superimposed heat map.
- the above-mentioned execution subject may superimpose the heat map generated in step 504 on the image obtained in step 401 at the location of the region corresponding to each limb part.
- the generated heat map of the left arm and left leg can be superimposed on the image obtained in step 401, so as to obtain the superimposed image of the heat map, and then The image is presented on the terminal.
- Fig. 4 it can be seen from Fig. 4 that, unlike the embodiment shown in Fig. 4, this embodiment highlights that the output result of the detection model is a score matrix, and each score matrix is used to indicate the position distribution of a limb part A step of. Therefore, by determining the position distribution of the limb parts based on the positioning pixel values, the result of the detected position distribution of the limb parts can be made more accurate.
- the present disclosure provides an embodiment of an image generation device.
- the device embodiment corresponds to the method embodiment shown in FIG. 2, and the device can be specifically applied to Various electronic devices.
- the image generating device 600 includes an acquiring unit 601, an input unit 602, a first generating unit 603, and a second generating unit 604.
- the acquiring unit 601 is configured to acquire an image of the target moving object, and the image presents the limb parts of the target moving object
- the input unit 602 is configured to input the image to a pre-trained detection model to obtain an indication of the preset limb
- the output result of the position distribution of each limb part in the part set is presented in the image
- the first generating unit 603 is configured to generate a heat map corresponding to each limb part based on the output result
- the second generating unit 604 is configured to The generated heat map is superimposed on the image at the location of the region corresponding to each limb part to generate an image after the superimposed heat map.
- step 201 the specific processing of the acquiring unit 601, the input unit 602, the first generating unit 603, and the second generating unit 604 and the technical effects brought by them can be referred to the corresponding embodiments in FIG. 2 respectively.
- step 201, step 202, step 203, and step 204 in step 201 will not be repeated here.
- the output result includes a preset number of score matrices, and each score matrix includes a score corresponding to the image and used to indicate the pixel distribution of the limbs in the image;
- the value matrix corresponds to the limb parts in the limb parts collection one-to-one.
- the first generating unit 703 is further configured to: for each score matrix of the preset number of score matrixes, determine the score in the score matrix that is greater than the preset threshold. The area of the pixel corresponding to the value in the image; based on the determined area, the limb part corresponding to the score matrix, and the preset heat map color value corresponding to each limb part, a heat map corresponding to each limb part is generated.
- the detection model is obtained by training in the following steps: Obtain a training sample set, the training sample set includes sample images showing limb parts and indicating that each limb part presented in the sample image is in the sample image Indication information of the position distribution in; Based on the training sample set, the sample image is used as input, and the indication information corresponding to the sample image is used as the desired output, and the detection model is trained by machine learning.
- the indication information includes a score matrix corresponding to the sample image and used to indicate the pixel distribution of the limb parts in the sample image; and the detection model is further trained through the following steps: Training steps: For the sample image in the training sample set, input the sample image to the convolutional neural network to obtain the sample score matrix indicating the pixel distribution of each limb part in the sample image; determine the obtained and each sample image Whether the difference between the corresponding sample score matrix and the score matrix in the indication information is less than the preset threshold; in response to determining that the difference is less than the preset threshold, it is determined that the training of the convolutional neural network is completed, and the completed convolutional neural network is trained As a detection model; in response to determining that the difference is greater than or equal to a preset threshold, adjust the parameters of the convolutional neural network to be trained, and re-execute the training step.
- the image generation device detects the acquired image of the target moving object, determines the position of the limb part of the target moving object in the image, and generates a thermal map of the limb position to be superimposed on the above image For the corresponding positions of the limbs, there is no need to detect the limbs of the moving object by means of key point detection, leading to positioning deviations, improving the accuracy of limb positioning, and helping to accurately guide the user to complete subsequent limb actions.
- Fig. 7 shows a schematic structural diagram of an electronic device (for example, the terminal device in Fig. 1) 700 suitable for implementing the embodiments of the present disclosure.
- the terminal devices in the embodiments of the present disclosure may include, but are not limited to, mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablets), PMPs (portable multimedia players), vehicle-mounted terminals ( For example, mobile terminals such as car navigation terminals) and fixed terminals such as digital TVs and desktop computers.
- the terminal device shown in FIG. 7 is only an example, and should not bring any limitation to the function and scope of use of the embodiments of the present disclosure.
- the electronic device 700 may include a processing device (such as a central processing unit, a graphics processor, etc.) 701, which may be loaded into a random access device according to a program stored in a read-only memory (ROM) 702 or from a storage device 708.
- the program in the memory (RAM) 703 executes various appropriate actions and processing.
- the RAM 703 also stores various programs and data required for the operation of the electronic device 700.
- the processing device 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704.
- An input/output (I/O) interface 705 is also connected to the bus 704.
- the following devices can be connected to the I/O interface 705: including input devices 706 such as touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, liquid crystal display (LCD), speakers, vibration An output device 707 such as a device; a storage device 708 such as a magnetic tape and a hard disk; and a communication device 709.
- the communication device 709 may allow the electronic device 700 to perform wireless or wired communication with other devices to exchange data.
- FIG. 7 shows an electronic device 700 having various devices, it should be understood that it is not required to implement or have all the illustrated devices. It may alternatively be implemented or provided with more or fewer devices. Each block shown in FIG. 7 can represent one device, or can represent multiple devices as needed.
- the process described above with reference to the flowchart can be implemented as a computer software program.
- the embodiments of the present disclosure include a computer program product, which includes a computer program carried on a computer-readable medium, and the computer program contains program code for executing the method shown in the flowchart.
- the computer program may be downloaded and installed from the network through the communication device 709, or installed from the storage device 708, or installed from the ROM 702.
- the processing device 701 the above-mentioned functions defined in the method of the embodiment of the present disclosure are executed.
- the computer-readable medium described in the embodiments of the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the two.
- the computer-readable storage medium may be, for example, but not limited to, an electric, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination of the above. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
- the computer-readable storage medium may be any tangible medium that contains or stores a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device.
- the computer-readable signal medium may include a data signal propagated in a baseband or as a part of a carrier wave, and a computer-readable program code is carried therein. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
- the computer-readable signal medium may also be any computer-readable medium other than the computer-readable storage medium.
- the computer-readable signal medium may send, propagate or transmit the program for use by or in combination with the instruction execution system, apparatus, or device .
- the program code contained on the computer-readable medium can be transmitted by any suitable medium, including but not limited to: wire, optical cable, RF (Radio Frequency), etc., or any suitable combination of the above.
- the above-mentioned computer-readable medium may be included in the above-mentioned terminal device; or it may exist alone without being assembled into the terminal device.
- the aforementioned computer-readable medium carries one or more programs.
- the electronic device acquires an image of the target moving object, and the image presents the limbs of the target moving object;
- the image is input to the pre-trained detection model, and the output result indicating the position distribution of each limb part in the preset limb part set in the image is obtained; based on the output result, a heat map corresponding to each limb part is generated;
- the generated heat map is superimposed on the image at the location of the region corresponding to each limb part to generate an image after the superimposed heat map.
- the computer program code used to perform the operations of the embodiments of the present disclosure can be written in one or more programming languages or a combination thereof.
- the programming languages include object-oriented programming languages such as Java, Smalltalk, C++, and Conventional procedural programming language-such as "C" language or similar programming language.
- the program code can be executed entirely on the user's computer, partly on the user's computer, executed as an independent software package, partly on the user's computer and partly executed on a remote computer, or entirely executed on the remote computer or server.
- the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (for example, using an Internet service provider to Connect via the Internet).
- LAN local area network
- WAN wide area network
- each block in the flowchart or block diagram can represent a module, program segment, or part of code, and the module, program segment, or part of code contains one or more for realizing the specified logical function Executable instructions.
- the functions marked in the block may also occur in a different order from the order marked in the drawings. For example, two blocks shown in succession can actually be executed substantially in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved.
- each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart can be implemented by a dedicated hardware-based system that performs the specified functions or operations Or it can be realized by a combination of dedicated hardware and computer instructions.
- a processor includes a processor including an acquisition unit, an input unit, a first generation unit, and a second generation unit.
- the names of these units do not constitute a limitation on the unit itself under certain circumstances.
- the acquisition unit can also be described as a "unit for acquiring an image of a target moving object".
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Medical Informatics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Quality & Reliability (AREA)
- Computer Graphics (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (12)
- 一种图像生成方法,包括:获取目标运动对象的图像,所述图像呈现有所述目标运动对象的肢体部位;将所述图像输入至预先训练的检测模型,得到用于指示预设肢体部位集合中的各肢体部位呈现在所述图像中的位置分布的输出结果;基于所述输出结果,生成与各肢体部位对应的热力图;将所生成的热力图叠加至所述图像中、与各肢体部位对应的区域位置处,生成叠加热力图后的图像。
- 根据权利要求1所述的方法,其中,所述输出结果包括预设数目个分值矩阵,各分值矩阵包括与所述图像的对应的、用于指示所述图像中呈现肢体部位的像素分布的分值;所述分值矩阵与所述肢体部位集合中的肢体部位一一对应。
- 根据权利要求2所述的方法,其中,所述基于所述输出结果,生成与各肢体部位对应的热力图,包括:对于所述预设数目个分值矩阵中的每一个分值矩阵,确定该分值矩阵中大于预设阈值的分值对应的像素在所述图像中的区域;基于所确定的区域、与分值矩阵对应的肢体部位、预先设定的与各肢体部位对应的热力图色值,生成与各肢体部位对应的热力图。
- 根据权利要求1或2所述的方法,其中,所述检测模型通过如下步骤训练得到:获取训练样本集,所述训练样本集包括呈现肢体部位的样本图像和用于指示样本图像所呈现的各肢体部位在样本图像中的位置分布的指示信息;基于训练样本集,将样本图像作为输入、将与样本图像对应的指示信息作为期望输出,利用机器学习的方法,训练得到所述检测模型。
- 根据权利要求4所述的方法,其中,指示信息包括与样本图像对应的、用于指示样本图像中呈现肢体部位的像素分布的分值矩阵;以及所述基于训练样本集,将样本图像作为输入、将与样本图像对应的指示信息作为期望输出,利用机器学习的方法,训练得到所述检测模型,包括:执行如下训练步骤:对于训练样本集中的样本图像,将该样本图像输入至卷积神经网络,得到用于指示各肢体部位在样本图像中的像素分布的样本分值矩阵;确定所得到的与各样本图像对应的样本分值矩阵和所述指示信息中的分值矩阵之间的差异是否小于预设阈值;响应于确定所述差异小于预设阈值,确定卷积神经网络训练完成,以及将训练完成的卷积神经网络作为所述检测模型;响应于确定所述差异大于或等于预设阈值,调整待训练的卷积神经网络的参数,重新执行所述训练步骤。
- 一种图像生成装置,包括:获取单元,被配置成获取目标运动对象的图像,所述图像呈现有所述目标运动对象的肢体部位;输入单元,被配置成将所述图像输入至预先训练的检测模型,得到用于指示预设肢体部位集合中的各肢体部位呈现在所述图像中的位置分布的输出结果;第一生成单元,被配置成基于所述输出结果,生成与各肢体部位对应的热力图;第二生成单元,被配置成将所生成的热力图叠加至所述图像中、与各肢体部位对应的区域位置处,生成叠加热力图后的图像。
- 根据权利要求6所述的装置,其中,所述输出结果包括预设数目个分值矩阵,各分值矩阵包括与所述图像的对应的、用于指示所述图像中呈现肢体部位的像素分布的分值;所述分值矩阵与所述肢体部 位集合中的肢体部位一一对应。
- 根据权利要求7所述的装置,其中,所述第一生成单元进一步被配置成:对于所述预设数目个分值矩阵中的每一个分值矩阵,确定该分值矩阵中大于预设阈值的分值对应的像素在所述图像中的区域;基于所确定的区域、与分值矩阵对应的肢体部位、预先设定的与各肢体部位对应的热力图色值,生成与各肢体部位对应的热力图。
- 根据权利要求6或7所述的装置,其中,所述检测模型通过如下步骤训练得到:获取训练样本集,所述训练样本集包括呈现肢体部位的样本图像和用于指示样本图像所呈现的各肢体部位在样本图像中的位置分布的指示信息;基于训练样本集,将样本图像作为输入、将与样本图像对应的指示信息作为期望输出,利用机器学习的方法,训练得到所述检测模型。
- 根据权利要求9所述的装置,其中,指示信息包括与样本图像对应的、用于指示样本图像中呈现肢体部位的像素分布的分值矩阵;以及所述检测模型进一步通过如下步骤训练得到:执行如下训练步骤:对于训练样本集中的样本图像,将该样本图像输入至卷积神经网络,得到用于指示各肢体部位在样本图像中的像素分布的样本分值矩阵;确定所得到的与各样本图像对应的样本分值矩阵和所述指示信息中的分值矩阵之间的差异是否小于预设阈值;响应于确定所述差异小于预设阈值,确定卷积神经网络训练完成,以及将训练完成的卷积神经网络作为所述检测模型;响应于确定所述差异大于或等于预设阈值,调整待训练的卷积神经网络的参数,重新执行所述训练步骤。
- 一种电子设备,包括:一个或多个处理器;存储装置,其上存储有一个或多个程序;当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-5中任一所述的方法。
- 一种计算机可读介质,其上存储有计算机程序,其中,该程序被处理器执行时实现如权利要求1-5中任一所述的方法。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/620,452 US20220358662A1 (en) | 2019-06-18 | 2020-06-17 | Image generation method and device |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910528723.1A CN110264539A (zh) | 2019-06-18 | 2019-06-18 | 图像生成方法和装置 |
CN201910528723.1 | 2019-06-18 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020253716A1 true WO2020253716A1 (zh) | 2020-12-24 |
Family
ID=67919196
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/096547 WO2020253716A1 (zh) | 2019-06-18 | 2020-06-17 | 图像生成方法和装置 |
Country Status (3)
Country | Link |
---|---|
US (1) | US20220358662A1 (zh) |
CN (1) | CN110264539A (zh) |
WO (1) | WO2020253716A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113553959A (zh) * | 2021-07-27 | 2021-10-26 | 杭州逗酷软件科技有限公司 | 动作识别方法及装置、计算机可读介质和电子设备 |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110264539A (zh) * | 2019-06-18 | 2019-09-20 | 北京字节跳动网络技术有限公司 | 图像生成方法和装置 |
CN111325822B (zh) | 2020-02-18 | 2022-09-06 | 腾讯科技(深圳)有限公司 | 热点图的显示方法、装置、设备及可读存储介质 |
CN113762015A (zh) * | 2021-01-05 | 2021-12-07 | 北京沃东天骏信息技术有限公司 | 一种图像处理方法和装置 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140307955A1 (en) * | 2013-04-12 | 2014-10-16 | Samsung Electronics Co., Ltd. | Apparatus and method for detecting body parts from user image |
CN108875482A (zh) * | 2017-09-14 | 2018-11-23 | 北京旷视科技有限公司 | 物体检测方法和装置、神经网络训练方法和装置 |
CN109117753A (zh) * | 2018-07-24 | 2019-01-01 | 广州虎牙信息科技有限公司 | 部位识别方法、装置、终端及存储介质 |
CN109274883A (zh) * | 2018-07-24 | 2019-01-25 | 广州虎牙信息科技有限公司 | 姿态矫正方法、装置、终端及存储介质 |
CN109685013A (zh) * | 2018-12-25 | 2019-04-26 | 上海智臻智能网络科技股份有限公司 | 人体姿态识别中头部关键点的检测方法及装置 |
CN110264539A (zh) * | 2019-06-18 | 2019-09-20 | 北京字节跳动网络技术有限公司 | 图像生成方法和装置 |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101599177B (zh) * | 2009-07-01 | 2011-07-27 | 北京邮电大学 | 一种基于视频的人体肢体运动的跟踪方法 |
CN109658455B (zh) * | 2017-10-11 | 2023-04-18 | 阿里巴巴集团控股有限公司 | 图像处理方法和处理设备 |
-
2019
- 2019-06-18 CN CN201910528723.1A patent/CN110264539A/zh active Pending
-
2020
- 2020-06-17 WO PCT/CN2020/096547 patent/WO2020253716A1/zh active Application Filing
- 2020-06-17 US US17/620,452 patent/US20220358662A1/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140307955A1 (en) * | 2013-04-12 | 2014-10-16 | Samsung Electronics Co., Ltd. | Apparatus and method for detecting body parts from user image |
CN108875482A (zh) * | 2017-09-14 | 2018-11-23 | 北京旷视科技有限公司 | 物体检测方法和装置、神经网络训练方法和装置 |
CN109117753A (zh) * | 2018-07-24 | 2019-01-01 | 广州虎牙信息科技有限公司 | 部位识别方法、装置、终端及存储介质 |
CN109274883A (zh) * | 2018-07-24 | 2019-01-25 | 广州虎牙信息科技有限公司 | 姿态矫正方法、装置、终端及存储介质 |
CN109685013A (zh) * | 2018-12-25 | 2019-04-26 | 上海智臻智能网络科技股份有限公司 | 人体姿态识别中头部关键点的检测方法及装置 |
CN110264539A (zh) * | 2019-06-18 | 2019-09-20 | 北京字节跳动网络技术有限公司 | 图像生成方法和装置 |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113553959A (zh) * | 2021-07-27 | 2021-10-26 | 杭州逗酷软件科技有限公司 | 动作识别方法及装置、计算机可读介质和电子设备 |
Also Published As
Publication number | Publication date |
---|---|
US20220358662A1 (en) | 2022-11-10 |
CN110264539A (zh) | 2019-09-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020253716A1 (zh) | 图像生成方法和装置 | |
CN109902659B (zh) | 用于处理人体图像的方法和装置 | |
CN109584276B (zh) | 关键点检测方法、装置、设备及可读介质 | |
CN110058685B (zh) | 虚拟对象的显示方法、装置、电子设备和计算机可读存储介质 | |
WO2020224479A1 (zh) | 目标的位置获取方法、装置、计算机设备及存储介质 | |
CN110866977B (zh) | 增强现实处理方法及装置、系统、存储介质和电子设备 | |
CN109754464B (zh) | 用于生成信息的方法和装置 | |
WO2020211573A1 (zh) | 用于处理图像的方法和装置 | |
CN111368668B (zh) | 三维手部识别方法、装置、电子设备及存储介质 | |
CN110059624B (zh) | 用于检测活体的方法和装置 | |
WO2020155915A1 (zh) | 用于播放音频的方法和装置 | |
CN111402122A (zh) | 图像的贴图处理方法、装置、可读介质和电子设备 | |
WO2020124995A1 (zh) | 手掌法向量确定方法、装置、设备及存储介质 | |
CN111967515A (zh) | 图像信息提取方法、训练方法及装置、介质和电子设备 | |
WO2023051244A1 (zh) | 图像生成方法、装置、设备及存储介质 | |
CN110288532B (zh) | 生成全身图像的方法、装置、设备及计算机可读存储介质 | |
CN109829431B (zh) | 用于生成信息的方法和装置 | |
CN112270242B (zh) | 轨迹的显示方法、装置、可读介质和电子设备 | |
CN109816791B (zh) | 用于生成信息的方法和装置 | |
WO2023151558A1 (zh) | 用于显示图像的方法、装置和电子设备 | |
CN112418233B (zh) | 图像处理方法、装置、可读介质及电子设备 | |
CN113238652A (zh) | 视线估计方法、装置、设备及存储介质 | |
CN112070903A (zh) | 虚拟对象的展示方法、装置、电子设备及计算机存储介质 | |
KR102534449B1 (ko) | 이미지 처리 방법, 장치, 전자 장치 및 컴퓨터 판독 가능 저장 매체 | |
WO2022194061A1 (zh) | 目标跟踪方法、装置、设备及介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20825428 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20825428 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 28.03.2022) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20825428 Country of ref document: EP Kind code of ref document: A1 |