WO2019120032A1 - 模型构建方法、拍照方法、装置、存储介质及终端 - Google Patents
模型构建方法、拍照方法、装置、存储介质及终端 Download PDFInfo
- Publication number
- WO2019120032A1 WO2019120032A1 PCT/CN2018/116800 CN2018116800W WO2019120032A1 WO 2019120032 A1 WO2019120032 A1 WO 2019120032A1 CN 2018116800 W CN2018116800 W CN 2018116800W WO 2019120032 A1 WO2019120032 A1 WO 2019120032A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- picture
- model
- human body
- virtual object
- preview image
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 57
- 238000003860 storage Methods 0.000 title claims abstract description 34
- 238000010276 construction Methods 0.000 title claims abstract description 23
- 238000012549 training Methods 0.000 claims abstract description 28
- 238000013136 deep learning model Methods 0.000 claims abstract description 26
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 20
- 238000010801 machine learning Methods 0.000 claims abstract description 14
- 230000003190 augmentative effect Effects 0.000 claims description 28
- 230000006870 function Effects 0.000 claims description 27
- 238000005457 optimization Methods 0.000 claims description 25
- 238000013527 convolutional neural network Methods 0.000 claims description 16
- 230000000694 effects Effects 0.000 claims description 16
- 238000004590 computer program Methods 0.000 claims description 15
- 230000004044 response Effects 0.000 claims description 13
- 238000011176 pooling Methods 0.000 claims description 4
- 230000036544 posture Effects 0.000 description 68
- 239000002609 medium Substances 0.000 description 20
- 238000010586 diagram Methods 0.000 description 10
- 230000002093 peripheral effect Effects 0.000 description 9
- 238000012545 processing Methods 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 6
- 230000003287 optical effect Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000004891 communication Methods 0.000 description 4
- 238000001514 detection method Methods 0.000 description 4
- 238000003062 neural network model Methods 0.000 description 4
- 230000003993 interaction Effects 0.000 description 3
- 230000001960 triggered effect Effects 0.000 description 3
- 230000000007 visual effect Effects 0.000 description 3
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 239000012120 mounting media Substances 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 1
- 238000007664 blowing Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 238000003825 pressing Methods 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 239000000779 smoke Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
Definitions
- Embodiments of the present application relate to a photographing technique, for example, to a model construction method, a photographing method, an apparatus, a storage medium, and a terminal.
- Augmented Reality is a new technology that integrates real world information and virtual world information "seamlessly". It can make real environment and virtual objects superimposed in the same picture or space in real time. It is perceived by human senses to achieve a sensory experience that transcends reality.
- AR technology is used in many fields such as medical, cultural, industrial, entertainment and tourism. For example, you can enrich the display of photos by adding AR virtual reality objects to your photos.
- the related image recognition technology has a defect, and when a virtual reality object is added to a photo, an inaccurate addition position may occur, thereby affecting the display effect of the photo.
- the embodiment of the present application provides a model construction method, a photographing method, a device, a storage medium, and a terminal, which can accurately recognize a human body posture.
- the embodiment of the present application provides a method for constructing a model, including: acquiring a set number of human body posture pictures, marking a target object in the human body posture picture, and obtaining a picture sample, wherein the target object And including at least one of a head, a limb, and a torso; according to the picture sample, training a preset depth learning model by using a set machine learning algorithm to obtain a gesture recognition model; and transmitting the gesture recognition model to the mobile terminal .
- the embodiment of the present application further provides a photographing method, comprising: acquiring a first preview image of an object to be photographed; and recognizing a human body posture in the first preview image by using a pre-configured gesture recognition model, wherein The gesture recognition model is a depth learning model trained according to a set number of picture samples, the picture samples being determined according to a human body gesture picture containing a target object; acquiring a virtual object, wherein the virtual object is used for the object to be photographed Providing an augmented reality effect; determining an added position of the virtual object according to the human body posture, and adding the virtual object to the added position to form a second preview image; acquiring a shooting instruction, and obtaining the first instruction in response to the shooting instruction
- the second preview image corresponds to the captured picture.
- the embodiment of the present application further provides a model construction apparatus, where the model construction apparatus includes: a sample determination module configured to acquire a set number of human body posture pictures, and mark a target object in the human body posture picture Obtaining a picture sample, wherein the target object includes at least one of a head, a limb, and a torso; and a model training module configured to use the set machine learning algorithm to preset a deep learning model according to the picture sample Performing training to obtain a gesture recognition model, wherein the picture sample is determined according to a human body pose picture including the target object; and the model sending module is configured to send the gesture recognition model to the mobile terminal.
- the model construction apparatus includes: a sample determination module configured to acquire a set number of human body posture pictures, and mark a target object in the human body posture picture Obtaining a picture sample, wherein the target object includes at least one of a head, a limb, and a torso; and a model training module configured to use the set machine learning algorithm to preset a deep learning
- the embodiment of the present application further provides a photographing apparatus, comprising: an image acquiring module, configured to acquire a first preview image of an object to be photographed; and a gesture recognition module configured to identify by using a pre-configured gesture recognition model a human body pose in the first preview image, wherein the gesture recognition model is a deep learning model trained according to a set number of picture samples; an object acquisition module configured to acquire a virtual object, wherein the virtual object is used for Providing an augmented reality effect for the object to be photographed; the object adding module is configured to determine an added position of the virtual object according to the human body posture, and add the virtual object at the added position to form a second preview image; And a module, configured to acquire a shooting instruction, and obtain a captured picture corresponding to the second preview image in response to the shooting instruction.
- a photographing apparatus comprising: an image acquiring module, configured to acquire a first preview image of an object to be photographed; and a gesture recognition module configured to identify by using a pre-configured gesture recognition model a human body pose
- the embodiment of the present application further provides a computer readable storage medium, where the computer program is stored, and when the program is executed by the processor, the model construction method according to the first aspect described above is implemented.
- the embodiment of the present application further provides a computer readable storage medium, where the computer program is stored, and when the program is executed by the processor, the photographing method according to the second aspect is implemented.
- the embodiment of the present application further provides a terminal, including a memory, a processor, and a computer program stored in the memory and executable by the processor, where the processor executes the computer program, implementing the first aspect as described above The model construction method described.
- the embodiment of the present application further provides another terminal, including a memory, a processor, and a computer program stored on the memory and executable by the processor, where the processor implements the second program as described above
- another terminal including a memory, a processor, and a computer program stored on the memory and executable by the processor, where the processor implements the second program as described above
- the processor implements the second program as described above
- FIG. 1 is a flowchart of a method for constructing a model provided by an embodiment of the present application
- FIG. 2 is a flowchart of a photographing method provided by an embodiment of the present application.
- FIG. 3 is a schematic diagram of adding a virtual object according to a human body posture according to an embodiment of the present application
- FIG. 5 is a structural block diagram of a model construction apparatus according to an embodiment of the present application.
- FIG. 6 is a structural block diagram of a server according to an embodiment of the present application.
- FIG. 7 is a structural block diagram of a camera device according to an embodiment of the present application.
- FIG. 8 is a structural block diagram of a mobile terminal according to an embodiment of the present application.
- the human body posture estimation for the image or video is one. A very challenging job.
- the related art provides a method of recognizing a human body, usually using information such as a human body edge, a silhouette contour, and an optical flow.
- these features are sensitive to noise, partial occlusion, and viewing angle changes, and the recognition results are easily affected by the above factors, and the detection accuracy is limited.
- the present application proposes a model construction scheme, which can effectively improve the situation that the human body gesture recognition result is affected by occlusion and the angle of view, and improves the accuracy of the human body gesture recognition.
- FIG. 1 is a flowchart of a method for constructing a model according to an embodiment of the present disclosure.
- the method may be implemented by a model building device, where the device may be implemented by at least one of software and hardware, and may be integrated into a terminal.
- the terminal may be a server, such as a server for performing functions such as human gesture model creation, training, and optimization.
- the method includes steps 110 to 130.
- step 110 a set number of human body posture pictures are acquired, and the target object in the human body posture picture is marked to obtain a picture sample.
- the human body posture picture is a picture containing a character, and the person in the picture poses a certain posture through the head, the limbs or the torso.
- the target object includes at least one of a head, a limb, and a torso.
- the human body posture picture is downloaded from the network platform through the web crawler, and the human body posture picture is classified. For example, according to the source of the human body posture picture, it is divided into sports, video, and expression packages.
- the head pixel coordinates, the limb pixel coordinates, and the trunk pixel coordinates of the character in the network picture are marked to obtain a first sample picture.
- the marking may be performed by roughly determining the pixel coordinates of the head of the person and the pixel coordinates of the limbs by the skin color recognition algorithm, and further determining the trunk pixel coordinates according to the head pixel coordinates and the limb pixel coordinates.
- the head contour, the limb contour, and the trunk contour are highlighted based on the above coordinates, thereby realizing marking of the head pixel coordinates, the limb pixel coordinates, and the trunk pixel coordinates.
- the head pixel coordinates, the limb pixel coordinates, and the trunk pixel coordinates may be marked with a dotted frame.
- the head contour, the limb contour, and the trunk contour are sampled at intervals, and the sampling point is used as a marker point to sequentially connect the marker to the head pixel, the limb pixel, or the torso pixel.
- Points implement markers for head pixel coordinates, limb pixel coordinates, and torso pixel coordinates.
- the sampling interval can be set according to actual needs.
- the marked network picture is recorded as the first sample picture, and the first sample picture is stored in the picture sample set.
- acquiring a user picture in the mobile terminal album marking the head pixel coordinates, the limb pixel coordinates, and the trunk pixel coordinates in the user picture to obtain a second sample picture; storing the second sample picture Enter the image sample set. Since the server needs to obtain the user permission before acquiring the user picture from the mobile terminal album, when the server first acquires the user picture from the mobile terminal album, the server displays the inquiry information in the form of a dialog box to ask whether to grant the server permission to access the album. Obtain the user's input instruction, and if the user inputs a positive information, give the server permission to access the album.
- the server After obtaining the user picture in the mobile terminal album, the server determines the head pixel coordinates and the limb pixel coordinates of the character in the user picture by using the similar method described above, and further determines the trunk pixel coordinates according to the head pixel coordinates and the limb pixel coordinates.
- the head contour, the limb contour, and the trunk contour are highlighted based on the above coordinates, thereby realizing marking of the head pixel coordinates, the limb pixel coordinates, and the trunk pixel coordinates.
- step 120 according to the picture sample, the preset deep learning model is trained by using a set machine learning algorithm to obtain a gesture recognition model.
- the deep learning model may be a convolutional neural network model, and the number of hidden layers and the number of nodes of the input layer, the hidden layer, and the output layer may be preset, and the first parameter of the convolutional neural network is initialized, where The parameters include the offset values of the layers and the weights of the edges, and the framework of the neural network model is initially obtained.
- the set machine learning algorithms include forward propagation algorithms and backward propagation algorithms.
- the preset depth learning model is used to perform training in two stages of forward propagation and backward propagation by using the image sample set; when the error calculated by the backward propagation training reaches a desired error value, the training ends. And get the gesture recognition model.
- the convolutional neural network model is trained by using a forward propagation algorithm and a backward propagation algorithm, and the second parameter of the framework of the neural network model is learned, wherein the second parameter is an actual output of the calculated picture sample.
- the deviation from the expected output, the correction parameter calculated by the backward propagation algorithm is used according to the deviation, and the first parameter is updated by using the second parameter.
- the model error is calculated, wherein the model error can be determined according to the deviation between the actual output of the picture sample and the expected output.
- the model error reaches the expected error value, the training ends, and the gesture recognition model is obtained.
- step 130 the gesture recognition model is transmitted to the mobile terminal.
- the convolutional neural network model is optimized by using a preset optimization strategy, wherein optimization of the convolutional neural network model includes optimization of internal network structure, optimization of implementation of convolutional layer, and pooling layer At least one of the implementation optimizations. For example, adding a residual block to construct a residual neural network model, or adjusting the structure of the residual block.
- the optimization of the implementation of the convolutional layer can be to reduce the number of connections between the output channel and the input channel, that is, the output channel is no longer associated with all input channels, but only with adjacent input channels.
- the base layer is added to the implementation of the convolution layer, and the volume is integrated into two steps: first, each channel of the input is operated separately, and under the action of the same size convolution kernel, each channel obtains an intermediate calculation result, which will Each channel of the intermediate calculation result is called a base layer; then, each channel is combined to obtain the output result of the convolution layer.
- a matrix for image compression in the pooling layer is designed by the required image compression factor.
- the technical solution of the embodiment obtains a set number of human body posture pictures, and marks the target object in the human body posture picture to obtain a picture sample; and according to the picture sample, uses a set machine learning algorithm to pre-predetermine
- the deep learning model is trained to obtain a gesture recognition model, and the gesture recognition model is sent to the mobile terminal, so that the mobile terminal identifies the human body gesture in the photograph taken by the camera through the gesture recognition model.
- the human body posture picture with various postures is used as a training sample, and the deep learning model is trained by using a machine learning algorithm to obtain a posture recognition model, which can make the posture recognition model have the function of recognizing the human body posture.
- the gesture recognition model learns effective features (including angle of view, occlusion, etc.) for recognizing gestures from a large number of image samples, these image samples include different shooting angles, different distances between the camera and the character, and various occlusion degrees of the character itself. Etc. Therefore, the gesture recognition model is used to identify the human body posture in the picture, which is robust to occlusion and viewing angle changes, avoiding the inaccurate or misidentification of the related image recognition technology, and improving the accuracy of the gesture recognition. .
- FIG. 2 is a flowchart of a photographing method provided by an embodiment of the present application.
- the method can be performed by a camera device, wherein the device can be implemented by at least one of software and hardware, and can be integrated into a mobile terminal, such as a mobile terminal having a camera.
- the method includes steps 201 to 210.
- step 210 a first preview image of the object to be photographed is acquired.
- the first preview image may include a screen captured by a camera displayed in a shooting interface of the mobile terminal before the user presses the camera button.
- the first preview image in this embodiment may be a character class image.
- the acquiring operation of the first preview image by the mobile terminal may be performed by a system of the mobile terminal or by any application software having a shooting function in the mobile terminal, and the operation of acquiring the first preview image may be performed by the user.
- the instructions are executed by the system or application software.
- the user can directly open the camera function in the mobile terminal system to take pictures of the person, or use the camera option of the application software to take pictures of the person.
- the first preview image may be obtained by the person who is the target of the shooting through the lens in the camera, and the optical image of the character is projected onto the photosensitive chip, and the optical image signal is converted into an electrical signal by the photosensitive chip.
- a first preview image is obtained, which is sent to an image processor (Image Signal Processor) in the mobile terminal motherboard through a dedicated interface, such as a Mobile Industry Processor Interface (MIPI).
- the ISP processes and finally converts it into a format that can be displayed on the screen of the mobile terminal, and displays it on the display screen of the mobile terminal.
- step 220 the human body pose in the first preview image is identified by a pre-configured gesture recognition model.
- the gesture recognition model is a deep learning model trained according to a set number of picture samples, and the picture samples are determined according to a human body posture picture including the target object.
- the gesture recognition model can be configured to quickly and accurately identify the human body gesture in the first preview image after inputting the first preview image.
- the gesture recognition model can be a convolutional neural network model.
- the network parameters of at least one of the number of layers of the neural network model, the number of neurons, the convolution kernel, and the weight are not limited.
- the gesture recognition model may be a convolutional neural network model obtained by training a preset deep learning model based on a set number of human body posture pictures in the embodiment of the present application.
- the gesture recognition model may be built, trained, and optimized in a server, and transplanted to a mobile terminal by a server and configured.
- model construction, training, and optimization processing may also be performed in the mobile terminal if the processing capabilities of the mobile terminal permit.
- the first preview image is input into the pre-configured gesture recognition model, and the posture of the person included in the first preview image is recognized by the gesture recognition model. Since the gesture recognition model learns effective features from a large number of image samples (including different shooting angles, different distances between the camera and the character, and various occlusion degrees of the character itself), the gesture recognition model is used to perform the human body posture in the image. Identification, good robustness to occlusion and viewing angle changes.
- step 230 a virtual object is obtained.
- the virtual object is used to provide an augmented reality effect for the object to be photographed, including a solid image (such as an image of a physical object such as basketball, soccer, sun, moon, or actor), a video image of a person (such as a Kung Fu Panda, a Smurf, or a Superman). Images of film and television characters) or special effects (such as smoke effects, steam effects, and trajectory effects, etc.).
- a solid image such as an image of a physical object such as basketball, soccer, sun, moon, or actor
- a video image of a person such as a Kung Fu Panda, a Smurf, or a Superman.
- images of film and television characters or special effects (such as smoke effects, steam effects, and trajectory effects, etc.).
- special effects such as smoke effects, steam effects, and trajectory effects, etc.
- a virtual object library is preset in the mobile terminal, and various virtual objects are stored in the virtual object library. It can be understood that the virtual object can be obtained by the network platform, and can also be designed by the terminal manufacturer.
- various virtual objects in the virtual object library are displayed for the user to select the virtual object to be added to the first preview image, and the virtual objects at this time are displayed in the default order.
- a virtual object associated with the determined human gesture may also be displayed. For example, if the user's posture is detected as a running posture, the acquired virtual objects include a runway, a finish line, a ribbon, and the like. If the user's gesture is detected as playing basketball, the virtual objects acquired include a basketball court, a basketball, a ball basket, and the like.
- step 240 an added position of the virtual object is determined according to the human body posture, and the virtual object is added at the added position to form a second preview image.
- the addition position of the virtual object may be pre-defined, that is, the correspondence relationship between the added position of the virtual object and the human body posture is associated in a database manner. For example, for a shooting position, it is prescribed to add a basketball to the hand; for a kicking posture, it is prescribed to add a soccer ball at one foot of the kicking action; for the posture of blowing out the birthday candle, it is prescribed to add a birthday to the top of the birthday cake. Language and so on.
- the correspondence between the added position of the preset virtual object and the human body posture is stored in the preset database.
- FIG. 3 shows a schematic diagram of adding a virtual object according to a human body posture.
- a first preview image 310 is acquired.
- a virtual object selection window is displayed.
- the virtual object selection window 320 includes a virtual object picture 321 and an option box 322. If it is detected that the user selects the option box 322 corresponding to the runway, the pixel in the underfoot setting area of the person in the first preview image 310 is replaced with the pixel corresponding to the virtual object, thereby forming the second preview image 330.
- step 250 a shooting instruction is acquired, and a captured picture corresponding to the second preview image is obtained in response to the shooting instruction.
- the shooting instruction may be an operation instruction triggered by the user pressing the camera button, or may be an operation instruction triggered by the user's input camera voice, or may be an operation instruction triggered by the user's camera gesture.
- the second preview image is stored in response to the shooting instruction, and a captured image corresponding to the second preview image is obtained.
- the second preview image is saved to obtain a photographed image, and the photographed image is stored in the album of the mobile terminal.
- the technical solution of the embodiment obtains a first preview image of the object to be photographed; identifies a human body posture in the first preview image through a pre-configured gesture recognition model; acquires a virtual object; and determines the virtual image according to the human body posture Adding a position of the object, and adding the virtual object to the added position to form a second preview image; acquiring a shooting instruction, and obtaining a captured picture corresponding to the second preview image in response to the shooting instruction.
- the posture recognition model is used to identify the human body posture in the first preview image, thereby avoiding the influence of the shooting angle of view and the occlusion on the accuracy of the gesture recognition, and improving the accuracy of the gesture recognition; and determining the addition position of the virtual object according to the posture of the human body.
- the combination of augmented reality function and human body posture detection enables precise addition of virtual objects in an accurate position and enhances the display effect of photographs taken.
- FIG. 4 is a flowchart of another photographing method provided by an embodiment of the present application. As shown in FIG. 4, the method includes steps 401 to 410.
- step 401 a first preview image of the object to be photographed is acquired.
- step 402 it is determined whether the pixel corresponding to the skin is included in the first preview image. If the pixel corresponding to the skin is included in the first preview image, step 403 is performed, where the first preview image does not include Step 404 is performed for the pixel corresponding to the skin.
- the first preview image may be subjected to image processing to obtain a histogram of the first preview image, and according to the gray value distribution of the pixel points in the histogram, whether the pixel corresponding to the skin is included in the first preview image.
- image processing to obtain a histogram of the first preview image, and according to the gray value distribution of the pixel points in the histogram, whether the pixel corresponding to the skin is included in the first preview image.
- Another example is to establish a region model by MATLAB, and use the range of the skin color in the color to mark the region that can satisfy certain conditions as the skin color region.
- There are two main steps in using this model one is to use statistical methods to determine the specific range of skin color; the other is to use this model to determine whether a new pixel or region is skin color.
- a value range in which a certain pixel or region satisfies the color region of the skin color is determined as a skin region, and a value range in a color region in which a certain pixel or region does not satisfy the skin color is determined as a non-skin region.
- step 403 the prompt information that cannot determine the added position of the virtual object is output.
- the prompt information is displayed in the form of a dialog box to prompt the user to determine the added position of the virtual object for the user to select whether to add the virtual object. If the user chooses to continue adding, the user is provided with the function of specifying the added location.
- the first layer may be drawn, the first layer is set to display the virtual object to be added, and the position of the first layer except the pixel coordinates corresponding to the virtual object is a transparent layer.
- the first layer is displayed on the second layer, and the operation instruction of the user on the first layer is obtained. For example, the user can drag the first layer to the location to be added.
- the first layer and the second layer are synthesized to display the virtual object at the added position specified by the user of the first preview image.
- the size of the layer is not limited by the screen size of the mobile terminal, that is, the size of the layer can be greater than, equal to, or smaller than the screen size.
- step 404 the first preview image is input to the gesture recognition model.
- the gesture recognition model may be a deep learning model trained according to the set number of picture samples described above.
- the first preview image is recognized by the gesture recognition model, and the human body posture included in the first preview image is determined.
- step 405 status information of the augmented reality function is acquired under the shooting scene.
- the shooting scene refers to that the current interface is a shooting interface, including but not limited to an interface for acquiring a character image through a camera.
- the augmented reality function is added to the camera application of the mobile terminal, that is, when the user shoots through the camera application, it is possible to select whether to enable the augmented reality function.
- an augmented reality function option is added under the shooting interface, and the augmented reality function is enabled by selecting the augmented reality function option. If the augmented reality option is selected, it is determined that the augmented reality function is not enabled.
- step 406 when the augmented reality function is enabled, the virtual objects in the virtual object library are sorted in descending order according to the frequency of use, and the sorting result is displayed.
- the historical behavior of the user under the augmented reality function can be obtained, and the historical behavior of the user is determined to determine the frequency of use of the virtual object, and the virtual objects are arranged in descending order according to the frequency of use, that is, the most frequently used virtual objects are ranked at the forefront.
- the display priority is configured for the virtual object according to the sorting result.
- step 407 a virtual object is obtained.
- the manner in which the user selects the virtual object includes, but is not limited to, clicking an icon of the virtual object, selecting a virtual object by means of voice input, and selecting a virtual object by a gesture manner such as a preset number of shaking times.
- step 408 the location of the virtual object is determined according to the human body gesture and the virtual object query preset database.
- the preset database stores the added position of the human body gesture and the virtual object. For example, by querying the preset database, it can be determined that a soccer ball is added to the foot when kicking the ball.
- step 409 the pixel value corresponding to the added position is replaced by the pixel value of the virtual object to form a second preview image.
- the pixel value of the pixel is correspondingly replaced with the pixel value of the virtual object to form a second preview image.
- the second preview image is converted into a format that can be displayed on the screen of the mobile terminal, and displayed on the display screen of the mobile terminal.
- step 410 a shooting instruction is acquired, and a captured picture corresponding to the second preview image is obtained in response to the shooting instruction.
- the technical solution of the embodiment after obtaining the first preview image of the object to be photographed, determining whether the pixel corresponding to the skin is included in the first preview image; if the first preview image includes the pixel corresponding to the skin, Inputting the first preview image into the gesture recognition model can avoid inputting a first preview picture without a portrait into the gesture recognition model, thereby reducing the data processing amount of the GPU.
- the virtual object is displayed according to the descending order of the use frequency, and the function of recommending the commonly used virtual object to the user is realized, and the human-computer interaction effect is optimized.
- FIG. 5 is a structural block diagram of a model construction apparatus according to an embodiment of the present application.
- the apparatus may be implemented by at least one of software and hardware, generally integrated into a terminal, which may be a server configured to construct a gesture recognition model by performing a model building method.
- the apparatus includes a sample determination module 510, a model training module 520, and a model transmission module 530.
- the sample determining module 510 is configured to acquire a set number of human body posture pictures, and mark the target object in the human body posture picture to obtain a picture sample, wherein the target object includes at least one of a head, a limb, and a torso Kind.
- the model training module 520 is configured to train the preset deep learning model according to the picture sample by using a set machine learning algorithm to obtain a gesture recognition model.
- the model sending module 530 is configured to send the gesture recognition model to the mobile terminal.
- the technical solution of the embodiment provides a model construction device, which uses a human body posture picture with various postures as a training sample, and uses a machine learning algorithm to train the deep learning model to obtain a posture recognition model, which can make the posture recognition model have recognition.
- the function of the human body posture Since the gesture recognition model learns effective features (including angle of view, occlusion, etc.) for recognizing gestures from a large number of image samples, these image samples include different shooting angles, different distances between the camera and the character, and various occlusion degrees of the character itself. Etc. Therefore, the pose recognition model is used to identify the human body pose in the picture, which is robust to occlusion and change of view angle, and avoids the inaccurate or misrecognition of the related image recognition technology, which improves the accuracy of pose recognition.
- the sample determining module 510 is configured to: acquire a network image in the network platform image library; mark the head pixel coordinates, the limb pixel coordinates, and the trunk pixel coordinates of the character in the network image to obtain the first a sample picture; storing the first sample picture in a picture sample set.
- the sample determining module 510 is configured to: obtain a user picture in the mobile terminal album; mark the head pixel coordinates, the limb pixel coordinates, and the trunk pixel coordinates in the user picture to obtain a second sample picture. And storing the second sample picture into a picture sample set.
- the model training module 520 is configured to: perform training of two stages of forward propagation and backward propagation on the preset deep learning model by using the image sample set; and calculating the backward propagation training When the error reaches the expected error value, the training ends and the attitude recognition model is obtained.
- the gesture recognition model is a convolutional neural network model
- the model construction apparatus further includes: a model optimization module, configured to adopt preset optimization before transmitting the gesture recognition model to the mobile terminal
- the strategy optimizes the convolutional neural network model, wherein optimization of the convolutional neural network model includes at least one of internal network structure optimization, convolutional layer implementation optimization, and pooling layer implementation optimization. .
- the embodiment of the present application further provides a storage medium including computer executable instructions, which are set to execute the photographing method provided by the embodiment of the present application when executed by a computer processor, the method comprising: acquiring settings a number of human body gesture pictures, the target object in the human body posture picture is marked to obtain a picture sample, wherein the target object includes at least one of a head, a limb, and a torso;
- the predetermined machine learning algorithm trains the preset deep learning model to obtain a gesture recognition model, and sends the gesture recognition model to the mobile terminal.
- Storage media any of a variety of types of memory devices or storage devices.
- the term "storage medium” is intended to include: a mounting medium such as a Compact Disc Read-Only Memory (CD-ROM), a floppy disk or a tape device; a computer system memory or a random access memory such as a dynamic random access memory; (Dynamic Random Access Memory, DRAM), Double Data Rate Random Access Memory (DDRRAM), Static Random Access Memory (SRAM), Extended Data Output Random Access Memory (Extended Data Output Random Access Memory (EDORAM), Rambus RAM, etc.; non-volatile memory such as flash memory, magnetic media (such as hard disk or optical storage); registers or other similar types of memory elements, and the like.
- a mounting medium such as a Compact Disc Read-Only Memory (CD-ROM), a floppy disk or a tape device
- a computer system memory or a random access memory such as a dynamic random access memory
- DRAM Double Data Rate Random Access Memory
- SRAM Static Random Access Memory
- the storage medium may also include other types of memory or a combination thereof. Additionally, the storage medium may be located in a first computer system in which the program is executed, or may be located in a different second computer system, the second computer system being coupled to the first computer system via a network, such as the Internet. The second computer system can provide program instructions to the first computer for execution.
- the term "storage medium" can include two or more storage media that can reside in different locations (eg, in different computer systems connected through a network).
- the storage medium may store program instructions (eg, embodied as a computer program) executable by the at least one processor.
- a storage medium containing computer executable instructions provided by the embodiments of the present application
- the computer executable instructions are not limited to the model construction operation as described above, and may also perform the gesture recognition model provided by any embodiment of the present application. Related operations in the build method.
- the embodiment of the present application further provides a terminal, which may be a server or other electronic device with strong computing capability.
- the server integrates a model construction device, and is configured to construct a gesture recognition model by executing a model construction method.
- FIG. 6 is a structural block diagram of a server according to an embodiment of the present application. As shown in FIG.
- the terminal 600 includes a memory 610 and a processor 620, wherein the memory 610 is configured to store executable program code and picture samples; the processor 620 reads the executable program stored in the memory 610 by reading a code to run a computer program corresponding to the executable program code to: obtain a set number of human body posture pictures, mark a target object in the human body posture picture, and obtain a picture sample, wherein the The target object includes at least one of a head, a limb, and a torso; according to the picture sample, the preset depth learning model is trained by using a set machine learning algorithm to obtain a gesture recognition model; and the gesture recognition model is sent to Mobile terminal.
- the memory 610 is configured to store executable program code and picture samples
- the processor 620 reads the executable program stored in the memory 610 by reading a code to run a computer program corresponding to the executable program code to: obtain a set number of human body posture pictures, mark a target object in the human body posture picture, and obtain a picture sample
- the structure of the foregoing terminal is only an example, and the terminal may include, but is not limited to, the memory and the processor described in the above examples, and may further include: a peripheral interface, a power management chip, and an input/output (I/O). Subsystems, other input/control devices, and external ports that communicate via at least one communication bus or signal line.
- the model construction device, the storage medium, and the server provided in the foregoing embodiments may perform corresponding model construction methods provided by the embodiments of the present application, and have corresponding functional modules and beneficial effects for executing the method.
- model construction method provided by any embodiment of the present application.
- FIG. 7 is a structural block diagram of a photographing apparatus according to an embodiment of the present application. As shown in FIG. 7 , the apparatus includes an image obtaining module 710, a gesture recognition module 720, an object obtaining module 730, an object adding module 740, and a photographing module 750.
- the image obtaining module 710 is configured to acquire a first preview image of the object to be photographed.
- the gesture recognition module 720 is configured to identify a human body gesture in the first preview image by using a pre-configured gesture recognition model, wherein the gesture recognition model is a deep learning model trained according to a set number of image samples, the image The sample is determined according to a human body pose picture containing the target object.
- the picture sample includes a human body pose
- the deep learning model is a convolutional neural network model.
- the object obtaining module 730 is configured to acquire a virtual object, wherein the virtual object is used to provide an augmented reality effect for the object to be photographed.
- the object adding module 740 is configured to determine an added position of the virtual object according to the human body posture, and add the virtual object at the added position to form a second preview image.
- the shooting module 750 is configured to acquire a shooting instruction, and obtain a captured picture corresponding to the second preview image in response to the shooting instruction.
- the technical solution of the embodiment provides a photographing device for recognizing a human body posture in the first preview image by using the gesture recognition model, thereby avoiding the influence of the photographing angle of view and the occlusion on the accuracy of the gesture recognition, and improving the accuracy of the gesture recognition;
- the posture determines the addition position of the virtual object, and combines the augmented reality function with the human body posture detection to accurately add the virtual object in an accurate position and improve the display effect of the photographed photo.
- the photographing apparatus further includes: a determining module, configured to determine, after the object acquiring module acquires the first preview image of the object to be photographed, whether the pixel corresponding to the skin is included in the first preview image; If the first preview image includes a pixel corresponding to the skin, the first preview image is input into the gesture recognition model; if the first preview image does not include a pixel corresponding to the skin, the output cannot be determined to be virtual. A hint for the location where the object is added.
- the method further includes: a state information acquiring module, configured to acquire state information of the augmented reality function in the shooting scene before the object acquiring module acquires the virtual object;
- the sorting module is configured to display the sorting result in descending order of the virtual objects in the virtual object library according to the frequency of use when the augmented reality function is enabled.
- the object adding module 740 is configured to: determine an added position of the virtual object according to the human body posture and the virtual object query preset database, wherein the preset database stores an associated human body posture and The location of the virtual object;
- the pixel value corresponding to the added position is replaced by the pixel value of the virtual object.
- the embodiment of the present application further provides a storage medium including computer executable instructions, which are executed by a computer processor to perform a photographing method provided by an embodiment of the present application, the method comprising: acquiring a to-be-shot a first preview image of the object; identifying a human body pose in the first preview image by a pre-configured gesture recognition model, wherein the gesture recognition model is a deep learning model trained according to a set number of picture samples; acquiring a virtual object The virtual object is configured to provide an augmented reality effect for the object to be photographed; determining an added position of the virtual object according to the human body posture, and adding the virtual object to the added position to form a second preview Obtaining a shooting instruction, and obtaining a captured image corresponding to the second preview image in response to the shooting instruction.
- Storage media any of a variety of types of memory devices or storage devices.
- the term "storage medium” is intended to include: a mounting medium such as a CD-ROM, a floppy disk or a tape device; a computer system memory or a random access memory such as DRAM, DDRRAM, SRAM, EDORAM, Rambus RAM, etc.; Volatile memory, such as flash memory, magnetic media (such as hard disk or optical storage); registers or other similar types of memory elements, and the like.
- the storage medium may also include other types of memory or a combination thereof.
- the storage medium may be located in a first computer system in which the program is executed, or may be located in a different second computer system, the second computer system being coupled to the first computer system via a network, such as the Internet.
- the second computer system can provide program instructions to the first computer for execution.
- the term "storage medium" can include two or more storage media that can reside in different locations (eg, in different computer systems connected through a network).
- the storage medium may store program instructions (eg, embodied as a computer program) executable by the at least one processor.
- a storage medium containing computer executable instructions provided by the embodiments of the present application, the computer executable instructions are not limited to the photographing operation as described above, and may also perform the posture and enhancement of the human body provided by any embodiment of the present application. Related operations in a combination of real-world photography methods.
- FIG. 8 is a structural block diagram of a mobile terminal according to an embodiment of the present disclosure.
- the mobile terminal may include: a casing (not shown), a memory 801, and a central processing unit (Central Processing Unit, A CPU 802 (also referred to as a processor), a circuit board (not shown), and a power supply circuit (not shown).
- a central processing unit Central Processing Unit
- a CPU 802 also referred to as a processor
- a circuit board not shown
- a power supply circuit not shown.
- the circuit board is disposed inside a space enclosed by the casing; the CPU 802 and the memory 801 are disposed on the circuit board; and the power circuit is configured to supply power to each circuit or device of the terminal;
- the memory 801 is configured to store executable program code, a preset database of added locations of virtual objects, and the like; the CPU 802 runs the executable program code by reading executable program code stored in the memory 801 Corresponding computer program to: obtain a first preview image of the object to be photographed; and identify a human body pose in the first preview image by a pre-configured gesture recognition model, wherein the gesture recognition model is set according to a number of picture sample-trained deep learning models, the picture samples being determined according to a human body pose picture containing the target object; acquiring a virtual object, wherein the virtual object is used to provide an augmented reality effect for the object to be photographed; a human gesture determines an added position of the virtual object, and adds the virtual pair at the added position Forming a second preview image; acquiring shooting instruction,
- the terminal further includes: a peripheral interface 803, a radio frequency (RF) circuit 805, an audio circuit 806, a speaker 811, a power management chip 808, an input/output (I/O) subsystem 809, and other input/control devices. 810, touch screen 812, other input/control devices 810, and external port 804, these components are communicated via at least one communication bus or signal line 807.
- RF radio frequency
- I/O input/output subsystem 809
- the illustrated mobile terminal 800 is merely one example of a terminal, and that the mobile terminal 800 may have more or fewer components than those shown in the figures, two or more components may be combined, or Can have different component configurations.
- the various components shown in the figures can be implemented in hardware, software, or a combination of hardware and software, including at least one of signal processing and application specific integrated circuits.
- the mobile terminal 800 provided in this embodiment is described in detail below.
- the terminal uses a mobile phone as an example.
- the memory 801 can be accessed by the CPU 802, the peripheral interface 803, etc., and the memory 801 can include a high speed random access memory, and can also include a non-volatile memory, such as at least one disk storage device, a flash memory device, or Other volatile solid-state storage devices.
- a non-volatile memory such as at least one disk storage device, a flash memory device, or Other volatile solid-state storage devices.
- Peripheral interface 803, which can connect the input and output peripherals of the device to CPU 802 and memory 801.
- the I/O subsystem 809 can connect input and output peripherals on the device, such as touch screen 812 and other input/control devices 810, to peripheral interface 803.
- the I/O subsystem 809 can include a display controller 8081 and at least one input controller 8092 configured to control other input/control devices 810.
- at least one input controller 8092 receives electrical signals from other input/control devices 810 or transmits electrical signals to other input/control devices 810, and other input/control devices 810 may include physical buttons (press buttons, rocker buttons, etc.), Dial, slide switch, joystick, click wheel.
- the input controller 8092 can be connected to any of the following: a keyboard, an infrared port, a USB interface, and a pointing device such as a mouse.
- the touch screen 812 is an input interface and an output interface between the user terminal and the user, and displays the visual output to the user.
- the visual output may include graphics, text, icons, videos, and the like.
- Display controller 8081 in I/O subsystem 809 receives an electrical signal from touch screen 812 or an electrical signal to touch screen 812.
- the touch screen 812 detects the contact on the touch screen, and the display controller 8081 converts the detected contact into an interaction with the user interface object displayed on the touch screen 812, ie, realizes human-computer interaction, and the user interface object displayed on the touch screen 812 may be running.
- the icon of the game, the icon of the network to the corresponding network, and the like.
- the device may also include a light mouse, which is a touch sensitive surface that does not display a visual output, or an extension of a touch sensitive surface formed by the touch screen.
- the RF circuit 805 is mainly configured to establish communication between the mobile phone and the wireless network (ie, the network side), and implement data reception and transmission between the mobile phone and the wireless network. For example, sending and receiving short messages, emails, and the like.
- RF circuit 805 receives and transmits an RF signal, also referred to as an electromagnetic signal, and RF circuit 805 converts the electrical signal into an electromagnetic signal or converts the electromagnetic signal into an electrical signal, and through the electromagnetic signal and communication network And other devices to communicate.
- the RF circuit 805 can include known circuitry configured to perform these functions including, but not limited to, an antenna system, an RF transceiver, at least one amplifier, a tuner, at least one oscillator, a digital signal processor, a codec (COder- DECoder, CODEC) Chipset, Subscriber Identity Module (SIM), etc.
- an antenna system an RF transceiver, at least one amplifier, a tuner, at least one oscillator, a digital signal processor, a codec (COder- DECoder, CODEC) Chipset, Subscriber Identity Module (SIM), etc.
- CODEC codec
- SIM Subscriber Identity Module
- the audio circuit 806 is primarily configured to receive audio data from the peripheral interface 803, convert the audio data into an electrical signal, and transmit the electrical signal to the speaker 811.
- the speaker 811 is arranged to restore the voice signal received by the mobile phone from the wireless network through the RF circuit 805 to sound and play the sound to the user.
- the power management chip 808 is configured to provide power and power management for the hardware connected to the CPU 802, the I/O subsystem, and the peripheral interface.
- the photographing device, the storage medium and the terminal provided in the above embodiments can execute the corresponding photographing method provided by the embodiment of the present application, and have the corresponding functional modules and beneficial effects of performing the method.
- the photographing method provided by any embodiment of the present application can execute the corresponding photographing method provided by any embodiment of the present application.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Image Analysis (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
本申请实施例公开了一种模型构建方法、拍照方法、装置、存储介质及终端。该方法包括获取设定数量的人体姿态图片,对所述人体姿态图片中的目标对象进行标记,得到图片样本,其中,所述目标对象包括头部、四肢以及躯干中的至少一种;根据所述图片样本,采用设定的机器学习算法对预设的深度学习模型进行训练,得到姿态识别模型;发送所述姿态识别模型至移动终端。
Description
本申请要求在2017年12月21日提交中国专利局、申请号为201711392033.5的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。
本申请实施例涉及拍摄技术,例如涉及一种模型构建方法、拍照方法、装置、存储介质及终端。
增强现实技术(Augmented Reality,AR)是一种将真实世界信息和虚拟世界信息“无缝”集成的新技术,可以使真实的环境和虚拟的物体实时地叠加到同一个画面或空间同时存在,并被人类感官所感知,从而达到超越现实的感官体验。
目前,AR技术被应用于医疗、文化、工业、娱乐和旅游等多个领域。例如,可以通过在照片中添加AR虚拟现实对象,丰富照片的显示效果。然而,相关图像识别技术存在缺陷,导致在照片中添加虚拟现实对象时,可能出现添加位置不准确的情况,从而影响照片的显示效果。
发明内容
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。
本申请实施例提供一种模型构建方法、拍照方法、装置、存储介质及终端,可以准确地识别人体姿态。
第一方面,本申请实施例提供了一种模型构建方法,包括:获取设定数量的人体姿态图片,对所述人体姿态图片中的目标对象进行标记,得到图片样本,其中,所述目标对象包括头部、四肢以及躯干中的至少一种;根据所述图片样本,采用设定的机器学习算法对预设的深度学习模型进行训练,得到姿态识别模型;发送所述姿态识别模型至移动终端。
第二方面,本申请实施例还提供了一种拍照方法,包括:获取待拍摄对象的第一预览图像;通过预先配置的姿态识别模型识别所述第一预览图像中的人体姿态,其中,所述姿态识别模型为根据设定数量的图片样本训练的深度学习模型,所述图片样本根据包含目标对象的人体姿态图片确定;获取虚拟对象,其中,所述虚拟对象用于为所述待拍摄对象提供增强现实效果;根据所述人体姿态确定所述虚拟对象的添加位置,并在所述添加位置增加所述虚拟对象,形成第二预览图像;获取拍摄指令,响应所述拍摄指令得到所述第二预览图像对应的拍摄图片。
第三方面,本申请实施例还提供了一种模型构建装置,该模型构建装置包括:样本确定模块,设置为获取设定数量的人体姿态图片,对所述人体姿态图 片中的目标对象进行标记,得到图片样本,其中,所述目标对象包括头部、四肢以及躯干中的至少一种;模型训练模块,设置为根据所述图片样本,采用设定的机器学习算法对预设的深度学习模型进行训练,得到姿态识别模型,所述图片样本根据包含目标对象的人体姿态图片确定;模型发送模块,设置为发送所述姿态识别模型至移动终端。
第四方面,本申请实施例还提供一种拍照装置,该拍照装置包括:图像获取模块,设置为获取待拍摄对象的第一预览图像;姿态识别模块,设置为通过预先配置的姿态识别模型识别所述第一预览图像中的人体姿态,其中,所述姿态识别模型为根据设定数量的图片样本训练的深度学习模型;对象获取模块,设置为获取虚拟对象,其中,所述虚拟对象用于为所述待拍摄对象提供增强现实效果;对象添加模块,设置为根据所述人体姿态确定所述虚拟对象的添加位置,并在所述添加位置增加所述虚拟对象,形成第二预览图像;拍摄模块,设置为获取拍摄指令,响应所述拍摄指令得到所述第二预览图像对应的拍摄图片。
第五方面,本申请实施例还提供一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如上述第一方面所述的模型构建方法。
第六方面,本申请实施例还提供一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如上述第二方面所述的拍照方法。
第七方面,本申请实施例还提供一种终端,包括存储器,处理器及存储在存储器上并可在处理器运行的计算机程序,所述处理器执行所述计算机程序时实现如上述第一方面所述的模型构建方法。
第八方面,本申请实施例还提供另一种终端,包括存储器,处理器及存储在存储器上并可在处理器运行的计算机程序,所述处理器执行所述计算机程序时实现如上述第二方面所述的拍照方法。
在阅读并理解了附图和详细描述后,可以明白其他方面。
图1是本申请实施例提供的一种模型构建方法的流程图;
图2是本申请实施例提供的一种拍照方法的流程图;
图3是本申请实施例提供的一种根据人体姿态添加虚拟对象的示意图;
图4是本申请实施例提供的另一种拍照方法的流程图;
图5是本申请实施例提供的一种模型构建装置的结构框图;
图6是本申请实施例提供的一种服务器的结构框图;
图7是本申请实施例提供的一种拍照装置的结构框图;
图8是本申请实施例提供的一种移动终端的结构框图。
下面结合附图和实施例对本申请作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释本申请,而非对本申请的限定。另外还需 要说明的是,为了便于描述,附图中仅示出了与本申请相关的部分而非全部结构。
在更加详细地讨论示例性实施例之前应当提到的是,一些示例性实施例被描述成作为流程图描绘的处理或方法。虽然流程图将各步骤描述成顺序的处理,但是其中的许多步骤可以被并行地、并发地或者同时实施。此外,各步骤的顺序可以被重新安排。当其操作完成时所述处理可以被终止,但是还可以具有未包括在附图中的附加步骤。所述处理可以对应于方法、函数、规程、子例程、子程序等等。
在人体姿态识别技术中,因为对于颜色、光照、遮挡等自然环境因素无法做到鲁棒,再加上人体姿态具有太多的自由度和观测角度,使得对图像或视频进行人体姿态估计是一项非常具有挑战性的工作。
相关技术提供采用人体姿态识别的方式通常是采用人体边缘、剪影轮廓及光流等信息。然而,这些特征对于噪声、部分遮挡及视角变化比较敏感,识别结果很容易受上述因素的影响,检测准确率受限。为了避免上述情况,本申请提出了一种模型构建方案,可以有效的改善人体姿态识别结果受遮挡及视角的影响的情况,提高了人体姿态识别的准确率。
图1为本申请实施例提供的一种模型构建方法的流程图,该方法可以由模型构建装置来执行,其中,该装置可由软件和硬件中的至少一种实现,一般可集成在终端中,该终端可以是服务器,如用于完成人体姿态模型创建、训练及优化等功能的服务器中。如图1所示,该方法包括步骤110至步骤130。
在步骤110中,获取设定数量的人体姿态图片,对所述人体姿态图片中的目标对象进行标记,得到图片样本。
其中,人体姿态图片为包含人物的图片,且该图片中人物通过头、四肢或躯干摆出某种姿势。目标对象包括头部、四肢以及躯干中的至少一种。
示例性的,通过网络爬虫从网络平台下载人体姿态图片,并将人体姿态图片进行分类。例如,按照人体姿态图片的来源将其分为运动类、影视类、表情包类等。在通过网络爬虫从网络平台图片库中获取到具有人物的网络图片后,对该网络图片中的人物的头部像素坐标、四肢像素坐标以及躯干像素坐标进行标记,得到第一样本图片。标记的方式可以是通过肤色识别算法粗略的确定出人物的头部像素坐标、四肢像素坐标,进而,根据头部像素坐标及四肢像素坐标确定躯干像素坐标。基于上述坐标突出显示头部轮廓、四肢轮廓及躯干轮廓,从而实现对头部像素坐标、四肢像素坐标及躯干像素坐标的标记。例如,可以采用虚线框分别标记头部像素坐标、四肢像素坐标及躯干像素坐标的方式进行标记。
在一实施例中,对头部轮廓、四肢轮廓及躯干轮廓进行间隔采样,将采样点作为标记点,以通过该标记点代表头部像素点、四肢像素点或躯干像素点,顺序连接该标记点实现对头部像素坐标、四肢像素坐标及躯干像素坐标的标记。其中,采样间隔可以根据实际需要自行设置。
将标记后的网络图片记为第一样本图片,将该第一样本图片存储于图片样本集。
在一实施例中,获取移动终端相册中的用户图片;对该用户图片中的人物头部像素坐标、四肢像素坐标以及躯干像素坐标进行标记,得到第二样本图片;将该第二样本图片存入图片样本集。由于服务器从移动终端相册中获取用户图片之前需要获取用户许可,因此,服务器在首次由移动终端相册获取用户图片时,以对话框的形式显示询问信息,以询问是否授予服务器访问相册的权限。获取用户的输入指示,若用户输入肯定信息,则赋予该服务器访问相册的权限。服务器获取了移动终端相册中的用户图片后,采用上述相似的方法确定出用户图片中人物的头部像素坐标、四肢像素坐标,进而,根据头部像素坐标及四肢像素坐标确定躯干像素坐标。基于上述坐标突出显示头部轮廓、四肢轮廓及躯干轮廓,从而实现对头部像素坐标、四肢像素坐标及躯干像素坐标的标记。
将标记后的用户图片记为第二样本图片,并将该第二样本图片存储于图片样本集,从而,将存储有第一样本图片及第二样本图片的图片样本集作为训练深度学习模型图的片样本。
在步骤120中,根据所述图片样本,采用设定的机器学习算法对预设的深度学习模型进行训练,得到姿态识别模型。
其中,深度学习模型可以为卷积神经网络模型,可以预先设置隐藏层的数目以及输入层、隐藏层和输出层各层的节点数,以及初始化卷积神经网络的第一参数,其中,第一参数包括各层的偏置值及边的权重,初步得到神经网络模型的框架。
设定的机器学习算法包括前向传播算法和后向传播算法。示例性的,利用所述图片样本集对预设的深度学习模型进行前向传播和后向传播两个阶段的训练;在所述后向传播训练计算得到的误差达到期望误差值时,训练结束,并得到姿态识别模型。具体可以是根据该图片样本,采用前向传播算法及后向传播算法训练该卷积神经网络模型,学习出神经网络模型的框架的第二参数,其中,第二参数是计算图片样本的实际输出与期望输出的偏差,根据该偏差采用后向传播算法计算得到的修正参数,并采用该第二参数更新第一参数。然后,计算模型误差,其中,模型误差可以根据图片样本的实际输出与期望输出的偏差确定,在所述模型误差达到期望误差值时,训练结束,得到姿态识别模型。
在步骤130中,发送所述姿态识别模型至移动终端。
获取移动终端发送的模型下载请求,根据该下载请求将该姿态识别模型移植至移动终端。需要说明的是,由于服务器与移动终端的运算能力存在较大差异,在将姿态识别模型移植到移动终端之前,还需要对姿态识别模型进行优化。示例性的,采用预设的优化策略对所述卷积神经网络模型进行优化,其中,对所述卷积神经网络模型的优化包括内部网络结构优化、卷积层的实现方式优化、池化层的实现方式优化中的至少一项。例如,增加残差块构建残差神经网络模型,或者调整残差块的结构。又如,对于卷积层的实现方式的优化可以是减少 输出通道和输入通道的连接数量,即输出通道不再和所有输入通道有关,只和相邻的输入通道相关。又如,在卷积层的实现上增加基层,将卷积分为两个步骤:首先,输入的每一通道单独运算,在同样尺寸卷积核的作用下,每一通道得到中间计算结果,将中间计算结果的每一个通道称为一个基层;然后,将各个通道进行合并,得到卷积层的输出结果。又如,通过需要的图像压缩系数设计池化层中用于图像压缩的矩阵。
本实施例的技术方案,通过获取设定数量的人体姿态图片,并对该人体姿态图片中的目标对象进行标记,得到图片样本;并根据所述图片样本,采用设定的机器学习算法对预设的深度学习模型进行训练,得到姿态识别模型;发送该姿态识别模型至移动终端,以使移动终端通过该姿态识别模型识别摄像头拍摄的照片中的人体姿态。采用上述技术方案,以具有各种姿态的人体姿态图片作为训练样本,采用机器学习算法对深度学习模型进行训练,得到姿态识别模型,可以使该姿态识别模型具备识别人体姿态的功能。由于姿态识别模型是从大量图片样本学习用于识别姿态的有效特征(包括视角、遮挡等)的,这些图像样本包含不同拍摄视角、摄像机与人物的不同距离、以及人物自身的多种遮挡程度等等,因此采用姿态识别模型对图片中的人体姿态进行识别,对遮挡、视角变化具有良好的鲁棒性,避免相关图像识别技术存在的识别不准确或误识别等情况,提高了姿态识别准确率。
图2是本申请实施例提供的一种拍照方法的流程图。该方法可以由拍照装置来执行,其中,该装置可由软件和硬件中的至少一种实现,一般可集成在移动终端中,如具有摄像头的移动终端。如图2所示,该方法包括步骤201至步骤210。
在步骤210中,获取待拍摄对象的第一预览图像。
其中,第一预览图像可以包括用户在按下拍照按钮前,移动终端的拍摄界面中显示的摄像头所捕获的画面。本实施例中的第一预览图像可以是人物类图像。
在一实施例中,移动终端对第一预览图像的获取操作可以由移动终端的系统执行,或者由移动终端中含有拍摄功能的任意应用软件执行,获取第一预览图像的操作可以在用户的操作指示下由系统或应用软件执行。例如,用户可直接打开移动终端系统中的相机功能对人物进行拍照,也可以使用应用软件的拍照选项对人物进行拍照。
示例性的,该第一预览图像的获取方式可以是作为拍摄目标的人物通过摄像头中的镜头,将该人物的光学图像投射到感光芯片上,由感光芯片将光学图像信号转换为电信号,经过一系列设定的变换或处理后,得到第一预览图像,通过专用的接口,例如移动行业处理器接口(Mobile Industry Processor Interface,MIPI)发送到移动终端主板中的图像处理器(Image Signal Processor,ISP)进行处理,最终转换成移动终端屏幕上可以显示的格式,在移动终端的显示屏上进行显示。
在步骤220中,通过预先配置的姿态识别模型识别所述第一预览图像中的人体姿态。
其中,姿态识别模型为根据设定数量的图片样本训练的深度学习模型,所述图片样本根据包含目标对象的人体姿态图片确定。该姿态识别模型可设置为在输入第一预览图像之后,快速并准确的识别第一预览图像中的人体姿态。该姿态识别模型可以是卷积神经网络模型。本申请实施例中对该神经网络模型的层数、神经元的数量、卷积核和权重中至少一种的网络参数不作限定。示例性的,该姿态识别模型可以是本申请实施例中基于设定数量的人体姿态图片,采用设定机器学习算法对预设的深度学习模型进行训练,得到的卷积神经网络模型。
其中,该姿态识别模型可以是在服务器中进行构建、训练及优化,并由服务器移植至到移动终端,并进行配置的。在一实施例中,如果移动终端的处理能力允许的话,也可以在移动终端中进行模型构建、训练及优化处理。
示例性的,将第一预览图像输入该预先配置的姿态识别模型,通过该姿态识别模型识别该第一预览图像包含的人物的人体姿态。由于姿态识别模型是从大量图片样本(包含不同拍摄视角、摄像机与人物的不同距离、以及人物自身的多种遮挡程度)中学习出有效的特征,因此采用姿态识别模型对图片中的人体姿态进行识别,对遮挡、视角变化具有良好的鲁棒性。
在步骤230中,获取虚拟对象。
其中,虚拟对象用于为所述待拍摄对象提供增强现实效果,包括实体图像(例如篮球、足球、太阳、月亮或演员等实物的图像)、影视人物图像(如功夫熊猫、蓝精灵或超人等影视人物的图像)或者特效(例如烟雾效果、蒸汽效果及运动轨迹效果等)。可以将上述虚拟对象存储于虚拟对象库中。
在移动终端内预置虚拟对象库,该虚拟对象库中存储有各种虚拟对象。可以理解的是,该虚拟对象可以由网络平台获取,还可以由终端厂商自行设计。
若检测到识别出人体姿态,则展示虚拟对象库中的各种虚拟对象,供用户选择待添加至第一预览图像的虚拟对象,此时的虚拟对象是按照默认顺序进行展示。在一实施例中,还可以展示与所确定的人体姿态关联的虚拟对象。例如,若检测出用户的姿态为跑步姿势,获取的虚拟对象包括跑道、终点线及彩带等等。若检测出用户的姿态为打篮球,则获取的虚拟对象包括篮球场地、篮球及球筐等。
在步骤240中,根据所述人体姿态确定所述虚拟对象的添加位置,并在所述添加位置增加所述虚拟对象,形成第二预览图像。
可以预先规定虚拟对象的添加位置,即以数据库的方式关联存储虚拟对象的添加位置与人体姿态的对应关系。例如,对于投篮姿势,规定在手部添加篮球;对于踢球姿势,规定在摆出踢动作的一只脚处添加足球;对于吹灭生日蜡烛的姿势,规定在生日蛋糕的正上方添加生日祝语等。将预先设定好的虚拟对象的添加位置与人体姿态的对应关系存储于预设数据库中。
在识别出人体姿态后,根据人体姿态及被选中的虚拟对象查询该预设数据库,根据查询结果确定虚拟对象与人体姿态的位置关系。例如,如果人体姿态为跑步,虚拟对象是跑道,则根据查询预设数据库的结果可以确定在第一预览图像中人物的脚下添加跑道。图3示出了一种根据人体姿态添加虚拟对象的示意图,如图3所示,获取第一预览图像310,在识别出第一预览图像310中的人体姿态为跑步时,显示虚拟对象选择窗口320,该虚拟对象选择窗口320包括虚拟对象图片321以及选项框322。若检测到用户选中跑道对应的选项框322,则将该第一预览图像310中人物的脚下设定区域内的像素点更换为虚拟对象对应的像素点,从而形成第二预览图像330。
在步骤250中,获取拍摄指令,响应所述拍摄指令得到所述第二预览图像对应的拍摄图片。
其中,拍摄指令可以是用户按下拍照按钮触发执行的操作指令,还可以是用户输入的拍照语音触发执行的操作指令,还可以是用户拍照手势触发执行的操作指令等等。
在检测到拍摄指令时,响应该拍摄指令对第二预览图像进行存储,得到第二预览图像对应的拍摄图片。示例性的,在检测到拍照按钮被按下时,保存第二预览图像得到拍摄图片,将该拍摄图片存储于移动终端的相册内。
本实施例的技术方案,通过获取待拍摄对象的第一预览图像;通过预先配置的姿态识别模型识别所述第一预览图像中的人体姿态;获取虚拟对象;根据所述人体姿态确定所述虚拟对象的添加位置,并在所述添加位置增加所述虚拟对象,形成第二预览图像;获取拍摄指令,响应所述拍摄指令得到所述第二预览图像对应的拍摄图片。采用上述技术方案,通过姿态识别模型识别第一预览图像中的人体姿态,避免拍摄视角、遮挡等对姿态识别准确率的影响,提高姿态识别的准确率;并根据人体姿态确定虚拟对象的添加位置,将增强现实功能与人体姿态检测相结合,实现在准确的位置精确的添加虚拟对象,提升拍摄照片的显示效果。
图4是本申请实施例提供的另一种拍照方法的流程图。如图4所示,该方法包括步骤401至步骤410。
在步骤401中,获取待拍摄对象的第一预览图像。
在步骤402中,判断所述第一预览图像中是否包含皮肤对应的像素点,若所述第一预览图像中包含皮肤对应的像素点,则执行步骤403,所述第一预览图像中不包含皮肤对应的像素点,执行步骤404。
对第一预览图像进行皮肤像素点检测的方法有很多中,本申请实施例并不作具体限定。例如,可以对第一预览图像进行图像处理,得到该第一预览图像的直方图,根据该直方图中像素点的灰度值分布,确定该第一预览图像中是否包含皮肤对应的像素点。又如,通过MATLAB建立区域模型,利用肤色在色彩中的取值范围,将能够满足一定条件的区域标记为肤色区域。利用此模型主要通过两个步骤:一是用统计法确定肤色的具体范围;二是通过此模型判定新像 素或区域是否为肤色。因此,在一幅图片中,某一像素或区域满足肤色所在色彩区域中的取值范围判定为皮肤区域,某一像素或区域不满足肤色所在色彩区域中的取值范围判定为非皮肤区域。
在步骤403中,输出无法确定虚拟对象的添加位置的提示信息。
在第一预览图像中不包含皮肤对应的像素点时,以对话框的形式显示提示信息,以提示用户无法确定虚拟对象的添加位置,供用户选择是否添加虚拟对象。若用户选择继续添加,则为用户提供指定添加位置的功能。例如可以是,绘制第一图层,该第一图层设置为显示待添加的虚拟对象,并且,第一图层除虚拟对象对应的像素坐标之外的位置为透明图层。在用户选择添加虚拟对象时,在第二图层之上显示第一图层,并获取用户对第一图层的操作指示。例如,用户可以将第一图层拖至待添加位置。在检测到用户指定的虚拟对象的添加位置后,合成第一图层与第二图层,以在第一预览图像的该用户指定的添加位置显示该虚拟对象。可以理解的是,图层的尺寸并不受移动终端的屏幕尺寸的限制,即图层的尺寸可以大于、等于或小于屏幕尺寸。
在步骤404中,将所述第一预览图像输入所述姿态识别模型。
其中,姿态识别模型可以为上述记载的根据设定数量的图片样本训练的深度学习模型。
在第一预览图像中包含皮肤对应的像素点时,通过姿态识别模型对第一预览图像进行识别,确定该第一预览图像包含的人体姿态。
在步骤405中,在拍摄场景下,获取增强现实功能的状态信息。
其中,拍摄场景是指当前界面为拍摄界面,包括但不限于通过摄像头获取人物图像的界面。
为移动终端的相机应用增加增强现实功能,即在用户通过相机应用拍摄时,可以选择是否开启增强现实功能。在一实施例中,在拍摄界面下添加增强现实功能选项,通过选中该增强现实功能选项,启用增强现实功能。若检测到选中该增强现实功能选项,则确定未启用增强现实功能。
在步骤406中,在增强现实功能被启用时,按照使用频率对虚拟对象库中的虚拟对象进行降序排列,显示排序结果。
可以获取增强现实功能下的用户历史行为,分析该用户历史行为确定虚拟对象的使用频率,并按照该使用频率对虚拟对象进行降序排列,即使用最频繁的虚拟对象排在最前端。根据排序结果为虚拟对象配置显示优先级,在增强现实功能被启用后,若检测到人体姿态识别结果,则调用该虚拟对象库,并按照显示优先级显示该虚拟对象库中的虚拟对象,以供用户选择。
在步骤407中,获取虚拟对象。
获取用户选中的虚拟对象。用户选择虚拟对象的方式包括但不限于点击虚拟对象的图标,以语音输入的方式选择虚拟对象,以预设晃动次数等手势方式选择虚拟对象。
在步骤408中,根据所述人体姿态及所述虚拟对象查询预设数据库,确定 所述虚拟对象的添加位置。
其中,所述预设数据库中关联存储人体姿态与虚拟对象的添加位置。例如,通过查询预设数据库可以确定在踢球姿态时,为脚部添加足球。
在步骤409中,采用所述虚拟对象的像素值替换所述添加位置对应的像素值,形成第二预览图像。
获取虚拟对象的轮廓,以该轮廓为边界,在根据人体姿态确定的虚拟对象的添加位置处选择与该边界重合的区域,采用虚拟对象的像素点替换该区域内的像素点,即将该区域内像素点的像素值对应替换为虚拟对象的像素值,形成第二预览图像。然后,将第二预览图像转换成移动终端屏幕上可以显示的格式,在移动终端的显示屏上进行显示。
在步骤410中,获取拍摄指令,响应所述拍摄指令得到所述第二预览图像对应的拍摄图片。
本实施例的技术方案,通过在获取待拍摄对象的第一预览图像之后,判断所述第一预览图像中是否包含皮肤对应的像素点;若第一预览图像中包含皮肤对应的像素点,则将所述第一预览图像输入所述姿态识别模型,可以避免将不具有人像的第一预览图片输入姿态识别模型,从而,减少GPU的数据处理量。另外,在检测到人体姿态的识别结果后,按照使用频率的降序排列结果显示虚拟对象,实现向用户推荐其常用的虚拟对象的功能,优化了人机交互效果。
图5是本申请实施例提供的一种模型构建装置的结构框图。该装置可由软件和硬件中至少一种实现,一般可集成在终端中,该终端可以是服务器,设置为通过执行模型构建方法来构建姿态识别模型。如图5所示,该装置包括样本确定模块510,模型训练模块520以及模型发送模块530。
样本确定模块510,设置为获取设定数量的人体姿态图片,对所述人体姿态图片中的目标对象进行标记,得到图片样本,其中,所述目标对象包括头部、四肢以及躯干中的至少一种。
模型训练模块520,设置为根据所述图片样本,采用设定的机器学习算法对预设的深度学习模型进行训练,得到姿态识别模型。
模型发送模块530,设置为发送所述姿态识别模型至移动终端。
本实施例的技术方案提供一种模型构建装置,以具有各种姿态的人体姿态图片作为训练样本,采用机器学习算法对深度学习模型进行训练,得到姿态识别模型,可以使该姿态识别模型具备识别人体姿态的功能。由于姿态识别模型是从大量图片样本学习用于识别姿态的有效特征(包括视角、遮挡等)的,这些图像样本包含不同拍摄视角、摄像机与人物的不同距离、以及人物自身的多种遮挡程度等等,因此采用姿态识别模型对图片中的人体姿态进行识别,对遮挡、视角变化具有良好的鲁棒性,避免相关图像识别技术存在的识别不准确或误识别等情况提高了姿态识别准确率。
在一实施例中,样本确定模块510设置为:获取网络平台图片库中的网络图片;对所述网络图片中的人物的头部像素坐标、四肢像素坐标以及躯干像素 坐标进行标记,得到第一样本图片;将所述第一样本图片存入图片样本集。
在一实施例中,样本确定模块510设置为:获取移动终端相册中的用户图片;对所述用户图片中的人物头部像素坐标、四肢像素坐标以及躯干像素坐标进行标记,得到第二样本图片;将所述第二样本图片存入图片样本集。
在一实施例中,模型训练模块520设置为:利用所述图片样本集对预设的深度学习模型进行前向传播和后向传播两个阶段的训练;在所述后向传播训练计算得到的误差达到期望误差值时,训练结束,并得到姿态识别模型。
在一实施例中,所述姿态识别模型为卷积神经网络模型;以及,该模型构建装置还包括:模型优化模块,设置为在发送所述姿态识别模型至移动终端之前,采用预设的优化策略对所述卷积神经网络模型进行优化,其中,对所述卷积神经网络模型的优化包括内部网络结构优化、卷积层的实现方式优化、池化层的实现方式优化中的至少一项。
本申请实施例还提供一种包含计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时设置为执行本申请实施例所提供的拍照方法,该方法包括:获取设定数量的人体姿态图片,对所述人体姿态图片中的目标对象进行标记,得到图片样本,其中,所述目标对象包括头部、四肢以及躯干中的至少一种;根据所述图片样本,采用设定的机器学习算法对预设的深度学习模型进行训练,得到姿态识别模型;发送所述姿态识别模型至移动终端。
存储介质——任何的各种类型的存储器设备或存储设备。术语“存储介质”旨在包括:安装介质,例如光盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、软盘或磁带装置;计算机系统存储器或随机存取存储器,诸如动态随机存取存储器(Dynamic Random Access Memory,DRAM)、双数据速率随机存取存储器(Double Data Rate Random Access Memory,DDRRAM)、静态随机存取存储器(Static Random Access Memory,SRAM)、扩展数据输出随机存取存储器(Extended Data Output Random Access Memory,EDORAM),兰巴斯(Rambus)RAM等;非易失性存储器,诸如闪存、磁介质(例如硬盘或光存储);寄存器或其它相似类型的存储器元件等。存储介质可以还包括其它类型的存储器或其组合。另外,存储介质可以位于程序在其中被执行的第一计算机系统中,或者可以位于不同的第二计算机系统中,第二计算机系统通过网络(诸如因特网)连接到第一计算机系统。第二计算机系统可以提供程序指令给第一计算机用于执行。术语“存储介质”可以包括可以驻留在不同位置中(例如在通过网络连接的不同计算机系统中)的两个或更多存储介质。存储介质可以存储可由至少一个处理器执行的程序指令(例如具体实现为计算机程序)。
当然,本申请实施例所提供的一种包含计算机可执行指令的存储介质,其计算机可执行指令不限于如上所述的模型构建操作,还可以执行本申请任意实施例所提供的姿态识别模型的构建方法中的相关操作。
本申请实施例还提供一种终端,该终端可以是服务器或其它具有较强运算能力的电子设备,该服务器中集成了模型构建装置,设置为通过执行模型构建 方法来构建姿态识别模型。图6是本申请实施例提供的一种服务器的结构框图。如图6所示,该终端600包括存储器610和处理器620,其中,该存储器610,设置为存储可执行程序代码及图片样本;该处理器620通过读取该存储器610中存储的可执行程序代码来运行与所述可执行程序代码对应的计算机程序,以实现以下步骤:获取设定数量的人体姿态图片,对所述人体姿态图片中的目标对象进行标记,得到图片样本,其中,所述目标对象包括头部、四肢以及躯干中的至少一种;根据所述图片样本,采用设定的机器学习算法对预设的深度学习模型进行训练,得到姿态识别模型;发送所述姿态识别模型至移动终端。
可以理解的是上述终端的结构仅是一个示例,该终端可以包括但不限于上述示例中所述的存储器及处理器,还可以包括:外设接口、电源管理芯片、输入/输出(I/O)子系统、其他输入/控制设备以及外部端口,这些部件通过至少一个通信总线或信号线来通信。
上述实施例中提供的模型构建装置、存储介质及服务器可执行本申请实施例所提供的对应的模型构建方法,具备执行该方法相应的功能模块和有益效果。未在上述实施例中详尽描述的技术细节,可参见本申请任意实施例所提供的模型构建方法。
本申请实施例还提供了一种拍照装置,该装置可由软件和硬件中的至少一种实现,一般可集成在移动终端中,如具有摄像头的移动终端。图7是本申请实施例提供的一种拍照装置的结构框图,如图7所示,该装置包括图像获取模块710,姿态识别模块720,对象获取模块730,对象添加模块740以及拍摄模块750。
图像获取模块710,设置为获取待拍摄对象的第一预览图像。
姿态识别模块720,设置为通过预先配置的姿态识别模型识别所述第一预览图像中的人体姿态,其中,所述姿态识别模型为根据设定数量的图片样本训练的深度学习模型,,该图片样本根据包含目标对象的人体姿态图片确定,在一实施例中,所述图片样本包含人体姿态,深度学习模型为卷积神经网络模型。
对象获取模块730,设置为获取虚拟对象,其中,所述虚拟对象用于为所述待拍摄对象提供增强现实效果。
对象添加模块740,设置为根据所述人体姿态确定所述虚拟对象的添加位置,并在所述添加位置增加所述虚拟对象,形成第二预览图像。
拍摄模块750,设置为获取拍摄指令,响应所述拍摄指令得到所述第二预览图像对应的拍摄图片。
本实施例的技术方案提供一种拍照装置,通过姿态识别模型识别第一预览图像中的人体姿态,避免拍摄视角、遮挡等对姿态识别准确率的影响,提高姿态识别的准确率;并根据人体姿态确定虚拟对象的添加位置,将增强现实功能与人体姿态检测相结合,实现在准确的位置精确的添加虚拟对象,提升拍摄照片的显示效果。
在一实施例中,该拍照装置还包括:判断模块,设置为在所述对象获取模 块获取待拍摄对象的第一预览图像之后,判断所述第一预览图像中是否包含皮肤对应的像素点;若所述第一预览图像中包含皮肤对应的像素点,则将所述第一预览图像输入所述姿态识别模型;若所述第一预览图像中不包含皮肤对应的像素点,输出无法确定虚拟对象的添加位置的提示信息。
在一实施例中,还包括:状态信息获取模块,设置为在所述对象获取模块获取虚拟对象之前,在拍摄场景下,获取增强现实功能的状态信息;
排序模块,设置为在增强现实功能被启用时,按照使用频率对虚拟对象库中的虚拟对象进行降序排列,显示排序结果。
在一实施例中,对象添加模块740设置为:根据所述人体姿态及所述虚拟对象查询预设数据库,确定所述虚拟对象的添加位置,其中,所述预设数据库中关联存储人体姿态与虚拟对象的添加位置;
采用所述虚拟对象的像素值替换所述添加位置对应的像素值。
本申请实施例还提供一种包含计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时设置为本申请实施例所提供的执行拍照方法,该方法包括:获取待拍摄对象的第一预览图像;通过预先配置的姿态识别模型识别所述第一预览图像中的人体姿态,其中,所述姿态识别模型为根据设定数量的图片样本训练的深度学习模型;获取虚拟对象,其中,所述虚拟对象用于为所述待拍摄对象提供增强现实效果;根据所述人体姿态确定所述虚拟对象的添加位置,并在所述添加位置增加所述虚拟对象,形成第二预览图像;获取拍摄指令,响应所述拍摄指令得到所述第二预览图像对应的拍摄图片。
存储介质——任何的各种类型的存储器设备或存储设备。术语“存储介质”旨在包括:安装介质,例如CD-ROM、软盘或磁带装置;计算机系统存储器或随机存取存储器,诸如DRAM、DDRRAM、SRAM、EDORAM,兰巴斯(Rambus)RAM等;非易失性存储器,诸如闪存、磁介质(例如硬盘或光存储);寄存器或其它相似类型的存储器元件等。存储介质可以还包括其它类型的存储器或其组合。另外,存储介质可以位于程序在其中被执行的第一计算机系统中,或者可以位于不同的第二计算机系统中,第二计算机系统通过网络(诸如因特网)连接到第一计算机系统。第二计算机系统可以提供程序指令给第一计算机用于执行。术语“存储介质”可以包括可以驻留在不同位置中(例如在通过网络连接的不同计算机系统中)的两个或更多存储介质。存储介质可以存储可由至少一个处理器执行的程序指令(例如具体实现为计算机程序)。
当然,本申请实施例所提供的一种包含计算机可执行指令的存储介质,其计算机可执行指令不限于如上所述的拍照操作,还可以执行本申请任意实施例所提供的将人体姿态与增强现实技术相结合的拍照方法中的相关操作。
本申请实施例还提供另一种终端,该终端中集成有上述实施例所述的拍照装置,可以执行基于人体姿态添加虚拟对象的操作。示例性的,该终端可以是移动终端。图8是本申请实施例提供的一种移动终端的结构框图,如图8所示,该移动终端可以包括:壳体(图中未示出)、存储器801、中央处理器(Central Processing Unit,CPU)802(又称处理器,)、电路板(图中未示出)和电源电路(图中未示出)。所述电路板安置在所述壳体围成的空间内部;所述CPU802和所述存储器801设置在所述电路板上;所述电源电路,设置为为所述终端的各个电路或器件供电;所述存储器801,设置为存储可执行程序代码、虚拟对象的添加位置的预设数据库等;所述CPU802通过读取所述存储器801中存储的可执行程序代码来运行与所述可执行程序代码对应的计算机程序,以实现以下步骤:获取待拍摄对象的第一预览图像;通过预先配置的姿态识别模型识别所述第一预览图像中的人体姿态,其中,所述姿态识别模型为根据设定数量的图片样本训练的深度学习模型,所述图片样本根据包含目标对象的人体姿态图片确定;获取虚拟对象,其中,所述虚拟对象用于为所述待拍摄对象提供增强现实效果;根据所述人体姿态确定所述虚拟对象的添加位置,并在所述添加位置增加所述虚拟对象,形成第二预览图像;获取拍摄指令,响应所述拍摄指令得到所述第二预览图像对应的拍摄图片。
所述终端还包括:外设接口803、射频(Radio Frequency,RF)电路805、音频电路806、扬声器811、电源管理芯片808、输入/输出(I/O)子系统809、其他输入/控制设备810、触摸屏812、其他输入/控制设备810以及外部端口804,这些部件通过至少一个通信总线或信号线807来通信。
应该理解的是,图示移动终端800仅仅是终端的一个范例,并且移动终端800可以具有比图中所示出的更多的或者更少的部件,可以组合两个或更多的部件,或者可以具有不同的部件配置。图中所示出的各种部件可以在包括信号处理和专用集成电路中至少一种在内的硬件、软件、或硬件和软件的组合中实现。
下面就本实施例提供的移动终端800进行详细的描述,该终端以手机为例。
存储器801,所述存储器801可以被CPU802、外设接口803等访问,所述存储器801可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。
外设接口803,所述外设接口803可以将设备的输入和输出外设连接到CPU802和存储器801。
I/O子系统809,所述I/O子系统809可以将设备上的输入输出外设,例如触摸屏812和其他输入/控制设备810,连接到外设接口803。I/O子系统809可以包括显示控制器8081和设置为控制其他输入/控制设备810的至少一个输入控制器8092。其中,至少一个输入控制器8092从其他输入/控制设备810接收电信号或者向其他输入/控制设备810发送电信号,其他输入/控制设备810可以包括物理按钮(按压按钮、摇臂按钮等)、拨号盘、滑动开关、操纵杆、点击滚轮。值得说明的是,输入控制器8092可以与以下任一个连接:键盘、红外端口、USB接口以及诸如鼠标的指示设备。
触摸屏812,所述触摸屏812是用户终端与用户之间的输入接口和输出接口,将可视输出显示给用户,可视输出可以包括图形、文本、图标、视频等。
I/O子系统809中的显示控制器8081从触摸屏812接收电信号或者向触摸 屏812发送电信号。触摸屏812检测触摸屏上的接触,显示控制器8081将检测到的接触转换为与显示在触摸屏812上的用户界面对象的交互,即实现人机交互,显示在触摸屏812上的用户界面对象可以是运行游戏的图标、联网到相应网络的图标等。值得说明的是,设备还可以包括光鼠,光鼠是不显示可视输出的触摸敏感表面,或者是由触摸屏形成的触摸敏感表面的延伸。
RF电路805,主要设置为建立手机与无线网络(即网络侧)的通信,实现手机与无线网络的数据接收和发送。例如收发短信息、电子邮件等。在一实施例中,RF电路805接收并发送RF信号,RF信号也称为电磁信号,RF电路805将电信号转换为电磁信号或将电磁信号转换为电信号,并且通过该电磁信号与通信网络以及其他设备进行通信。RF电路805可以包括设置为执行这些功能的已知电路,其包括但不限于天线系统、RF收发机、至少一个放大器、调谐器、至少一个振荡器、数字信号处理器、编译码器(COder-DECoder,CODEC)芯片组、用户标识模块(Subscriber Identity Module,SIM)等等。
音频电路806,主要设置为从外设接口803接收音频数据,将该音频数据转换为电信号,并且将该电信号发送给扬声器811。
扬声器811,设置为将手机通过RF电路805从无线网络接收的语音信号,还原为声音并向用户播放该声音。
电源管理芯片808,设置为为CPU802、I/O子系统及外设接口所连接的硬件进行供电及电源管理。
上述实施例中提供的拍照装置、存储介质及终端可执行本申请实施例所提供的对应的拍照方法,具备执行该方法相应的功能模块和有益效果。未在上述实施例中详尽描述的技术细节,可参见本申请任意实施例所提供的拍照方法。
Claims (20)
- 一种模型构建方法,包括:获取设定数量的人体姿态图片,对所述人体姿态图片中的目标对象进行标记,得到图片样本,其中,所述目标对象包括头部、四肢以及躯干中的至少一种;根据所述图片样本,采用设定的机器学习算法对预设的深度学习模型进行训练,得到姿态识别模型;发送所述姿态识别模型至移动终端。
- 根据权利要求1所述的方法,其中,获取设定数量的人体姿态图片,对所述人体姿态图片中的目标对象进行标记,包括:获取网络平台图片库中的网络图片;对所述网络图片中的人物的头部像素坐标、四肢像素坐标以及躯干像素坐标进行标记,得到第一样本图片;将所述第一样本图片存入图片样本集。
- 根据权利要求1所述的方法,其中,获取设定数量的人体姿态图片,对所述人体姿态图片中的目标对象进行标记,包括:获取移动终端相册中的用户图片;对所述用户图片中的人物头部像素坐标、四肢像素坐标以及躯干像素坐标进行标记,得到第二样本图片;将所述第二样本图片存入图片样本集。
- 根据权利要求2或3所述的方法,其中,根据所述图片样本,采用设定的机器学习算法对预设的深度学习模型进行训练,得到姿态识别模型,包括:利用所述图片样本集对所述预设的深度学习模型进行前向传播和后向传播两个阶段的训练;在所述后向传播训练计算得到的误差达到期望误差值时,训练结束,并得到姿态识别模型。
- 根据权利要求4所述的方法,所述姿态识别模型为卷积神经网络模型;以及,在发送所述姿态识别模型至移动终端之前,还包括:采用预设的优化策略对所述卷积神经网络模型进行优化,其中,对所述卷积神经网络模型的优化包括内部网络结构优化、卷积层的实现方式优化以及池化层的实现方式优化中的至少一项。
- 一种拍照方法,包括:获取待拍摄对象的第一预览图像;通过预先配置的姿态识别模型识别所述第一预览图像中的人体姿态,其中,所述姿态识别模型为根据设定数量的图片样本训练的深度学习模型,所述图片样本根据包含目标对象的人体姿态图片确定;获取虚拟对象,其中,所述虚拟对象用于为所述待拍摄对象提供增强现实效果;根据所述人体姿态确定所述虚拟对象的添加位置,并在所述添加位置增加所述虚拟对象,形成第二预览图像;获取拍摄指令,响应所述拍摄指令得到所述第二预览图像对应的拍摄图片。
- 根据权利要求6所述的方法,在获取待拍摄对象的第一预览图像之后,还包括:判断所述第一预览图像中是否包含皮肤对应的像素点;响应于确定所述第一预览图像中包含皮肤对应的像素点,将所述第一预览图像输入所述姿态识别模型;响应于确定所述第一预览图像中不包含皮肤对应的像素点,输出无法确定所述虚拟对象的添加位置的提示信息。
- 根据权利要求6所述的方法,在获取虚拟对象之前,还包括:在拍摄场景下,获取增强现实功能的状态信息;在增强现实功能被启用时,按照使用频率对虚拟对象库中的虚拟对象进行降序排列,显示排序结果。
- 根据权利要求6至8中任一项所述的方法,其中,根据所述人体姿态确定所述虚拟对象的添加位置,并在所述添加位置增加所述虚拟对象,包括:根据所述人体姿态及所述虚拟对象查询预设数据库,确定所述虚拟对象的添加位置,其中,所述预设数据库中关联存储人体姿态与虚拟对象的添加位置;采用所述虚拟对象的像素值替换所述添加位置对应的像素值。
- 一种模型构建装置,包括:样本确定模块,设置为获取设定数量的人体姿态图片,对所述人体姿态图片中的目标对象进行标记,得到图片样本,其中,所述目标对象包括头部、四肢以及躯干中的至少一种;模型训练模块,设置为根据所述图片样本,采用设定的机器学习算法对预设的深度学习模型进行训练,得到姿态识别模型;模型发送模块,设置为发送所述姿态识别模型至移动终端。
- 根据权利要求10所述的装置,其中,所述样本确定模块,设置为:获取网络平台图片库中的网络图片,对所述网络图片中的人物的头部像素坐标、四肢像素坐标以及躯干像素坐标进行标记,得到第一样本图片;将所述第一样本图片存入图片样本集;或者,获取移动终端相册中的用户图片,对所述用户图片中的人物头部像素坐标、四肢像素坐标以及躯干像素坐标进行标记,得到第二样本图片;将所述第二样本图片存入图片样本集。
- 根据权利要求11所述的装置,其中,所述模型训练模块,设置为利用所述图片样本集对预设的深度学习模型进行前向传播和后向传播两个阶段的训练;在所述后向传播训练计算得到的误差达到期望误差值时,训练结束,并得到姿态识别模型。
- 根据权利要求12所述的装置,所述装置还包括:模型优化模块,设置为在发送所述姿态识别模型至移动终端之前,采用预设的优化策略对所述卷积神经网络模型进行优化,其中,对所述卷积神经网络 模型的优化包括内部网络结构优化、卷积层的实现方式优化以及池化层的实现方式优化中的至少一项。
- 一种拍照装置,包括:图像获取模块,设置为获取待拍摄对象的第一预览图像;姿态识别模块,设置为通过预先配置的姿态识别模型识别所述第一预览图像中的人体姿态,其中,所述姿态识别模型为根据设定数量的图片样本训练的深度学习模型,所述图片样本根据包含目标对象的人体姿态图片确定;对象获取模块,设置为获取虚拟对象,其中,所述虚拟对象用于为所述待拍摄对象提供增强现实效果;对象添加模块,设置为根据所述人体姿态确定所述虚拟对象的添加位置,并在所述添加位置增加所述虚拟对象,形成第二预览图像;拍摄模块,设置为获取拍摄指令,响应所述拍摄指令得到所述第二预览图像对应的拍摄图片。
- 根据权利要求14所述的装置,所述装置还包括:判断模块,设置为在所述对象获取模块获取待拍摄对象的第一预览图像之后,判断所述第一预览图像中是否包含皮肤对应的像素点;若所述第一预览图像中包含皮肤对应的像素点,则将所述第一预览图像输入所述姿态识别模型;若所述第一预览图像中不包含皮肤对应的像素点,输出无法确定虚拟对象的添加位置的提示信息。
- 根据权利要求14所述的装置,所述装置还包括:状态信息获取模块,设置为在所述对象获取模块获取虚拟对象之前,在拍摄场景下,获取增强现实功能的状态信息;排序模块,设置为在增强现实功能被启用时,按照使用频率对虚拟对象库中的虚拟对象进行降序排列,显示排序结果。
- 一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如权利要求1至5中任一项所述的模型构建方法。
- 一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如权利要求6至9中任一项所述的拍照方法。
- 一种终端,包括存储器,处理器及存储在存储器上并可在处理器运行的计算机程序,所述处理器执行所述计算机程序时实现如权利要求1至5中任一项所述的模型构建方法。
- 一种终端,包括存储器,处理器及存储在存储器上并可在处理器运行的计算机程序,所述处理器执行所述计算机程序时实现如权利要求6至9中任一项所述的拍照方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711392033.5A CN109951628A (zh) | 2017-12-21 | 2017-12-21 | 模型构建方法、拍照方法、装置、存储介质及终端 |
CN201711392033.5 | 2017-12-21 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019120032A1 true WO2019120032A1 (zh) | 2019-06-27 |
Family
ID=66992511
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2018/116800 WO2019120032A1 (zh) | 2017-12-21 | 2018-11-21 | 模型构建方法、拍照方法、装置、存储介质及终端 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109951628A (zh) |
WO (1) | WO2019120032A1 (zh) |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110543578A (zh) * | 2019-08-09 | 2019-12-06 | 华为技术有限公司 | 物体识别方法及装置 |
CN110866514A (zh) * | 2019-11-22 | 2020-03-06 | 安徽小眯当家信息技术有限公司 | 一种异常状态识别的方法、装置、客户端、服务器及计算机可读介质 |
CN111597923A (zh) * | 2020-04-28 | 2020-08-28 | 上海伟声德智能科技有限公司 | 一种对人员温度进行监测的方法、装置和电子设备 |
CN111638795A (zh) * | 2020-06-05 | 2020-09-08 | 上海商汤智能科技有限公司 | 一种控制虚拟对象展示状态的方法及装置 |
CN111640165A (zh) * | 2020-06-08 | 2020-09-08 | 上海商汤智能科技有限公司 | Ar合影图像的获取方法、装置、计算机设备及存储介质 |
CN111798407A (zh) * | 2020-05-15 | 2020-10-20 | 国网浙江省电力有限公司嘉兴供电公司 | 一种基于神经网络模型的带电设备故障诊断方法 |
CN112069931A (zh) * | 2020-08-20 | 2020-12-11 | 深圳数联天下智能科技有限公司 | 一种状态报告的生成方法及状态监控系统 |
CN112084965A (zh) * | 2020-09-11 | 2020-12-15 | 义乌市悦美科技有限公司 | 一种头皮头发检测装置及系统 |
CN112184722A (zh) * | 2020-09-15 | 2021-01-05 | 上海传英信息技术有限公司 | 图像处理方法、终端及计算机存储介质 |
CN112307799A (zh) * | 2019-07-24 | 2021-02-02 | 鲁班嫡系机器人(深圳)有限公司 | 姿态识别方法、装置、系统、存储介质及设备 |
CN112614568A (zh) * | 2020-12-28 | 2021-04-06 | 东软集团股份有限公司 | 检查图像的处理方法、装置、存储介质和电子设备 |
CN112633196A (zh) * | 2020-12-28 | 2021-04-09 | 浙江大华技术股份有限公司 | 人体姿态检测方法、装置和计算机设备 |
CN112836801A (zh) * | 2021-02-03 | 2021-05-25 | 上海商汤智能科技有限公司 | 深度学习网络确定方法、装置、电子设备及存储介质 |
CN112991494A (zh) * | 2021-01-28 | 2021-06-18 | 腾讯科技(深圳)有限公司 | 图像生成方法、装置、计算机设备及计算机可读存储介质 |
CN113325951A (zh) * | 2021-05-27 | 2021-08-31 | 百度在线网络技术(北京)有限公司 | 基于虚拟角色的操作控制方法、装置、设备以及存储介质 |
CN113596387A (zh) * | 2020-04-30 | 2021-11-02 | 中国石油天然气股份有限公司 | 监控系统 |
CN114077303A (zh) * | 2020-08-21 | 2022-02-22 | 广州视享科技有限公司 | Ar眼镜摄像头角度调整方法、装置、电子设备及介质 |
CN114745576A (zh) * | 2022-03-25 | 2022-07-12 | 上海合志信息技术有限公司 | 一种家庭健身互动方法、装置、电子设备以及存储介质 |
CN114822127A (zh) * | 2022-04-20 | 2022-07-29 | 深圳市铱硙医疗科技有限公司 | 一种基于虚拟现实设备的训练方法及训练装置 |
CN116503508A (zh) * | 2023-06-26 | 2023-07-28 | 南昌航空大学 | 一种个性化模型构建方法、系统、计算机及可读存储介质 |
CN116661608A (zh) * | 2023-07-26 | 2023-08-29 | 海马云(天津)信息技术有限公司 | 虚拟人动捕的模型切换方法和装置、电子设备及存储介质 |
CN116740768A (zh) * | 2023-08-11 | 2023-09-12 | 南京诺源医疗器械有限公司 | 基于鼻颅镜的导航可视化方法、系统、设备及存储介质 |
CN117456611A (zh) * | 2023-12-22 | 2024-01-26 | 拓世科技集团有限公司 | 一种基于人工智能的虚拟人物训练方法及系统 |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112306220A (zh) * | 2019-07-31 | 2021-02-02 | 北京字节跳动网络技术有限公司 | 基于肢体识别的控制方法、装置、电子设备及存储介质 |
CN110740264B (zh) * | 2019-10-31 | 2021-06-04 | 重庆工商职业学院 | 一种智能摄像数据快速采集系统及采集方法 |
CN112949355A (zh) * | 2019-12-10 | 2021-06-11 | Oppo广东移动通信有限公司 | 一种姿态迁移方法、装置及存储介质 |
CN111461014B (zh) * | 2020-04-01 | 2023-06-27 | 西安电子科技大学 | 基于深度学习的天线姿态参数检测方法、装置及存储介质 |
CN112200126A (zh) * | 2020-10-26 | 2021-01-08 | 上海盛奕数字科技有限公司 | 一种基于人工智能跑步肢体遮挡姿态识别方法 |
CN112396494B (zh) * | 2020-11-23 | 2024-06-21 | 北京百度网讯科技有限公司 | 商品引导方法、装置、设备及存储介质 |
CN113762286A (zh) * | 2021-09-16 | 2021-12-07 | 平安国际智慧城市科技股份有限公司 | 数据模型训练方法、装置、设备及介质 |
CN117115400A (zh) * | 2023-09-15 | 2023-11-24 | 深圳市红箭头科技有限公司 | 实时显示全身人体动作的方法、装置、计算机设备及存储介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101576953A (zh) * | 2009-06-10 | 2009-11-11 | 北京中星微电子有限公司 | 一种人体姿态的分类方法和装置 |
CN105760836A (zh) * | 2016-02-17 | 2016-07-13 | 厦门美图之家科技有限公司 | 基于深度学习的多角度人脸对齐方法、系统及拍摄终端 |
CN106127167A (zh) * | 2016-06-28 | 2016-11-16 | 广东欧珀移动通信有限公司 | 一种增强现实中目标对象的识别方法、装置及移动终端 |
CN106155315A (zh) * | 2016-06-28 | 2016-11-23 | 广东欧珀移动通信有限公司 | 一种拍摄中增强现实效果的添加方法、装置及移动终端 |
WO2017164478A1 (ko) * | 2016-03-25 | 2017-09-28 | 한국과학기술원 | 미세 얼굴 다이나믹의 딥 러닝 분석을 통한 미세 표정 인식 방법 및 장치 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103337083B (zh) * | 2013-07-11 | 2016-03-09 | 南京大学 | 一种非侵入式大运动条件下人体测量方法 |
US9336440B2 (en) * | 2013-11-25 | 2016-05-10 | Qualcomm Incorporated | Power efficient use of a depth sensor on a mobile device |
CN105787439B (zh) * | 2016-02-04 | 2019-04-05 | 广州新节奏智能科技股份有限公司 | 一种基于卷积神经网络的深度图像人体关节定位方法 |
WO2017156243A1 (en) * | 2016-03-11 | 2017-09-14 | Siemens Aktiengesellschaft | Deep-learning based feature mining for 2.5d sensing image search |
CN106097435A (zh) * | 2016-06-07 | 2016-11-09 | 北京圣威特科技有限公司 | 一种增强现实拍摄系统及方法 |
-
2017
- 2017-12-21 CN CN201711392033.5A patent/CN109951628A/zh active Pending
-
2018
- 2018-11-21 WO PCT/CN2018/116800 patent/WO2019120032A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101576953A (zh) * | 2009-06-10 | 2009-11-11 | 北京中星微电子有限公司 | 一种人体姿态的分类方法和装置 |
CN105760836A (zh) * | 2016-02-17 | 2016-07-13 | 厦门美图之家科技有限公司 | 基于深度学习的多角度人脸对齐方法、系统及拍摄终端 |
WO2017164478A1 (ko) * | 2016-03-25 | 2017-09-28 | 한국과학기술원 | 미세 얼굴 다이나믹의 딥 러닝 분석을 통한 미세 표정 인식 방법 및 장치 |
CN106127167A (zh) * | 2016-06-28 | 2016-11-16 | 广东欧珀移动通信有限公司 | 一种增强现实中目标对象的识别方法、装置及移动终端 |
CN106155315A (zh) * | 2016-06-28 | 2016-11-23 | 广东欧珀移动通信有限公司 | 一种拍摄中增强现实效果的添加方法、装置及移动终端 |
Cited By (36)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112307799A (zh) * | 2019-07-24 | 2021-02-02 | 鲁班嫡系机器人(深圳)有限公司 | 姿态识别方法、装置、系统、存储介质及设备 |
CN110543578B (zh) * | 2019-08-09 | 2024-05-14 | 华为技术有限公司 | 物体识别方法及装置 |
CN110543578A (zh) * | 2019-08-09 | 2019-12-06 | 华为技术有限公司 | 物体识别方法及装置 |
CN110866514A (zh) * | 2019-11-22 | 2020-03-06 | 安徽小眯当家信息技术有限公司 | 一种异常状态识别的方法、装置、客户端、服务器及计算机可读介质 |
CN111597923A (zh) * | 2020-04-28 | 2020-08-28 | 上海伟声德智能科技有限公司 | 一种对人员温度进行监测的方法、装置和电子设备 |
CN111597923B (zh) * | 2020-04-28 | 2023-05-12 | 上海伟声德智能科技有限公司 | 一种对人员温度进行监测的方法、装置和电子设备 |
CN113596387B (zh) * | 2020-04-30 | 2023-10-31 | 中国石油天然气股份有限公司 | 监控系统 |
CN113596387A (zh) * | 2020-04-30 | 2021-11-02 | 中国石油天然气股份有限公司 | 监控系统 |
CN111798407B (zh) * | 2020-05-15 | 2024-05-21 | 国网浙江省电力有限公司嘉兴供电公司 | 一种基于神经网络模型的带电设备故障诊断方法 |
CN111798407A (zh) * | 2020-05-15 | 2020-10-20 | 国网浙江省电力有限公司嘉兴供电公司 | 一种基于神经网络模型的带电设备故障诊断方法 |
CN111638795B (zh) * | 2020-06-05 | 2024-06-11 | 上海商汤智能科技有限公司 | 一种控制虚拟对象展示状态的方法及装置 |
CN111638795A (zh) * | 2020-06-05 | 2020-09-08 | 上海商汤智能科技有限公司 | 一种控制虚拟对象展示状态的方法及装置 |
CN111640165A (zh) * | 2020-06-08 | 2020-09-08 | 上海商汤智能科技有限公司 | Ar合影图像的获取方法、装置、计算机设备及存储介质 |
CN112069931A (zh) * | 2020-08-20 | 2020-12-11 | 深圳数联天下智能科技有限公司 | 一种状态报告的生成方法及状态监控系统 |
CN114077303A (zh) * | 2020-08-21 | 2022-02-22 | 广州视享科技有限公司 | Ar眼镜摄像头角度调整方法、装置、电子设备及介质 |
CN112084965A (zh) * | 2020-09-11 | 2020-12-15 | 义乌市悦美科技有限公司 | 一种头皮头发检测装置及系统 |
CN112184722B (zh) * | 2020-09-15 | 2024-05-03 | 上海传英信息技术有限公司 | 图像处理方法、终端及计算机存储介质 |
CN112184722A (zh) * | 2020-09-15 | 2021-01-05 | 上海传英信息技术有限公司 | 图像处理方法、终端及计算机存储介质 |
CN112614568B (zh) * | 2020-12-28 | 2024-05-28 | 东软集团股份有限公司 | 检查图像的处理方法、装置、存储介质和电子设备 |
CN112633196A (zh) * | 2020-12-28 | 2021-04-09 | 浙江大华技术股份有限公司 | 人体姿态检测方法、装置和计算机设备 |
CN112614568A (zh) * | 2020-12-28 | 2021-04-06 | 东软集团股份有限公司 | 检查图像的处理方法、装置、存储介质和电子设备 |
CN112991494B (zh) * | 2021-01-28 | 2023-09-15 | 腾讯科技(深圳)有限公司 | 图像生成方法、装置、计算机设备及计算机可读存储介质 |
CN112991494A (zh) * | 2021-01-28 | 2021-06-18 | 腾讯科技(深圳)有限公司 | 图像生成方法、装置、计算机设备及计算机可读存储介质 |
CN112836801A (zh) * | 2021-02-03 | 2021-05-25 | 上海商汤智能科技有限公司 | 深度学习网络确定方法、装置、电子设备及存储介质 |
CN113325951A (zh) * | 2021-05-27 | 2021-08-31 | 百度在线网络技术(北京)有限公司 | 基于虚拟角色的操作控制方法、装置、设备以及存储介质 |
CN113325951B (zh) * | 2021-05-27 | 2024-03-29 | 百度在线网络技术(北京)有限公司 | 基于虚拟角色的操作控制方法、装置、设备以及存储介质 |
CN114745576A (zh) * | 2022-03-25 | 2022-07-12 | 上海合志信息技术有限公司 | 一种家庭健身互动方法、装置、电子设备以及存储介质 |
CN114822127A (zh) * | 2022-04-20 | 2022-07-29 | 深圳市铱硙医疗科技有限公司 | 一种基于虚拟现实设备的训练方法及训练装置 |
CN114822127B (zh) * | 2022-04-20 | 2024-04-02 | 深圳市铱硙医疗科技有限公司 | 一种基于虚拟现实设备的训练方法及训练装置 |
CN116503508A (zh) * | 2023-06-26 | 2023-07-28 | 南昌航空大学 | 一种个性化模型构建方法、系统、计算机及可读存储介质 |
CN116661608B (zh) * | 2023-07-26 | 2023-10-03 | 海马云(天津)信息技术有限公司 | 虚拟人动捕的模型切换方法和装置、电子设备及存储介质 |
CN116661608A (zh) * | 2023-07-26 | 2023-08-29 | 海马云(天津)信息技术有限公司 | 虚拟人动捕的模型切换方法和装置、电子设备及存储介质 |
CN116740768B (zh) * | 2023-08-11 | 2023-10-20 | 南京诺源医疗器械有限公司 | 基于鼻颅镜的导航可视化方法、系统、设备及存储介质 |
CN116740768A (zh) * | 2023-08-11 | 2023-09-12 | 南京诺源医疗器械有限公司 | 基于鼻颅镜的导航可视化方法、系统、设备及存储介质 |
CN117456611A (zh) * | 2023-12-22 | 2024-01-26 | 拓世科技集团有限公司 | 一种基于人工智能的虚拟人物训练方法及系统 |
CN117456611B (zh) * | 2023-12-22 | 2024-03-29 | 拓世科技集团有限公司 | 一种基于人工智能的虚拟人物训练方法及系统 |
Also Published As
Publication number | Publication date |
---|---|
CN109951628A (zh) | 2019-06-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2019120032A1 (zh) | 模型构建方法、拍照方法、装置、存储介质及终端 | |
WO2020216116A1 (zh) | 动作识别方法和装置、人机交互方法和装置 | |
US20200327695A1 (en) | Relocalization method and apparatus in camera pose tracking process, device, and storage medium | |
WO2019109801A1 (zh) | 拍摄参数的调整方法、装置、存储介质及移动终端 | |
US10963727B2 (en) | Method, device and storage medium for determining camera posture information | |
CN111726536B (zh) | 视频生成方法、装置、存储介质及计算机设备 | |
US11138434B2 (en) | Electronic device for providing shooting mode based on virtual character and operation method thereof | |
CN109194879B (zh) | 拍照方法、装置、存储介质及移动终端 | |
US11003253B2 (en) | Gesture control of gaming applications | |
US10043308B2 (en) | Image processing method and apparatus for three-dimensional reconstruction | |
US10318011B2 (en) | Gesture-controlled augmented reality experience using a mobile communications device | |
WO2019218880A1 (zh) | 识别交互方法、装置、存储介质及终端设备 | |
WO2019184889A1 (zh) | 增强现实模型的调整方法、装置、存储介质和电子设备 | |
WO2019205851A1 (zh) | 位姿确定方法、装置、智能设备及存储介质 | |
CN109348135A (zh) | 拍照方法、装置、存储介质及终端设备 | |
JP2021524957A (ja) | 画像処理方法およびその、装置、端末並びにコンピュータプログラム | |
CN109145809B (zh) | 一种记谱处理方法和装置以及计算机可读存储介质 | |
US10931880B2 (en) | Electronic device and method for providing information thereof | |
CN108646920A (zh) | 识别交互方法、装置、存储介质及终端设备 | |
WO2020110547A1 (ja) | 情報処理装置、情報処理方法およびプログラム | |
WO2020221121A1 (zh) | 视频查询方法、装置、设备及存储介质 | |
TWI653546B (zh) | 具有外置追蹤及內置追蹤之虛擬實境系統及其控制方法 | |
KR20160106653A (ko) | 조정된 스피치 및 제스처 입력 | |
CN108921815A (zh) | 拍照交互方法、装置、存储介质及终端设备 | |
CN109508713A (zh) | 图片获取方法、装置、终端和存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18892471 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18892471 Country of ref document: EP Kind code of ref document: A1 |