WO2020019873A1 - 图像处理方法、装置、终端及计算机可读存储介质 - Google Patents
图像处理方法、装置、终端及计算机可读存储介质 Download PDFInfo
- Publication number
- WO2020019873A1 WO2020019873A1 PCT/CN2019/089825 CN2019089825W WO2020019873A1 WO 2020019873 A1 WO2020019873 A1 WO 2020019873A1 CN 2019089825 W CN2019089825 W CN 2019089825W WO 2020019873 A1 WO2020019873 A1 WO 2020019873A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- pixel
- classification
- head
- target image
- target
- Prior art date
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 34
- 238000013145 classification model Methods 0.000 claims abstract description 68
- 238000000034 method Methods 0.000 claims abstract description 30
- 230000014509 gene expression Effects 0.000 claims description 104
- 238000012545 processing Methods 0.000 claims description 74
- 238000012549 training Methods 0.000 claims description 51
- 230000008921 facial expression Effects 0.000 claims description 21
- 238000001514 detection method Methods 0.000 claims description 14
- 230000000694 effects Effects 0.000 claims description 12
- 230000001815 facial effect Effects 0.000 claims description 8
- 239000000463 material Substances 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 6
- 230000009467 reduction Effects 0.000 claims description 3
- 230000002093 peripheral effect Effects 0.000 description 10
- 238000012360 testing method Methods 0.000 description 10
- 230000001133 acceleration Effects 0.000 description 9
- 238000010586 diagram Methods 0.000 description 8
- 230000006870 function Effects 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 230000003287 optical effect Effects 0.000 description 5
- 238000003066 decision tree Methods 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 238000004590 computer program Methods 0.000 description 3
- 230000009977 dual effect Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000003062 neural network model Methods 0.000 description 3
- 238000012706 support-vector machine Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 239000000919 ceramic Substances 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 230000005484 gravity Effects 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 230000006641 stabilisation Effects 0.000 description 1
- 238000011105 stabilization Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/217—Validation; Performance evaluation; Active pattern learning techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/776—Validation; Performance evaluation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
Definitions
- Embodiments of the present invention relate to the field of computer technology, and in particular, to an image processing method, device, terminal, and computer-readable storage medium.
- the user can determine the rectangular frame by sliding or dragging the position of the head so that the head is located inside the rectangular frame.
- the terminal will use the rectangular frame determined by the user as the head area, and edit the head area.
- an image processing method, device, terminal, and computer-readable storage medium are provided.
- An image processing method executed by a terminal includes:
- a trained pixel classification model is obtained, the pixel classification model is used to determine a classification identifier of each pixel in any image, the classification identifier includes at least a head classification identifier, and the head classification identifier is used to represent a corresponding pixel In the head area;
- the head region of the target image is determined according to the head classification flag in the classification flag, and the head region is edited.
- An image processing device includes:
- a first acquisition module configured to acquire a trained pixel classification model, where the pixel classification model is used to determine a classification identifier of each pixel in any image, the classification identifier includes at least a head classification identifier, and the head classification The identifier is used to indicate that the corresponding pixel is located in the head area;
- a classification module configured to classify each pixel in a target image based on the pixel classification model to obtain a classification identifier of each pixel in the target image
- a first processing module is configured to determine a head region of the target image according to a head classification identifier in the classification identifier, and perform editing processing on the head region.
- a terminal for image processing includes a processor and a memory, and the memory stores at least one instruction, at least one program, a code set, or an instruction set, the instruction, the program, and the code set Or the instruction set is loaded by the processor and executes the steps of the image processing method.
- a computer-readable storage medium stores at least one instruction, at least one program, code set, or instruction set in the computer-readable storage medium.
- the instruction, the program, the code set, or the instruction set is composed of
- the processor loads and executes the steps of the image processing method.
- FIG. 1a is an application environment diagram of an image processing method according to an embodiment of the present invention.
- FIG. 1b is a flowchart of an image processing method according to an embodiment of the present invention.
- FIG. 3 is a flowchart of training a pixel classification model according to an embodiment of the present invention.
- FIG. 4 is a schematic structural diagram of a pixel classification model according to an embodiment of the present invention.
- FIG. 5 is a schematic diagram of a processing effect of a head region according to an embodiment of the present invention.
- FIG. 6 is a flowchart of an image processing method according to an embodiment of the present invention.
- FIG. 7 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present invention.
- FIG. 8 is a schematic structural diagram of a terminal according to an embodiment of the present invention.
- the head region in the target image is to be edited
- the head region is usually manually determined by the user first.
- the determined head region includes not only the head but also the surrounding area of the head.
- An embodiment of the present invention provides an image processing method, which can classify pixels in a target image based on a pixel classification model to determine a head region in the target image. Fine picking of the head edge, and editing processing of the finely picked head area improves the accuracy.
- the embodiment of the present invention can be applied to any scene where the head region of an image is edited.
- the method provided in the embodiment of the present invention may be used to edit the head area in the photo.
- the method provided by the embodiment of the present invention may be used to edit the head region of each frame of the video in the video.
- the terminal installs a third-party application that is used to edit the image.
- the third-party application can call the photos or videos in the gallery and use the method provided by the embodiment of the present invention to modify the head area in the photos or videos. Perform editing processing, and store photos or videos after editing processing in the gallery.
- the above-mentioned gallery can be a local gallery or a server-side gallery.
- FIG. 1a is an application environment diagram of an image processing method in an embodiment.
- the image processing method is applied to an image processing system.
- the image processing system includes a terminal 110 and a server 120.
- the terminal 110 and the server 120 are connected through a network.
- the terminal 110 collects a target image (or video) through a camera, or obtains a target image (or video) from a gallery or a local gallery of the server 120; then, it acquires a trained pixel classification model, which is used to determine whether any Classification identification of each pixel.
- the classification identification includes at least a head classification identification.
- the head classification identification is used to indicate that the corresponding pixel is located in the head area.
- Each pixel in the target image is classified based on the pixel classification model to obtain the target image.
- the classification identifier of each pixel; the head region of the target image is determined according to the head classification identifier in the classification identifier, and the head region is edited.
- the terminal 110 may be a desktop terminal or a mobile terminal, and the mobile terminal may be at least one of a mobile phone, a tablet computer, and a notebook computer.
- the server 120 may be implemented by an independent server or a server cluster composed of multiple servers.
- FIG. 1b is a flowchart of an image processing method according to an embodiment of the present invention.
- the execution subject of this embodiment of the present invention is a terminal. Referring to FIG. 1b, the method includes:
- the terminal determines a target image to be processed, performs face detection on the target image, and obtains a face region of the target image.
- the terminal obtains a trained expression recognition model, recognizes a face region based on the expression recognition model, and obtains an expression category of the face region.
- the embodiment of the present invention is applied to a scenario in which a head region of a target image is edited, and a terminal may determine a target image to be processed, identify the head region of the target image, and perform editing processing.
- the terminal may perform editing processing according to the expression of the face area in the target image. Since the target image determined by the terminal includes a face area and may also include a non-face area, the target image is subjected to face detection to obtain a person. Face area, and obtain an expression recognition model, input the face area into the expression recognition model, recognize the face area, and obtain the expression category.
- a preset face detection algorithm may be adopted, or a face detection interface provided by a terminal may be called to perform face detection on a target image.
- the expression recognition model is used to divide the face area into at least two expression categories, such as surprised expressions, happy expressions, etc.
- the at least two expression categories can be determined when training the expression recognition model.
- the training device can obtain multiple sample face images and the expression categories of each sample face image, and perform multi-processing based on the multiple sample face images and the expression categories of each sample face image. Iterative training is performed to obtain an expression recognition model until the recognition accuracy rate of the trained expression recognition model reaches a second preset threshold.
- the training device may construct an initial expression recognition model, obtain a training data set and a test data set, and each of the training data set and the test data set includes a plurality of sample face images and corresponding expression categories.
- the training device may use a crawler program to capture face images in the network, obtain multiple sample face images, and mark the expression categories in each sample face image.
- the training phase multiple sample face images in the training data set are used as the input of the expression recognition model, and the corresponding expression categories are used as the output of the expression recognition model.
- the expression recognition model is iteratively trained to make the expression recognition model to the face image. To learn the features of the facial expressions, and have the ability to recognize facial expressions. After that, each sample face image in the test data set is used as the input of the expression recognition model, and the test expression category corresponding to each sample face image is obtained based on the expression recognition model, and the test expression category is compared with the labeled actual expression category. To determine the recognition accuracy of the facial expression recognition model. When the recognition accuracy rate of the expression recognition model is less than the second preset threshold, the training is continued according to the training data set, until the recognition accuracy rate of the trained expression recognition model reaches the second preset threshold, the training is completed.
- the second preset threshold may be determined according to the accuracy requirement and calculation requirement of the expression recognition, and may be a value such as 95% or 99%.
- the training device may be the terminal itself, or the training device may be other equipment other than the terminal, such as a server, etc. After the training device performs offline training, it sends the facial expression recognition model to the terminal for use by the terminal.
- the training device may use at least one of a linear classifier, a support vector machine, a deep neural network, and a decision tree to train an expression recognition model.
- the trained expression recognition model may include a linear classifier model, a support vector Machine model, deep neural network model and decision tree model.
- FIG. 2 a training flowchart of an expression recognition model may be shown in FIG. 2.
- the expression recognition model as Mobilenet (a lightweight deep neural network model) as an example, the network model has fast calculation speed, small network model size, accurate recognition rate, can quickly respond to a large number of user needs, and reduces background burden.
- Network layer / stride Convolution kernels / channels Feature map size Conv / s2 3 * 3/32 112 * 112 * 32 DepthSepConv / s1 3 * 3/64 112 * 112 * 64 DepthSepConv / s2 3 * 3/128 56 * 56 * 128 DepthSepConv / s1 3 * 3/128 56 * 56 * 128 DepthSepConv / s2 3 * 3/256 28 * 28 * 128
- Network layer / stride Convolution kernels / channels Feature map size DepthSepConv / s1 3 * 3/256 28 * 28 * 256 DepthSepConv / s2 3 * 3/512 14 * 14 * 512 DepthSepConv / s1 * 5 3 * 3/512 14 * 14 * 512 DepthSepConv / s2 3 * 3/1024 7 * 7 * 1024 DepthSepConv / s1 3 * 3/1024 7 * 7 * 1024 pooling 7 * 7 1 * 1 * 1024 Conv / s1 3 * 3 * N 1 * 1 * N
- Conv is a convolution layer
- DepthSepConv network layer is a deep separable convolution layer.
- a 3 * 3 depth convolution operation is performed first, and then a 1 * 1 point cloud convolution operation is performed.
- Pooling is a pool. ⁇ ⁇ The layer.
- the stride parameter of the convolution operation in each network layer is s1 or s2, where the value of s1 is 1, and the value of s2 is 2.
- the size of the feature map of each network layer is the size of the data output by the network layer, and the size of the feature map output by the last layer is 1 * 1 * N, where N is the number of expression categories.
- N-dimensional data is finally output.
- the N-dimensional data can be calculated by softmax (flexible maximum transfer function) to obtain the probability in the N-dimensional data. The highest data.
- the N-dimensional data can represent the probability that the facial expressions in the target image belong to N expression categories, and the data with the highest probability is the expression category to which the facial expressions in the target image most likely belong.
- the terminal obtains a trained pixel classification model, classifies each pixel in the target image based on the pixel classification model, and obtains a classification identifier of each pixel in the target image.
- the terminal may set one or more target expression categories, and only when the target image has a face area that matches the target expression category, the head region of the target image is edited. Therefore, when the terminal determines the expression category of the face region in the target image based on the expression recognition model, it is determined whether the expression category is the target expression category. When the expression category is not the target expression category, no editing processing is performed.
- the terminal in order to identify the head region in the template image, the terminal first obtains a pixel classification model, inputs the target image into the pixel classification model, and classifies each pixel in the target image. Get the classification identifier of each pixel in the target image.
- the pixel classification model is used to determine the classification identifier of a pixel in any image.
- the classification identifier includes a head classification identifier and a non-head classification identifier.
- the head classification identifier is used to indicate that the corresponding pixel is located in the head region and the non-head classification.
- the identifier is used to indicate that the corresponding pixel is located in a non-head region, so that each pixel can be divided into a head region or a non-head region.
- the head classification identifier and the non-head classification identifier are different classification identifiers determined when the pixel classification model is trained. For example, the head classification identifier is 1 and the non-head classification identifier is 0.
- the training device can obtain multiple sample images and the classification identifier of each pixel in each sample image, and perform multi-processing based on the multiple sample images and the classification identifier of each pixel in each sample image. Iterative training is performed to obtain a pixel classification model until the classification accuracy rate of the trained pixel classification model reaches a first preset threshold.
- the training device may construct an initial pixel classification model, obtain a training data set and a test data set, and both the training data set and the test data set include multiple sample images and each pixel in each sample image Classification ID.
- the training device may use a crawler program to capture sample images in the network, obtain multiple sample images, and classify and identify each pixel in each sample image according to the head area in each sample image. Mark it.
- the training phase multiple sample images in the training data set are used as the input of the pixel classification model, and the classification identifier of each pixel in the sample image is used as the output of the pixel classification model.
- the pixel classification model is iteratively trained to make the pixel classification model pair The features of the head region in the sample image are learned and have the ability to divide the pixels of the head region.
- each sample image in the test data set is used as the input of the pixel classification model, and the test classification identifier of each pixel in each sample image is obtained based on the pixel classification model, and the test classification identifier is compared with the labeled actual classification identifier.
- the classification accuracy rate of the pixel classification model is less than the first preset threshold, training is continued according to the training data set, until the classification accuracy rate of the trained pixel classification model reaches the first preset threshold, the training is completed.
- the first preset threshold may be determined according to the accuracy requirement and the calculation requirement of the pixel classification in the sample image, and may be a value such as 95% or 99%.
- the training device may be the terminal itself, or the training device may be other equipment other than the terminal, such as a server, etc. After the training device performs offline training, the pixel classification model is sent to the terminal for use by the terminal.
- the training device may use at least one of a training algorithm such as a linear classifier, a support vector machine, a deep neural network, and a decision tree to train a pixel classification model.
- the trained pixel classification model may include a linear classifier model and a support vector. At least one of a machine model, a deep neural network model, and a decision tree model.
- a training flowchart of a pixel classification model may be shown in FIG. 3.
- the pixel classification model as the network model consisting of the semantic segmentation network and the Mobilenet basic network model as an example, see Figure 4, input the target image into the network model, perform rough prediction through the semantic segmentation network, and then multi-resolution convolution and The deconvolution operation finally obtains the size of the target image, and then classifies each pixel of the target image. If the classification identifier obtained is 1, the pixel is considered to be a pixel in the head area, otherwise the pixel is considered to be a non-head The pixels of the area.
- the terminal determines the head region of the target image according to the pixels whose classification identifier is the head classification identifier.
- an area composed of multiple pixels whose classification identifier is the head classification identifier may be determined as the head region of the target image.
- the terminal determines a target processing mode corresponding to the target expression category according to a preset relationship between the preset expression category and the processing mode, and uses the target processing mode to edit the head region in the target image.
- the terminal may preset a correspondence relationship between an expression category and a processing mode, which indicates that a corresponding processing mode may be used for editing processing on a head region belonging to a specific expression category. Therefore, the terminal determines the target processing mode corresponding to the target expression category, and uses the target processing mode to edit the head region in the target image.
- the processing method set in the corresponding relationship may include at least one of the following: enlarging or reducing the head area, adding material to the head area, displaying dynamic effects of head area jitter, or other processing methods, etc. .
- the materials that can be added can include lighting effects, stickers, pendants, and so on.
- Emoji category Processing method Surprised Magnify the head area happy Add glow effects to the head area fear Dithering the head area like Add stickers inside the head area ... ...
- a text sticker “happy with full face” and a smiley sticker are added to the left of the head region in the target image to match the happy expression.
- the terminal may not set the corresponding relationship.
- the preset processing method may be set by the terminal by default, or preset by a user, or may be determined according to a user's editing operation in the target image.
- the terminal displays the option to add a sticker and the option to add a glow effect, and when a user's selection operation for the option to add a glow effect is detected, a glow effect is added to the head area.
- the terminal may not perform facial expression recognition on the face area in the target image, and when the target image is obtained, directly perform steps 103-105 to edit the head area.
- the target image may be a single image or an image in a video.
- the single image or the video may be captured by the terminal, or may be sent to the terminal by other devices.
- the terminal obtains a target video, and the target video includes multiple images arranged in sequence, each of the multiple images is used as the target image, and each pixel of the multiple images in the video is classified to obtain a classification identifier Then, the head region in each image in the video may be edited by using the method provided by the embodiment of the present invention.
- FIG. 6 is a flowchart of an image processing method according to an embodiment of the present invention.
- a terminal captures a video
- face detection is performed first, and detected based on an expression recognition model. Recognize the facial area of the face.
- the identified expression category is the target expression category, classify the target image pixel-levelly based on the pixel classification model, determine the head area in the target image, and edit the head area.
- each pixel in a target image is classified based on a pixel classification model to obtain a classification identifier of each pixel in the target image; the target image is determined according to the pixels whose classification identifier is a head classification identifier.
- the head area can be used to classify the pixels in the target image based on the pixel classification model to determine the head area in the target image, to achieve pixel-level head recognition, and to fine-pick the edge of the head. The accuracy of the head area is improved, and the editing processing effect of the head area is improved.
- facial recognition before performing head recognition, perform facial recognition on the face area in the target image.
- the facial expression category of the face area is the target facial expression category
- the face area is edited to improve the pertinence.
- the target processing method corresponding to the target expression category is used to edit the head region to ensure that the processing method matches the expression in the head region, which further improves the processing effect.
- steps in the flowcharts of FIGS. 1b-3 and 6 are sequentially displayed in accordance with the instructions of the arrows, these steps are not necessarily performed in the order indicated by the arrows. Unless explicitly stated in this document, the execution of these steps is not strictly limited, and these steps can be performed in other orders. Moreover, at least some of the steps in Figures 1b-3 and 6 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily performed at the same time, but may be performed at different times. These sub-steps Or the order of execution of the phases is not necessarily sequential, but can be performed in turn or in a close-to-close manner with at least a part of the other steps or sub-steps of other steps or phases.
- FIG. 7 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present invention.
- the apparatus includes:
- a classification module 702 configured to perform the steps of classifying each pixel in a target image based on a pixel classification model in the foregoing embodiment
- the first processing module 703 is configured to perform the steps of determining the head region of the target image and performing editing processing on the head region in the foregoing embodiment.
- the device further includes:
- a second acquisition module configured to perform the steps of acquiring multiple sample images and the classification identifier of each pixel in the multiple sample images in the foregoing embodiment
- the first training module is configured to perform the training in the foregoing embodiment according to the multiple sample images and the classification identifier of each pixel in the multiple sample images.
- the device further includes:
- a detection module configured to perform the steps of performing a face detection on a target image to obtain a face region of the target image in the foregoing embodiment
- a third acquisition module configured to perform the steps of acquiring a trained expression recognition model in the foregoing embodiment
- An expression recognition module configured to perform the steps of recognizing a face region based on an expression recognition model in the above embodiment to obtain an expression category of the face region;
- the classification module 702 is further configured to perform the step of classifying each pixel in the target image based on the pixel classification model in the foregoing embodiment when the expression category of the face region is the target expression category.
- the first processing module 703 includes:
- a target processing unit configured to execute the steps of determining a target processing manner corresponding to a target expression category in the foregoing embodiment
- the editing processing unit is configured to execute the steps of performing the editing processing on the head area by using the target processing mode in the foregoing embodiment.
- the device further includes:
- a fourth obtaining module configured to perform the steps of obtaining multiple sample face images and expression categories of each sample face image in the foregoing embodiment
- the second training module is configured to perform the training in the foregoing embodiment according to a plurality of sample face images and expression categories of each sample face image.
- the device further includes:
- the video processing module is configured to perform the steps of obtaining a target video in the foregoing embodiment, and using each of a plurality of images as a target image, respectively.
- the first processing module 703 includes:
- a zoom processing unit configured to execute the steps of performing enlargement processing or reduction processing on the head area in the foregoing embodiment
- a material adding unit configured to perform the step of adding material in the head area in the foregoing embodiment
- a dynamic processing unit is configured to perform the steps of displaying the dynamic effect of the jitter in the head region in the foregoing embodiment.
- the above image processing apparatus may be implemented in the form of a computer program, and the computer program may be run on a terminal.
- the computer-readable storage medium on the terminal may store various program modules constituting the image processing apparatus, such as the first acquisition module 701, the classification module 702, and the first processing module 703 shown in FIG.
- the processor causes the processor to execute the steps in the image processing method of each embodiment of the present application.
- FIG. 8 is a schematic structural diagram of a terminal 800 according to an exemplary embodiment of the present invention.
- the terminal 800 may be a portable mobile terminal, such as: smartphone, tablet, MP3 player (Moving Picture Experts Group Audio Layer III, moving image expert compression standard audio level 3), MP4 (Moving Picture Experts Group Audio Audio Layer IV, Image expert compression standard audio level 4) Player, laptop, desktop computer, head-mounted device, or any other smart terminal.
- the terminal 800 may also be called other names such as user equipment, portable terminal, laptop terminal, desktop terminal, and the like.
- the terminal 800 includes a processor 801 and a memory 802.
- the processor 801 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like.
- the processor 801 may use at least one hardware form of DSP (Digital Signal Processing), FPGA (Field-Programmable Gate Array), and PLA (Programmable Logic Array). achieve.
- the processor 801 may also include a main processor and a co-processor.
- the main processor is a processor for processing data in the awake state, also called a CPU (Central Processing Unit).
- the co-processor is Low-power processor for processing data in standby.
- the processor 801 may be integrated with a GPU (Graphics Processing Unit), and the GPU is responsible for rendering and drawing content required to be displayed on the display screen.
- the processor 801 may further include an AI (Artificial Intelligence) processor, and the AI processor is configured to process computing operations related to machine learning.
- AI Artificial Intelligence
- the memory 802 may include one or more computer-readable storage media, which may be non-volatile and / or volatile memory.
- Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
- Volatile memory can include random access memory (RAM) or external cache memory.
- RAM is available in various forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
- SRAM static RAM
- DRAM dynamic RAM
- SDRAM synchronous DRAM
- DDRSDRAM dual data rate SDRAM
- ESDRAM enhanced SDRAM
- SLDRAM synchronous chain Synchlink DRAM
- memory bus Rabus direct RAM
- DRDRAM direct memory bus dynamic RAM
- RDRAM memory bus dynamic RAM
- the terminal 800 may optionally include a peripheral device interface 803 and at least one peripheral device.
- the processor 801, the memory 802, and the peripheral device interface 803 may be connected through a bus or a signal line.
- Each peripheral device can be connected to the peripheral device interface 803 through a bus, a signal line, or a circuit board.
- the peripheral device includes at least one of a radio frequency circuit 804, a touch display screen 805, a camera 806, an audio circuit 807, a positioning component 808, and a power supply 809.
- the peripheral device interface 803 may be used to connect at least one peripheral device related to I / O (Input / Output) to the processor 801 and the memory 802.
- the processor 801, the memory 802, and the peripheral device interface 803 are integrated on the same chip or circuit board; in some other embodiments, any one of the processor 801, the memory 802, and the peripheral device interface 803 or Two can be implemented on separate chips or circuit boards, which is not limited in this embodiment.
- the radio frequency circuit 804 is used for receiving and transmitting an RF (Radio Frequency) signal, also called an electromagnetic signal.
- the radio frequency circuit 804 communicates with a communication network and other communication devices through electromagnetic signals.
- the radio frequency circuit 804 converts electrical signals into electromagnetic signals for transmission, or converts received electromagnetic signals into electrical signals.
- the radio frequency circuit 804 includes an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and the like.
- the radio frequency circuit 804 can communicate with other terminals through at least one wireless communication protocol.
- the wireless communication protocol includes, but is not limited to: a metropolitan area network, various generations of mobile communication networks (2G, 3G, 4G, and 8G), a wireless local area network, and / or a WiFi (Wireless Fidelity) network.
- the radio frequency circuit 804 may further include circuits related to Near Field Communication (NFC), which is not limited in this application.
- NFC Near Field Communication
- the display screen 805 is used to display a UI (User Interface).
- the UI can include graphics, text, icons, videos, and any combination thereof.
- the display screen 805 also has the ability to collect touch signals on or above the surface of the display screen 805.
- the touch signal can be input to the processor 801 as a control signal for processing.
- the display screen 805 may also be used to provide a virtual button and / or a virtual keyboard, which is also called a soft button and / or a soft keyboard.
- the display screen 805 may be one, and the front panel of the terminal 800 is provided.
- the display screen 805 may be at least two, which are respectively disposed on different surfaces of the terminal 800 or have a folded design. In still other embodiments, the display screen 805 may be a flexible display screen disposed on a curved surface or a folded surface of the terminal 800. Furthermore, the display screen 805 can also be set as a non-rectangular irregular figure, that is, a special-shaped screen.
- the display screen 805 can be made of materials such as LCD (Liquid Crystal Display) and OLED (Organic Light-Emitting Diode).
- the camera component 806 is used for capturing images or videos.
- the camera component 806 includes a front camera and a rear camera.
- the front camera is disposed on the front panel of the terminal, and the rear camera is disposed on the back of the terminal.
- the camera assembly 806 may further include a flash.
- the flash can be a monochrome temperature flash or a dual color temperature flash.
- a dual color temperature flash is a combination of a warm light flash and a cold light flash, which can be used for light compensation at different color temperatures.
- the audio circuit 807 may include a microphone and a speaker.
- the microphone is used to collect sound waves of the user and the environment, and convert the sound waves into electrical signals and input them to the processor 801 for processing, or input them to the radio frequency circuit 804 to implement voice communication.
- the microphone can also be an array microphone or an omnidirectional acquisition microphone.
- the speaker is used to convert electrical signals from the processor 801 or the radio frequency circuit 804 into sound waves.
- the speaker can be a traditional film speaker or a piezoelectric ceramic speaker.
- the speaker When the speaker is a piezoelectric ceramic speaker, it can not only convert electrical signals into sound waves audible to humans, but also convert electrical signals into sound waves inaudible to humans for ranging purposes.
- the audio circuit 807 may further include a headphone jack.
- the positioning component 808 is used to locate the current geographic position of the terminal 800 to implement navigation or LBS (Location Based Service).
- the positioning component 808 may be a positioning component based on the United States' GPS (Global Positioning System), the Beidou system in China, the Granas system in Russia, or the Galileo system in the European Union.
- the power supply 809 is used to power various components in the terminal 800.
- the power source 809 may be an alternating current, a direct current, a disposable battery, or a rechargeable battery.
- the rechargeable battery may support wired charging or wireless charging.
- the rechargeable battery can also be used to support fast charging technology.
- the terminal 800 further includes one or more sensors 810.
- the one or more sensors 810 include, but are not limited to, an acceleration sensor 811, a gyroscope sensor 812, a pressure sensor 813, a fingerprint sensor 814, an optical sensor 815, and a proximity sensor 816.
- the acceleration sensor 811 can detect the magnitude of acceleration on three coordinate axes of the coordinate system established by the terminal 800.
- the acceleration sensor 811 may be used to detect components of the acceleration of gravity on three coordinate axes.
- the processor 801 may control the touch display screen 805 to display the user interface in a horizontal view or a vertical view according to the gravity acceleration signal collected by the acceleration sensor 811.
- the acceleration sensor 811 may also be used for collecting motion data of a game or a user.
- the gyro sensor 812 can detect the body direction and rotation angle of the terminal 800, and the gyro sensor 812 can cooperate with the acceleration sensor 811 to collect a 3D motion of the user on the terminal 800. Based on the data collected by the gyro sensor 812, the processor 801 can implement the following functions: motion sensing (such as changing the UI according to the user's tilt operation), image stabilization during shooting, game control, and inertial navigation.
- the pressure sensor 813 may be disposed on a side frame of the terminal 800 and / or a lower layer of the touch display screen 805.
- a user's holding signal to the terminal 800 can be detected, and the processor 801 can perform left-right hand recognition or quick operation according to the holding signal collected by the pressure sensor 813.
- the processor 801 operates according to the user's pressure on the touch display screen 805 to control the operable controls on the UI interface.
- the operability controls include at least one of a button control, a scroll bar control, an icon control, and a menu control.
- the fingerprint sensor 814 is used to collect a user's fingerprint, and the processor 801 recognizes the identity of the user based on the fingerprint collected by the fingerprint sensor 814, or the fingerprint sensor 814 recognizes the identity of the user based on the collected fingerprint. When identifying the user's identity as a trusted identity, the processor 801 authorizes the user to have related sensitive operations, which include unlocking the screen, viewing encrypted information, downloading software, paying, and changing settings.
- the fingerprint sensor 814 may be provided on the front, back, or side of the terminal 800. When a physical button or a manufacturer's logo is set on the terminal 800, the fingerprint sensor 814 may be integrated with the physical button or the manufacturer's logo.
- the optical sensor 815 is used to collect the ambient light intensity.
- the processor 801 may control the display brightness of the touch display screen 805 according to the ambient light intensity collected by the optical sensor 815. Specifically, when the ambient light intensity is high, the display brightness of the touch display screen 805 is increased; when the ambient light intensity is low, the display brightness of the touch display screen 805 is decreased.
- the processor 801 may also dynamically adjust the shooting parameters of the camera component 806 according to the ambient light intensity collected by the optical sensor 815.
- the proximity sensor 816 also called a distance sensor, is usually disposed on the front panel of the terminal 800.
- the proximity sensor 816 is used to collect the distance between the user and the front of the terminal 800.
- the processor 801 controls the touch display screen 805 to switch from the bright screen state to the closed screen state; when the proximity sensor 816 detects When the distance between the user and the front of the terminal 800 gradually becomes larger, the processor 801 controls the touch display screen 805 to switch from the screen state to the bright screen state.
- FIG. 8 does not constitute a limitation on the terminal 800, and may include more or fewer components than shown in the figure, or combine certain components, or adopt different component arrangements.
- An embodiment of the present invention further provides a terminal for image processing.
- the terminal includes a processor and a memory.
- the memory stores at least one instruction, at least one program, code set, or instruction set, the instruction, program, code set, or instruction.
- the set is loaded by the processor and has operations to realize the operations in the image processing method of the above embodiment.
- An embodiment of the present invention also provides a computer-readable storage medium.
- the computer-readable storage medium stores at least one instruction, at least one program, code set, or instruction set, the instruction, the program, the code set, or the instruction.
- the set is loaded by the processor and has operations to realize the operations in the image processing method of the above embodiment.
- the program may be stored in a computer-readable storage medium.
- the storage medium mentioned may be a read-only memory, a magnetic disk or an optical disk.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
Description
网络层/步幅 | 卷积核/通道数 | 特征图大小 |
Conv/s2 | 3*3/32 | 112*112*32 |
DepthSepConv/s1 | 3*3/64 | 112*112*64 |
DepthSepConv/s2 | 3*3/128 | 56*56*128 |
DepthSepConv/s1 | 3*3/128 | 56*56*128 |
DepthSepConv/s2 | 3*3/256 | 28*28*128 |
网络层/步幅 | 卷积核/通道数 | 特征图大小 |
DepthSepConv/s1 | 3*3/256 | 28*28*256 |
DepthSepConv/s2 | 3*3/512 | 14*14*512 |
DepthSepConv/s1*5 | 3*3/512 | 14*14*512 |
DepthSepConv/s2 | 3*3/1024 | 7*7*1024 |
DepthSepConv/s1 | 3*3/1024 | 7*7*1024 |
pooling | 7*7 | 1*1*1024 |
Conv/s1 | 3*3*N | 1*1*N |
表情类别 | 处理方式 |
惊讶 | 对头部区域进行放大处理 |
高兴 | 在头部区域添加发光特效 |
恐惧 | 对头部区域进行抖动处理 |
喜欢 | 在头部区域内添加贴纸 |
…… | …… |
Claims (20)
- 一种图像处理方法,由终端执行,其特征在于,所述方法包括:获取已训练的像素分类模型,所述像素分类模型用于确定任一图像中每个像素的分类标识,所述分类标识至少包括头部分类标识,所述头部分类标识用于表示对应的像素位于头部区域;基于所述像素分类模型对目标图像中的每个像素进行分类,得到所述目标图像中每个像素的分类标识;根据所述分类标识中的头部分类标识确定所述目标图像的头部区域,对所述头部区域进行编辑处理。
- 根据权利要求1所述的方法,其特征在于,所述获取已训练的像素分类模型之前,所述方法还包括:获取多个样本图像以及所述多个样本图像中每个像素的分类标识;根据所述多个样本图像以及所述多个样本图像中每个像素的分类标识进行训练,直至训练出的像素分类模型的分类准确率达到第一预设阈值时为止。
- 根据权利要求1所述的方法,其特征在于,所述基于所述像素分类模型对目标图像中的每个像素进行分类,得到所述目标图像中每个像素的分类标识之前,所述方法还包括:对所述目标图像进行人脸检测,得到所述目标图像的人脸区域;获取已训练的表情识别模型;基于所述表情识别模型对所述人脸区域进行识别,得到所述人脸区域的表情类别;当所述人脸区域的表情类别为目标表情类别时,执行所述基于所述像素分类模型对所述目标图像中的每个像素进行分类的步骤。
- 根据权利要求3所述的方法,其特征在于,所述对所述头部区域进行编辑处理,包括:根据预先设置的表情类别与处理方式之间的对应关系,确定所述目标表情类别对应的目标处理方式;采用所述目标处理方式对所述头部区域进行编辑处理。
- 根据权利要求3所述的方法,其特征在于,所述获取已训练的表情识别模型之前,所述方法还包括:获取多个样本人脸图像以及每个样本人脸图像的表情类别;根据所述多个样本人脸图像以及每个样本人脸图像的表情类别进行训练,直至训练出的表情识别模型的识别准确率达到第二预设阈值时为止。
- 根据权利要求1-5任一项所述的方法,其特征在于,所述基于所述像素分类模型对目标图像中的每个像素进行分类,得到所述目标图像中每个像素的分类标识之前,所述方法还包括:获取目标视频,所述目标视频包括按照先后顺序排列的多个图像;将所述多个图像中的每个图像分别作为所述目标图像,执行基于所述像素分类模型对目标图像中的每个像素进行分类的步骤。
- 根据权利要求1-5任一项所述的方法,其特征在于,所述对所述头部区域进行编辑处理,包括:对所述头部区域进行放大处理;或者,对所述头部区域进行缩小处理;或者,在所述头部区域内添加素材;或者,显示所述头部区域抖动的动态效果。
- 一种图像处理装置,其特征在于,所述装置包括:第一获取模块,用于获取已训练的像素分类模型、所述像素分类模型用于确定任一图像中每个像素的分类标识,所述分类标识至少包括头部分类标识,所述头部分类标识用于表示对应的像素位于头部区域;分类模块,用于基于所述像素分类模型对目标图像中的每个像素进行分类,得到所述目标图像中每个像素的分类标识;第一处理模块,用于根据所述分类标识中的头部分类标识确定所述目标图像的头部区域,对所述头部区域进行编辑处理。
- 根据权利要求8所述的装置,其特征在于,所述装置还包括:第二获取模块,用于获取多个样本图像以及所述多个样本图像中每个像素的分类标识;第一训练模块,用于根据所述多个样本图像以及所述多个样本图像中每个像素的分类标识进行训练,直至训练出的像素分类模型的分类准确率达到第一预设阈值时为止。
- 根据权利要求8所述的装置,其特征在于,所述装置还包括:检测模块,用于对所述目标图像进行人脸检测,得到所述目标图像的人脸区域;第三获取模块,用于获取已训练的表情识别模型;表情识别模块,用于基于所述表情识别模型对所述人脸区域进行识别,得到所述人脸区域的表情类别;所述分类模块,还用于当所述人脸区域的表情类别为目标表情类别时,执行所述基于所述像素分类模型对所述目标图像中的每个像素进行分类的步骤。
- 根据权利要求10所述的装置,其特征在于,所述第一处理模块,包括:目标处理单元,用于根据预先设置的表情类别与处理方式之间的对应关系,确定所述目标表情类别对应的目标处理方式;编辑处理单元,用于采用所述目标处理方式对所述头部区域进行编辑处理。
- 根据权利要求8所述的装置,其特征在于,所述装置还包括:第四获取模块,用于获取多个样本人脸图像以及每个样本人脸图像的表情类别;第二训练模块,用于根据所述多个样本人脸图像以及每个样本人脸图像的表情类别进行训练,直至训练出的表情识别模型的识别准确率达到第二预设阈值时为止。
- 根据权利要求8-12所述的装置,其特征在于,所述装置还包括:视频处理模块,用于获取目标视频,所述目标视频包括按照先后顺序排列的多个图像,将所述多个图像中的每个图像分别作为所述目标图像。
- 一种用于图像处理的终端,其特征在于,所述终端包括处理器和存储器,所述存储器中存储有至少一条指令、至少一段程序、代码集或指令集,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以下步骤:获取已训练的像素分类模型,所述像素分类模型用于确定任一图像中每个像素的分类标识,所述分类标识至少包括头部分类标识,所述头部分类标识用于表示对应的像素位于头部区域;基于所述像素分类模型对目标图像中的每个像素进行分类,得到所述目标 图像中每个像素的分类标识;根据所述分类标识中的头部分类标识确定所述目标图像的头部区域,对所述头部区域进行编辑处理。
- 如权利要求14所述的终端,其特征在于,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以下步骤:获取多个样本图像以及所述多个样本图像中每个像素的分类标识;根据所述多个样本图像以及所述多个样本图像中每个像素的分类标识进行训练,直至训练出的像素分类模型的分类准确率达到第一预设阈值时为止。
- 如权利要求14所述的终端,其特征在于,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以下步骤:对所述目标图像进行人脸检测,得到所述目标图像的人脸区域;获取已训练的表情识别模型;基于所述表情识别模型对所述人脸区域进行识别,得到所述人脸区域的表情类别;当所述人脸区域的表情类别为目标表情类别时,执行所述基于所述像素分类模型对所述目标图像中的每个像素进行分类的步骤。
- 如权利要求16所述的终端,其特征在于,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行对所述头部区域进行编辑处理的步骤时,使得所述处理器具体执行以下步骤:根据预先设置的表情类别与处理方式之间的对应关系,确定所述目标表情类别对应的目标处理方式;采用所述目标处理方式对所述头部区域进行编辑处理。
- 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有至少一条指令、至少一段程序、代码集或指令集,所述指令、所述程序、所述代码集或所述指令集由处理器加载并执行时,使得所述处理器执行以下步骤:获取已训练的像素分类模型,所述像素分类模型用于确定任一图像中每个像素的分类标识,所述分类标识至少包括头部分类标识,所述头部分类标识用于表示对应的像素位于头部区域;基于所述像素分类模型对目标图像中的每个像素进行分类,得到所述目标 图像中每个像素的分类标识;根据所述分类标识中的头部分类标识确定所述目标图像的头部区域,对所述头部区域进行编辑处理。
- 如权利要求18所述的计算机可读存储介质,其特征在于,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以下步骤:获取多个样本图像以及所述多个样本图像中每个像素的分类标识;根据所述多个样本图像以及所述多个样本图像中每个像素的分类标识进行训练,直至训练出的像素分类模型的分类准确率达到第一预设阈值时为止。
- 如权利要求18所述的计算机可读存储介质,其特征在于,所述指令、所述程序、所述代码集或所述指令集由所述处理器加载并执行以下步骤:对所述目标图像进行人脸检测,得到所述目标图像的人脸区域;获取已训练的表情识别模型;基于所述表情识别模型对所述人脸区域进行识别,得到所述人脸区域的表情类别;当所述人脸区域的表情类别为目标表情类别时,执行所述基于所述像素分类模型对所述目标图像中的每个像素进行分类的步骤。
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP19839848.9A EP3828769B1 (en) | 2018-07-23 | 2019-06-03 | Image processing method and apparatus, terminal and computer-readable storage medium |
JP2020561766A JP7058760B2 (ja) | 2018-07-23 | 2019-06-03 | 画像処理方法およびその、装置、端末並びにコンピュータプログラム |
KR1020207028918A KR102635373B1 (ko) | 2018-07-23 | 2019-06-03 | 이미지 처리 방법 및 장치, 단말 및 컴퓨터 판독 가능 저장 매체 |
US17/006,071 US11631275B2 (en) | 2018-07-23 | 2020-08-28 | Image processing method and apparatus, terminal, and computer-readable storage medium |
US18/114,062 US20230222770A1 (en) | 2018-07-23 | 2023-02-24 | Head image editing based on face expression classification |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810812675.4 | 2018-07-23 | ||
CN201810812675.4A CN110147805B (zh) | 2018-07-23 | 2018-07-23 | 图像处理方法、装置、终端及存储介质 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/006,071 Continuation US11631275B2 (en) | 2018-07-23 | 2020-08-28 | Image processing method and apparatus, terminal, and computer-readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020019873A1 true WO2020019873A1 (zh) | 2020-01-30 |
Family
ID=67589260
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/089825 WO2020019873A1 (zh) | 2018-07-23 | 2019-06-03 | 图像处理方法、装置、终端及计算机可读存储介质 |
Country Status (6)
Country | Link |
---|---|
US (2) | US11631275B2 (zh) |
EP (1) | EP3828769B1 (zh) |
JP (1) | JP7058760B2 (zh) |
KR (1) | KR102635373B1 (zh) |
CN (1) | CN110147805B (zh) |
WO (1) | WO2020019873A1 (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111353470A (zh) * | 2020-03-13 | 2020-06-30 | 北京字节跳动网络技术有限公司 | 图像的处理方法、装置、可读介质和电子设备 |
CN117115895A (zh) * | 2023-10-25 | 2023-11-24 | 成都大学 | 一种课堂微表情识别方法、系统、设备及介质 |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110807361B (zh) * | 2019-09-19 | 2023-08-08 | 腾讯科技(深圳)有限公司 | 人体识别方法、装置、计算机设备及存储介质 |
CN110850996A (zh) * | 2019-09-29 | 2020-02-28 | 上海萌家网络科技有限公司 | 一种应用于输入法的图片/视频的处理方法和装置 |
KR20210062477A (ko) | 2019-11-21 | 2021-05-31 | 삼성전자주식회사 | 전자 장치 및 그 제어 방법 |
CN110991298B (zh) * | 2019-11-26 | 2023-07-14 | 腾讯科技(深圳)有限公司 | 图像的处理方法和装置、存储介质及电子装置 |
CN112514364A (zh) * | 2019-11-29 | 2021-03-16 | 深圳市大疆创新科技有限公司 | 图像信号处理装置、方法、相机以及可移动平台 |
CN111435437A (zh) * | 2019-12-26 | 2020-07-21 | 珠海大横琴科技发展有限公司 | 一种pcb行人重识别模型训练方法及pcb行人重识别方法 |
CN113315924A (zh) * | 2020-02-27 | 2021-08-27 | 北京字节跳动网络技术有限公司 | 图像特效处理方法及装置 |
CN111402271A (zh) * | 2020-03-18 | 2020-07-10 | 维沃移动通信有限公司 | 一种图像处理方法及电子设备 |
CN111598133B (zh) * | 2020-04-22 | 2022-10-14 | 腾讯医疗健康(深圳)有限公司 | 基于人工智能的图像显示方法、装置、系统、设备及介质 |
CN113763228B (zh) * | 2020-06-01 | 2024-03-19 | 北京达佳互联信息技术有限公司 | 图像处理方法、装置、电子设备及存储介质 |
CN111652878B (zh) * | 2020-06-16 | 2022-09-23 | 腾讯科技(深圳)有限公司 | 图像检测方法、装置、计算机设备及存储介质 |
CN113569894B (zh) * | 2021-02-09 | 2023-11-21 | 腾讯科技(深圳)有限公司 | 图像分类模型的训练方法、图像分类方法、装置及设备 |
CN116386106B (zh) * | 2023-03-16 | 2024-08-20 | 宁波星巡智能科技有限公司 | 伴睡婴幼儿时婴幼儿头部智能识别方法、装置及设备 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102436636A (zh) * | 2010-09-29 | 2012-05-02 | 中国科学院计算技术研究所 | 自动分割头发的方法及其系统 |
CN105404845A (zh) * | 2014-09-15 | 2016-03-16 | 腾讯科技(深圳)有限公司 | 图片处理方法及装置 |
CN106096551A (zh) * | 2016-06-14 | 2016-11-09 | 湖南拓视觉信息技术有限公司 | 人脸部位识别的方法和装置 |
CN107909065A (zh) * | 2017-12-29 | 2018-04-13 | 百度在线网络技术(北京)有限公司 | 用于检测人脸遮挡的方法及装置 |
CN108280388A (zh) * | 2017-01-06 | 2018-07-13 | 富士通株式会社 | 训练面部检测模型的方法和装置以及面部检测方法和装置 |
Family Cites Families (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2009199417A (ja) | 2008-02-22 | 2009-09-03 | Denso Corp | 顔追跡装置及び顔追跡方法 |
JP2010086178A (ja) * | 2008-09-30 | 2010-04-15 | Fujifilm Corp | 画像合成装置およびその制御方法 |
US8224042B2 (en) | 2009-03-12 | 2012-07-17 | Seiko Epson Corporation | Automatic face recognition |
WO2012002048A1 (ja) * | 2010-06-30 | 2012-01-05 | Necソフト株式会社 | 頭部検出方法、頭部検出装置、属性判定方法、属性判定装置、プログラム、記録媒体および属性判定システム |
US8648959B2 (en) * | 2010-11-11 | 2014-02-11 | DigitalOptics Corporation Europe Limited | Rapid auto-focus using classifier chains, MEMS and/or multiple object focusing |
CN104063683B (zh) * | 2014-06-06 | 2017-05-17 | 北京搜狗科技发展有限公司 | 一种基于人脸识别的表情输入方法和装置 |
CN104063865B (zh) * | 2014-06-27 | 2017-08-01 | 小米科技有限责任公司 | 分类模型创建方法、图像分割方法及相关装置 |
CN106709404B (zh) * | 2015-11-16 | 2022-01-04 | 佳能株式会社 | 图像处理装置及图像处理方法 |
CN106295566B (zh) * | 2016-08-10 | 2019-07-09 | 北京小米移动软件有限公司 | 人脸表情识别方法及装置 |
CN107341434A (zh) * | 2016-08-19 | 2017-11-10 | 北京市商汤科技开发有限公司 | 视频图像的处理方法、装置和终端设备 |
JP6788264B2 (ja) * | 2016-09-29 | 2020-11-25 | 国立大学法人神戸大学 | 表情認識方法、表情認識装置、コンピュータプログラム及び広告管理システム |
KR101835531B1 (ko) * | 2016-12-23 | 2018-03-08 | 주식회사 심보다 | 얼굴 인식 기반의 증강현실 영상을 제공하는 디스플레이 장치 및 이의 제어 방법 |
US10922566B2 (en) * | 2017-05-09 | 2021-02-16 | Affectiva, Inc. | Cognitive state evaluation for vehicle navigation |
CN107680069B (zh) * | 2017-08-30 | 2020-09-11 | 歌尔股份有限公司 | 一种图像处理方法、装置和终端设备 |
CN107844781A (zh) * | 2017-11-28 | 2018-03-27 | 腾讯科技(深圳)有限公司 | 人脸属性识别方法及装置、电子设备及存储介质 |
-
2018
- 2018-07-23 CN CN201810812675.4A patent/CN110147805B/zh active Active
-
2019
- 2019-06-03 KR KR1020207028918A patent/KR102635373B1/ko active IP Right Grant
- 2019-06-03 JP JP2020561766A patent/JP7058760B2/ja active Active
- 2019-06-03 WO PCT/CN2019/089825 patent/WO2020019873A1/zh unknown
- 2019-06-03 EP EP19839848.9A patent/EP3828769B1/en active Active
-
2020
- 2020-08-28 US US17/006,071 patent/US11631275B2/en active Active
-
2023
- 2023-02-24 US US18/114,062 patent/US20230222770A1/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102436636A (zh) * | 2010-09-29 | 2012-05-02 | 中国科学院计算技术研究所 | 自动分割头发的方法及其系统 |
CN105404845A (zh) * | 2014-09-15 | 2016-03-16 | 腾讯科技(深圳)有限公司 | 图片处理方法及装置 |
CN106096551A (zh) * | 2016-06-14 | 2016-11-09 | 湖南拓视觉信息技术有限公司 | 人脸部位识别的方法和装置 |
CN108280388A (zh) * | 2017-01-06 | 2018-07-13 | 富士通株式会社 | 训练面部检测模型的方法和装置以及面部检测方法和装置 |
CN107909065A (zh) * | 2017-12-29 | 2018-04-13 | 百度在线网络技术(北京)有限公司 | 用于检测人脸遮挡的方法及装置 |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111353470A (zh) * | 2020-03-13 | 2020-06-30 | 北京字节跳动网络技术有限公司 | 图像的处理方法、装置、可读介质和电子设备 |
CN111353470B (zh) * | 2020-03-13 | 2023-08-01 | 北京字节跳动网络技术有限公司 | 图像的处理方法、装置、可读介质和电子设备 |
CN117115895A (zh) * | 2023-10-25 | 2023-11-24 | 成都大学 | 一种课堂微表情识别方法、系统、设备及介质 |
Also Published As
Publication number | Publication date |
---|---|
US11631275B2 (en) | 2023-04-18 |
US20230222770A1 (en) | 2023-07-13 |
CN110147805B (zh) | 2023-04-07 |
KR102635373B1 (ko) | 2024-02-07 |
EP3828769A1 (en) | 2021-06-02 |
JP7058760B2 (ja) | 2022-04-22 |
US20200394388A1 (en) | 2020-12-17 |
KR20200128565A (ko) | 2020-11-13 |
EP3828769B1 (en) | 2023-08-16 |
JP2021524957A (ja) | 2021-09-16 |
CN110147805A (zh) | 2019-08-20 |
EP3828769A4 (en) | 2021-08-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020019873A1 (zh) | 图像处理方法、装置、终端及计算机可读存储介质 | |
WO2021008456A1 (zh) | 图像处理方法、装置、电子设备及存储介质 | |
WO2020048308A1 (zh) | 多媒体资源分类方法、装置、计算机设备及存储介质 | |
WO2020221012A1 (zh) | 图像特征点的运动信息确定方法、任务执行方法和设备 | |
CN110650379B (zh) | 视频摘要生成方法、装置、电子设备及存储介质 | |
CN111127509B (zh) | 目标跟踪方法、装置和计算机可读存储介质 | |
CN111382624A (zh) | 动作识别方法、装置、设备及可读存储介质 | |
US11386586B2 (en) | Method and electronic device for adding virtual item | |
CN109360222B (zh) | 图像分割方法、装置及存储介质 | |
CN109285178A (zh) | 图像分割方法、装置及存储介质 | |
WO2022048398A1 (zh) | 多媒体数据拍摄方法及终端 | |
WO2022057435A1 (zh) | 基于搜索的问答方法及存储介质 | |
CN110290426B (zh) | 展示资源的方法、装置、设备及存储介质 | |
US11908105B2 (en) | Image inpainting method, apparatus and device, and storage medium | |
CN110705614A (zh) | 模型训练方法、装置、电子设备及存储介质 | |
CN111754386A (zh) | 图像区域屏蔽方法、装置、设备及存储介质 | |
CN108305262A (zh) | 文件扫描方法、装置及设备 | |
CN111639639B (zh) | 检测文本区域的方法、装置、设备及存储介质 | |
WO2023087703A9 (zh) | 媒体文件处理方法及装置 | |
CN111611414A (zh) | 车辆检索方法、装置及存储介质 | |
CN114817709A (zh) | 排序方法、装置、设备及计算机可读存储介质 | |
CN112990424A (zh) | 神经网络模型训练的方法和装置 | |
WO2021243955A1 (zh) | 主色调提取方法及装置 | |
CN111488898B (zh) | 对抗数据获取方法、装置、设备及存储介质 | |
CN113157310A (zh) | 配置信息的获取方法、装置、设备及计算机可读存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19839848 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 20207028918 Country of ref document: KR Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2020561766 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2019839848 Country of ref document: EP Effective date: 20210223 |