CN109241835A - Image processing method and device, electronic equipment and storage medium - Google Patents
Image processing method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN109241835A CN109241835A CN201810842970.4A CN201810842970A CN109241835A CN 109241835 A CN109241835 A CN 109241835A CN 201810842970 A CN201810842970 A CN 201810842970A CN 109241835 A CN109241835 A CN 109241835A
- Authority
- CN
- China
- Prior art keywords
- target object
- feature
- motor unit
- image
- characteristic information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
- G06V10/464—Salient features, e.g. scale invariant feature transforms [SIFT] using a plurality of salient features, e.g. bag-of-words [BoW] representations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
- G06V40/171—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
- G06V40/176—Dynamic expression
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Human Computer Interaction (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
This disclosure relates to a kind of image processing method and device, electronic equipment and storage medium.This method comprises: obtaining the crucial point feature of the characteristic information of target object and target object in image to be processed;According to crucial point feature, the positioning result of the key point of target object in image to be processed is determined;According to the characteristic information of target object, positioning result and crucial point feature, the object detection results of the motor unit of target object are determined.According to the embodiment of the present disclosure, the accuracy of positioning result and object detection results can be improved, to improve the accuracy for carrying out object analysis to target object in image to be processed.
Description
Technical field
This disclosure relates to field of computer technology more particularly to a kind of image processing method and device, electronic equipment and deposit
Storage media.
Background technique
With the fast development of Internet technology, computer vision technique and is applied to every field, for example, available
In the analysis task (for example, carrying out object analysis such as face registration, the detection of human face action unit etc.) for carrying out types of objects.So
And in the related technology, the accuracy of object analysis task result need to be improved.
Summary of the invention
In view of this, the present disclosure proposes a kind of image processing techniques schemes.
According to the one side of the disclosure, a kind of image processing method is provided, which comprises
Obtain the crucial point feature of the characteristic information of target object and the target object in image to be processed;
According to the crucial point feature, the positioning knot of the key point of target object described in the image to be processed is determined
Fruit;
According to the characteristic information of the target object, the positioning result and the crucial point feature, the mesh is determined
Mark the object detection results of the motor unit of object.
In one possible implementation, according to the characteristic information of the target object, the positioning result and institute
Crucial point feature is stated, determines the object detection results of the motor unit of the target object, comprising:
According to the characteristic information of the target object and the positioning result, the motor unit of the target object is determined
Local feature;
According to the local feature and the crucial point feature, the object detection results of the motor unit are determined.
In this way, passing through the characteristic information and the positioning result of target object, the part of obtained motor unit is determined
Feature accuracy is high, so that determining obtained motor unit according to the local feature and the crucial point feature
Object detection results have high accuracy.
In one possible implementation, the characteristic information of target object and the target in image to be processed are obtained
The crucial point feature of object, comprising:
Feature extraction is carried out to the image to be processed, obtains the characteristic information of target object in the image to be processed;
Key point feature extraction is carried out to the characteristic information of the target object, the key point for obtaining the target object is special
Sign.
In such manner, it is possible to accurately obtain the characteristic information of target object and the crucial point feature of target object.
In one possible implementation, according to the characteristic information of the target object and the positioning result, really
The local feature of the motor unit of the fixed target object, comprising:
According to the positional relationship of the key point of the central point and target object of the positioning result and motor unit, determine
The initial attention characteristic pattern of the motor unit of the target object;
Process of convolution is carried out to the initial attention characteristic pattern, the attention characteristic pattern that obtains that treated;
According to treated attention characteristic pattern and the characteristic information of the target object, the motor unit is determined
Local feature.
In this way, being distributed by the attention that adaptive attention mode of learning learns motor unit, and according to attention point
Cloth binding characteristic information, the accuracy of the local feature of determining motor unit are higher.
In one possible implementation, according to the feature of treated attention characteristic pattern and the target object
Information determines the local feature of the motor unit, comprising:
According to the characteristic information of treated attention characteristic pattern and the target object, the power that gains attention content;
Feature extraction processing is carried out to the attention content, obtains the local feature.
In such manner, it is possible to the local feature for the motor unit for guaranteeing that loss of spatial information is less, model parameter is simple, and extracting
It is more accurate, to improve the detection accuracy of the object detection results of moving cell.
In one possible implementation, the target object includes multiple motor units,
Wherein, according to the local feature and the crucial point feature, the target detection knot of the motor unit is determined
Fruit, comprising:
Fusion treatment is carried out to the local feature of multiple motor units, obtains fused local feature;
According to the crucial point feature of fused local feature and the target object, the multiple motor unit is determined
Object detection results.
In this way, the space characteristics extracted can preferably be kept, to improve the accuracy of object detection results.
In one possible implementation, the method utilizes neural fusion,
Wherein, feature extraction is carried out to the image to be processed, obtains the feature of target object in the image to be processed
Information, comprising:
The image to be processed is inputted and carries out feature extraction in the feature extraction network of the neural network, is obtained described
The characteristic information of target object in image to be processed.
In one possible implementation, the method utilizes neural fusion, and the neural network is according to wait locate
Reason image training obtains.
The same of neural fusion positioning result and object detection results end to end is obtained by image to be processed training
Step prediction, improves the accuracy and intelligence of image procossing.
In one possible implementation, include: according to the step of image training neural network to be processed
Feature extraction network and key point feature that the image to be processed is inputted respectively in the neural network are mentioned
It takes in network and is handled, it is special to obtain the key point of the characteristic information of target object and the target object in image to be processed
Sign;
First that the crucial point feature is inputted in the neural network, which detects in network, to be handled, and determines the mesh
Mark the positioning result of the key point of object;
The characteristic information of the target object, the positioning result and the crucial point feature are inputted into the nerve net
It is handled in the second detection network in network, determines the object detection results of the motor unit of the target object;
According to the positioning result, the markup information of the positioning result, the object detection results and the target
The markup information of testing result determines the model loss of the neural network;
It is lost according to the model, adjusts the network parameter values of the neural network.
In this way, it can train to obtain the positioning result and motor unit that can accurately obtain key point
The neural network of object detection results.
In one possible implementation, by the characteristic information of the target object, the positioning result and described
It is handled in the second detection network that crucial point feature inputs in the neural network, determines the movement list of the target object
The object detection results of member, comprising:
The characteristic information of the target object and the positioning result are inputted to the local feature in the neural network
It extracts and is handled in network, determine the local feature of the motor unit of the target object;
By the local feature and the crucial point feature input in the second detection network in the neural network into
Row processing, determines the object detection results of the motor unit.
In such manner, it is possible to accurately get the local feature of motor unit, the object detection results of moving cell are improved
Detection accuracy.
In one possible implementation, the characteristic information of the target object and the positioning result are inputted into institute
It states and is handled in the local shape factor network in neural network, determine that the part of the motor unit of the target object is special
Sign, comprising:
The initial attention that the positioning result inputs in the neural network is generated in network and is handled, according to institute
The positional relationship for stating the central point of positioning result and motor unit and the key point of target object determines the target object
The initial attention characteristic pattern of motor unit;
The fixed reference feature that the initial attention characteristic pattern inputs in the neural network is extracted in network and carries out convolution
Processing, the attention characteristic pattern that obtains that treated;
The characteristic information of treated the attention characteristic pattern and the target object is inputted into the neural network
In local shape factor network in handled, determine the local feature of the moving cell of the target object,
Wherein, according to the positioning result, the markup information of the positioning result, object detection results and described
The markup information of object detection results determines the model loss of the neural network, comprising:
According to the positioning result, the markup information of the positioning result, the object detection results, the target detection
As a result the initial attention weight of each element and treated the note in markup information, the initial attention characteristic pattern
The attention weight of each element in meaning power characteristic pattern determines the model loss of the neural network.
By the adaptive attention study of neural network, the local feature accuracy of obtained each motor unit is higher, energy
Enough improve the accuracy of the robustness of image processing method and the object detection results of motor unit.
In one possible implementation, the feature extraction network includes at least one convolution group, each convolution group
Including at least one convolutional layer and at least one convolution subgroup, the convolution subgroup includes multiple convolution sublayers, each convolution
Layer includes the subregion of different number, and the convolution nuclear parameter of each convolution sublayer difference subregion is different.
In this way, convolution sublayer is divided into multiple subregions, local feature can be preferably extracted.Each convolution sublayer quilt
It is divided into the subregion of different number, convenient for extracting various sizes of local feature, to adapt to different size of motor unit,
Finer, more fully feature can be extracted, to improve the accuracy of the object detection results of subsequent action unit.Meanwhile
This residual error structure can reduce the probability of occurrence of gradient disperse problem in training process, improve the stability of network and accurate
Property.
According to another aspect of the present disclosure, a kind of image processing apparatus is provided, described device includes:
Module is obtained, for obtaining the key of the characteristic information of target object and the target object in image to be processed
Point feature;
Positioning result determining module, for determining target described in the image to be processed according to the crucial point feature
The positioning result of the key point of object;
Object detection results determining module, for according to the characteristic information of the target object, the positioning result and
The key point feature, determines the object detection results of the motor unit of the target object.
In one possible implementation, the object detection results determining module includes:
First determines that submodule determines institute for the characteristic information and the positioning result according to the target object
State the local feature of the motor unit of target object;
Second determines submodule, for determining that the movement is single according to the local feature and the crucial point feature
The object detection results of member.
In one possible implementation, the acquisition module includes:
First acquisition submodule obtains in the image to be processed for carrying out feature extraction to the image to be processed
The characteristic information of target object;
Second acquisition submodule carries out key point feature extraction for the characteristic information to the target object, obtains institute
State the crucial point feature of target object.
In one possible implementation, described first determine that submodule includes:
Third determines submodule, for according to the central point of the positioning result and motor unit and the pass of target object
The positional relationship of key point determines the initial attention characteristic pattern of the motor unit of the target object;
Third acquisition submodule, for obtaining to the initial attention characteristic pattern progress process of convolution, treated is infused
Meaning power characteristic pattern;
4th determines submodule, for the feature letter according to treated attention characteristic pattern and the target object
Breath, determines the local feature of the motor unit.
In one possible implementation, the described 4th determine that submodule includes:
4th acquisition submodule, for the characteristic information according to treated attention characteristic pattern and the target object,
The power that gains attention content;
5th acquisition submodule obtains the local feature for carrying out feature extraction processing to the attention content.
In one possible implementation, the target object includes multiple motor units,
Wherein, described second determine that submodule includes:
6th acquisition submodule carries out fusion treatment for the local feature to multiple motor units, obtains fused
Local feature;
5th determines submodule, for the crucial point feature according to fused local feature and the target object,
Determine the object detection results of the multiple motor unit.
In one possible implementation, described device utilizes neural fusion,
Wherein, first acquisition submodule includes:
7th acquisition submodule, for by the image to be processed input in the feature extraction network of the neural network into
Row feature extraction obtains the characteristic information of target object in the image to be processed.
In one possible implementation, described device utilizes neural fusion, and the neural network is according to wait locate
Reason image training obtains.
In one possible implementation, described device includes:
Feature obtains module, for the image to be processed to be inputted to the feature extraction network in the neural network respectively
And handled in key point feature extraction network, obtain the characteristic information of target object and the mesh in image to be processed
Mark the crucial point feature of object;
First determining module, for by the crucial point feature input in the first detection network in the neural network into
Row processing, determines the positioning result of the key point of the target object;
Second determining module, for by the characteristic information of the target object, the positioning result and the key point
It is handled in the second detection network that feature inputs in the neural network, determines the mesh of the motor unit of the target object
Mark testing result;
Third determining module, for the markup information according to the positioning result, the positioning result, the target detection
As a result and the markup information of the object detection results, the model loss of the neural network is determined;
Parameter adjustment module adjusts the network parameter values of the neural network for losing according to the model.
In one possible implementation, second determining module includes:
6th determines submodule, for the characteristic information of the target object and the positioning result to be inputted the mind
Through being handled in the local shape factor network in network, the local feature of the motor unit of the target object is determined;
7th determines submodule, for inputting the local feature and the crucial point feature in the neural network
Second detection network in handled, determine the object detection results of the motor unit.
In one possible implementation, the described 6th determine that submodule includes:
8th determines submodule, generates net for the positioning result to be inputted the initial attention in the neural network
It is handled in network, is closed according to the position of the key point of the central point and target object of the positioning result and motor unit
System, determines the initial attention characteristic pattern of the motor unit of the target object;
8th acquisition submodule, for the initial attention characteristic pattern to be inputted the fixed reference feature in the neural network
It extracts in network and carries out process of convolution, the attention characteristic pattern that obtains that treated;
9th determines submodule, for believing the feature of treated the attention characteristic pattern and the target object
It is handled in the local shape factor network that breath inputs in the neural network, determines the moving cell of the target object
Local feature,
Wherein, the third determining module includes:
Tenth determines submodule, for being examined according to the markup information of the positioning result, the positioning result, the target
Survey result, the markup information of the object detection results, in the initial attention characteristic pattern each element initial attention power
The attention weight of each element in weight and treated the attention characteristic pattern determines the model damage of the neural network
It loses.
In one possible implementation, the feature extraction network includes at least one convolution group, each convolution group
Including at least one convolutional layer and at least one convolution subgroup, the convolution subgroup includes multiple convolution sublayers, each convolution
Layer includes the subregion of different number, and the convolution nuclear parameter of each convolution sublayer difference subregion is different.
According to another aspect of the present disclosure, a kind of electronic equipment is provided, comprising:
Processor;
Memory for storage processor executable instruction;
Wherein, the processor is configured to: execute above-mentioned image processing method.
According to another aspect of the present disclosure, a kind of computer readable storage medium is provided, computer journey is stored thereon with
Sequence instruction, the computer program instructions realize above-mentioned image processing method when being executed by processor.
The dynamic of target object is determined by the positioning result of the key point of combining target object according to the embodiment of the present disclosure
The object detection results for making unit can be improved positioning result using the relevance between positioning result and object detection results
And the accuracy of object detection results, to improve the accuracy for carrying out object analysis to target object in image to be processed.
It should be understood that above general description and following detailed description is only exemplary and explanatory, rather than
Limit the disclosure.
According to below with reference to the accompanying drawings to detailed description of illustrative embodiments, the other feature and aspect of the disclosure will become
It is clear.
Detailed description of the invention
The drawings herein are incorporated into the specification and forms part of this specification, and those figures show meet this public affairs
The embodiment opened, and together with specification it is used to illustrate the technical solution of the disclosure.
Fig. 1 shows the flow chart of the image processing method according to the embodiment of the present disclosure.
Fig. 2 shows the schematic diagrames according to the neural network of the image processing method of the embodiment of the present disclosure.
Fig. 3 shows the schematic diagram of the convolution group according to the image processing method of the embodiment of the present disclosure.
Fig. 4 shows the schematic diagram of the convolution group according to the image processing method of the embodiment of the present disclosure.
Fig. 5 shows the schematic diagram of the application scenarios of the image processing method according to the embodiment of the present disclosure.
Fig. 6 is shown according to the initial attention characteristic pattern of motor unit of the image processing method of the embodiment of the present disclosure and place
The schematic diagram of attention characteristic pattern after reason.
Fig. 7 shows the schematic diagram of the application scenarios of the image processing method according to the embodiment of the present disclosure.
Fig. 8 shows the flow chart of training neural network in the image processing method according to the embodiment of the present disclosure.
Fig. 9 shows the block diagram of the image processing apparatus according to the embodiment of the present disclosure.
Figure 10 shows the block diagram of the image processing apparatus according to the embodiment of the present disclosure.
Figure 11 shows the block diagram of the electronic equipment according to the embodiment of the present disclosure.
Figure 12 shows the block diagram of the electronic equipment according to the embodiment of the present disclosure.
Specific embodiment
Various exemplary embodiments, feature and the aspect of the disclosure are described in detail below with reference to attached drawing.It is identical in attached drawing
Appended drawing reference indicate element functionally identical or similar.Although the various aspects of embodiment are shown in the attached drawings, remove
It non-specifically points out, it is not necessary to attached drawing drawn to scale.
Dedicated word " exemplary " means " being used as example, embodiment or illustrative " herein.Here as " exemplary "
Illustrated any embodiment should not necessarily be construed as preferred or advantageous over other embodiments.
The terms "and/or", only a kind of incidence relation for describing affiliated partner, indicates that there may be three kinds of passes
System, for example, A and/or B, can indicate: individualism A exists simultaneously A and B, these three situations of individualism B.In addition, herein
Middle term "at least one" indicate a variety of in any one or more at least two any combination, it may for example comprise A,
B, at least one of C can indicate to include any one or more elements selected from the set that A, B and C are constituted.
In addition, giving numerous details in specific embodiment below to better illustrate the disclosure.
It will be appreciated by those skilled in the art that without certain details, the disclosure equally be can be implemented.In some instances, for
Method, means, element and circuit well known to those skilled in the art are not described in detail, in order to highlight the purport of the disclosure.
Fig. 1 shows the flow chart of the image processing method according to the embodiment of the present disclosure.This method can be applied to electronic equipment
In, equipment which may be provided as terminal, server or other forms.Wherein, terminal can be user equipment
(User Equipment, UE), mobile device, user terminal, cellular phone, wireless phone, personal digital assistant (Personal
Digital Assistant, PDA), handheld device, calculate equipment, mobile unit, wearable device etc., the disclosure does not make this
Limitation.In some possible implementations, which can call the calculating stored in memory by processor
The mode of machine readable instruction is realized.As shown in Figure 1, the image processing method according to the embodiment of the present disclosure includes:
In step s101, the key of the characteristic information of target object and the target object in image to be processed is obtained
Point feature;
In step s 102, according to the crucial point feature, the pass of target object described in the image to be processed is determined
The positioning result of key point;
In step s 103, special according to the characteristic information of the target object, the positioning result and the key point
Sign, determines the object detection results of the motor unit of the target object.
The dynamic of target object is determined by the positioning result of the key point of combining target object according to the embodiment of the present disclosure
The object detection results for making unit can be improved positioning result using the relevance between positioning result and object detection results
And the accuracy of object detection results, to improve the accuracy for carrying out object analysis to target object in image to be processed.
Wherein, image to be processed can be true picture, for example, can be original image or image after treatment.
Target object can be the object in a certain area image in image to be processed.For example, image to be processed can be to original
Beginning image carries out the image obtained after intercepting process, for example, intercepting the obtained image of target object according to certain rule.Wait locate
Reason image be also possible to include target object original image, the disclosure to this with no restriction.Key point can be target object
On the point that acquires a special sense, can be used for defining shape, appearance of target object etc..For example, the key point of face can be people
The point of certain specific positions (for example, eyes, eyebrow etc.) on the face, can be used for defining face shape, expression appearance etc..Motor unit
(Action Unit, AU) can be used for indicating the muscle movement on target object.For example, human face action unit can indicate certain
Muscle movement in face location, can be used for it is accurate, human face expression is objectively described.The disclosure is to the key point of target object
Quantity and definition rule, the quantity of motor unit and definition rule etc. are with no restriction.
In one possible implementation, the target object in object to be processed may include the object of any classification, example
Such as, face, animal face, article etc., the disclosure to this with no restriction.Below for ease of understanding, using face as target object, to
Processing image is illustrated for facial image.
For example, original image can be intercepted according to certain rule, obtains image to be processed.For example, people can be passed through
Face detects the face location in the various ways such as code, face registration codes detection original image, determines multiple features of face
Point (key point).It should be understood that can locating human face in several ways key point, for example, can be positioned manually, Face datection generation
Code, face registration codes etc., the disclosure to this with no restriction.
It in one possible implementation, can be according to multiple key points of face (for example, pupil of left eye, right eye pupil
Hole, nose, the left corners of the mouth, the right corners of the mouth), by similarity transformation (for example, rotation, translation, uniformly scaling etc.), do not changing face shape
Under conditions of shape and expression, interception original image obtains image to be processed, and image to be processed can be facial image.
For example, can be by way of image rotation, so that pupil of both eyes keeps horizontal, and along this 5 crucial dots
At boundary rectangle frame amplification, cut obtain the facial image (image to be processed) of target size (for example, L × L).
As shown in Figure 1, in step s101, obtaining the characteristic information of target object and the target in image to be processed
The crucial point feature of object.
For example, feature extraction can be carried out to image to be processed (facial image), obtains sharing feature.The shared spy
Sign can be used as the characteristic information of target object.The sharing feature can also be used in the crucial point feature for determining target object, for example, right
Sharing feature carries out key point feature extraction, obtains crucial point feature.
In some alternative embodiments, feature extraction can also be carried out to image to be processed respectively, obtains figure to be processed
The characteristic information of target object and the crucial point feature of target object as in.For example, it can be got according to expectation
The classification of feature carries out feature extraction to image to be processed respectively.For example, feature extraction can be carried out to image to be processed, obtain
Take the characteristic information of target object.Key point feature extraction can be carried out to image to be processed, obtain crucial point feature.The disclosure
The mode for obtaining the crucial point feature of the characteristic information and target object of target object in image to be processed is not limited
System.
In one possible implementation, step S101 may include:
Feature extraction is carried out to the image to be processed, obtains the characteristic information of target object in the image to be processed;
Key point feature extraction is carried out to the characteristic information of the target object, the key point for obtaining the target object is special
Sign.
For example, it can be based on deep learning algorithm, for example, nerve net can be obtained according to image to be processed training
Network.Feature extraction can be carried out to image to be processed using the feature extraction network of trained neural network, obtain it is described to
Handle the characteristic information (sharing feature) of target object in image.It is carried out by the characteristic information (sharing feature) to target object
Key point feature extraction obtains the crucial point feature of the target object.In such manner, it is possible to more quickly, accurately obtain target
The characteristic information of object and the crucial point feature of target object.
Fig. 2 shows the schematic diagrames according to the neural network of the image processing method of the embodiment of the present disclosure.A kind of possible
In implementation, as shown in Fig. 2, the neural network includes feature extraction network, this feature extracts network can be to described wait locate
It manages image and carries out feature extraction, obtain the characteristic information (sharing feature) of target object in the image to be processed.The nerve
Network may also include key point feature extraction network, and sharing feature inputs key point feature extraction network and carries out feature extraction, can
Obtain the crucial point feature of target object.
In one possible implementation, feature extraction is carried out to the image to be processed, obtains the figure to be processed
As in the step of the characteristic information of target object, may include:
The image to be processed is inputted and carries out feature extraction in the feature extraction network of the neural network, is obtained described
The characteristic information of target object in image to be processed.
Wherein, feature extraction network can be the arbitrary network structure that feature extraction can be carried out to image to be processed, example
Such as, feature extraction network may include convolutional layer, by carrying out process of convolution to image to be processed, obtain characteristic information.Wherein, special
Sign, which extracts the convolutional layer that network includes, can have arbitrary form and structure.For example, feature extraction network may include at least one
A convolution group, each convolution group may include at least one convolutional layer.Every grade of convolutional layer may include one or more convolution
Layer etc., the disclosure to this with no restriction.
In one possible implementation, the feature extraction network includes at least one convolution group, each convolution group
Including at least one convolutional layer and at least one convolution subgroup, the convolution subgroup includes multiple convolution sublayers, each convolution
Layer includes the subregion of different number, and the convolution nuclear parameter of each convolution sublayer difference subregion is different.
For example, feature extraction network may include two convolution groups, for example, respectively MR1 (L, L, c) and MR2
(L/2, L/2,2c), described two convolution group series connection, for carrying out feature extraction to image to be processed.Wherein, c represents convolution
The number of filter of layer, and filter the port number of the characteristic pattern generated.Wherein, MR1 can be used for extracting image to be processed compared with office
Portion (for example, corner angle, edge etc.) feature.MR2 can be used for extracting the feature of image higher level to be processed (for example, fritter etc.).MR2
It, can effective lifting feature extraction effect by compared with multi-filter.A maximum pond layer is connected with after each convolution group.It is maximum
Pond layer can be used for carrying out down-sampling to the feature extracted, and reduce characteristic dimension.Each convolutional layer can be applied batch and normalize
With rectification linear unit, feature extraction network exports the characteristic information of target object in image to be processed.
Fig. 3 shows the schematic diagram of the convolution group according to the image processing method of the embodiment of the present disclosure.In a kind of possible reality
In existing mode, as shown in figure 3, a convolution group includes a convolutional layer and a convolution subgroup, the convolution subgroup includes
Four convolution sublayers.Four convolution sublayers are divided evenly the subregion for different number respectively, for example, being divided evenly respectively
For 1 × 1,2 × 2,4 × 4 and 8 × 8 sub-regions.Each subregion can share a convolution kernel, the different sons of each convolution sublayer
The convolution nuclear parameter in region is different (for example, the value of some weights of convolution kernel is different).
For example, image to be processed is input in feature extraction network, and the convolutional layer of first convolution group is to be processed
Image carries out process of convolution, obtains some intermediate features, which is input in convolution subgroup, respectively by four convolution
Layer carries out process of convolution to intermediate features, the output result of four convolution sublayers be overlapped to obtain the first Superposition Characteristics (for example,
C shown in Fig. 3 indicates that superposition obtains the first Superposition Characteristics), the first Superposition Characteristics and intermediate features progress Element-Level be added (for example,
Corresponding element is added), it obtains carrying out the characteristic information that process of convolution obtains through first convolution group.It should be understood that first
Convolution group carries out the characteristic information that process of convolution obtains and handles by second convolution group (for example, treatment process can be the same as first
Convolution group), output is the characteristic information of target object.
In this way, convolution sublayer is divided into multiple subregions, local feature can be preferably extracted.Each convolution sublayer quilt
It is divided into the subregion of different number, convenient for extracting various sizes of local feature, to adapt to different size of motor unit,
Finer, more fully feature can be extracted, to improve the accuracy of the object detection results of subsequent action unit.Meanwhile
It should be understood that the first Superposition Characteristics are added with intermediate features progress Element-Level may make up residual error structure, can reduce in training process
The probability of occurrence of gradient disperse problem improves the stability and accuracy of network.
It should be noted that a convolution group may include diversified forms, as it was noted above, multiple convolution sublayers can be
Convolution with sizes (quantity of divided subregion is different) on the same level (the same convolution subgroup)
Layer.Multiple convolution sublayers can also be the convolution sublayer with multiple sizes in different levels.
Fig. 4 shows the schematic diagram of the convolution group according to the image processing method of the embodiment of the present disclosure.In a kind of possible reality
In existing mode, as shown in figure 4, convolution subgroup includes 3 convolution sublayers, for example, respectively convolution sublayer 1, convolution sublayer 2 and
Convolution sublayer 3.This 3 convolution sublayers are divided evenly respectively as 2 × 2,4 × 4 and 8 × 8 sub-regions (3 convolution sublayers
With sizes).As shown in figure 4, this 3 convolution sublayers are the convolution sublayer in different levels.For example, first convolutional layer
The intermediate features of output are input in convolution sublayer 1 and are handled, convolution sublayer 1 export feature input convolution sublayer 2 in into
Row processing, the feature that convolution sublayer 2 exports are inputted in convolution sublayer 3 and are handled.By convolution sublayer 1, convolution sublayer 2 and volume
The output of product sublayer 3 is overlapped, and the output result of other convolutional layers of the result being superimposed and the convolution group carries out element
Grade is added.
In this way, the layered structure of convolution subgroup can effectively expand the receptive field of convolution kernel, it is conducive to extract more comprehensive
Characteristic information.As long as feature extraction, structure, feature of the disclosure to feature extraction network can be carried out to image to be processed
Extract network include the quantity of convolution group, the structure of each convolution group, the level of each convolution subgroup and structure, each convolution sublayer number
The quantity for the subregion that amount, each convolution sublayer include and division mode, the convolution nuclear parameter of each convolution sublayer difference subregion etc.
With no restriction.
In one possible implementation, key point feature extraction is carried out to the characteristic information of the target object, obtained
To the crucial point feature of the target object.
For example, as shown in Fig. 2, the neural network may also include key point feature extraction network, wherein key point
Feature extraction network feature extraction network as previously described can be the arbitrary network structure for including convolutional layer, the disclosure pair
This is with no restriction.
In some alternative embodiments, key point feature extraction network may include 5 convolutional layers, for example, this 5 volumes
Lamination is connected, and global average pond layer is connected after the 5th convolutional layer, which is averaged pond layer can be to the 5th convolutional layer output
Result carry out the average pond of entire spatial domain.
For example, feature extraction will be carried out in the characteristic information input key point feature extraction network of target object, point
Not Jing Guo 5 convolutional layers carry out process of convolution, the output result of the 5th convolutional layer carries out global average pond, obtains target pair
The crucial point feature of elephant.
In this way, the structure of key point feature extraction network is simple, and global average pondization can preferably keep to be processed
The space characteristics of image.The object detection results of the motor unit of the positioning result and target object of the key point of target object
It is closely related with the space characteristics of target object, it is fixed that the crucial point feature of the target object extracted in this way can be improved key point
The prediction accuracy of the object detection results of position result and motor unit.
It should be understood that key point feature extraction network can also be other network structures, for example, it is also possible to include 3 convolution
Group, 3 convolution group series connection, each convolution group include 2 convolutional layers, one maximum pond layer of connection etc. after each convolution group.Only
Want can to extract the crucial point feature of target object, the disclosure is special to structure, the key point of key point feature extraction network
It includes convolution group, the quantity of convolutional layer, the structure of each convolution group, the structure of convolutional layer, classification of pond layer etc. that sign, which extracts network,
With no restriction.
As shown in Figure 1, in step s 102, according to the crucial point feature, determining mesh described in the image to be processed
Mark the positioning result of the key point of object.
For example, the crucial point feature for the target object that can be will acquire is input to full articulamentum (for example, full connection
Layer dimension be 2 × key point number) in handled, obtain the positioning result of the key point of target object.For example, positioning
As a result can be the horizontally and vertically coordinate etc. of key point, the disclosure to this with no restriction.For example, as shown in Fig. 2, being positioned
As a result.
As shown in Figure 1, in step s 103, according to the characteristic information of the target object, the positioning result and institute
Crucial point feature is stated, determines the object detection results of the motor unit of the target object.
For example, the positioning result of the key point of determination described previously can be acted on to the feature letter of target object
Breath, for example, the region of interest ROI of the motor unit of target object can be intercepted out according to the positioning result of key point
(Region of Interest).And ROI and crucial point feature based on motor unit, determine the movement of the target object
The object detection results etc. of unit.The disclosure is to according to the characteristic information of the target object, the positioning result and described
Crucial point feature determines the mode of the object detection results of the motor unit of the target object with no restriction.
In one possible implementation, step S103 may include:
According to the characteristic information of the target object and the positioning result, the motor unit of the target object is determined
Local feature;
According to the local feature and the crucial point feature, the object detection results of the motor unit are determined.
It for example, as it was noted above, can be according to the characteristic information and the positioning result of the target object, really
Determine motor unit ROI.The local feature of motor unit can be determined based on the ROI of motor unit.For example, to the mesh intercepted out
The motor unit ROI for marking object carries out feature extraction, obtains the local feature of motor unit.It can be according to the part of motor unit
Feature and crucial point feature, determine the object detection results of motor unit.
In this way, passing through the characteristic information and the positioning result of target object, the part of obtained motor unit is determined
Feature accuracy is high, so that determining obtained motor unit according to the local feature and the crucial point feature
Object detection results have high accuracy.The disclosure is to according to the characteristic information of the target object and positioning knot
Fruit determines the mode of the local feature of the motor unit of the target object, according to the local feature and the key point
Feature determines the mode of the object detection results of the motor unit with no restriction.
Wherein, the key point of target object may include one or more, and motor unit also may include one or more.
The ROI of motor unit can be the region for noting also that power distribution (for example, each pixel attention weight is identical), or different
Attention is distributed the region of (for example, each pixel attention weight is not exactly the same).It is multiple in the motor unit of target object
When, the ROI of each motor unit can be fixed dimension, or different sizes (such as each motor unit ROI shape is not advised
Then), the disclosure to this with no restriction.
In one possible implementation, according to the characteristic information of the target object and the positioning result, really
The step of local feature of the motor unit of the fixed target object, may include:
According to the positional relationship of the key point of the central point and target object of the positioning result and motor unit, determine
The initial attention characteristic pattern of the motor unit of the target object;
Process of convolution is carried out to the initial attention characteristic pattern, the attention characteristic pattern that obtains that treated;
According to treated attention characteristic pattern and the characteristic information of the target object, the motor unit is determined
Local feature.
For example, as shown in Fig. 2, the neural network further includes that initial attention generates network.The initial attention
Power generates network and can be closed according to the position of the key point of the central point and target object of the positioning result and motor unit
System, determines the initial attention characteristic pattern of the motor unit of the target object.
The position of the central point of the illustrative motor unit of the embodiment of the present disclosure and the key point of target object is given below
Relationship, as illustrated in chart 1:
Table 1
Motor unit serial number | Motor unit title | Motor unit center |
7 | Eyelid tightening | Eye center |
10 | Upper lip raises up | Upper lip center |
12 | The corners of the mouth elongates | The corners of the mouth |
14 | Dimple | The corners of the mouth |
15 | The corners of the mouth forces down | The corners of the mouth |
Now by taking motor unit 12 as an example, as shown in table 1, the center of the motor unit 12 is the corners of the mouth.For example, key point is determined
It include the positioning result (for example, coordinate) of this key point of the corners of the mouth in the result of position.Can according to the coordinate of the determining corners of the mouth and
Positional relationship between the corners of the mouth and motor unit 12 defines the initial attention characteristic pattern of motor unit 12.
In some alternative embodiments, each element that can define the attention characteristic pattern of motor unit is initialized as 0,
And according to the positioning result of the key point in positioning result, the center of motor unit is determined.It, can be with according to the center of motor unit
Define the ROI of the motor unit.The ROI of the motor unit may include two symmetrical subregions, can determine each motor unit
Two sub-regions in each element initial attention weight, obtain the element for needing to update attention weight, and generate each dynamic
Make the initial attention characteristic pattern of unit.Wherein, the size of the initial attention characteristic pattern of each motor unit can are as follows: L/4 × L/4
×1。
The disclosure is to the position according to the key point of the central point and target object of the positioning result and motor unit
Relationship determines the concrete mode of the initial attention characteristic pattern of the motor unit of the target object, initial attention characteristic pattern
Size etc. with no restriction.
The attention of k-th of element in the subregion of illustrative i-th of the motor unit of the embodiment of the present disclosure is given below
The determination formula (1) of weight:
In formula (1), vikIndicate the attention weight of k-th of element in the subregion of i-th of motor unit, dikTable
Show k-th of element to motor unit subregion center manhatton distance.ζ indicates the width and attention characteristic pattern of subregion
The ratio of width, ξ are a coefficient, ξ >=0.nauIndicate the number of motor unit.I is variable, and the value of i arrives n 1auBetween,
ζ and ξ is the hyper parameter pre-set.
Wherein, ζ can be used for determining the size of the ROI of each motor unit, for example, according to ζ, motor unit subregion center with
And the width of attention characteristic pattern, it can determine the size of the ROI of each motor unit.
It should be noted that if some element, belongs to the lap of two sub-regions of ROI, then it can be according to formula (1)
Two attention weights are acquired respectively, take attention weight of the biggish value as the element in two attention weights.It can be with
The ROI of motor unit is defined with the attention weight of the element of exterior domain as 0.
In this way, the available initial attention characteristic pattern with the distribution of different attentions.The disclosure is to according to described fixed
The positional relationship of the key point of the central point and target object of position result and motor unit, determines the movement of the target object
The mode of initial attention characteristic pattern, the size of initial attention characteristic pattern, the method for determination of attention weight, region of unit
Width and the ratio size of attention characteristic pattern, the value of ξ this coefficient etc. with no restriction.
Fig. 5 shows the schematic diagram of the application scenarios of the image processing method according to the embodiment of the present disclosure.A kind of possible
In implementation, as shown in figure 5, target object (face) includes multiple motor units.According to determining positioning result (for example,
The coordinate of the multiple key points of face) and motor unit central point and target object key point positional relationship, can be true
The initial attention characteristic pattern of the motor unit of face is determined, for example, as shown in figure 5, determining the initial attention of multiple motor units
Power characteristic pattern 51.
In one possible implementation, according to the characteristic information of the target object and the positioning result, really
The step of local feature of the motor unit of the fixed target object, may include:
Process of convolution is carried out to the initial attention characteristic pattern, the attention characteristic pattern that obtains that treated.
For example, as shown in Fig. 2, the neural network further includes that fixed reference feature extracts network, which is extracted
Network can carry out attention optimization processing to initial attention characteristic pattern.For example, the fixed reference feature extract network can be such as institute above
The feature extraction network stated is the arbitrary network structure for including convolutional layer.For example, may include a convolution as previously described
Group (for example, convolution group shown in Fig. 3) can carry out process of convolution to the initial attention characteristic pattern, by a filter
Convolutional layer output treated the attention characteristic pattern that (port number) is 1.
In this way, multiple convolution sublayers of convolution subgroup include the subregion of different number, can be in different offices
Portion region optimizes transformation using different attention, and is suitable for different size of motor unit, the attention that improves that treated
The effect of optimization of characteristic pattern.Fixed reference feature, which extracts network, to be other forms, such as may include multiple convolutional layer (examples
Such as, concatenated 3 convolutional layers) etc., the disclosure carries out the mode of process of convolution, convolutional layer to the initial attention characteristic pattern
Structure and form etc. with no restriction.For example, as shown in figure 5, the initial attention characteristic pattern 51 to multiple motor units carries out
Process of convolution obtains multiple treated attention characteristic patterns 52.
Fig. 6 is shown according to the initial attention characteristic pattern of motor unit of the image processing method of the embodiment of the present disclosure and place
The schematic diagram of attention characteristic pattern after reason.In one possible implementation, as shown in fig. 6, the first row is 6 dynamic respectively
Make the initial attention characteristic pattern of unit, the second row is that this 6 initial attention characteristic patterns are obtained by process of convolution respectively
Treated attention characteristic pattern.As shown in fig. 6, respectively treated attention characteristic pattern according to the position of respective action unit from
The size and attention weight of ROI are adaptively adjusted, the shape of the ROI of each motor unit is irregular, and its edge smoothing
Ground is transitioned into peripheral region.
In one possible implementation, according to the characteristic information of the target object and the positioning result, really
The step of local feature of the motor unit of the fixed target object, may include:
According to treated attention characteristic pattern and the characteristic information of the target object, the motor unit is determined
Local feature.
It for example, can be according to processing as shown in Fig. 2, the neural network may include local shape factor network
The characteristic information of attention characteristic pattern and the target object afterwards inputs local shape factor network, determines that the movement is single
The local feature of member.
It, can treated that attention is special according to each motor unit as it was noted above, may include multiple motor units
The characteristic information of sign figure and the target object, determines the local feature of each motor unit respectively.In this way, should be noted that by adaptive
Power mode of learning learns the attention distribution of motor unit, and is distributed binding characteristic information, determining movement list according to attention
The accuracy of the local feature of member is higher.The disclosure is to the spy according to treated attention characteristic pattern and the target object
Reference breath, determines the mode of the local feature of the motor unit with no restriction.
In one possible implementation, according to the feature of treated attention characteristic pattern and the target object
Information the step of determining the local feature of the motor unit, may include:
According to the characteristic information of treated attention characteristic pattern and the target object, the power that gains attention content;
Feature extraction processing is carried out to the attention content, obtains the local feature.
For example, the characteristic information of treated attention characteristic pattern and the target object can be subjected to Element-Level
It is multiplied.For example, corresponding element multiplication, the power that gains attention content.Feature extraction processing can be carried out to the attention content,
Obtain the local feature.For example, can include the feature extraction structure of 5 series connection convolutional layers to attention content by one
Feature extraction processing is carried out, the local feature is obtained.For example, it is single to respectively obtain each movement when including multiple motor units
The local feature of member.For example, as shown in figure 5, respectively according to each treated attention characteristic pattern and the target object
Characteristic information (for example, it may be the characteristic information extracted by feature extraction network), determines the part of multiple motor units
Feature 53.
In such manner, it is possible to the local feature for the motor unit for guaranteeing that loss of spatial information is less, model parameter is simple, and extracting
It is more accurate, to improve the detection accuracy of the object detection results of moving cell.The disclosure is carried out to the attention content
Feature extraction processing, obtains the mode of the local feature with no restriction.
In one possible implementation, it according to the local feature and the crucial point feature, determines described dynamic
Make the object detection results of unit.
It for example, as shown in Fig. 2, can be special in conjunction with the local feature of motor unit and the key point of target object
Sign, determines the object detection results of the motor unit.The disclosure to according to the local feature and the crucial point feature,
Determine the mode of the object detection results of the motor unit with no restriction.
In one possible implementation, the target object includes multiple motor units, according to the local feature
And the crucial point feature, the step of determining the object detection results of the motor unit, may include:
Fusion treatment is carried out to the local feature of multiple motor units, obtains fused local feature;
According to the crucial point feature of fused local feature and the target object, the multiple motor unit is determined
Object detection results.
For example, the local feature of obtained multiple motor units can be subjected to fusion treatment, for example, carrying out element
Grade is added (respective pixel addition), obtains fused local feature.For example, as shown in figure 5, by the part of multiple motor units
Feature carries out Element-Level addition, obtains fused local feature 54, which can be used for determining final
Object detection results.
For example, can be determined described more according to the crucial point feature of fused local feature and the target object
The object detection results of a motor unit.For example, the crucial point feature for stating target object and fused local feature are carried out
Element-Level is added, and carries out global average pond by pond layer, is n by the result input dimension of pond layer outputau(for movement
The number of unit) full articulamentum in handled, the object detection results of multiple motor units are obtained, for example, obtaining one
Two classification results of multi-tag.
In this way, the space characteristics extracted can preferably be kept, to improve the accuracy of object detection results.The disclosure
To the mode of fusion treatment, the form of local feature, the form of object detection results, according to fused local feature and institute
The crucial point feature for stating target object determines mode of the object detection results of the multiple motor unit etc. with no restriction.
It should be understood that the above method can be adapted for the scene that object detection results are determined using trained neural network,
It is readily applicable to train the process of neural network, the embodiment of the present disclosure does not limit this.In a kind of possible implementation
In, before determining object detection results using trained neural network, it may include according to the image to be processed training mind
The step of through network.
Fig. 7 shows the schematic diagram of the application scenarios of the image processing method according to the embodiment of the present disclosure.A kind of possible
In implementation, as shown in fig. 7, before according to the image to be processed training neural network and passing through trained mind
Before determining object detection results through network, data prediction can be carried out.Now according to the image to be processed training nerve
It carries out being illustrated for data prediction before network.
For example, it can establish human face action cell data library.For example, 20 males of recruitment and 20 women, each
People is induced to spontaneously produce different expressions in 8 different tasks, is acquired by camera and obtains 2D video, each view
Frequency obtains 500 frames by screening, and picture sum is 40 × 8 × 500=160000.Multiple movements are marked to every face picture
Unit (such as 12).For example, occurring the motor unit in face picture, then the markup information of the motor unit is 1, if not having
There is the motor unit, then the markup information of the motor unit is 0.
In addition, carrying out Face datection and crucial point location to every face picture, multiple face key point (examples are marked
Such as, 49).For example, marking the horizontally and vertically coordinate of multiple face key point.At can be to 160000 pictures
Reason, for example, then positioning 5 face key points: pupil of left eye, pupil of right eye, nose as it was noted above, first detect face location
Sharp, the left corners of the mouth, the right corners of the mouth, the face similarity transformation rotated, translated, uniformly scaled, are not changing face shape and expression
While face is normalized, facial image randomly flip horizontal and is finally cut to L × L, obtains figure to be processed
Picture, details are not described herein.
Fig. 8 shows the flow chart of training neural network in the image processing method according to the embodiment of the present disclosure.One kind can
In the implementation of energy, as shown in figure 8, including: according to the step of image training neural network to be processed
In step S104, by the image to be processed input respectively feature extraction network in the neural network and
It is handled in key point feature extraction network, obtains the characteristic information of target object and the target pair in image to be processed
The crucial point feature of elephant;
In step s105, the crucial point feature is inputted in the first detection network in the neural network
Reason, determines the positioning result of the key point of the target object;
In step s 106, by the characteristic information of the target object, the positioning result and the crucial point feature
It is handled in the second detection network inputted in the neural network, determines the target inspection of the motor unit of the target object
Survey result;
In step s 107, according to the positioning result, the markup information of the positioning result, the object detection results
And the markup information of the object detection results, determine the model loss of the neural network;
In step S108, is lost according to the model, adjust the network parameter values of the neural network.
For example, the image to be processed can be inputted respectively feature extraction network in the neural network and
It is handled in key point feature extraction network, obtains the characteristic information of target object and the target pair in image to be processed
The crucial point feature of elephant.For example, as shown in fig. 7, image input feature vector to be processed is extracted the multiple dimensioned sharing feature of e-learning.
The crucial point feature of the multiple dimensioned sharing feature input key point feature extraction e-learning face.
It is handled in the first detection network that crucial point feature can be inputted in the neural network, determines the mesh
Mark the positioning result of the key point of object.For example, as shown in fig. 7, determining the positioning result of multiple key points of face.
The characteristic information of the target object, the positioning result and the crucial point feature can be inputted into the mind
Through being handled in the second detection network in network, the object detection results of the motor unit of the target object are determined.
In one possible implementation, step S106 may include:
The characteristic information of the target object and the positioning result are inputted to the local feature in the neural network
It extracts and is handled in network, determine the local feature of the motor unit of the target object;
By the local feature and the crucial point feature input in the second detection network in the neural network into
Row processing, determines the object detection results of the motor unit.
For example, the characteristic information of the target object and the positioning result can be inputted into the neural network
In local shape factor network in handled, the local feature of the motor unit of the target object is determined, by the office
It is handled, is determined described dynamic in the second detection network that portion's feature and the crucial point feature input in the neural network
Make the object detection results of unit.For example, as shown in fig. 7, being determined according to the local feature of motor unit and crucial point feature
Object detection results.In such manner, it is possible to accurately get the local feature of motor unit, the target detection of moving cell is improved
As a result detection accuracy.
In one possible implementation, according to the positioning result, the markup information of the positioning result, the mesh
The markup information for marking testing result and the object detection results determines the model loss of the neural network.According to model
Loss, adjusts the network parameter values of the neural network.
It for example, can be according to positioning result, the markup information of the positioning result, the object detection results, institute
The markup information and loss function for stating object detection results determine the model loss of the neural network.The disclosure is to loss
The form of function is with no restriction.It can be lost according to model, adjust the network parameter values of the neural network.For example, using anti-
Gradient descent algorithm etc. is combined to adjust network parameter values to propagating.It should be appreciated that suitable mode, which can be used, adjusts neural network
Network parameter values, the disclosure to this with no restriction.
After repeatedly adjusting, if meeting preset training condition, such as adjustment number reaches preset
Frequency of training threshold value or model loss are less than or equal to preset loss threshold value, then can be by current neural network
Be determined as final neural network, so as to complete neural network training process.It should be appreciated that those skilled in the art can
Be set according to actual conditions training condition and loss threshold value, the disclosure to this with no restriction.
In this way, it can train to obtain the positioning result and motor unit that can accurately obtain key point
The neural network of object detection results.
In one possible implementation, the characteristic information of the target object and the positioning result are inputted into institute
It states and is handled in the local shape factor network in neural network, determine the local feature of the motor unit of the target object
The step of, may include:
The initial attention that the positioning result inputs in the neural network is generated in network and is handled, according to institute
The positional relationship for stating the central point of positioning result and motor unit and the key point of target object determines the target object
The initial attention characteristic pattern of motor unit;
The fixed reference feature that the initial attention characteristic pattern inputs in the neural network is extracted in network and carries out convolution
Processing, the attention characteristic pattern that obtains that treated;
The characteristic information of treated the attention characteristic pattern and the target object is inputted into the neural network
In local shape factor network in handled, determine the local feature of the moving cell of the target object.
In one possible implementation, step S107 may include:
According to the positioning result, the markup information of the positioning result, the object detection results, the target detection
As a result the initial attention weight of each element and treated the note in markup information, the initial attention characteristic pattern
The attention weight of each element in meaning power characteristic pattern determines the model loss of the neural network.
For example, during training neural network, it may be determined that the initial attention characteristic pattern of each motor unit, and
Processing is optimized to the initial attention characteristic pattern of each motor unit, auxiliary determines the local feature of each moving cell.For example,
It is generated in network as shown in fig. 7, the positioning result of the face key point is inputted the initial attention in the neural network
It is handled, determines the initial attention characteristic pattern of the motor unit of the face.Initial attention characteristic pattern is optimized
It handles (for example, process of convolution), the attention characteristic pattern that obtains that treated.According to treated attention characteristic pattern and described
The local feature of the characteristic information study face moving cell of target object.By the adaptive attention study of neural network, obtain
The local feature accuracy of each motor unit arrived is higher, to improve the robustness of image processing method and the mesh of motor unit
Mark the accuracy of testing result.
It in one possible implementation, can be according to the positioning result, the markup information of the positioning result, institute
State object detection results, the markup information of the object detection results, each element is initial in the initial attention characteristic pattern
The attention weight of each element, determines the neural network in attention weight and treated the attention characteristic pattern
Model loss.
It should be understood that can determine the model loss of neural network by markup information, testing result.Determining neural network
Model loss during, loss function may include diversified forms, the disclosure to this with no restriction.
The embodiment of the present disclosure is given below illustratively according to positioning result and the markup information of positioning result, determines damage
The formula (2) of mistake:
In formula (2), EalignIt indicates according to key point positioning result and the markup information of positioning result, determining damage
It loses.y2j-1And y2jRespectively indicate j-th point of markup information (true x coordinate and y-coordinate).Wherein, the value of j is arrived 1
nalignBetween, nalignIndicate the number of key point.doFor the true interpupillary distance of eyes.WithJ-th point is respectively indicated to determine
Position result (x coordinate and y-coordinate of the point predicted).
The embodiment of the present disclosure is given below illustratively according to object detection results and the mark of the object detection results
Information is infused, determines the formula (3) of loss:
In formula (3), EauIt indicates to be determined according to object detection results and the markup information of the object detection results
Loss.nauIndicate the dimension of full articulamentum, the number of motor unit.piIndicate the true probability that i-th of motor unit occurs
(markup informations of object detection results) occur for 1 and lack then to be 0.Expression prediction probability be (motor unit of prediction
Object detection results).Weight wiIt is for overcoming the problems, such as that data are unbalanced.For most of motor unit Test databases,
The frequency that different motor units occur is unbalanced, and not mutually indepedent between motor unit, in training neural network
When establishing face database before, it is ensured that the appearance of each motor unit and frequency of loss are almost the same, so as to drop
The probability of occurrence of the low unbalanced problem of data guarantees predictablity rate.Wherein it is possible to statistical disposition is carried out to training set, it can be with
According to formulaDetermine weight, wherein riIndicate that i-th of motor unit goes out in training set
Existing frequency.
It should be understood that formula (3) are weighting multi-tag sigmoid cross entropy loss function, full articulamentum can simplify in this way
Dimension, simplify structure on the basis of ensure that prediction effect.Furthermore it is also possible to by other kinds of loss function, root
According to object detection results and the markup information of the object detection results, loss is determined.For example, it is also possible to for weighting multi-tag
Softmax loss etc., the disclosure to this with no restriction.
The embodiment of the present disclosure is given below illustratively according to the initial attention of each element in initial attention characteristic pattern
The attention weight of each element in weight and treated the attention characteristic pattern, determines the formula (4) of loss:
In formula (4), ErIndicate the initial attention weight and institute according to each element in initial attention characteristic pattern
The attention weight of each element, determining loss in attention characteristic pattern of stating that treated, can measure attention characteristic pattern just
The sigmoid cross entropy between value after initial value and optimization.vikIndicate the initial of k-th of element of i-th of attention characteristic pattern
Attention weighted value.Indicate k-th of element treated attention weighted value, n of i-th of attention characteristic patternelement
It is the element number of each attention characteristic pattern.
In this way, the probability that can reduce that treated attention characteristic pattern and initial attention characteristic pattern has big difference.
The formula (5) that the embodiment of the present disclosure illustratively determines whole model loss is given below:
E=Eau+λ1Ealign+λ2Er (5)
Wherein, E indicates whole model loss, λ1And λ2It is the coefficient for balancing importance, is pre-set hyper parameter.
In this way, can be during training neural network, it can be according to formula (2), formula (3), formula (4) and formula
(5), whole model loss is calculated, is lost according to the model, adjusts the network parameter values of the neural network, as before
Described, details are not described herein.
In this way, it can train to obtain the positioning result and motor unit that can accurately obtain key point
The neural network of object detection results.Joint training is detected by crucial point location and motor unit, using between two tasks
Relevance can promote the accuracy for improving the testing result of two tasks.Learnt by adaptive attention, is adapted to each
Class diversification, the detection of nonrigid motor unit, improve the accuracy of the object detection results of motor unit.
It should be understood that the positioning result and target object of the key point of the target object obtained according to the embodiment of the present disclosure
The object detection results of motor unit can be applied in types of objects analysis task.For example, the positioning of the face key point determined
As a result and the object detection results of human face action unit can be used for carrying out human face analysis personage, can be applied to human face expression knowledge
Not, the every field such as face verification, security protection.The disclosure does not limit the applicable scene of object detection results and positioning result
System.
It will be understood by those skilled in the art that each step writes sequence simultaneously in the above method of specific embodiment
It does not mean that stringent execution sequence and any restriction is constituted to implementation process, the specific execution sequence of each step should be with its function
It can be determined with possible internal logic.
Fig. 9 shows the block diagram of the image processing apparatus according to the embodiment of the present disclosure.As shown in figure 9, described device includes:
Module 201 is obtained, for obtaining the characteristic information of target object in image to be processed and the target object
Crucial point feature;
Positioning result determining module 202, for determining mesh described in the image to be processed according to the crucial point feature
Mark the positioning result of the key point of object;
Object detection results determining module 203, for according to the characteristic information of the target object, the positioning result with
And the crucial point feature, determine the object detection results of the motor unit of the target object.
In one possible implementation, the object detection results determining module 203 includes:
First determines that submodule determines institute for the characteristic information and the positioning result according to the target object
State the local feature of the motor unit of target object;
Second determines submodule, for determining that the movement is single according to the local feature and the crucial point feature
The object detection results of member.
In one possible implementation, the acquisition module 201 includes:
First acquisition submodule obtains in the image to be processed for carrying out feature extraction to the image to be processed
The characteristic information of target object;
Second acquisition submodule carries out key point feature extraction for the characteristic information to the target object, obtains institute
State the crucial point feature of target object.
In one possible implementation, described first determine that submodule includes:
Third determines submodule, for according to the central point of the positioning result and motor unit and the pass of target object
The positional relationship of key point determines the initial attention characteristic pattern of the motor unit of the target object;
Third acquisition submodule, for obtaining to the initial attention characteristic pattern progress process of convolution, treated is infused
Meaning power characteristic pattern;
4th determines submodule, for the feature letter according to treated attention characteristic pattern and the target object
Breath, determines the local feature of the motor unit.
In one possible implementation, the described 4th determine that submodule includes:
4th acquisition submodule, for the characteristic information according to treated attention characteristic pattern and the target object,
The power that gains attention content;
5th acquisition submodule obtains the local feature for carrying out feature extraction processing to the attention content.
In one possible implementation, the target object includes multiple motor units,
Wherein, described second determine that submodule includes:
6th acquisition submodule carries out fusion treatment for the local feature to multiple motor units, obtains fused
Local feature;
5th determines submodule, for the crucial point feature according to fused local feature and the target object,
Determine the object detection results of the multiple motor unit.
In one possible implementation, described device utilizes neural fusion,
Wherein, first acquisition submodule includes:
7th acquisition submodule, for by the image to be processed input in the feature extraction network of the neural network into
Row feature extraction obtains the characteristic information of target object in the image to be processed.
In one possible implementation, described device utilizes neural fusion, and the neural network is according to wait locate
Reason image training obtains.
Figure 10 shows the block diagram of the image processing apparatus according to the embodiment of the present disclosure.As shown in Figure 10, a kind of possible
In implementation, described device includes:
Feature obtains module 204, for the image to be processed to be inputted to the feature extraction in the neural network respectively
It is handled in network and key point feature extraction network, obtains the characteristic information of target object and institute in image to be processed
State the crucial point feature of target object;
First determining module 205, for the crucial point feature to be inputted to the first detection network in the neural network
In handled, determine the positioning result of the key point of the target object;
Second determining module 206, for by the characteristic information of the target object, the positioning result and the key
It is handled in the second detection network that point feature inputs in the neural network, determines the motor unit of the target object
Object detection results;
Third determining module 207, for the markup information according to the positioning result, the positioning result, the target
The markup information of testing result and the object detection results determines the model loss of the neural network;
Parameter adjustment module 208 adjusts the network parameter values of the neural network for losing according to the model.
In one possible implementation, second determining module 206 includes:
6th determines submodule, for the characteristic information of the target object and the positioning result to be inputted the mind
Through being handled in the local shape factor network in network, the local feature of the motor unit of the target object is determined;
7th determines submodule, for inputting the local feature and the crucial point feature in the neural network
Second detection network in handled, determine the object detection results of the motor unit.
In one possible implementation, the described 6th determine that submodule includes:
8th determines submodule, generates net for the positioning result to be inputted the initial attention in the neural network
It is handled in network, is closed according to the position of the key point of the central point and target object of the positioning result and motor unit
System, determines the initial attention characteristic pattern of the motor unit of the target object;
8th acquisition submodule, for the initial attention characteristic pattern to be inputted the fixed reference feature in the neural network
It extracts in network and carries out process of convolution, the attention characteristic pattern that obtains that treated;
9th determines submodule, for believing the feature of treated the attention characteristic pattern and the target object
It is handled in the local shape factor network that breath inputs in the neural network, determines the moving cell of the target object
Local feature,
Wherein, the third determining module 207 includes:
Tenth determines submodule, for being examined according to the markup information of the positioning result, the positioning result, the target
Survey result, the markup information of the object detection results, in the initial attention characteristic pattern each element initial attention power
The attention weight of each element in weight and treated the attention characteristic pattern determines the model damage of the neural network
It loses.
In one possible implementation, the feature extraction network includes at least one convolution group, each convolution group
Including at least one convolutional layer and at least one convolution subgroup, the convolution subgroup includes multiple convolution sublayers, each convolution
Layer includes the subregion of different number, and the convolution nuclear parameter of each convolution sublayer difference subregion is different.
In some embodiments, the embodiment of the present disclosure provides the function that has of device or comprising module can be used for holding
The method of row embodiment of the method description above, specific implementation are referred to the description of embodiment of the method above, for sake of simplicity, this
In repeat no more
The embodiment of the present disclosure also proposes a kind of computer readable storage medium, is stored thereon with computer program instructions, institute
It states when computer program instructions are executed by processor and realizes the above method.Computer readable storage medium can be non-volatile meter
Calculation machine readable storage medium storing program for executing.
The embodiment of the present disclosure also proposes a kind of electronic equipment, comprising: processor;For storage processor executable instruction
Memory;Wherein, the processor is configured to the above method.
Figure 11 shows the block diagram of the electronic equipment according to the embodiment of the present disclosure.For example, electronic equipment 800 can be mobile electricity
Words, computer, digital broadcasting terminal, messaging device, game console, tablet device, Medical Devices, body-building equipment are a
The terminals such as personal digital assistant.
Referring to Fig.1 1, electronic equipment 800 may include following one or more components: processing component 802, memory 804,
Power supply module 806, multimedia component 808, audio component 810, the interface 812 of input/output (I/O), sensor module 814,
And communication component 816.
The integrated operation of the usual controlling electronic devices 800 of processing component 802, such as with display, call, data are logical
Letter, camera operation and record operate associated operation.Processing component 802 may include one or more processors 820 to hold
Row instruction, to perform all or part of the steps of the methods described above.In addition, processing component 802 may include one or more moulds
Block, convenient for the interaction between processing component 802 and other assemblies.For example, processing component 802 may include multi-media module, with
Facilitate the interaction between multimedia component 808 and processing component 802.
Memory 804 is configured as storing various types of data to support the operation in electronic equipment 800.These data
Example include any application or method for being operated on electronic equipment 800 instruction, contact data, telephone directory
Data, message, picture, video etc..Memory 804 can by any kind of volatibility or non-volatile memory device or it
Combination realize, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM) is erasable
Except programmable read only memory (EPROM), programmable read only memory (PROM), read-only memory (ROM), magnetic memory, fastly
Flash memory, disk or CD.
Power supply module 806 provides electric power for the various assemblies of electronic equipment 800.Power supply module 806 may include power supply pipe
Reason system, one or more power supplys and other with for electronic equipment 800 generate, manage, and distribute the associated component of electric power.
Multimedia component 808 includes the screen of one output interface of offer between the electronic equipment 800 and user.
In some embodiments, screen may include liquid crystal display (LCD) and touch panel (TP).If screen includes touch surface
Plate, screen may be implemented as touch screen, to receive input signal from the user.Touch panel includes one or more touches
Sensor is to sense the gesture on touch, slide, and touch panel.The touch sensor can not only sense touch or sliding
The boundary of movement, but also detect duration and pressure associated with the touch or slide operation.In some embodiments,
Multimedia component 808 includes a front camera and/or rear camera.When electronic equipment 800 is in operation mode, as clapped
When taking the photograph mode or video mode, front camera and/or rear camera can receive external multi-medium data.It is each preposition
Camera and rear camera can be a fixed optical lens system or have focusing and optical zoom capabilities.
Audio component 810 is configured as output and/or input audio signal.For example, audio component 810 includes a Mike
Wind (MIC), when electronic equipment 800 is in operation mode, when such as call mode, recording mode, and voice recognition mode, microphone
It is configured as receiving external audio signal.The received audio signal can be further stored in memory 804 or via logical
Believe that component 816 is sent.In some embodiments, audio component 810 further includes a loudspeaker, is used for output audio signal.
I/O interface 812 provides interface between processing component 802 and peripheral interface module, and above-mentioned peripheral interface module can
To be keyboard, click wheel, button etc..These buttons may include, but are not limited to: home button, volume button, start button and lock
Determine button.
Sensor module 814 includes one or more sensors, for providing the state of various aspects for electronic equipment 800
Assessment.For example, sensor module 814 can detecte the state that opens/closes of electronic equipment 800, the relative positioning of component, example
As the component be electronic equipment 800 display and keypad, sensor module 814 can also detect electronic equipment 800 or
The position change of 800 1 components of electronic equipment, the existence or non-existence that user contacts with electronic equipment 800, electronic equipment 800
The temperature change of orientation or acceleration/deceleration and electronic equipment 800.Sensor module 814 may include proximity sensor, be configured
For detecting the presence of nearby objects without any physical contact.Sensor module 814 can also include optical sensor,
Such as CMOS or ccd image sensor, for being used in imaging applications.In some embodiments, which may be used also
To include acceleration transducer, gyro sensor, Magnetic Sensor, pressure sensor or temperature sensor.
Communication component 816 is configured to facilitate the communication of wired or wireless way between electronic equipment 800 and other equipment.
Electronic equipment 800 can access the wireless network based on communication standard, such as WiFi, 2G or 3G or their combination.Show at one
In example property embodiment, communication component 816 receives broadcast singal or broadcast from external broadcasting management system via broadcast channel
Relevant information.In one exemplary embodiment, the communication component 816 further includes near-field communication (NFC) module, short to promote
Cheng Tongxin.For example, radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra wide band can be based in NFC module
(UWB) technology, bluetooth (BT) technology and other technologies are realized.
In the exemplary embodiment, electronic equipment 800 can be by one or more application specific integrated circuit (ASIC), number
Word signal processor (DSP), digital signal processing appts (DSPD), programmable logic device (PLD), field programmable gate array
(FPGA), controller, microcontroller, microprocessor or other electronic components are realized, for executing the above method.
In the exemplary embodiment, a kind of non-volatile computer readable storage medium storing program for executing is additionally provided, for example including calculating
The memory 804 of machine program instruction, above-mentioned computer program instructions can be executed by the processor 820 of electronic equipment 800 to complete
The above method.
Figure 12 shows the block diagram of the electronic equipment according to the embodiment of the present disclosure.For example, electronic equipment 1900 can be provided
For a server.Referring to Fig.1 2, it further comprises one or more processing that electronic equipment 1900, which includes processing component 1922,
Device and memory resource represented by a memory 1932, can be by the instruction of the execution of processing component 1922, example for storing
Such as application program.The application program stored in memory 1932 may include it is one or more each correspond to one group
The module of instruction.In addition, processing component 1922 is configured as executing instruction, to execute the above method.
Electronic equipment 1900 can also include that a power supply module 1926 is configured as executing the power supply of electronic equipment 1900
Management, a wired or wireless network interface 1950 is configured as electronic equipment 1900 being connected to network and an input is defeated
(I/O) interface 1958 out.Electronic equipment 1900 can be operated based on the operating system for being stored in memory 1932, such as
Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM or similar.
In the exemplary embodiment, a kind of non-volatile computer readable storage medium storing program for executing is additionally provided, for example including calculating
The memory 1932 of machine program instruction, above-mentioned computer program instructions can by the processing component 1922 of electronic equipment 1900 execute with
Complete the above method.
The disclosure can be system, method and/or computer program product.Computer program product may include computer
Readable storage medium storing program for executing, containing for making processor realize the computer-readable program instructions of various aspects of the disclosure.
Computer readable storage medium, which can be, can keep and store the tangible of the instruction used by instruction execution equipment
Equipment.Computer readable storage medium for example can be-- but it is not limited to-- storage device electric, magnetic storage apparatus, optical storage
Equipment, electric magnetic storage apparatus, semiconductor memory apparatus or above-mentioned any appropriate combination.Computer readable storage medium
More specific example (non exhaustive list) includes: portable computer diskette, hard disk, random access memory (RAM), read-only deposits
It is reservoir (ROM), erasable programmable read only memory (EPROM or flash memory), static random access memory (SRAM), portable
Compact disk read-only memory (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanical coding equipment, for example thereon
It is stored with punch card or groove internal projection structure and the above-mentioned any appropriate combination of instruction.Calculating used herein above
Machine readable storage medium storing program for executing is not interpreted that instantaneous signal itself, the electromagnetic wave of such as radio wave or other Free propagations lead to
It crosses the electromagnetic wave (for example, the light pulse for passing through fiber optic cables) of waveguide or the propagation of other transmission mediums or is transmitted by electric wire
Electric signal.
Computer-readable program instructions as described herein can be downloaded to from computer readable storage medium it is each calculate/
Processing equipment, or outer computer or outer is downloaded to by network, such as internet, local area network, wide area network and/or wireless network
Portion stores equipment.Network may include copper transmission cable, optical fiber transmission, wireless transmission, router, firewall, interchanger, gateway
Computer and/or Edge Server.Adapter or network interface in each calculating/processing equipment are received from network to be counted
Calculation machine readable program instructions, and the computer-readable program instructions are forwarded, for the meter being stored in each calculating/processing equipment
In calculation machine readable storage medium storing program for executing.
Computer program instructions for executing disclosure operation can be assembly instruction, instruction set architecture (ISA) instructs,
Machine instruction, machine-dependent instructions, microcode, firmware instructions, condition setup data or with one or more programming languages
The source code or object code that any combination is write, the programming language include the programming language-of object-oriented such as
Smalltalk, C++ etc., and conventional procedural programming languages-such as " C " language or similar programming language.Computer
Readable program instructions can be executed fully on the user computer, partly execute on the user computer, be only as one
Vertical software package executes, part executes on the remote computer or completely in remote computer on the user computer for part
Or it is executed on server.In situations involving remote computers, remote computer can pass through network-packet of any kind
It includes local area network (LAN) or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as benefit
It is connected with ISP by internet).In some embodiments, by utilizing computer-readable program instructions
Status information carry out personalized customization electronic circuit, such as programmable logic circuit, field programmable gate array (FPGA) or can
Programmed logic array (PLA) (PLA), the electronic circuit can execute computer-readable program instructions, to realize each side of the disclosure
Face.
Referring herein to according to the flow chart of the method, apparatus (system) of the embodiment of the present disclosure and computer program product and/
Or block diagram describes various aspects of the disclosure.It should be appreciated that flowchart and or block diagram each box and flow chart and/
Or in block diagram each box combination, can be realized by computer-readable program instructions.
These computer-readable program instructions can be supplied to general purpose computer, special purpose computer or other programmable datas
The processor of processing unit, so that a kind of machine is produced, so that these instructions are passing through computer or other programmable datas
When the processor of processing unit executes, function specified in one or more boxes in implementation flow chart and/or block diagram is produced
The device of energy/movement.These computer-readable program instructions can also be stored in a computer-readable storage medium, these refer to
It enables so that computer, programmable data processing unit and/or other equipment work in a specific way, thus, it is stored with instruction
Computer-readable medium then includes a manufacture comprising in one or more boxes in implementation flow chart and/or block diagram
The instruction of the various aspects of defined function action.
Computer-readable program instructions can also be loaded into computer, other programmable data processing units or other
In equipment, so that series of operation steps are executed in computer, other programmable data processing units or other equipment, to produce
Raw computer implemented process, so that executed in computer, other programmable data processing units or other equipment
Instruct function action specified in one or more boxes in implementation flow chart and/or block diagram.
The flow chart and block diagram in the drawings show system, method and the computer journeys according to multiple embodiments of the disclosure
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
One module of table, program segment or a part of instruction, the module, program segment or a part of instruction include one or more use
The executable instruction of the logic function as defined in realizing.In some implementations as replacements, function marked in the box
It can occur in a different order than that indicated in the drawings.For example, two continuous boxes can actually be held substantially in parallel
Row, they can also be executed in the opposite order sometimes, and this depends on the function involved.It is also noted that block diagram and/or
The combination of each box in flow chart and the box in block diagram and or flow chart, can the function as defined in executing or dynamic
The dedicated hardware based system made is realized, or can be realized using a combination of dedicated hardware and computer instructions.
The presently disclosed embodiments is described above, above description is exemplary, and non-exclusive, and
It is not limited to disclosed each embodiment.Without departing from the scope and spirit of illustrated each embodiment, for this skill
Many modifications and changes are obvious for the those of ordinary skill in art field.The selection of term used herein, purport
In the principle, practical application or technological improvement to the technology in market for best explaining each embodiment, or lead this technology
Other those of ordinary skill in domain can understand each embodiment disclosed herein.
Claims (10)
1. a kind of image processing method, which is characterized in that the described method includes:
Obtain the crucial point feature of the characteristic information of target object and the target object in image to be processed;
According to the crucial point feature, the positioning result of the key point of target object described in the image to be processed is determined;
According to the characteristic information of the target object, the positioning result and the crucial point feature, the target pair is determined
The object detection results of the motor unit of elephant.
2. the method according to claim 1, wherein according to the characteristic information of the target object, the positioning
As a result and the crucial point feature, the object detection results of the motor unit of the target object are determined, comprising:
According to the characteristic information of the target object and the positioning result, the office of the motor unit of the target object is determined
Portion's feature;
According to the local feature and the crucial point feature, the object detection results of the motor unit are determined.
3. method according to claim 1 or 2, which is characterized in that obtain the feature letter of target object in image to be processed
The crucial point feature of breath and the target object, comprising:
Feature extraction is carried out to the image to be processed, obtains the characteristic information of target object in the image to be processed;
Key point feature extraction is carried out to the characteristic information of the target object, obtains the crucial point feature of the target object.
4. according to the method described in claim 2, it is characterized in that, according to the characteristic information of the target object and described fixed
Position is as a result, determine the local feature of the motor unit of the target object, comprising:
According to the positional relationship of the key point of the central point and target object of the positioning result and motor unit, determine described in
The initial attention characteristic pattern of the motor unit of target object;
Process of convolution is carried out to the initial attention characteristic pattern, the attention characteristic pattern that obtains that treated;
According to treated attention characteristic pattern and the characteristic information of the target object, the part of the motor unit is determined
Feature.
5. a kind of image processing apparatus, which is characterized in that described device includes:
Module is obtained, it is special for obtaining the key point of the characteristic information of target object and the target object in image to be processed
Sign;
Positioning result determining module, for determining target object described in the image to be processed according to the crucial point feature
Key point positioning result;
Object detection results determining module, for according to the characteristic information of the target object, the positioning result and described
Crucial point feature determines the object detection results of the motor unit of the target object.
6. device according to claim 5, which is characterized in that the object detection results determining module includes:
First determines that submodule determines the mesh for the characteristic information and the positioning result according to the target object
Mark the local feature of the motor unit of object;
Second determines submodule, for determining the motor unit according to the local feature and the crucial point feature
Object detection results.
7. device according to claim 5 or 6, which is characterized in that the acquisition module includes:
First acquisition submodule obtains target in the image to be processed for carrying out feature extraction to the image to be processed
The characteristic information of object;
Second acquisition submodule carries out key point feature extraction for the characteristic information to the target object, obtains the mesh
Mark the crucial point feature of object.
8. device according to claim 6, which is characterized in that described first determines that submodule includes:
Third determines submodule, for according to the central point of the positioning result and motor unit and the key point of target object
Positional relationship, determine the initial attention characteristic pattern of the motor unit of the target object;
Third acquisition submodule, for carrying out process of convolution to the initial attention characteristic pattern, the attention that obtains that treated
Characteristic pattern;
4th determines submodule, for the characteristic information according to treated attention characteristic pattern and the target object, really
The local feature of the fixed motor unit.
9. a kind of electronic equipment characterized by comprising
Processor;
Memory for storage processor executable instruction;
Wherein, the processor is configured to: perform claim require any one of 1 to 4 described in method.
10. a kind of computer readable storage medium, is stored thereon with computer program instructions, which is characterized in that the computer
Method described in any one of Claims 1-4 is realized when program instruction is executed by processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810842970.4A CN109241835A (en) | 2018-07-27 | 2018-07-27 | Image processing method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810842970.4A CN109241835A (en) | 2018-07-27 | 2018-07-27 | Image processing method and device, electronic equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109241835A true CN109241835A (en) | 2019-01-18 |
Family
ID=65073111
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810842970.4A Pending CN109241835A (en) | 2018-07-27 | 2018-07-27 | Image processing method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109241835A (en) |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109685041A (en) * | 2019-01-23 | 2019-04-26 | 北京市商汤科技开发有限公司 | Image analysis method and device, electronic equipment and storage medium |
CN109815924A (en) * | 2019-01-29 | 2019-05-28 | 成都旷视金智科技有限公司 | Expression recognition method, apparatus and system |
CN109886335A (en) * | 2019-02-21 | 2019-06-14 | 厦门美图之家科技有限公司 | Disaggregated model training method and device |
CN109902631A (en) * | 2019-03-01 | 2019-06-18 | 北京视甄智能科技有限公司 | A kind of fast face detecting method based on image pyramid |
CN110147717A (en) * | 2019-04-03 | 2019-08-20 | 平安科技(深圳)有限公司 | A kind of recognition methods and equipment of human action |
CN110321849A (en) * | 2019-07-05 | 2019-10-11 | 腾讯科技(深圳)有限公司 | Image processing method, device and computer readable storage medium |
CN110530372A (en) * | 2019-09-26 | 2019-12-03 | 上海商汤智能科技有限公司 | Localization method, determining method of path, device, robot and storage medium |
CN110992406A (en) * | 2019-12-10 | 2020-04-10 | 张家港赛提菲克医疗器械有限公司 | Radiotherapy patient positioning rigid body registration algorithm based on region of interest |
CN111104925A (en) * | 2019-12-30 | 2020-05-05 | 上海商汤临港智能科技有限公司 | Image processing method, image processing apparatus, storage medium, and electronic device |
CN111144313A (en) * | 2019-12-27 | 2020-05-12 | 创新奇智(青岛)科技有限公司 | Face detection method and system based on multi-receptive-field dynamic combination |
CN111291804A (en) * | 2020-01-22 | 2020-06-16 | 杭州电子科技大学 | Multi-sensor time series analysis model based on attention mechanism |
CN111680646A (en) * | 2020-06-11 | 2020-09-18 | 北京市商汤科技开发有限公司 | Motion detection method and device, electronic device and storage medium |
CN111739097A (en) * | 2020-06-30 | 2020-10-02 | 上海商汤智能科技有限公司 | Distance measuring method and device, electronic equipment and storage medium |
CN111783724A (en) * | 2020-07-14 | 2020-10-16 | 上海依图网络科技有限公司 | Target object identification method and device |
CN111832338A (en) * | 2019-04-16 | 2020-10-27 | 北京市商汤科技开发有限公司 | Object detection method and device, electronic equipment and storage medium |
CN112036487A (en) * | 2020-08-31 | 2020-12-04 | 北京市商汤科技开发有限公司 | Image processing method and device, electronic equipment and storage medium |
CN112364773A (en) * | 2020-11-12 | 2021-02-12 | 西安电子科技大学 | Hyperspectral target detection method based on L1 regular constraint depth multi-instance learning |
CN113261011A (en) * | 2019-12-30 | 2021-08-13 | 商汤国际私人有限公司 | Image processing method and device, electronic equipment and storage medium |
CN113692563A (en) * | 2019-06-27 | 2021-11-23 | 苹果公司 | Modifying existing content based on target audience |
WO2022179412A1 (en) * | 2021-02-26 | 2022-09-01 | 华为技术有限公司 | Recognition method and electronic device |
WO2022213761A1 (en) * | 2021-04-08 | 2022-10-13 | 腾讯科技(深圳)有限公司 | Image processing method and apparatus, electronic device, and storage medium |
CN115311542A (en) * | 2022-08-25 | 2022-11-08 | 杭州恒胜电子科技有限公司 | Target detection method, device, equipment and medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106295566A (en) * | 2016-08-10 | 2017-01-04 | 北京小米移动软件有限公司 | Facial expression recognizing method and device |
CN107729835A (en) * | 2017-10-10 | 2018-02-23 | 浙江大学 | A kind of expression recognition method based on face key point region traditional characteristic and face global depth Fusion Features |
CN108268885A (en) * | 2017-01-03 | 2018-07-10 | 京东方科技集团股份有限公司 | Feature point detecting method, equipment and computer readable storage medium |
-
2018
- 2018-07-27 CN CN201810842970.4A patent/CN109241835A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106295566A (en) * | 2016-08-10 | 2017-01-04 | 北京小米移动软件有限公司 | Facial expression recognizing method and device |
CN108268885A (en) * | 2017-01-03 | 2018-07-10 | 京东方科技集团股份有限公司 | Feature point detecting method, equipment and computer readable storage medium |
CN107729835A (en) * | 2017-10-10 | 2018-02-23 | 浙江大学 | A kind of expression recognition method based on face key point region traditional characteristic and face global depth Fusion Features |
Non-Patent Citations (1)
Title |
---|
ZHIWEN SHAO等: "Deep Adaptive Attention for Joint Facial Action Unit Detection and Face Alignment", 《ARXIV》 * |
Cited By (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109685041A (en) * | 2019-01-23 | 2019-04-26 | 北京市商汤科技开发有限公司 | Image analysis method and device, electronic equipment and storage medium |
CN109815924A (en) * | 2019-01-29 | 2019-05-28 | 成都旷视金智科技有限公司 | Expression recognition method, apparatus and system |
CN109815924B (en) * | 2019-01-29 | 2021-05-04 | 成都旷视金智科技有限公司 | Expression recognition method, device and system |
CN109886335B (en) * | 2019-02-21 | 2021-11-26 | 厦门美图之家科技有限公司 | Classification model training method and device |
CN109886335A (en) * | 2019-02-21 | 2019-06-14 | 厦门美图之家科技有限公司 | Disaggregated model training method and device |
CN109902631A (en) * | 2019-03-01 | 2019-06-18 | 北京视甄智能科技有限公司 | A kind of fast face detecting method based on image pyramid |
CN109902631B (en) * | 2019-03-01 | 2021-02-26 | 北京视甄智能科技有限公司 | Rapid face detection method based on image pyramid |
CN110147717B (en) * | 2019-04-03 | 2023-10-20 | 平安科技(深圳)有限公司 | Human body action recognition method and device |
CN110147717A (en) * | 2019-04-03 | 2019-08-20 | 平安科技(深圳)有限公司 | A kind of recognition methods and equipment of human action |
CN111832338A (en) * | 2019-04-16 | 2020-10-27 | 北京市商汤科技开发有限公司 | Object detection method and device, electronic equipment and storage medium |
CN113692563A (en) * | 2019-06-27 | 2021-11-23 | 苹果公司 | Modifying existing content based on target audience |
CN110321849B (en) * | 2019-07-05 | 2023-12-22 | 腾讯科技(深圳)有限公司 | Image data processing method, device and computer readable storage medium |
CN110321849A (en) * | 2019-07-05 | 2019-10-11 | 腾讯科技(深圳)有限公司 | Image processing method, device and computer readable storage medium |
CN110530372A (en) * | 2019-09-26 | 2019-12-03 | 上海商汤智能科技有限公司 | Localization method, determining method of path, device, robot and storage medium |
CN110530372B (en) * | 2019-09-26 | 2021-06-22 | 上海商汤智能科技有限公司 | Positioning method, path determining device, robot and storage medium |
CN110992406A (en) * | 2019-12-10 | 2020-04-10 | 张家港赛提菲克医疗器械有限公司 | Radiotherapy patient positioning rigid body registration algorithm based on region of interest |
CN110992406B (en) * | 2019-12-10 | 2024-04-30 | 张家港赛提菲克医疗器械有限公司 | Radiotherapy patient positioning rigid body registration algorithm based on region of interest |
CN111144313A (en) * | 2019-12-27 | 2020-05-12 | 创新奇智(青岛)科技有限公司 | Face detection method and system based on multi-receptive-field dynamic combination |
WO2021135424A1 (en) * | 2019-12-30 | 2021-07-08 | 上海商汤临港智能科技有限公司 | Image processing method and apparatus, storage medium, and electronic device |
CN113261011A (en) * | 2019-12-30 | 2021-08-13 | 商汤国际私人有限公司 | Image processing method and device, electronic equipment and storage medium |
CN111104925A (en) * | 2019-12-30 | 2020-05-05 | 上海商汤临港智能科技有限公司 | Image processing method, image processing apparatus, storage medium, and electronic device |
CN111291804A (en) * | 2020-01-22 | 2020-06-16 | 杭州电子科技大学 | Multi-sensor time series analysis model based on attention mechanism |
CN111680646A (en) * | 2020-06-11 | 2020-09-18 | 北京市商汤科技开发有限公司 | Motion detection method and device, electronic device and storage medium |
CN111680646B (en) * | 2020-06-11 | 2023-09-22 | 北京市商汤科技开发有限公司 | Action detection method and device, electronic equipment and storage medium |
CN111739097A (en) * | 2020-06-30 | 2020-10-02 | 上海商汤智能科技有限公司 | Distance measuring method and device, electronic equipment and storage medium |
CN111783724A (en) * | 2020-07-14 | 2020-10-16 | 上海依图网络科技有限公司 | Target object identification method and device |
CN111783724B (en) * | 2020-07-14 | 2024-03-26 | 上海依图网络科技有限公司 | Target object identification method and device |
CN112036487A (en) * | 2020-08-31 | 2020-12-04 | 北京市商汤科技开发有限公司 | Image processing method and device, electronic equipment and storage medium |
CN112364773A (en) * | 2020-11-12 | 2021-02-12 | 西安电子科技大学 | Hyperspectral target detection method based on L1 regular constraint depth multi-instance learning |
WO2022179412A1 (en) * | 2021-02-26 | 2022-09-01 | 华为技术有限公司 | Recognition method and electronic device |
WO2022213761A1 (en) * | 2021-04-08 | 2022-10-13 | 腾讯科技(深圳)有限公司 | Image processing method and apparatus, electronic device, and storage medium |
CN115311542A (en) * | 2022-08-25 | 2022-11-08 | 杭州恒胜电子科技有限公司 | Target detection method, device, equipment and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109241835A (en) | Image processing method and device, electronic equipment and storage medium | |
CN109829501A (en) | Image processing method and device, electronic equipment and storage medium | |
CN106339680B (en) | Face key independent positioning method and device | |
CN110084775A (en) | Image processing method and device, electronic equipment and storage medium | |
CN105631403B (en) | Face identification method and device | |
CN110210535A (en) | Neural network training method and device and image processing method and device | |
CN109871883A (en) | Neural network training method and device, electronic equipment and storage medium | |
CN109522910A (en) | Critical point detection method and device, electronic equipment and storage medium | |
CN110348537A (en) | Image processing method and device, electronic equipment and storage medium | |
CN106548468B (en) | The method of discrimination and device of image definition | |
CN105608425B (en) | The method and device of classification storage is carried out to photo | |
CN109166107A (en) | A kind of medical image cutting method and device, electronic equipment and storage medium | |
CN106469302A (en) | A kind of face skin quality detection method based on artificial neural network | |
CN109784255A (en) | Neural network training method and device and recognition methods and device | |
CN109614613A (en) | The descriptive statement localization method and device of image, electronic equipment and storage medium | |
CN106295515B (en) | Determine the method and device of the human face region in image | |
CN109816764A (en) | Image generating method and device, electronic equipment and storage medium | |
CN110503023A (en) | Biopsy method and device, electronic equipment and storage medium | |
CN110443280A (en) | Training method, device and the storage medium of image detection model | |
CN109544560A (en) | Image processing method and device, electronic equipment and storage medium | |
CN109919300A (en) | Neural network training method and device and image processing method and device | |
CN108921117A (en) | Image processing method and device, electronic equipment and storage medium | |
CN110458218A (en) | Image classification method and device, sorter network training method and device | |
CN108010060A (en) | Object detection method and device | |
CN110532956A (en) | Image processing method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190118 |