CN109858333A

CN109858333A - Image processing method, device, electronic equipment and computer-readable medium

Info

Publication number: CN109858333A
Application number: CN201811564701.2A
Authority: CN
Inventors: 邰颖; 曹玮剑; 葛彦昊; 汪铖杰; 李季檩; 黄飞跃
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2018-12-20
Filing date: 2018-12-20
Publication date: 2019-06-07
Anticipated expiration: 2038-12-20
Also published as: CN109858333B

Abstract

This disclosure relates to a kind of image processing method, device, electronic equipment and computer-readable medium.This method comprises: detecting to target object in image, target area is extracted；The target area input feature vector point is extracted to the temperature figure that the target object is generated in model, the temperature figure is used to show that the pixel in the target area to be the probability of the characteristic point of the target object；Determine multiple validity feature points in the temperature figure；And the target feature point of the target object in described image is determined according to the multiple validity feature point.This disclosure relates to image processing method, device, electronic equipment and computer-readable medium, quickly can accurately position the facial key point in image.

Description

Image processing method, device, electronic equipment and computer-readable medium

Technical field

This disclosure relates to computer information processing field, in particular to a kind of image processing method, device, electronics Equipment and computer-readable medium.

Background technique

Facial critical point detection is also referred to as the crucial point location of face or face alignment, refers to given facial image, positions The critical zone locations of face face out, the detection technique including eyebrow, eyes, nose, mouth, face mask etc..Face closes The detection of key point is the focus of academic circles at present research, accurate detection of this technology based on face frame, after detecting face frame Precise positioning goes out eyes, nose, mouth and chin etc..Facial critical point detection is widely applied and recognition of face, face tracking, people The fields such as face animation and the modeling of 3D face.

In recent years, with personal and network video image explosive growth, full-automatic, efficient and powerful face is crucial Independent positioning method is highly desirable, face critical point detection method in the prior art, in timeliness and treating capacity not It is able to satisfy the needs of active user.

Therefore, it is necessary to a kind of new image processing method, device, electronic equipment and computer-readable mediums.

Above- mentioned information are only used for reinforcing the understanding to the background of the disclosure, therefore it disclosed in the background technology part It may include the information not constituted to the prior art known to persons of ordinary skill in the art.

Summary of the invention

In view of this, the disclosure provides a kind of image processing method, device, electronic equipment and computer-readable medium, energy Facial key point in enough quickly accurate positioning images.

Other characteristics and advantages of the disclosure will be apparent from by the following detailed description, or partially by the disclosure Practice and acquistion.

According to another aspect of the present disclosure, a kind of image processing method is proposed, this method comprises: to target object in image It is detected, extracts target area；The target area input feature vector point is extracted to the heat that the target object is generated in model Degree figure, the temperature figure are used to show that the pixel in the target area to be the probability of the characteristic point of the target object；Really Multiple validity feature points in the fixed temperature figure；And the target in described image is determined according to the multiple validity feature point The target feature point of object.

In one embodiment, this method further include: pass through training image and the corresponding accurate temperature of the training image Figure is trained to hourglass network model is stacked, and generates the feature point extraction model.

According to the another aspect of the disclosure, propose a kind of image processing apparatus, which includes: extraction module, for pair Target object is detected in image, extracts target area；Model computation module is used for the target area input feature vector point The temperature figure that the target object is generated in model is extracted, the temperature figure is for showing that the pixel in the target area is The probability of the characteristic point of the target object；Validity feature point module, for determining multiple validity features in the temperature figure Point；And feature point module, the target for determining the target object in described image according to the multiple validity feature point are special Sign point.

In one embodiment, the device further include: model training module, for passing through training image and the training figure As corresponding accurate temperature figure to stack hourglass network model be trained, generate the feature point extraction model.

According to the one side of the disclosure, a kind of electronic equipment is proposed, which includes: one or more processors； Storage device, for storing one or more programs；When one or more programs are executed by one or more processors, so that one A or multiple processors realize such as methodology above.

According to the one side of the disclosure, it proposes a kind of computer-readable medium, is stored thereon with computer program, the program Method as mentioned in the above is realized when being executed by processor.

It, can be quickly accurate according to the image processing method of the disclosure, device, electronic equipment and computer-readable medium Position the face key point in image.

It should be understood that the above general description and the following detailed description are merely exemplary, this can not be limited It is open.

Detailed description of the invention

Its example embodiment is described in detail by referring to accompanying drawing, above and other target, feature and the advantage of the disclosure will It becomes more fully apparent.Drawings discussed below is only some embodiments of the present disclosure, for the ordinary skill of this field For personnel, without creative efforts, it is also possible to obtain other drawings based on these drawings.

Fig. 1 is a kind of system block diagram of image processing method shown according to an exemplary embodiment.

Fig. 2 is a kind of application scenarios schematic diagram of image processing method shown according to an exemplary embodiment.

Fig. 3 is a kind of application scenarios schematic diagram of image processing method shown according to an exemplary embodiment.

Fig. 4 is a kind of flow chart of image processing method shown according to an exemplary embodiment.

Fig. 5 is a kind of schematic diagram of image processing method shown according to an exemplary embodiment.

Fig. 6 is a kind of schematic diagram of image processing method shown according to an exemplary embodiment.

Fig. 7 is a kind of flow chart of the image processing method shown according to another exemplary embodiment.

Fig. 8 is a kind of schematic diagram of image processing method shown according to an exemplary embodiment.

Fig. 9 is a kind of block diagram of image processing apparatus shown according to an exemplary embodiment.

Figure 10 is a kind of block diagram of the image processing apparatus shown according to another exemplary embodiment.

Figure 11 is the block diagram of a kind of electronic equipment shown according to an exemplary embodiment.

Figure 12 is that a kind of computer readable storage medium schematic diagram is shown according to an exemplary embodiment.

Specific embodiment

Example embodiment is described more fully with reference to the drawings.However, example embodiment can be real in a variety of forms It applies, and is not understood as limited to embodiment set forth herein；On the contrary, thesing embodiments are provided so that the disclosure will be comprehensively and complete It is whole, and the design of example embodiment is comprehensively communicated to those skilled in the art.Identical appended drawing reference indicates in figure Same or similar part, thus repetition thereof will be omitted.

In addition, described feature, structure or characteristic can be incorporated in one or more implementations in any suitable manner In example.In the following description, many details are provided to provide and fully understand to embodiment of the disclosure.However, It will be appreciated by persons skilled in the art that can with technical solution of the disclosure without one or more in specific detail, Or it can be using other methods, constituent element, device, step etc..In other cases, it is not shown in detail or describes known side Method, device, realization or operation are to avoid fuzzy all aspects of this disclosure.

Block diagram shown in the drawings is only functional entity, not necessarily must be corresponding with physically separate entity. I.e., it is possible to realize these functional entitys using software form, or realized in one or more hardware modules or integrated circuit These functional entitys, or these functional entitys are realized in heterogeneous networks and/or processor device and/or microcontroller device.

Flow chart shown in the drawings is merely illustrative, it is not necessary to including all content and operation/step, It is not required to execute by described sequence.For example, some operation/steps can also decompose, and some operation/steps can close And or part merge, therefore the sequence actually executed is possible to change according to the actual situation.

It will be understood by those skilled in the art that attached drawing is the schematic diagram of example embodiment, module or process in attached drawing Necessary to not necessarily implementing the disclosure, therefore it cannot be used for the protection scope of the limitation disclosure.

As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105. Network 104 between terminal device 101,102,103 and server 105 to provide the medium of communication link.Network 104 can be with Including various connection types, such as wired, wireless communication link or fiber optic cables etc..

User can be used terminal device 101,102,103 and be interacted by network 104 with server 105, to receive or send out Send message etc..Various telecommunication customer end applications, such as the application of shopping class, net can be installed on terminal device 101,102,103 The application of page browsing device, searching class application, instant messaging tools, mailbox client, social platform software etc..

Terminal device 101,102,103 can be the various electronic equipments with display screen and supported web page browsing, packet Include but be not limited to smart phone, tablet computer, pocket computer on knee and desktop computer etc..

Server 105 can be to provide the server of various services, such as utilize terminal device 101,102,103 to user The website of captured picture or video provides the background server supported.Server 105 can to the picture received or The data such as video image analyze etc. processing, and by processing result (such as face feature point or video image in picture In each frame face feature point) feed back to terminal device.

User can shoot picture or video by terminal device 101,102,103, and terminal device 101,102,103 can example Such as target object in image is detected, extracts target area；Terminal device 101,102,103 can be for example by the target area Domain input feature vector point extracts the temperature figure that the target object is generated in model, and the temperature figure is for showing the target area In pixel be the target object characteristic point probability；Terminal device 101,102,103 can for example determine the temperature Multiple validity feature points in figure；And terminal device 101,102,103 can be determined for example according to the multiple validity feature point The target feature point of target object in described image.Terminal device 101,102,103 can also be for example by characteristic point to user The picture or video of shooting are further processed, can for example carry out recognition of face, or carry out real-time makeups processing etc. Deng.

Terminal device 101,102,103 can also for example pass through the corresponding accurate temperature figure of training image and the training image It is trained to hourglass network model is stacked, generates the feature point extraction model.

User can shoot picture or video by terminal device 101,102,103, and terminal device 101,102,103 can example Such as picture or video are forwarded in server 105, server 105 can for example detect target object in image, mention Take target area；The target area input feature vector point can be extracted for example in model and generate the target object by server 105 Temperature figure, the temperature figure is used to show the general of the characteristic point that pixel in the target area is the target object Rate；Server 105 can for example determine multiple validity feature points in the temperature figure；And server 105 can be for example according to institute State the target feature point that multiple validity feature points determine the target object in described image.Server 105 can also for example pass through spy The picture or video that sign point shoot user are further processed, and can for example carry out recognition of face, or carry out in real time Makeups processing etc..Server 105 can also for example by characteristic point return terminal device 101,102,103, terminal device 101, 102,103 pictures or video that can be also for example shot by characteristic point to user are further processed.

Server 105 can also be for example by the corresponding accurate temperature figure of training image and the training image to stacking hourglass Network model is trained, and generates the feature point extraction model.

Server 105 can be the server of an entity, also may be, for example, multiple server compositions, needs to illustrate It is that image processing method provided by the embodiment of the present disclosure can be held by server 105 and/or terminal device 101,102,103 Row, correspondingly, image processing apparatus can be set in server 105 and/or terminal device 101,102,103.And it is supplied to User carries out picture or the request end of video input is normally in terminal device 101,102,103.

It according to the image processing method of the disclosure, can apply in various scenes, by by the target area in image It inputs to have in the feature point extraction model for stacking hourglass network structure (Stacked Hourglass Network) and generates temperature Figure, then determines the mode of the characteristic point in image according to 1 validity feature points in the temperature figure of output, can be fast Face key point in fast accurate positioning image.

It, can be with by there is the feature point extraction model for stacking hourglass network structure according to the image processing method of the disclosure Capture effective information of the image under different scale.In addition, by stacking multiple hourglass configurations, which passes through from slightly to essence Form estimates the temperature figure of characteristic point in face or human body image, estimates the coordinate of characteristic point in turn from characteristic pattern.In turn The image processing method based on the disclosure is allowed to accurately estimate the characteristic point in facial image.

Fig. 2 is a kind of application scenarios schematic diagram of image processing method shown according to an exemplary embodiment.Such as Fig. 2 institute Show, user passes through camera shooting equipment in the user terminal and has taken personal picture, and terminal device determines the characteristic point (example in image Such as facial contour and eyes, nose, the key point at the positions such as mouth).Terminal device can be for example by features described above point in user It is shown in terminal device, so that user clicks, then according to the user's choice, according to different characteristic points to the face of user Portion's image is handled, can be for example, being the lip-region filling in user images according to the selection to lip characteristic point of user Makeup effect.

Terminal device can also be for example, according to the pre-set parameter of user, after obtaining the characteristic point in picture, automatically By the image of pre-set parameter processing user, and then whole makeup effect is shown in terminal device.

Fig. 3 is a kind of application scenarios schematic diagram of image processing method shown according to an exemplary embodiment.Such as Fig. 3 institute Show, user passes through camera shooting equipment in the user terminal and has taken personal video image, and video image is transmitted to by terminal device In remote server, remote server determine in above-mentioned video image characteristic point (such as facial contour and eyes, nose, The key point at the positions such as mouth).Then remote server can return to features described above point in the terminal device of user, so as to user Further operated.

When video image is transmitted to remote server by terminal device, terminal device can also be by preset parameter one And be uploaded to remote server, remote server according to preset parameter, binding characteristic point to the video image of user into The further processing of row may be, for example, video image addition special efficacy, and then remote server returns to the video after processing eventually At end equipment, so that by treated, video image is shown at user.

It is noted that though the disclosure is to extract the characteristic point in user's face image to handle user's face image Illustratively illustrated, still, the image processing method in the disclosure also accommodates in other image procossing scenes, this public affairs It opens and is not limited.

The image processing method in the disclosure will be described in detail by specific embodiment below:

Fig. 4 is a kind of flow chart of image processing method shown according to an exemplary embodiment.Image processing method 40 Including at least step S402 to S408.

As shown in figure 4, detecting in S402 to target object in image, target area is extracted.Image may be, for example, Facial image comprising face, target area may be, for example, rectangular area.

In one embodiment, by Face datection algorithm by extracting rectangle frame area image in image；To rectangle frame area Area image zooms to predetermined size；Pixel in rectangle frame region after scaling is normalized to generate target area Domain.

It wherein, can be to the figure of arbitrary size when by Face datection algorithm by extracting rectangle frame area image in image As user's face detection algorithm detects face location, and export face frame (rectangle frame)；Robust real-time face can specifically be passed through Detection and analysis technology (Robust real-time face detection) detects the position of face, and extracts the people of rectangle Face facial area.It is noted that when taking out face part according to face frame, it can be constant according to face frame center, side It is long to expand 1.3 times to determine target area.Wherein, 1.3 times are the empirical value that can obtain preferable face rectangle frame, in difference Usage scenario and requirement under, different parameters can be set to obtain rectangle frame region.

It wherein, can be by the rectangle frame area zoom of taking-up extremely when zooming to predetermined size to rectangle frame area image 256*256 size.Wherein, 256*256 size is the empirical value that can obtain preferable face rectangle frame again, in different uses Under scene and requirement, different sizes can be set.

It wherein, will when being normalized the pixel in the rectangle frame region after scaling to generate target area The pixel value of image in rectangle frame region is normalized to [0,1] section.

In S404, the target area input feature vector point is extracted to the temperature figure that the target object is generated in model, The temperature figure is used to show that the pixel in the target area to be the probability of the characteristic point of the target object.Wherein, institute Stating feature point extraction module can be stacking hourglass network model.Stacking hourglass network model is a kind of applied to prediction human body attitude Network model, stack hourglass network network structure can capture and integral image all scales information.Based on to being used for The visualization for obtaining the pondization and subsequent up-sampling step of network final output, is generally known as hourglass for the structure.

Hourglass configuration in the feature point extraction model of the disclosure is that one kind is simple, and the design of minimum can capture All these features are simultaneously bonded them together to export and be predicted by pixel.By continuously by multiple sand in the disclosure Stripping pattern block puts together to extend single hourglass end-to-endly.Across the scale repetition of this hourglass configuration from bottom to top, from upper and Lower operation realizes duplicate two-way deduction, substantially increases the model to the performance of image procossing.

Fig. 5 is a kind of schematic diagram of image processing method shown according to an exemplary embodiment.As shown in figure 5, can incite somebody to action Target area input stacks in first hourglass configuration of hourglass network model, again by the output data of first hourglass configuration It inputs in second hourglass configuration, then in being input to third hourglass configuration, temperature figure of the output with characteristic point label (Heatmap).In the 5 embodiment of figure 5, only with the stacking hourglass network model of the hourglass configuration containing there are three as an example, to this The calculating process of network model is described, and in practical application scene, can select according to the actual situation containing multiple hourglasses The stacking hourglass network model of structure is calculated, and the disclosure is not limited.

Fig. 6 is a kind of schematic diagram of image processing method shown according to an exemplary embodiment.It is husky that Fig. 5 illustrates stacking The specific schematic diagram for leaking the hourglass configuration in network model, as shown in figure 5, the design of hourglass is as follows: convolution sum maximum pond Layer is used for characteristic processing to extremely low resolution.In each maximum compilation steps, network branches simultaneously pre- collect resolution original More convolution are applied under rate.After reaching lowest resolution, network starts top-down up-sampling and the spy across scale Sign combination.In order to resolution ratio be up-sampled to following arest neighbors, then according to member on information taken to two adjacent resolution ratio Element two groups of features of addition.

It in conjunction with the displaying of Fig. 5 and Fig. 6, is further described below: target area being inputted and stacks the of hourglass network model In one hourglass configuration, first hourglass configuration can compress target area for the needs of operation, thus by target area The output data of first hourglass configuration is inputted second hourglass configuration to compared with low-res by the characteristic processing of area image again In, second hourglass configuration can compress again target area for the needs of operation.The topological structure of hourglass is symmetrical , therefore there is a corresponding layer to rise on each layer of appearance on the way.After by several wheel compressions, obtain After the characteristic point of target area, and pass through the symmetrical quantity of wheel number with front compression, several continuous hourglass configurations will be upper The target area compressed in text is extended, and to obtain scheduled image in different resolution, exported.Stack hourglass network mould The output of type is multiple thermal maps, wherein each given thermal map, stacking hourglass network model can be predicted in each pixel Locate probability existing for characteristic point.It wherein, include multiple in each hourglass configuration for carrying out the residual error of specific network query function Block (residual block).

In one embodiment, target area is inputted and is stacked in hourglass network model, be based on small lot gradient descent method (mini-batch) forward calculation is carried out, temperature figure is generated.The concrete thought of small lot gradient descent method is to update each ginseng It is all updated using a part of sample when number, the time can calculated and calculating the solution of one compromise of acquisition in noise.When So, batch gradient descent method (Batch Gradient Descent) or stochastic gradient descent method also may be selected Other gradient descent methods such as (Stochastic Gradient Descent) carry out the forward calculation of this step.

In S406, multiple validity feature points in the temperature figure are determined.Wherein, validity feature point is that pixel value is not The point of O.It specifically can be such as: determining the extreme point in temperature figure；Determine at least four at random in the preset range apart from extreme point A validity feature point.

In one embodiment, extreme point is that maximum point is responded in temperature figure.Temperature figure is according to two-dimensional Gaussian function It generates, the calculation formula of temperature figure is as follows:

For the face key point coordinate of label.

It is noted that face key point coordinateIt not will do it to round up and become integer behaviour Make, fractional part can be retained, avoid loss of significance.It can be by solving the binary quadratic equation of temperature figure, and then from the heat of recovery Characteristic point coordinate is accurately estimated in degree figure.

By characteristic point coordinate, in the temperature figure of acquisition, several validity feature points are determined by random manner, with Continue after an action of the bowels and calculates.

In S408, the target feature point of the target object in described image is determined according to the multiple validity feature point. It can be such as: the position of the multiple validity feature point and pixel value being substituted into temperature figure equation, and solve the temperature figure side Journey is to determine the target feature point of the target object in described image；Wherein, the temperature figure equation is dimensional Gaussian equation.

In one embodiment, since the two-dimensional Gaussian function for generating temperature figure is a dihydric phenol about key point Equation.In theory, the equation can be solved as long as the position and pixel value that can find 4 validity feature points, to recover Accurate key point coordinate.It therefore, can be in the temperature figure for the temperature figure that each is obtained by stacking hourglass network model Determine that Gaussian function responds maximum point, responding 4 points of stochastical sampling near maximum point, according to the position of this 4 points and It after pixel value substitutes into two-dimensional Gaussian function, solves the equation and obtains and then obtain accurately putting coordinate, it is extensive will finally to put coordinate amplification Again to original image size.

According to the image processing method of the disclosure, mentioned by the characteristic point that the stacking hourglass network in deep learning field generates Modulus type can capture effective information of the image under different scale.In addition, by stacking multiple hourglass configurations, which passes through From the temperature figure for slightly estimating characteristic point in face or human body image to the form of essence, from characteristic pattern and then characteristic point is estimated Coordinate.Image processing method based on the disclosure can accurately estimate the characteristic point in facial image.

According to the image processing method of the disclosure, coordinate points can be retained to temperature map generalization from the characteristic point coordinate of label Fractional part, loss of significance will not be led to the problem of；And have closed solutions from the temperature figure of recovery estimation characteristic point coordinate, it can be with one The exact position for walking estimation point in place, hardly introduces additional computation complexity while promoting precision, thus significantly Improve the efficiency of final face feature point identification.

Video data mark needs a large amount of human input, and the image processing method of the disclosure can provide high-precision half Automatic video frequency data mark generates, and human input is effectively reduced.In actual scene, this method can be applied to makeups and dynamic effect etc. Corresponding function in short video product,

It will be clearly understood that the present disclosure describes how to form and use particular example, but the principle of the disclosure is not limited to These exemplary any details.On the contrary, the introduction based on disclosure disclosure, these principles can be applied to many other Embodiment.

Fig. 7 is a kind of flow chart of the image processing method shown according to another exemplary embodiment.Image shown in Fig. 7 Processing method 70 " is extracted the target area input feature vector point described in generating in model to S404 in process shown in Fig. 4 The detailed description of " feature point extraction model " building process mentioned in the temperature figure of target object ".It specifically can be such as: passing through instruction Practice image and the corresponding accurate temperature figure of the training image and be trained to hourglass network model is stacked, generates the characteristic point Extract model.

As shown in fig. 7, having the target area of target object by extracting in the training image in S702.

In one embodiment, by Face datection algorithm by extracting rectangle frame area image in multiple images respectively；It will Multiple rectangle frame area images zoom to predetermined size；Place is normalized in pixel in multiple rectangle frame regions after scaling Reason is to generate multiple target areas.

Wherein, when by Face datection algorithm by extracting rectangle frame area image in image, robust reality can specifically be passed through When Face datection analytical technology (Robust real-time face detection) detect the position of face, and extract square The face facial area of shape.

In S704, the characteristic point in the target area is determined.It can be for example, determining multiple characteristic points in the target area Corresponding multiple accurate coordinates；And multiple temperature figures are generated by multiple accurate coordinates.

In a real-time example, multiple features in multiple target areas can be for example determined by way of manually marking Point.In another embodiment, it can also for example be determined respectively in multiple target areas by other algorithms for obtaining characteristic point Multiple characteristic points.

As above, temperature figure is generated according to two-dimensional Gaussian function, and the calculation formula of temperature figure is as follows:

Wherein,For the face key point coordinate of label.

It is noted that in the embodiments of the present disclosure, the face key point coordinate of acquisitionNo It will do it to round up and become integer operation, fractional part can be retained, avoid loss of significance.It is generated by multiple accurate coordinates multiple Temperature figure.

In S706, accurate temperature figure is generated by the characteristic point.It is raw respectively by the mark to each characteristic point At corresponding multiple accurate temperature figures.

Fig. 8 is a kind of schematic diagram of image processing method shown according to an exemplary embodiment.As shown in figure 8, existing Stacking hourglass network the problem of having loss of significance when handling high definition facial image.Specifically, in original stacking hourglass In network, retain integer from taking to round up when realizing conversion of the training characteristics point coordinate for passing through label to accurate temperature figure Part can be rounding to (30,77) when generating the label of accurate temperature figure, then such as the midpoint Fig. 8 (29.55,77.38) Existing stacking hourglass network can centered on (30,77) point, generate accurate temperature figure further according to two-dimensional Gaussian function.

The specific image for stacking the temperature figure of hourglass network model output is also illustratively illustrated in Fig. 8, it is husky stacking It leaks in network model, indicates that the pixel is the characteristic point of target object with the color intensity of each pixel on temperature figure Probability, it is bigger that color represents more by force the probability that the pixel is characterized a little.

This generating mode can abandon fractional part, to generate loss of significance.Especially original stacking hourglass network Output generally can it is smaller than original input picture (such as output be 64*64, input as 256*256 size), from output put again Error can be further amplified in the big process to input size.

In order to solve the problems, such as existing method, from the characteristic point coordinate of label to temperature figure in the embodiment originally opened Fractional part is remained when conversion, and temperature figure is generated directly centered on point (29.55,77.38).

In S708, it is trained with the accurate temperature figure to hourglass network model is stacked by the target area, Generate the feature point extraction model.It can stack in hourglass network structure, generate multiple first for example, multiple target areas are inputted Beginning temperature figure；Multiple initial temperature figures are compared with multiple accurate temperature figures and determine multiple error parameters；And based on more A error parameter and optimization algorithm are updated the parameter for stacking hourglass network structure, to obtain feature point extraction model.

It include convolution sum maximum pond layer in hourglass configuration, convolution sum maximum pond layer is used for characteristic processing to extremely low solution Analysis degree.In each maximum compilation steps, network branches simultaneously pre- are collected under resolution ratio using more convolution original.Reaching most After low resolution, network starts top-down up-sampling and combines across the feature of scale.In order to by information taken to two In a adjacent resolution ratio, resolution ratio is up-sampled to following arest neighbors, then adds two groups of features according to element.

It is that multiple target areas can be inputted respectively to first hourglass for stacking hourglass network model in embodiment at one In structure, first hourglass configuration can compress target area for the needs of operation, thus by target area image Characteristic processing inputs the output data of first hourglass configuration in second hourglass configuration, second again to compared with low-res A hourglass configuration can compress again target area for the needs of operation.

Due to stacking in hourglass network architecture, the topological structure of each hourglass is symmetrical, therefore for occurring Each layer on the way has a corresponding layer to rise.After by several wheel compressions, the feature of target area is obtained After point, and pass through the symmetrical quantity of wheel number with front compression, the target that several continuous hourglass configurations will above be compressed Region is extended, and to obtain scheduled image in different resolution, exported.In the training for stacking hourglass network architecture In the process, the output for stacking hourglass network model is multiple initial thermal maps, wherein each given initial thermal map, stacks hourglass Network model can predict the probability existing for characteristic point at each pixel.

In one embodiment, multiple initial temperature figures are compared with multiple accurate temperature figures and determine multiple error ginsengs Number includes: to be compared multiple initial temperature figures with multiple accurate temperature figures by European loss function to determine multiple error ginsengs Number.Forward calculation can be carried out based on mini-batch in such as training process, the temperature figure being restored, to initial temperature figure and Accurate temperature figure uses Euclidean loss function (Euclidean Loss), obtains temperature figure image error.

Wherein, when multiple initial temperature figures to be compared to determining multiple error parameters with multiple accurate temperature figures, base Model parameter update, continuous iteration, until model are carried out by Adam (optimization algorithm) algorithm in temperature figure image error Convergence.In the training process, hourglass network model is entirely stacked to be trained using training method end to end.

It is noted that the various algorithms described in above, exemplary only explanation, in actual stacking hourglass During network structure training, supplemental training can be carried out according to the different different optimization algorithms that needs to choose, the disclosure is not As limit.

Loss function (loss function) refers to a kind of by an event (element in a sample space) A kind of function being mapped on the real number of expression economic cost relevant to its event or opportunity cost.For example, initial During temperature figure and accurate temperature figure are compared, various types of loss functions can be selected and obtain error, may be, for example: Logarithm loss function, quadratic loss function, the loss function of figure penalties function, etc. other forms.

Optimization algorithm can accelerate the model calculating convergence rate during model training, to obtain one more preferable smaller Loss function value.For example, temperature figure image can be based on by first-order optimization method, second order optimization algorithm or gradient descent method Error carry out model parameter update.

According to the image processing method of the disclosure, by improving the conversion accuracy of characteristic point coordinate and temperature figure, to mention Rise the performance that network is registrated on high definition facial image.By improving the conversion accuracy of characteristic point coordinate and temperature figure, can compare Original stacking hourglass model obtains better face registration performance, to preferably be applied to the application such as subsequent makeups, thin face Scene.

It will be appreciated by those skilled in the art that realizing that all or part of the steps of above-described embodiment is implemented as being executed by CPU Computer program.When the computer program is executed by CPU, above-mentioned function defined by the above method that the disclosure provides is executed Energy.Program can store in a kind of computer readable storage medium, which can be read-only memory, disk or CD etc..

Further, it should be noted that above-mentioned attached drawing is only the place according to included by the method for disclosure exemplary embodiment Reason schematically illustrates, rather than limits purpose.It can be readily appreciated that above-mentioned processing shown in the drawings is not indicated or is limited at these The time sequencing of reason.In addition, be also easy to understand, these processing, which can be, for example either synchronously or asynchronously to be executed in multiple modules.

Following is embodiment of the present disclosure, can be used for executing embodiments of the present disclosure.It is real for disclosure device Undisclosed details in example is applied, embodiments of the present disclosure is please referred to.

Fig. 9 is a kind of block diagram of image processing apparatus shown according to an exemplary embodiment.Image processing apparatus 90 wraps It includes: extraction module 902, model computation module 904, validity feature point module 906 and feature point module 908.

Extraction module 902 is used for by extracting target from images region；It specifically can be for example, by Face datection algorithm by image Middle extraction rectangle frame area image；Predetermined size is zoomed to rectangle frame area image；It will be in the rectangle frame region after scaling Pixel is normalized to generate target area.

Model computation module 904, which is used to extract the target area input feature vector point in model, generates the target object Temperature figure, pixel on the temperature figure indicates the probability of characteristic point that it is the target object by color intensity. Feature point extraction model, which can have, stacks hourglass network architecture, and stacking hourglass network model is a kind of applied to prediction human body The network model of posture, the network structure for stacking hourglass network can capture the information of simultaneously all scales of integral image.Based on pair For obtaining the visualization of the pondization and subsequent up-sampling step of network final output, the structure is generally known as hourglass.It can incite somebody to action Target area input stacks in first hourglass configuration of hourglass network model, again by the output data of first hourglass configuration It inputs in second hourglass configuration, and so on, until by exporting the heat with characteristic point label in the last one hourglass configuration Degree figure.

Validity feature point module 906 is used to determine multiple validity feature points in the temperature figure；Wherein, validity feature point It is the point that pixel value is not O.It specifically can be such as: determining the extreme point in temperature figure；In the preset range apart from extreme point with Machine determines 1 validity feature points.

Feature point module 908 is used to determine the target of the target object in described image according to the multiple validity feature point Characteristic point.It can be such as: image is determined by the position and pixel value of 1 validity feature points based on two-dimensional Gaussian function Characteristic point.

Image can capture by the stacking hourglass network in deep learning field according to the image processing apparatus of the disclosure Effective information under different scale.In addition, by stacking multiple hourglass configurations, which passes through from slightly to estimation by way of essence The temperature figure (Heatmap) of characteristic point in face or human body image estimates the coordinate of characteristic point in turn from characteristic pattern.It is based on The image processing method of the disclosure can accurately estimate the characteristic point in facial image.

Figure 10 is a kind of block diagram of the image processing apparatus shown according to another exemplary embodiment.As shown in Figure 10, scheme As processing unit 1000 is on the basis of image processing apparatus 90 further include: model training module 1002.

Model training module 1002 is used for through training image and the corresponding accurate temperature figure of the training image to stacking Hourglass network model is trained, and generates the feature point extraction model.It specifically can be such as: more by being extracted in multiple images respectively A target area；Determine multiple characteristic points in multiple target areas；Multiple accurate temperature figures are generated by multiple characteristic points；With And the stacking hourglass network model is trained with the accurate temperature figure by the target area, generate the feature Point extracts model.

According to the image processing apparatus of the disclosure, by improving the conversion accuracy of characteristic point coordinate and temperature figure, to mention Rise the performance that network is registrated on high definition facial image.By improving the conversion accuracy of characteristic point coordinate and temperature figure, can compare Original stacking hourglass model obtains better face registration performance, to preferably be applied to the application such as subsequent makeups, thin face Scene.

The electronic equipment 1100 of this embodiment according to the disclosure is described referring to Figure 11.The electricity that Figure 11 is shown Sub- equipment 1100 is only an example, should not function to the embodiment of the present disclosure and use scope bring any restrictions.

As shown in figure 11, electronic equipment 1100 is showed in the form of universal computing device.The component of electronic equipment 1100 can To include but is not limited to: at least one processing unit 1110, at least one storage unit 1120, connection different system components (packet Include storage unit 1120 and processing unit 1110) bus 1130, display unit 1140 etc..

Wherein, the storage unit is stored with program code, and said program code can be held by the processing unit 1110 Row, so that the processing unit 1110 executes described in this specification above-mentioned electronic prescription circulation processing method part according to this The step of disclosing various illustrative embodiments.For example, the processing unit 1110 can be executed such as Fig. 4, step shown in fig. 7 Suddenly.

The storage unit 1120 may include the readable medium of volatile memory cell form, such as random access memory Unit (RAM) 11201 and/or cache memory unit 11202 can further include read-only memory unit (ROM) 11203。

The storage unit 1120 can also include with one group of (at least one) program module 11205 program/it is practical Tool 11204, such program module 11205 includes but is not limited to: operating system, one or more application program, other It may include the realization of network environment in program module and program data, each of these examples or certain combination.

Bus 1130 can be to indicate one of a few class bus structures or a variety of, including storage unit bus or storage Cell controller, peripheral bus, graphics acceleration port, processing unit use any bus structures in a variety of bus structures Local bus.

Electronic equipment 1100 can also (such as keyboard, sensing equipment, bluetooth be set with one or more external equipments 1100 ' It is standby etc.) communication, the equipment interacted with the electronic equipment 1100 can be also enabled a user to one or more to be communicated, and/or with Enable the electronic equipment 1100 with it is one or more of the other calculating equipment communicated any equipment (such as router, adjust Modulator-demodulator etc.) communication.This communication can be carried out by input/output (I/O) interface 1150.Also, electronic equipment 1100 can also by network adapter 1160 and one or more network (such as local area network (LAN), wide area network (WAN) and/ Or public network, such as internet) communication.Network adapter 1160 can pass through the other of bus 1130 and electronic equipment 1100 Module communication.It should be understood that although not shown in the drawings, other hardware and/or software mould can be used in conjunction with electronic equipment 1100 Block, including but not limited to: microcode, device driver, redundant processing unit, external disk drive array, RAID system, tape Driver and data backup storage system etc..

Through the above description of the embodiments, those skilled in the art is it can be readily appreciated that example described herein is implemented Mode can also be realized by software realization in such a way that software is in conjunction with necessary hardware.Therefore, according to the disclosure The technical solution of embodiment can be embodied in the form of software products, which can store non-volatile at one Property storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.) in or network on, including some instructions are so that a calculating Equipment (can be personal computer, server or network equipment etc.) executes the above method according to disclosure embodiment.

Figure 12 schematically shows a kind of computer readable storage medium schematic diagram in disclosure exemplary embodiment.

With reference to shown in Figure 12, the program product for realizing the above method according to embodiment of the present disclosure is described 1200, can using portable compact disc read only memory (CD-ROM) and including program code, and can in terminal device, Such as it is run on PC.However, the program product of the disclosure is without being limited thereto, in this document, readable storage medium storing program for executing can be with To be any include or the tangible medium of storage program, the program can be commanded execution system, device or device use or It is in connection.

Described program product can be using any combination of one or more readable mediums.Readable medium can be readable letter Number medium or readable storage medium storing program for executing.Readable storage medium storing program for executing for example can be but be not limited to electricity, magnetic, optical, electromagnetic, infrared ray or System, device or the device of semiconductor, or any above combination.The more specific example of readable storage medium storing program for executing is (non exhaustive List) include: electrical connection with one or more conducting wires, portable disc, hard disk, random access memory (RAM), read-only Memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read only memory (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.

The computer readable storage medium may include in a base band or the data as the propagation of carrier wave a part are believed Number, wherein carrying readable program code.The data-signal of this propagation can take various forms, including but not limited to electromagnetism Signal, optical signal or above-mentioned any appropriate combination.Readable storage medium storing program for executing can also be any other than readable storage medium storing program for executing Readable medium, the readable medium can send, propagate or transmit for by instruction execution system, device or device use or Person's program in connection.The program code for including on readable storage medium storing program for executing can transmit with any suitable medium, packet Include but be not limited to wireless, wired, optical cable, RF etc. or above-mentioned any appropriate combination.

Can with any combination of one or more programming languages come write for execute the disclosure operation program Code, described program design language include object oriented program language-Java, C++ etc., further include conventional Procedural programming language-such as " C " language or similar programming language.Program code can be fully in user It calculates and executes in equipment, partly executes on a user device, being executed as an independent software package, partially in user's calculating Upper side point is executed on a remote computing or is executed in remote computing device or server completely.It is being related to far Journey calculates in the situation of equipment, and remote computing device can pass through the network of any kind, including local area network (LAN) or wide area network (WAN), it is connected to user calculating equipment, or, it may be connected to external computing device (such as utilize ISP To be connected by internet).

Above-mentioned computer-readable medium carries one or more program, when said one or multiple programs are by one When the equipment executes, so that the computer-readable medium implements function such as: being detected to target object in image, extract mesh Mark region；The target area input feature vector point is extracted to the temperature figure that the target object is generated in model, the temperature figure For showing that the pixel in the target area is the probability of the characteristic point of the target object；It determines in the temperature figure Multiple validity feature points；And the target signature of the target object in described image is determined according to the multiple validity feature point Point.

It will be appreciated by those skilled in the art that above-mentioned each module can be distributed in device according to the description of embodiment, it can also Uniquely it is different from one or more devices of the present embodiment with carrying out corresponding change.The module of above-described embodiment can be merged into One module, can also be further split into multiple submodule.

By the description of above embodiment, those skilled in the art is it can be readily appreciated that example embodiment described herein It can also be realized in such a way that software is in conjunction with necessary hardware by software realization.Therefore, implemented according to the disclosure The technical solution of example can be embodied in the form of software products, which can store in a non-volatile memories In medium (can be CD-ROM, USB flash disk, mobile hard disk etc.) or on network, including some instructions are so that a calculating equipment (can To be personal computer, server, mobile terminal or network equipment etc.) it executes according to the method for the embodiment of the present disclosure.

It is particularly shown and described the exemplary embodiment of the disclosure above.It should be appreciated that the present disclosure is not limited to Detailed construction, set-up mode or implementation method described herein；On the contrary, disclosure intention covers included in appended claims Various modifications and equivalence setting in spirit and scope.

Claims

1. a kind of image processing method characterized by comprising

Target object in image is detected, target area is extracted；

The target area input feature vector point is extracted to the temperature figure that the target object is generated in model, the temperature figure is used for Show that the pixel in the target area is the probability of the characteristic point of the target object；

Determine multiple validity feature points in the temperature figure；And

The target feature point of the target object in described image is determined according to the multiple validity feature point.

2. the method as described in claim 1, which is characterized in that determine that multiple validity feature points in the temperature figure include:

Determine the extreme point in the temperature figure；And

Determine the multiple validity feature point at random in the preset range apart from the extreme point.

3. the method as described in claim 1, which is characterized in that determined in described image according to the multiple validity feature point The target feature point of target object includes:

The position of the multiple validity feature point and pixel value are substituted into temperature figure equation, and solve the temperature figure equation with Determine the target feature point of the target object in described image；

Wherein, the temperature figure equation is dimensional Gaussian equation.

4. the method as described in claim 1, which is characterized in that detected to target object in image, extract target area Include:

Target object in image is detected by Face datection algorithm；

Detection zone image is extracted according to testing result；

By the detection zone image scaling to predetermined size；And

Pixel in the detection zone image after scaling is normalized to generate the target area.

5. the method as described in claim 1, which is characterized in that extract in model the target area input feature vector point and generate The temperature figure of the target object includes:

The target area is inputted in the feature point extraction model, forward calculation is carried out based on small lot gradient descent method, Generate the temperature figure of the target object.

6. the method as described in claim 1, which is characterized in that further include:

It is trained, is generated to hourglass network model is stacked by the corresponding accurate temperature figure of training image and the training image The feature point extraction model.

7. method as claimed in claim 6, which is characterized in that pass through training image and the corresponding accurate thermal of the training image Degree figure is trained to hourglass network model is stacked, and generating the feature point extraction model includes:

The target area of target object is had by extracting in the training image；

Determine the characteristic point in the target area；

Accurate temperature figure is generated by the characteristic point；And

The stacking hourglass network model is trained with the accurate temperature figure by the target area, generates the spy Sign point extracts model.

8. the method for claim 7, which is characterized in that generating accurate temperature figure by the characteristic point includes:

Determine corresponding accurate coordinates of the characteristic point in the target area；And

The accurate temperature figure is generated by the accurate coordinates.

9. the method for claim 7, which is characterized in that by the target area with the accurate temperature figure to described It stacks hourglass network model to be trained, generating the feature point extraction model includes:

The target area is inputted in the stacking hourglass network model, initial temperature figure is generated；

The initial temperature figure is compared determining error parameter with the accurate temperature figure；And

The parameter for stacking hourglass network model is updated based on the error parameter and optimization algorithm, described in obtaining Feature point extraction model.

10. method as claimed in claim 9, which is characterized in that the target area is inputted the stacking hourglass network mould In type, generating initial temperature figure includes:

The target area is inputted in the stacking hourglass network model, is carried out by small lot gradient descent method preceding based on It calculates, generates the initial temperature figure.

11. method as claimed in claim 9, which is characterized in that carry out the initial temperature figure and the accurate temperature figure Comparing determining error parameter includes:

The initial temperature figure is compared with the accurate temperature figure by European loss function and determines the error parameter.

12. a kind of image processing apparatus characterized by comprising

Extraction module extracts target area for detecting to target object in image；

Model computation module, for the target area input feature vector point to be extracted to the temperature for generating the target object in model Figure, the temperature figure are used to show that the pixel in the target area to be the probability of the characteristic point of the target object；

Validity feature point module, for determining multiple validity feature points in the temperature figure；And

Feature point module, for determining the target signature of the target object in described image according to the multiple validity feature point Point.

13. device as claimed in claim 12, which is characterized in that further include:

Model training module, for passing through training image and the corresponding accurate temperature figure of the training image to stacking hourglass network Model is trained, and generates the feature point extraction model.

14. a kind of electronic equipment characterized by comprising

One or more processors；

Storage device, for storing one or more programs；

When one or more of programs are executed by one or more of processors, so that one or more of processors are real The now method as described in any in claim 1-11.

15. a kind of computer-readable medium, is stored thereon with computer program, which is characterized in that described program is held by processor The method as described in any in claim 1-11 is realized when row.