CN110414428A

CN110414428A - A method of generating face character information identification model

Info

Publication number: CN110414428A
Application number: CN201910686308.9A
Authority: CN
Inventors: 刘志辉; 余清洲; 许清泉; 张伟; 洪炜冬
Original assignee: Xiamen Meitu Technology Co Ltd
Current assignee: Xiamen Meitu Technology Co Ltd
Priority date: 2019-07-26
Filing date: 2019-07-26
Publication date: 2019-11-05

Abstract

The invention discloses a kind of methods for generating face character information identification model, comprising: generates the mask images for being labeled with multiple face character information corresponding with original image, and multiple attribute tags of characterization face character information；It will be handled in the segmentation network of original image input pre-training, the mask images predicted, and based on the segmentation penalty values between mask images generated and the mask images predicted, training segmentation network obtains the first segmentation network；Generate the characteristic image for identifying at least one attributive character；Each characteristic image is inputted in the sorter network of each pre-training respectively and handled, the Classification Loss value between the attribute tags of attribute tags and prediction with the attribute tags predicted, and based on mark, training sorter network obtains multiple first sorter networks；And based on segmentation penalty values and Classification Loss value, network and multiple first sorter networks are divided in training first, generate face character information identification model with coupling.

Description

A method of generating face character information identification model

Technical field

The present invention relates to depth learning technology field more particularly to a kind of sides for generating face character information identification model Method, face character information identifying method, mobile terminal and storage medium.

Background technique

Face character information is identified by detection facial image, obtains the information of multiple attributes in the facial image, such as Age, gender, expression etc..Face character identification has important role in terms of customer analysis, virtual image.For people Face attribute, such as eyelid, glasses, beard carry out identifying that there are problems.For example, eyelid is divided into single-edge eyelid, interior double and simple eye Skin etc., needs consider that whether there is or not eyeglasses when describing glasses, and size, the color of eyeglass, whether there is or not frame, shape, the thickness of frame etc. are retouched The position of beard is not only considered when stating beard, it is also contemplated that the length of beard, dense degree etc..This kind of identification mission key exists In given label, how different classes of under same label is distinguished.And multiple labels of same attribute frequently result in data It is unbalanced.And the position of attribute region in the picture be it is uncertain, be easy by face, other attributes are influenced, such as carry out eyelid By the interference of false eyelashes, eye shadow, false double-edged eyelid and other background areas when Attribute Recognition.Such as eyelid Attribute Recognition, no Avoidable ground can cut out eye portion, for beard Attribute Recognition, inevitably cut out mouth part, individual attribute Recognition effect is easy to be influenced by these factors.

Existing face character recognition methods generally utilize traditional characteristic extracting method combination supporting vector machine scheduling algorithm into Row classification, such methods often have the shortcomings that nicety of grading is low computationally intensive.

Therefore, it is necessary to a kind of face character recognition methods, and multiple categories of face can be predicted by single Neural Property, improve the accuracy and efficiency of face character identification.

Summary of the invention

For this purpose, the present invention provides a kind of methods for generating face character information identification model and the identification of face character information Method, with try hard to solve the problems, such as or at least alleviate above it is existing at least one.

According to an aspect of the invention, there is provided a kind of method for generating face character information identification model, this method Suitable for executing in the terminal.Firstly, generating the mask artwork for being labeled with multiple face character information corresponding with original image Picture, and characterization face character information multiple attribute tags then, by original image input pre-training segmentation network in into Row processing, the mask images predicted, and based on the segmentation between mask images generated and the mask images predicted Penalty values, training segmentation network, obtain the first segmentation network.And generate the characteristic image for identifying at least one attributive character. Then, each characteristic image is inputted in the sorter network of each pre-training respectively and is handled, with the attribute tags predicted, and Attribute tags based on mark and the Classification Loss value between the attribute tags of prediction, training sorter network, obtain multiple first Sorter network.Finally, based on segmentation penalty values and Classification Loss value, the first segmentation network of training and multiple first sorter networks, To utilize the first segmentation network and multiple first sorter networks coupling generation face character information identification model after training.

Optionally, in the above-mentioned methods, mask images are generated based on semantic segmentation, and are shown using different colours to characterize Different face characters.

Optionally, in the above-mentioned methods, segmentation network includes multiple lightweight process of convolution modules, lightweight process of convolution Module separates convolution using depth.

Optionally, in the above-mentioned methods, the mask images of the mask images based on generation and prediction calculate segmentation loss Value.And based on segmentation penalty values, inverse iteration updates the parameter of segmentation network, training knot when meeting the first predetermined condition Beam obtains the first segmentation network.

Optionally, in the above-mentioned methods, the first predetermined condition is that segmentation penalty values no longer decline.

Optionally, in the above-mentioned methods, segmentation penalty values are calculated based on following formula:

Wherein, H (p, q) is segmentation penalty values, and p is the mask images generated, and q is the mask images of prediction, and p (x) makes a living At mask images pixel Distribution value, q (x) be prediction mask images pixel Distribution value, x indicate image in pixel Point, n are the quantity of pixel.

Optionally, in the above-mentioned methods, it will handle, obtained at least in the segmentation network of original image input pre-training One feature channel figure；The mask images of first segmentation neural network forecast and at least one feature channel figure are subjected to point multiplication operation, Obtain at least one characteristic image.

Optionally, in the above-mentioned methods, the attribute tags of the attribute tags based on mark and prediction calculate Classification Loss Value；And it is based on Classification Loss value, inverse iteration updates the parameter of sorter network, training knot when meeting the second predetermined condition Beam obtains multiple first sorter networks.

Optionally, in the above-mentioned methods, the second predetermined condition is that Classification Loss value no longer declines or front and back iteration twice The difference of the penalty values of the classification function of calculating reaches the first pre-determined number less than the first predetermined threshold or the number of iterations.

Optionally, in the above-mentioned methods, Classification Loss value is calculated by following formula:

Wherein, l (y, z) is Classification Loss value, and y is the attribute tags of mark, and z is the attribute tags of prediction, z_jIndicate pre- The label value for j-th of the attribute tags surveyed, z_yIndicate that the label value of the attribute tags of mark, m indicate the quantity of attribute tags.

Optionally, in the above-mentioned methods, weight based on segmentation penalty values and Classification Loss value and, inverse iteration update the The parameter of one segmentation network and multiple first sorter networks, when meeting third predetermined condition, training terminates, wherein third Predetermined condition be weight and no longer decline or the difference of weight sum that front and back iterates to calculate twice less than the second predetermined threshold or Person's the number of iterations reaches the second pre-determined number.

Optionally, in the above-mentioned methods, face character includes eyelid, glasses, beard, cap, hair.Attribute tags include One or more of following label: the label of eyelid, the label of glasses, the label of beard, the label of cap, hair mark Label.

Optionally, in the above-mentioned methods, the label of eyelid includes single-edge eyelid, interior double and double-edged eyelid；The label of glasses includes The presence or absence of eyeglass, size, color, the presence or absence of spectacle-frame, shape, size, thickness；The label of beard includes the length of beard, class Type, dense degree；The label of cap includes the style of cap, color；The label of hair includes the length, color, hair style of hair.

Optionally, the step of generating the mask images for being labeled with multiple face character information corresponding with original image it Before, human face region is carried out to original image and is cut out.

Optionally, in the above-mentioned methods, it is based on characteristic point detection model, obtains position and the characteristic point coordinate of face；Base Face is carried out in face location and characteristic point coordinate to ajust, and obtains the consistent facial image of size to cut.

According to another aspect of the present invention, a kind of face character information identifying method is provided, is suitable in mobile terminal Middle execution, firstly, obtaining facial image to be identified.Then, by facial image input face character information identification model in into Row processing, face character information identification model includes the segmentation network and sorter network being mutually coupled, wherein facial image is defeated Enter to divide in network and handle, obtains the mask images and feature channel figure of multiple face characters；By mask images and feature channel Figure carries out point multiplication operation, obtains each attribute region figure.Finally, each attribute region figure is inputted corresponding sorter network respectively In handled, obtain the information of each face character in facial image.

Optionally, in the above-mentioned methods, face character includes eyelid, glasses, beard, cap, hair；The letter of face character Breath includes one or more of following information: single-edge eyelid, interior double and double-edged eyelid and/or eyeglass size, colors, frame have Nothing, shape, size, the length of thickness and/or beard, type, dense degree and/or cap style, color and/or hair Length, color, hair style.

Optionally, in the above-mentioned methods, human face region cutting is carried out to the facial image of acquisition, it is consistent to obtain size Facial image.

Another aspect according to the present invention, provides a kind of mobile terminal, including one or more processors, memory and One or more programs, wherein one or more programs store in memory and are configured as being held by one or more processors Row, one or more programs include the instruction for executing the above method.

Another aspect according to the present invention provides a kind of computer readable storage medium for storing one or more programs. Here one or more programs include instruction, when these instructions are by mobile terminal execution, so that in the mobile terminal execution State method.

According to the solution of the present invention, by carrying out semantic segmentation to facial image, each face character can effectively be obtained Semantic information can be improved the accuracy of face character information identification；Image segmentation is carried out by using lightweight convolution, it can Meet the needs of mobile terminal is to speed and calculation amount, improves the efficiency of face character information identification.

Detailed description of the invention

To the accomplishment of the foregoing and related purposes, certain illustrative sides are described herein in conjunction with following description and drawings Face, these aspects indicate the various modes that can practice principles disclosed herein, and all aspects and its equivalent aspect It is intended to fall in the range of theme claimed.Read following detailed description in conjunction with the accompanying drawings, the disclosure it is above-mentioned And other purposes, feature and advantage will be apparent.Throughout the disclosure, identical appended drawing reference generally refers to identical Component or element.

Fig. 1 shows the organigram of mobile terminal 100 according to an embodiment of the invention；

Fig. 2 shows the methods 200 according to an embodiment of the invention for generating face character information identification model Schematic flow chart；

Fig. 3 shows the schematic diagram of mask images according to an embodiment of the invention；

Fig. 4 shows the process that training according to an embodiment of the invention generates face character information identification model and shows It is intended to；

Fig. 5 shows the schematic flow of face character information identifying method 500 according to an embodiment of the invention Figure.

Specific embodiment

Exemplary embodiments of the present disclosure are described in more detail below with reference to accompanying drawings.Although showing the disclosure in attached drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here It is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure It is fully disclosed to those skilled in the art.

Face character identification is to the local feature (eyelid, hair, beard etc.) of such as face, gender, age, expression, decorations The information such as object are identified.In order to improve the precision and efficiency of face character identification, present solution provides one kind based on semantic point The face character information identifying method cut can effectively be obtained the semantic information in facial image by semantic segmentation, be reduced The influence of fitting and data nonbalance to accuracy of identification, improves the accuracy of face character information identification.

Fig. 1 shows the organigram of mobile terminal 100 according to an embodiment of the invention.Mobile terminal 100 It may include memory interface 102, one or more data processors, image processor and/or central processing unit 104, aobvious Display screen curtain (not shown in figure 1) and peripheral interface 106.

Memory interface 102, one or more processors 104 and/or peripheral interface 106 either discrete component, It can integrate in one or more integrated circuits.In the mobile terminal 100, various elements can pass through one or more communication Bus or signal wire couple.Sensor, equipment and subsystem may be coupled to peripheral interface 106, a variety of to help to realize Function.

For example, motion sensor 110, light sensor 112 and range sensor 114 may be coupled to peripheral interface 106, To facilitate the functions such as orientation, illumination and ranging.Other sensors 116 can equally be connected with peripheral interface 106, such as positioning system System (such as GPS receiver), temperature sensor, biometric sensor or other sensor devices, it is possible thereby to help to implement phase The function of pass.

Camera sub-system 120 and optical sensor 122 can be used for the camera of convenient such as record photos and video clips The realization of function, wherein the camera sub-system and optical sensor for example can be charge-coupled device (CCD) or complementary gold Belong to oxide semiconductor (CMOS) optical sensor.It can help to realize by one or more radio communication subsystems 124 Communication function, wherein radio communication subsystem may include radio-frequency transmitter and transmitter and/or light (such as infrared) receiver And transmitter.The particular design and embodiment of radio communication subsystem 124 can depend on mobile terminal 100 is supported one A or multiple communication networks.For example, mobile terminal 100 may include be designed to support LTE, 3G, GSM network, GPRS network, EDGE network, Wi-Fi or WiMax network and Bluetooth^TMThe communication subsystem 124 of network.

Audio subsystem 126 can be coupled with loudspeaker 128 and microphone 130, to help to implement to enable voice Function, such as speech recognition, speech reproduction, digital record and telephony feature.I/0 subsystem 140 may include touch screen control Device 142 processed and/or other one or more input controllers 144.Touch screen controller 142 may be coupled to touch screen 146.It lifts For example, any one of a variety of touch-sensing technologies are can be used to detect in the touch screen 146 and touch screen controller 142 The contact and movement or pause carried out therewith, wherein detection technology includes but is not limited to capacitive character, resistive, infrared and table Face technology of acoustic wave.Other one or more input controllers 144 may be coupled to other input/control devicess 148, such as one Or the pointer device of multiple buttons, rocker switch, thumb wheel, infrared port, USB port, and/or stylus etc.It is described One or more button (not shown)s may include the up/down for controlling 130 volume of loudspeaker 128 and/or microphone Button.

Memory interface 102 can be coupled with memory 150.The memory 150 may include that high random access is deposited Reservoir and/or nonvolatile memory, such as one or more disk storage equipments, one or more optical storage apparatus, and/ Or flash memories (such as NAND, NOR).Memory 150 can store an operating system 172, for example, Android, iOS or The operating system of Windows Phone etc.The operating system 172 may include for handling basic system services and execution The instruction of task dependent on hardware.Memory 150 can also store one or more programs 174.In mobile device operation, Meeting load operating system 172 from memory 150, and executed by processor 104.Program 174 at runtime, also can be from storage It loads in device 150, and is executed by processor 104.Program 174 operates on operating system, utilizes operating system and bottom The interface that hardware provides realizes the various desired functions of user, such as instant messaging, web page browsing, pictures management.Program 174 can To be independently of operating system offer, it is also possible to what operating system carried.In addition, program 174 is mounted to mobile terminal When in 100, drive module can also be added to operating system.Program 174 may be arranged on an operating system by one or more A processor 104 executes relevant instruction.In some embodiments, mobile terminal 100 is configured as executing according to the present invention Generate the method 200 and face character information identifying method 500 of face character information identification model.Wherein, mobile terminal 100 One or more programs 174 include for executing method 200 He according to the present invention for generating face character information identification model The instruction of face character information identifying method 500.

Image, semantic segmentation is exactly to be partitioned into subject area from image automatically, and by each pixel generic of image It is labeled, different colours can represent different classifications, divide an image into semantically meaningful part.Since face has Multiple attributes, such as glasses, eyelid, beard, each attribute have multiple labels according to different classification methods, as beard can be by According to length, straight, shape progress multi-tag classification is rolled up, so face character identification is the learning process of a multitask.

Fig. 2 shows the methods 200 according to an embodiment of the invention for generating face character information identification model Schematic flow chart.This method can execute in the mobile terminal 100.Face character information identification model may include segmentation net Network and multiple sorter networks arranged side by side.The corresponding face character of each sorter network, is a multi-tag sorter network, has Multiple outputs, each output correspond to an attribute tags.

As shown in Fig. 2, method 200 starts from step S210, generation is corresponding with original image to be labeled with multiple face characters The mask images of information, and multiple attribute tags of characterization face character information.

Wherein, generating mask images is exactly to be blocked with selected image to the whole of image to be processed or part, is come Control region or the treatment process of image procossing.Specific image or object for covering are known as mask.Mask images can be with base It generates, can be shown using different colours to characterize different face characters in semantic segmentation.

Wherein, attribute tags are second level label, and the first order is different latitude (such as size, shape), and the second level is each The tag along sort of latitude.Such as the label of glasses are as follows: and type: eyeglasses, size: in, shape: rectangular }.Other attributes Label is similar.According to one embodiment of present invention, face character includes eyelid, glasses, beard, cap, hair etc..Table 1 shows Attribute tags according to an embodiment of the invention are gone out.As shown in table 1, glasses can according to when whether there is or not classification, having a glasses by Look in the mirror piece whether there is or not, size, color, frame whether there is or not, shape, thickness, material etc. be labeled.For example, frame can be divided into framed, nothing Frame can be further subdivided into thick frame, thin frame if framed, further according to color classification, the thin frame of black, red thick frame Deng.For hair attribute, can classify according to hair lengths, color, hair style etc..It can be according to the face of cap for cap attribute Color, style etc. are classified.For beard attribute, can classify according to the type of beard, position, length, dense degree etc.. For eyelid attribute, single-edge eyelid, double-edged eyelid, interior double-edged eyelid can be divided into according to eyelid type.Attribute tags in table 1 are only to show Example property, it can be labeled according to actual needs.It should be noted that not intersecting between each attribute tags, for example, In When classifying to beard attribute, it cannot be placed on by the Xiao Hu's slag classified according to length and according to the whiskers of Shape Classification same Classification grade, the same label can carry out numeralization differentiation, press, it should be apparent that for example, when classifying to eyelid attribute Double-edged eyelid and interior pair are distinguished according to the distance between eyelid.Same label covers all situations as far as possible, is avoided as far as possible by remaining institute There is less classification to be classified as same class.The case where certain label classifications are not present, such as no-frame glasses do not need to judge mirror The thickness of frame, can independent one classification of another column, without the label should be allowed default.

The attribute tags according to an embodiment of the present invention of table 1

When obtaining training dataset, the picture of some possible classifications is more, and the picture of some classifications is especially few, makes in this way At the imbalance of training dataset.In order to solve the problems, such as to state data nonbalance, it can use the methods of sampling or weighting, may be used also To generate more samples using existing sample by Data Synthesis/enhancing method, such as generation confrontation network can be sampled The new data of a batch are generated, but should ensure that the data proportion of generation cannot be too many, the feature for avoiding e-learning from arriving is not It is the feature of necessary being.The weighting loss value between each classification can also be made approximately equal by the way that reasonable weight is arranged.With The unbalanced method of upper solution data set is merely exemplary, and can solve training number using different methods according to the actual situation According to unbalanced problem.

Fig. 3 shows the schematic diagram of mask images according to an embodiment of the invention.As shown in figure 3, in order to mark Convenient, all properties are labeled in same mask images, and different colours display indicates different people in the mask images of generation Face attribute.

According to one embodiment of present invention, before generating mask images, human face region can be carried out to original image It cuts out.For example, firstly, obtaining position and the characteristic point coordinate of face based on characteristic point detection model.Then, it is based on face position It sets and carries out face normalization with characteristic point coordinate, obtain the consistent facial image of size to cut.It is, for example, possible to use two eyes Coordinate as the standard ajusted, acquisition human face region is cut out to original image after ajusting: 1, obtaining two eyes shapes At straight line angle, the rotation of image is carried out according to the opposite direction of the angle, a crucial step is according to rotation after rotation Formula, the coordinate of the characteristic point after being rotated；2, after ajusting, according to eyes, nose, mouth fore-and-aft distance, estimation is wanted The height of the face of cutting；3, after determining height, setting depth-width ratio etc., the width of the face to be determined of determination；4, basis The distance ratio of nose and two eyes transverse directions calculates the specific coordinate of the right and left, rotation angle to positive face, last pantograph Spending human face region is 128*128.The method that the above progress human face region is cut out is merely exemplary, can also be using at image Reason software is cut out processing, and this programme is not construed as limiting this.

Then in step S220, it will handle, predicted in the segmentation network of original image input pre-training Mask images, and based on the segmentation penalty values between mask images generated and the mask images predicted, training segmentation net Network obtains the first segmentation network.

In the method 200 of execution, the network structure of the segmentation network of embodiment according to the present invention can be constructed in advance.Table 2 Show the subnetwork parameter of the segmentation network of pre-training according to an embodiment of the invention.As shown in table 2, wherein kh, Kw indicates the height and width of convolution kernel, and padding is Filling power, and stride is step-length, and what expand was indicated is the logical of expansion Road number, the duplicate structure number that n is indicated.The scale of input picture is triple channel 128*128, defeated in embodiments of the present invention Entering image is the consistent colorized face images of size.In order to run on the mobile apparatus, so the network structure of selection must Palpus calculation amount is small, and lightweight process of convolution module is used in this programme, is reduced on its basis.It includes more for dividing network A lightweight process of convolution module MobilenetV2 Block, process of convolution layer process of convolution layer include convolutional layer, batch normalizing Change layer and active coating.As shown in table 1, conv indicates that convolutional layer, BN are batch normalization layer, PReLU is the activation letter with parameter Number, can also be any kinds such as ReLU, tanh, sigmoid, LeakyReLU, it is not limited here.Lightweight process of convolution Module MobilenetV2 Block separates convolution using depth, and it is one corresponding that depth separates each channel in convolutional coding structure Different filters, rather than all channels correspond to the same filter, can be realized the separation in channel and region, reduce parameter Calculation amount, obtained characteristic mass is more preferably.By taking Block1 as an example, convolutional coding structure is separated by 4 depth and is formed.Block2, Block3 is similar.

The subnetwork parameter of the segmentation network of 2 pre-training of table

According to one embodiment of present invention, can mask images and prediction based on generation mask images, calculate point Cut penalty values.Then, based on segmentation penalty values, inverse iteration updates the parameter of segmentation network, until meeting the first predetermined condition Shi Xunlian terminates, and obtains the first segmentation network.Wherein, the first predetermined condition can be segmentation penalty values and no longer decline.In training In the process, it needs to resolve into the mask images of generation into k mask, each mask represents the response of an attribute.If no There are some attributes, then the mask is all 0.

Segmentation loss can use the intersection entropy loss of Pixel-level, segmentation penalty values can be calculated based on following formula:

Then in step S230, the characteristic image for identifying at least one attributive character is generated.

According to one embodiment of present invention, original image can be inputted in the segmentation network of pre-training and is handled, Obtain at least one feature channel figure.Then, by the mask images and at least one feature channel figure of the first segmentation neural network forecast Point multiplication operation is carried out, at least one characteristic image is obtained.

In some embodiments, characteristic pattern output layer, Lai Shengcheng can be increased after the exposure mask output layer of segmentation network Characteristic image.As shown in table 2, divide includes segmentation mask output layer Mask and characteristic pattern output layer featuremap in network. Divide the feature channel figure of conv_4 output 128x16x16 size in network, conv_5 exports the mask artwork of k prediction, each What mask artwork indicated is the semantic segmentation figure an of attribute.This k mask artwork point is taken on the figure of feature channel respectively, feature is logical Road figure will become the characteristic image of k 128x16x16.What the characteristic pattern of each 128x16x16 represented is an attributive character.Cause It is 128x128 for input size, output is 16*16, it is down-sampled there are 8 times between pixel, so when label is super in the small lattice of 8x8 Crossing 20 has value, then the label value of this small lattice is set to 1, is otherwise set to 0.

Then in step S240, each characteristic image is inputted in the sorter network of each pre-training respectively and handled, with The attribute tags predicted, and the Classification Loss value between the attribute tags based on mark and the attribute tags of prediction, training Sorter network obtains multiple first sorter networks.

According to one embodiment of present invention, can attribute tags and prediction based on mark attribute tags, calculate point Class penalty values.Then, it is based on Classification Loss value, inverse iteration updates the parameter of sorter network, until meeting the second predetermined condition Shi Xunlian terminates, and obtains multiple first sorter networks.Wherein, the second predetermined condition may include: that Classification Loss value no longer declines, Perhaps the difference of the penalty values for the classification function that front and back iterates to calculate twice reaches less than the first predetermined threshold or the number of iterations First pre-determined number.

Classification Loss can be lost using SoftMaxWithLoss, can calculate Classification Loss value by following formula:

Finally in step s 250, based on segmentation penalty values and Classification Loss value, the first segmentation network of training and multiple the One sorter network, to utilize the first segmentation network after training and the coupling of multiple first sorter networks to generate the knowledge of face character information Other model.

In training process, all-network is opened, the weight of segmentation loss and classified weight 1: 10 can be tuned into, combined At all training.Fig. 4 shows training according to an embodiment of the invention and generates face character information identification model Flow diagram.As shown in figure 4, the first segmentation network exports the mask artwork and feature channel figure of each attribute.Then pass through Mask artwork and feature channel figure carry out dot product, obtain characteristic image.The essence of dot product is that mask artwork is bianry image, portion interested Dividing pixel value is 1, and feature channel figure is multiplied with mask artwork respective pixel, and the pixel of area-of-interest remains unchanged, remaining Pixel is 0, to show area-of-interest.Obtained each characteristic image is finally inputted into corresponding first classification net respectively In network, the attribute tags predicted, such as obtaining the attribute tags of hair is [black, long, curly hair].Based on segmentation penalty values With the weight of Classification Loss value and, inverse iteration updates the parameter of the first segmentation network and multiple first sorter networks, Zhi Daoman When sufficient third predetermined condition, training terminates, and first after training is divided network and the coupling of multiple first sorter networks generates people Face attribute information identification model.Wherein, third predetermined condition may include: weight and no longer decline or front and back iteration twice The difference of the weight sum of calculating reaches the second pre-determined number less than the second predetermined threshold or the number of iterations.

According to the method for the present invention 200, sorter network is fixed first, training segmentation network；Then segmentation network is consolidated It is fixed, the multiple sorter networks of training；Finally, joint training is carried out to the segmentation network trained and the sorter network trained again, The segmentation network and multiple sorter networks that final training obtains are mutually coupled, and generate face character information identification model.

Fig. 5 shows the schematic flow of face character information identifying method 500 according to an embodiment of the invention Figure.As shown in figure 5, obtaining facial image to be identified first in step S510.The facial image of acquisition can also be carried out Human face region is cut.Then in step S520, facial image is inputted in face character information identification model and is handled, people Face attribute information identification model includes the segmentation network and sorter network being mutually coupled, wherein facial image is inputted segmentation net It is handled in network, obtains the mask images and feature channel figure of multiple face characters；Mask images and feature channel figure are carried out a little Multiplication obtains each attribute region figure.A kind of embodiment according to the present invention, face character information identification model can lead to The above method 200 is crossed to generate, details are not described herein again.Finally in step S530, each attribute region figure is inputted pair respectively It is handled in the sorter network answered, obtains the information of each face character in facial image.

According to one embodiment of present invention, face character may include eyelid, glasses, beard, cap, hair.Face The information of attribute may include one or more of following information: the size of single-edge eyelid, interior pair and double-edged eyelid and/or eyeglass, Color, the presence or absence of frame, shape, size, length, type, style, the face of dense degree and/or cap of thickness and/or beard The length of color and/or hair, color, hair style.

According to the solution of the present invention, by carrying out semantic segmentation to facial image, the language of each attribute can effectively be obtained Adopted information can be improved the accuracy of face character information identification；Image segmentation, Neng Gouman are carried out by using lightweight convolution Demand of the sufficient mobile terminal to speed and calculation amount greatly improves the efficiency of face character information identification.

A8, the method as described in A6, wherein calculate Classification Loss value by following formula:

Wherein, I (y, z) is Classification Loss value, and y is the attribute tags of mark, and z is the attribute tags of prediction, z_jIndicate pre- The label value for j-th of the attribute tags surveyed, z_yIndicate that the label value of the attribute tags of mark, m indicate the quantity of attribute tags.

A9, method as described in a1, wherein the face character includes eyelid, glasses, beard, cap, hair；It is described Attribute tags include one or more of following label: the label of eyelid, the label of glasses, the label of beard, cap mark The label of label, hair.

A10, the method as described in A9, wherein

The label of eyelid includes single-edge eyelid, interior double and double-edged eyelid；

The label of glasses includes the presence or absence of eyeglass, size, color, the presence or absence of spectacle-frame, shape, size, thickness；

The label of beard includes the length, type, dense degree of beard；

The label of cap includes the style of cap, color；

The label of hair includes the length, color, hair style of hair.

A11, method as described in a1, wherein the segmentation network includes multiple lightweight process of convolution modules, described light Quantify process of convolution module and convolution is separated using depth.

A12, method as described in a1, wherein the mask images are generated based on semantic segmentation, and aobvious using different colours Show to characterize different face characters.

A13, method as described in a1, wherein it is described based on segmentation penalty values and Classification Loss value, train described first point The step of cutting network and multiple first sorter networks include:

Based on segmentation penalty values and Classification Loss value weight and, inverse iteration update first divide network and multiple first The parameter of sorter network, when meeting third predetermined condition, training terminate, wherein the third predetermined condition be weight and No longer the difference of weight sum that iterates to calculate twice of decline or front and back reaches the less than the second predetermined threshold or the number of iterations Two pre-determined numbers.

A14, method as described in a1, wherein corresponding with original image be labeled with multiple face character information generating Mask images the step of before, the method also includes:

Human face region is carried out to original image to cut out.

A15, the method as described in A14, wherein it is described to original image carry out human face region cut out the step of include:

Based on characteristic point detection model, position and the characteristic point coordinate of face are obtained；

Face normalization is carried out based on face location and characteristic point coordinate, obtains the consistent facial image of size to cut.

B17, the method as described in B16, wherein

The face character includes eyelid, glasses, beard, cap, hair,

The information of face character includes one or more of following information: single-edge eyelid, interior double and double-edged eyelid and/or mirror Size, the color of piece, the presence or absence of frame, shape, size, the length of thickness and/or beard, type, dense degree and/or cap Style, the length of color and/or hair, color, hair style.

B18, the method as described in B16, wherein the method also includes:

Human face region cutting is carried out to the facial image of acquisition, to obtain the consistent facial image of size.

It should be appreciated that in order to simplify the disclosure and help to understand one or more of the various inventive aspects, it is right above In the description of exemplary embodiment of the present invention, each feature of the invention be grouped together into sometimes single embodiment, figure or In person's descriptions thereof.However, the disclosed method should not be interpreted as reflecting the following intention: i.e. claimed hair Bright requirement is than feature more features expressly recited in each claim.More precisely, as the following claims As book reflects, inventive aspect is all features less than single embodiment disclosed above.Therefore, it then follows specific real Thus the claims for applying mode are expressly incorporated in the specific embodiment, wherein each claim itself is used as this hair Bright separate embodiments.

Those skilled in the art should understand that the module of the equipment in example disclosed herein or unit or groups Part can be arranged in equipment as depicted in this embodiment, or alternatively can be positioned at and the equipment in the example In different one or more equipment.Module in aforementioned exemplary can be combined into a module or furthermore be segmented into multiple Submodule.

Those skilled in the art will understand that can be carried out adaptively to the module in the equipment in embodiment Change and they are arranged in one or more devices different from this embodiment.It can be the module or list in embodiment Member or component are combined into a module or unit or component, and furthermore they can be divided into multiple submodule or subelement or Sub-component.Other than such feature and/or at least some of process or unit exclude each other, it can use any Combination is to all features disclosed in this specification (including adjoint claim, abstract and attached drawing) and so disclosed All process or units of what method or apparatus are combined.Unless expressly stated otherwise, this specification is (including adjoint power Benefit require, abstract and attached drawing) disclosed in each feature can carry out generation with an alternative feature that provides the same, equivalent, or similar purpose It replaces.

In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments In included certain features rather than other feature, but the combination of the feature of different embodiments mean it is of the invention Within the scope of and form different embodiments.For example, in the following claims, embodiment claimed is appointed Meaning one of can in any combination mode come using.

Various technologies described herein are realized together in combination with hardware or software or their combination.To the present invention Method and apparatus or the process and apparatus of the present invention some aspects or part can take insertion tangible media, such as it is soft The form of program code (instructing) in disk, CD-ROM, hard disk drive or other any machine readable storage mediums, Wherein when program is loaded into the machine of such as computer etc, and is executed by the machine, the machine becomes to practice this hair Bright equipment.

In the case where program code executes on programmable computers, mobile terminal generally comprises processor, processor Readable storage medium (including volatile and non-volatile memory and or memory element), at least one input unit, and extremely A few output device.Wherein, memory is configured for storage program code；Processor is configured for according to the memory Instruction in the said program code of middle storage executes method of the present invention.

By way of example and not limitation, computer-readable medium includes computer storage media and communication media.It calculates Machine readable medium includes computer storage media and communication media.Computer storage medium storage such as computer-readable instruction, The information such as data structure, program module or other data.Communication media is generally modulated with carrier wave or other transmission mechanisms etc. Data-signal processed passes to embody computer readable instructions, data structure, program module or other data including any information Pass medium.Above any combination is also included within the scope of computer-readable medium.

In addition, be described as herein can be by the processor of computer system or by executing by some in the embodiment The combination of method or method element that other devices of the function are implemented.Therefore, have for implementing the method or method The processor of the necessary instruction of element forms the device for implementing this method or method element.In addition, Installation practice Element described in this is the example of following device: the device be used for implement as in order to implement the purpose of the invention element performed by Function.

As used in this, unless specifically stated, come using ordinal number " first ", " second ", " third " etc. Description plain objects, which are merely representative of, is related to the different instances of similar object, and is not intended to imply that the object being described in this way must Must have the time it is upper, spatially, sequence aspect or given sequence in any other manner.

Although the embodiment according to limited quantity describes the present invention, above description, the art are benefited from It is interior it is clear for the skilled person that in the scope of the present invention thus described, it can be envisaged that other embodiments.Additionally, it should be noted that Language used in this specification primarily to readable and introduction purpose and select, rather than in order to explain or limit Determine subject of the present invention and selects.Therefore, without departing from the scope and spirit of the appended claims, for this Many modifications and changes are obvious for the those of ordinary skill of technical field.For the scope of the present invention, done Disclosure be it is illustrative and not restrictive, it is intended that the scope of the present invention be defined by the claims appended hereto.

Claims

1. a kind of method for generating face character information identification model, the method is suitable for executing in the terminal, comprising:

Generate the mask images for being labeled with multiple face character information corresponding with original image, and characterization face character information Multiple attribute tags；

It will be handled in the segmentation network of original image input pre-training, the mask images predicted, and be based on being generated Mask images and the mask images predicted between segmentation penalty values, training segmentation network obtains the first segmentation network；

Generate the characteristic image for identifying at least one attributive character；

Each characteristic image is inputted in the sorter network of each pre-training respectively and is handled, with the attribute tags predicted, and Attribute tags based on mark and the Classification Loss value between the attribute tags of prediction, training sorter network, obtain multiple first Sorter network；And

Based on segmentation penalty values and Classification Loss value, the training first segmentation network and multiple first sorter networks, to utilize The first segmentation network and the coupling of multiple first sorter networks after training generate face character information identification model.

2. the method for claim 1, wherein it is described based on mask images generated and the mask images predicted it Between segmentation penalty values, training segmentation network the step of include:

The mask images of mask images and prediction based on generation calculate segmentation penalty values；

Based on the segmentation penalty values, inverse iteration updates the parameter of segmentation network, training when meeting the first predetermined condition Terminate, obtains the first segmentation network.

3. method according to claim 2, wherein first predetermined condition is that segmentation penalty values no longer decline.

4. method according to claim 2, wherein calculate segmentation penalty values based on following formula:

Wherein, H (p, q) is segmentation penalty values, and p is the mask images generated, and q is the mask images of prediction, and p (x) is to generate The pixel Distribution value of mask images, q (x) are the pixel Distribution value of the mask images of prediction, and x indicates that the pixel in image, n are The quantity of pixel.

5. the method for claim 1, wherein described the step of generating the characteristic image for identifying at least one attributive character Include:

It will be handled in the segmentation network of original image input pre-training, obtain at least one feature channel figure；

The mask images of first segmentation neural network forecast and at least one feature channel figure are subjected to point multiplication operation, obtain at least one Characteristic image.

6. the method for claim 1, wherein attribute tags based on mark and between the attribute tags of prediction Classification Loss value, training sorter network the step of include:

The attribute tags of attribute tags and prediction based on mark calculate Classification Loss value；

Based on the Classification Loss value, inverse iteration updates the parameter of sorter network, training when meeting the second predetermined condition Terminate, obtains multiple first sorter networks.

7. method as claimed in claim 6, wherein second predetermined condition no longer declines for Classification Loss value or front and back The difference of the penalty values of the classification function iterated to calculate twice reaches first less than the first predetermined threshold or the number of iterations and makes a reservation for Number.

8. a kind of face character information identifying method, suitable for executing in the terminal, which comprises

Obtain facial image to be identified；

Facial image is inputted in face character information identification model and is handled, the face character information identification model includes The segmentation network and sorter network being mutually coupled, wherein

Facial image is inputted in segmentation network and is handled, the mask images and feature channel figure of multiple face characters are obtained；

Mask images and feature channel figure are subjected to point multiplication operation, obtain each attribute region figure；

Each attribute region figure is inputted respectively in corresponding sorter network and is handled, each face character in facial image is obtained Information.

9. a kind of mobile terminal, comprising:

Memory；

One or more processors；

One or more programs, wherein one or more of programs are stored in the memory and are configured as by described one A or multiple processors execute, and one or more of programs include for executing in -8 the methods according to claim 1 The instruction of either method.

10. a kind of computer readable storage medium for storing one or more programs, one or more of programs include instruction, Described instruction is when mobile terminal execution, so that the mobile terminal execution appointing in method described in -8 according to claim 1 One method.