CN110033424A

CN110033424A - Method, apparatus, electronic equipment and the computer readable storage medium of image procossing

Info

Publication number: CN110033424A
Application number: CN201910313968.2A
Authority: CN
Inventors: 杨弋; 周舒畅
Original assignee: Beijing Maigewei Technology Co Ltd
Current assignee: Beijing Megvii Technology Co Ltd; Beijing Maigewei Technology Co Ltd
Priority date: 2019-04-18
Filing date: 2019-04-18
Publication date: 2019-07-19

Abstract

The embodiment of the present application provides method, apparatus, electronic equipment and the computer readable storage medium of a kind of image procossing, is related to field of computer technology.This method comprises: determining the corresponding location information of each candidate frame in image to be processed, candidate frame is the candidate frame for target object, it is then based on image and the corresponding location information of each candidate frame to be processed, and determine the attribute information of the corresponding target object of each candidate frame respectively by preset model, then by non-maxima suppression algorithm and the attribute information based on the corresponding location information of each candidate frame and the corresponding target object of each candidate frame, processing is merged to each candidate frame.The embodiment of the present application realizes the accuracy for improving and merging processing to each candidate frame, promotes user experience.

Description

Method, apparatus, electronic equipment and the computer readable storage medium of image procossing

Technical field

This application involves field of computer technology, specifically, this application involves a kind of method, apparatus of image procossing, Electronic equipment and computer readable storage medium.

Background technique

With the development of information technology, image processing techniques also develops therewith, is related to a kind of figure in the image processing arts As Processing Algorithm, i.e. non-maxima suppression algorithm (Non-maximum suppression, NMS), the essence of NMS is search office Portion's maximum inhibits non-maximum element.In deep learning target detection, usually it is directed to output result (for same There may be several very close candidate frames to be corresponding to it for one target object) non-maxima suppression processing is carried out, by phase As candidate frame merge.

It is handled in the prior art based on non-maxima suppression, in a manner of merging similar candidate frame: being based on appointing The corresponding location information of each candidate frame, merges candidate frame in one image, the corresponding position of any candidate frame Confidence breath includes: center position, the upper left corner location of candidate frame, the length of candidate frame and/or the candidate frame of candidate frame Width etc., for example, successively merging the candidate frame that distance between center position is less than preset threshold.

But non-maxima suppression method in the prior art is only the location information based on each candidate frame to candidate frame It merges, merges into a time so as to cause to be closer in the picture and belong to the candidate frame of different target object The case where selecting frame, or the multiple candidate frames for belonging to same target object be merged into more than one frame, and then cause to image In each candidate frame merge processing accuracy it is lower, user experience is poor.

Summary of the invention

This application provides a kind of method, apparatus of image procossing, electronic equipment and computer readable storage mediums, are used for Solve the technical problem that the accuracy for merging processing to candidate frame each in image is lower and user experience is poor.

In a first aspect, a kind of method of image procossing is provided, this method comprises:

Determine that the corresponding location information of each candidate frame in image to be processed, candidate frame are for target object Candidate frame；

Based on image to be processed and the corresponding location information of each candidate frame, and distinguished by preset model true Determine the attribute information of the corresponding target object of each candidate frame；

Pass through non-maxima suppression algorithm and is based on the corresponding location information of each candidate frame and each candidate frame The attribute information of corresponding target object merges processing to each candidate frame.

In one possible implementation, it for any two candidate frame, by non-maxima suppression algorithm and is based on The attribute information of each corresponding location information of candidate frame and the corresponding target object of each candidate frame, to each candidate Frame merges processing, comprising:

The friendship of any two candidate frame is determined based on the corresponding location information of each candidate frame and compares IoU；

By non-maxima suppression algorithm and based on each candidate in the IoU of any two candidate frames and any two candidate frames The attribute information of the corresponding target object of frame, determines whether any two candidate frames belong to pending combined candidate frame；

If belonging to pending combined candidate frame, processing is merged to any two candidate frame.

In one possible implementation, the corresponding position letter of each candidate frame in image to be processed is determined Breath, before further include:

Multimedia messages are pre-processed, image to be processed is obtained；

Determine the corresponding location information of each candidate frame in image to be processed, comprising:

By image to be processed by detection network, the corresponding position of each candidate frame in image to be processed is exported Information.

In one possible implementation, the attribute information of the corresponding target object of any candidate frame include it is following at least One:

The colouring information of target object；The density information of target object；The shape information of target object；Belonging to target object Classification information.

In alternatively possible implementation, if candidate frame is the candidate frame for face information, any candidate frame The attribute information of corresponding target object includes at least one of the following:

The age information of target object；The gender information of target object；The skin color information of target object；Target object Towards radian information.

In one possible implementation, if candidate frame is the candidate frame for face information,

By non-maxima suppression algorithm, and based on each time in the IoU of any two candidate frames and any two candidate frames The attribute information for selecting the corresponding target object of frame, determines whether any two candidate frames belong to pending combined candidate frame, Include:

By non-maxima suppression algorithm, the IoU of any two candidate frames is determined, each candidate frame point in any two candidate frames Whether the attribute information of not corresponding face information meets preset rules, to determine whether any two candidate frames belong to pending conjunction And candidate frame；

Preset rules include at least one of the following:

IoU is greater than the first attribute information phase of the first preset threshold and the corresponding target object of any two candidate frames Together；

IoU is not greater than the first attribute information of the second preset threshold and the corresponding target object of any two candidate frames Together, the first preset threshold is less than the second preset threshold；

IoU is greater than the first preset threshold, and the first attribute information of the corresponding target object of any two candidate frames is The probability difference of preset attribute information is away from no more than third predetermined threshold value；

IoU is greater than the second preset threshold, and the first attribute information of the corresponding target object of any two candidate frames is The probability difference of preset attribute information is away from not less than third predetermined threshold value；

IoU be greater than the first preset threshold, and the second attribute information of the corresponding target object of any two candidate frames it Difference is not more than the 4th preset threshold；

IoU be greater than the second preset threshold, and the second attribute information of the corresponding target object of any two candidate frames it Difference is greater than the 4th preset threshold；

IoU is greater than the 5th preset threshold, and the 5th preset threshold is that the first preset threshold is respectively corresponded with any two candidate frames Target object the first attribute information difference between weighted information；

IoU is greater than the 6th preset threshold, and the 6th preset threshold is that the first preset threshold is respectively corresponded with any two candidate frames Target object the first attribute information difference and the corresponding target object of any two candidate frames the second attribute information For the weighted information between the probability difference of preset attribute information；

IoU is greater than the 7th preset threshold, and the 7th preset threshold is that the first preset threshold is respectively corresponded with any two candidate frames Third attribute information difference between weighted information.

In alternatively possible implementation, preset rules are included at least one of the following:

IoU is greater than the first preset threshold and the gender information of the corresponding target object of any two candidate frames is identical；

IoU is greater than the second preset threshold and the gender information of the corresponding target object of any two candidate frames is different, the One preset threshold is less than the second preset threshold；

IoU is greater than the first preset threshold, and the gender of the corresponding target object of any two candidate frames is that the same sex is other Probability difference is away from no more than third predetermined threshold value；

IoU is greater than the second preset threshold, and the gender of the corresponding target object of any two candidate frames is that the same sex is other Probability difference is away from greater than third predetermined threshold value；

IoU is greater than the first preset threshold, and the age gap of the corresponding target object of any two candidate frames is not more than 4th preset threshold；

IoU is greater than the second preset threshold, and the age gap of the corresponding target object of any two candidate frames is greater than the Four preset thresholds；

IoU is greater than the 5th preset threshold, the 5th preset threshold=first the+the first parameter value of preset threshold × any two candidates The age gap of the corresponding target object of frame；

IoU is greater than the 6th preset threshold, the 6th preset threshold=first the+the first parameter value of preset threshold × any two candidates The+the second parameter value of age gap of the corresponding target object of frame × any two candidate frames corresponding target object property Not Wei the other probability difference of the same sex away from；

IoU is greater than the 7th preset threshold, the 7th preset threshold=first the+the second parameter value of preset threshold × any two candidates Difference between the radian value of the corresponding target object direction of frame.

Second aspect provides a kind of image processing apparatus, which includes:

First determining module, it is described for determining the corresponding location information of each candidate frame in image to be processed Candidate frame is the candidate frame for target object；

Second determining module, each time for being determined based on the image to be processed and first determining module The corresponding location information of frame is selected, and determines the category of the corresponding target object of each candidate frame respectively by preset model Property information；

Merging treatment module, for by non-maxima suppression algorithm and based on the corresponding position letter of each candidate frame The attribute information of breath and the corresponding target object of each candidate frame, merges processing to each candidate frame.

In one possible implementation, for any two candidate frame, merging treatment module, comprising: first determines Unit, the second determination unit and merge processing unit, wherein

First determination unit, for determining any two candidate frame based on the corresponding location information of each candidate frame It hands over and compares IoU；

Second determination unit, any two times for passing through non-maxima suppression algorithm and being determined based on the first determination unit The attribute information for selecting the corresponding target object of each candidate frame in the IoU and any two candidate frames of frame, determines any two Whether candidate frame belongs to pending combined candidate frame；

Merge processing unit, for determining that any two candidate frames belong to pending combined candidate when the second determination unit Frame then merges processing to any two candidate frame.

In one possible implementation, device further include: preprocessing module, wherein

Preprocessing module obtains image to be processed for pre-processing to multimedia messages；

First determining module, specifically for by detection network, determining to be processed the image to be processed The corresponding location information of each candidate frame in image.

In one possible implementation, if candidate frame is the candidate frame for face information, any candidate frame pair The attribute information for the target object answered includes at least one of the following:

In alternatively possible implementation, when candidate frame is the candidate frame for face information, second determines list Member is specifically used for determining by non-maxima suppression algorithm the IoU of any two candidate frames, each candidate in any two candidate frames Whether the attribute information of the corresponding face information of frame meets preset rules, with determine any two candidate frames whether belong to into The combined candidate frame of row；

Preset rules include at least one of the following:

IoU is greater than the second preset threshold, and the first attribute information of the corresponding target object of any two candidate frames is The probability difference of preset attribute information is away from no more than third predetermined threshold value；

IoU is greater than the 5th preset threshold, and the 5th preset threshold is the of the corresponding target object of any two candidate frames The weighted information of one attribute information difference；

IoU is greater than the 6th preset threshold, and the 6th preset threshold is the of the corresponding target object of any two candidate frames Second attribute information of one attribute information difference and the corresponding target object of any two candidate frames is preset attribute information Probability difference between weighted information；

IoU is greater than the 7th preset threshold, and the 7th preset threshold is the corresponding third attribute information of any two candidate frames The weighted information of difference.

The third aspect provides a kind of electronic equipment, which includes:

One or more processors；

Memory；

One or more application program, wherein one or more application programs be stored in memory and be configured as by One or more processors execute, and one or more programs are configured to: executing times according to first aspect or first aspect The corresponding operation of method of image procossing shown in one possible implementation.

Fourth aspect, provides a kind of computer readable storage medium, and storage medium is stored at least one instruction, at least One Duan Chengxu, code set or instruction set, at least one instruction, an at least Duan Chengxu, code set or instruction set are loaded by processor And the method executed to realize the image procossing as shown in first aspect or first aspect any possible implementation.

Technical solution provided by the present application has the benefit that

It is and existing this application provides a kind of method, apparatus of image procossing, electronic equipment and computer readable storage medium Non-maxima suppression method in technology is only that the location information based on each candidate frame merges candidate frame and compares, this The corresponding location information of each candidate frame in image to be processed is determined in application, candidate frame is the time for target object Frame is selected, is then based on image and the corresponding location information of each candidate frame to be processed, and distinguish by preset model It determines the attribute information of the corresponding target object of each candidate frame, then pass through non-maxima suppression algorithm and is based on each candidate The attribute information of the corresponding location information of frame and the corresponding target object of each candidate frame, closes each candidate frame And it handles.It when i.e. the application merges processing for the candidate frame in image to be processed, is distinguished based on each candidate frame The attribute information of corresponding location information and the corresponding target object of each candidate frame, will be in image so as to reduce Be closer and the candidate frame for belonging to different target object be incorporated into the probability of a candidate frame, and can also reduce will belong to it is same The case where multiple candidate frames of one target object are merged into more than one frame, and then can be improved and each candidate frame is merged The accuracy of processing, and then user experience can be promoted.

Detailed description of the invention

In order to more clearly explain the technical solutions in the embodiments of the present application, institute in being described below to the embodiment of the present application Attached drawing to be used is needed to be briefly described.

Fig. 1 is a kind of flow diagram of the method for image procossing provided by the embodiments of the present application；

Fig. 2 is a kind of apparatus structure schematic diagram of image procossing provided by the embodiments of the present application；

Fig. 3 is a kind of structural schematic diagram of the electronic equipment of image procossing provided by the embodiments of the present application；

Fig. 4 is a kind of overall schematic of the method for image procossing provided by the embodiments of the present application；

Fig. 5 is the form of expression exemplary diagram of candidate frame in the embodiment of the present application.

Specific embodiment

Embodiments herein is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached The embodiment of figure description is exemplary, and is only used for explaining the application, and is not construed as limiting the claims.

Those skilled in the art of the present technique are appreciated that unless expressly stated, singular " one " used herein, " one It is a ", " described " and "the" may also comprise plural form.It is to be further understood that being arranged used in the description of the present application Diction " comprising " refer to that there are the feature, integer, step, operation, element and/or component, but it is not excluded that in the presence of or addition Other one or more features, integer, step, operation, element, component and/or their group.It should be understood that when we claim member Part is " connected " or when " coupled " to another element, it can be directly connected or coupled to other elements, or there may also be Intermediary element.In addition, " connection " used herein or " coupling " may include being wirelessly connected or wirelessly coupling.It is used herein to arrange Diction "and/or" includes one or more associated wholes for listing item or any cell and all combinations.

To keep the purposes, technical schemes and advantages of the application clearer, below in conjunction with attached drawing to the application embodiment party Formula is described in further detail.

How the technical solution of the application and the technical solution of the application are solved with specifically embodiment below above-mentioned Technical problem is described in detail.These specific embodiments can be combined with each other below, for the same or similar concept Or process may repeat no more in certain embodiments.Below in conjunction with attached drawing, embodiments herein is described.

The embodiment of the present application provides a kind of method of image procossing, as shown in Figure 1, this method comprises:

Step S101, the corresponding location information of each candidate frame in image to be processed is determined.

Wherein, candidate frame is the candidate frame for target object.

For the embodiment of the present application, it can also include: that multimedia messages are pre-processed before step S101, obtain Image to be processed.

For the embodiment of the present application, candidate circle favored area is region shared by the target object that identifies.

For the embodiment of the present application, candidate frame can be the mark letter for indicating the region of target object in the picture Breath, is identified by arbitrary shape, for example, rectangle, square etc..For example, as shown in figure 5, the square-shaped frame in image is Candidate frame, for indicating the face area of gorilla in the images.

Step S102, based on image to be processed and the corresponding location information of each candidate frame, and by default Model determines the attribute information of the corresponding target object of each candidate frame respectively.

For the embodiment of the present application, after obtaining the corresponding location information of each candidate frame, by each candidate frame Corresponding location information and image to be processed be input to network model, export the corresponding target of each candidate frame The attribute information of object.

The category of the corresponding target object of each candidate frame is determined respectively by preset model for the embodiment of the present application It can also include: to be trained to preset model before property information.Specifically, training sample is obtained, and passes through training sample, Preset model is trained, training sample includes: image, the candidate frame for target object that magnanimity includes target object The attribute information of location information and target object.

Step S103, by non-maxima suppression algorithm and based on the corresponding location information of each candidate frame and respectively The attribute information of the corresponding target object of a candidate frame, merges processing to each candidate frame.

For the embodiment of the present application, by non-maxima suppression algorithm and it is based on the corresponding location information of each candidate frame And the attribute information of the corresponding target object of each candidate frame, determine the candidate that merging is able to carry out in each candidate frame Frame, and merging treatment is carried out, the location information of the candidate frame after being merged.

For the embodiment of the present application, the corresponding location information of any candidate frame may include at least one of following: this is any The center position information of candidate frame；The corresponding location information in each vertex；Any candidate frame and current candidate frame it Between hand over and compare IoU, wherein hand over and than (Intersection-over-Union, IoU), used in target detection one it is general It reads, is the overlapping rate of the candidate frame generated and former indicia framing, the i.e. ratio of their intersection and union, most ideally completely Overlapping, i.e. ratio are 1.

For example, the IoU between any candidate frame A and current candidate frame B isIts In, area (A) is the corresponding region candidate frame A, the region of the corresponding current candidate frame B of area (B).

For the embodiment of the present application, the attribute information of the corresponding target object of any candidate frame can by matrix or to The form of amount or discrete data indicates.In the embodiment of the present application without limitation.

For the embodiment of the present application, through the above technical solutions, being merged to multiple candidate frames for target object Processing, can carry out the tracking etc. of target image based on the candidate frame after merging, merge place to each candidate frame improving When the accuracy of reason, it can also be improved the target object such as to be tracked and handled based on the candidate frame for target object The accuracy of mode, and then promote user experience.

The embodiment of the present application provides a kind of method of image procossing, only with non-maxima suppression method in the prior art It is that the location information based on each candidate frame merges candidate frame and compares, image to be processed is determined in the embodiment of the present application In the corresponding location information of each candidate frame, candidate frame is the candidate frame for target object, is then based on to be processed Image and the corresponding location information of each candidate frame, and the corresponding mesh of each candidate frame is determined by preset model respectively Mark object attribute information, then by non-maxima suppression algorithm and based on the corresponding location information of each candidate frame with And the attribute information of the corresponding target object of each candidate frame, processing is merged to each candidate frame.That is the embodiment of the present application When merging processing for the candidate frame in image to be processed, be based on the corresponding location information of each candidate frame with And the attribute information of the corresponding target object of each candidate frame, it will be closer and belong to not in image so as to reduce It is incorporated into the probability of a candidate frame with the candidate frame of target object, and can also reduce and will belong to the multiple of same target object Candidate frame is merged into the case where more than one frame, and then can be improved and merge the accurate of processing to candidate frame each in image Degree, and then user experience can be promoted.

A kind of possible implementation of the embodiment of the present application, for any two candidate frame, step S103 specifically can be with It include: step S1031 (not shown), step S1032 (not shown) and step S1033 (not shown), In,

Step S1031, friendship and the ratio of any two candidate frame are determined based on the corresponding location information of each candidate frame IoU。

Step S1032, by non-maxima suppression algorithm and based on the IoU of any two candidate frames and any two candidate frames In the corresponding target object of each candidate frame attribute information, determine whether any two candidate frames belong to pending merging Candidate frame.

If step S1033, belonging to pending combined candidate frame, processing is merged to any two candidate frame.

For the embodiment of the present application, by non-maxima suppression algorithm and based on each time in IoU and any two candidate frames The attribute information for selecting the corresponding target object of frame merges processing to any two candidate frame, can reduce image In be closer and belong to the candidate frame of different target object and be incorporated into the probability of a candidate frame, and can also reduce and will belong to The case where multiple candidate frames of same target object are merged into more than one frame, and then can be improved to candidate frame each in image The accuracy of processing is merged, and then user experience can be promoted.

For the embodiment of the present application, when merging processing for multiple candidate frames, one can be first chosen with reference to time Select frame, determine any one other candidate frame and this with reference to the friendship between candidate frame and than IoU, and determine other any candidates The attribute information of frame and the corresponding target object of reference candidate frame, however, it is determined that can merge, then merge, and will be after merging Candidate frame be used as and refer to candidate frame, determine the IoU with reference between candidate frame and other any candidate frames, and determine this its The attribute information of its any candidate frame and the corresponding target object of reference candidate frame, however, it is determined that it can merge, then merge, Circulation is executed using the candidate frame after merging as candidate frame is referred to, and determines that this is referred between candidate frame and other any candidate frames IoU, and determine other any candidate frames and the attribute information with reference to the corresponding target object of candidate frame, however, it is determined that The step of can merging, then merge, until meeting preset condition.

For example, for target object include 4 candidate frames, candidate frame 1, candidate frame 2, candidate frame 3 and candidate frame 4, Middle candidate frame 1 of first choosing is with reference to candidate frame and to determine the IoU between candidate frame 1 and candidate frame 2 and candidate frame 1 and rear choosing The attribute information of the corresponding target object of frame 2 merges candidate frame 1 and candidate frame 2 if can merge, then Determine the IoU between the candidate frame and candidate frame 3 after merging, and the candidate frame after merging selects the corresponding mesh of frame 3 with after The attribute information of mark object merges if can merge, and then determines between candidate frame and candidate frame 4 after merging IoU, and candidate frame after merging select the attribute information of the corresponding target object of frame 4 will conjunction if can merge with after Candidate frame after and is merged with candidate frame 4, the location information of candidate frame after finally being merged.

For the embodiment of the present application, if finally there are the candidate frame that can not merge processing, from can not merge Target candidate frame is determined in the candidate frame of processing.In the embodiment of the present application, the location information of target candidate frame is used for final table Levy the location information of target object in the picture.

For example, the candidate frame after merging in above-mentioned example (is merged by candidate frame 1, candidate frame 2 and candidate frame 3 and is post-processed Candidate frame afterwards) with candidate frame 4 can not merge processing, then from the candidate frame and candidate frame 4 after merging determination target Candidate frame (can be the candidate frame or candidate frame 4 after merging).

For the embodiment of the present application, the corresponding IoU of any two candidate frame and any two can also be determined respectively The attribute information of the corresponding target object of each candidate frame, determines candidate frame to be combined, and closed in a candidate frame And it handles, the candidate frame that obtains that treated (each candidate frame that merges that treated and/or the candidate for not merging processing Frame), and determine any two treated the corresponding IoU of candidate frame and any two treated and is each in candidate frame The attribute information of a treated corresponding target object of candidate frame is then based on any two treated candidate frame point Each treated corresponding target object of candidate frame in not corresponding IoU and any two treated candidate frame Attribute information, determine candidate frame to be combined, and carry out merging treatment, circulation executes that treated is candidate based on any two Each treated corresponding target of candidate frame in the corresponding IoU of frame and any two treated candidate frame The attribute information of object determines candidate frame to be combined, and carries out merging treatment, until meeting preset condition.

For example, including 4 candidate frames for target object, candidate frame 1, candidate frame 2, candidate frame 3 and candidate frame 4 are first The first corresponding IoU between determining candidate frame 1 and candidate frame 2 and candidate frame 1 and the corresponding target of candidate frame 2 The attribute information of object, corresponding IoU and candidate frame 3 and candidate frame 4 between candidate frame 3 and candidate frame 4 distinguish The attribute information of corresponding target object, corresponding IoU and the candidate frame being then based between candidate frame 1 and candidate frame 2 1 and the corresponding target object of candidate frame 2 attribute information, determine whether candidate frame 1 and candidate frame 2 can merge, And merging treatment is carried out, and based on the corresponding IoU and candidate frame 3 and candidate frame 4 between candidate frame 3 and candidate frame 4 The attribute information of corresponding target object, determines whether candidate frame 3 and candidate frame 4 can merge, and merges place Reason, if then candidate frame 1 and candidate frame 2 merge, candidate frame 3 and candidate frame 4 merge, it is determined that two candidates after merging The attribute letter of the corresponding target object of each candidate frame in corresponding IoU between frame, and two candidate frames after merging Breath, and each candidate frame is right respectively in two candidate frames after corresponding IoU between two candidate frames after merging, and merging The attribute information for the target object answered, determines whether two candidate frames after merging merge, and carries out merging treatment.Certainly, if Candidate frame 1 and candidate frame 2 can merge processing, and candidate frame 3 and candidate frame 4 can not merge processing, then will close And treated that candidate frame successively successively merges processing with candidate frame 3 and candidate frame 4, until meeting preset condition.

For the embodiment of the present application, above-mentioned preset condition may include at least one of following:

There is no the candidate frames that can merge processing；

The candidate frame of target object is directed to comprising one；

It include a candidate frame for either objective object；

The condition of user's input.

The alternatively possible implementation of the embodiment of the present application, the attribute information of the corresponding target object of any candidate frame It includes at least one of the following:

If candidate frame is the candidate frame for face information, the attribute information packet of the corresponding target object of any candidate frame It includes at least one of following:

The alternatively possible implementation of the embodiment of the present application, if candidate frame is the candidate frame for face information, Step S1032 can specifically include: step S10321 (not shown), wherein

Step S10321, by non-maxima suppression algorithm, the IoU of any two candidate frames is determined, in any two candidate frames Whether the attribute information of the corresponding face information of each candidate frame meets preset rules, whether to determine any two candidate frames Belong to pending combined candidate frame.

Wherein, preset rules include at least one of the following:

Specifically, preset rules include at least one of the following:

IoU is greater than the 7th preset threshold, the 7th preset threshold=first the+the second parameter value of preset threshold × any two candidates Difference between the radian value of the corresponding direction of frame.

For example, if candidate frame is the candidate frame for face information, when carrying out non-maxima suppression algorithm, two candidates The condition that frame merges is that IoU is greater than 0.3 and the corresponding gender attribute of two candidate frames is identical；Carrying out non-maxima suppression When algorithm, the condition that two candidate frames merge is IoU greater than 0.3 and the corresponding gender attribute of two candidate frames is identical, or Person IoU is greater than 0.95 and the corresponding gender attribute of two candidate frames is not identical；When carrying out non-maxima suppression algorithm, two The condition that a candidate frame merges is that IoU is greater than 0.3 and their corresponding attribute informations are no more than for the gap of the probability of women 0.4；When carrying out non-maxima suppression algorithm, the condition that two candidate frames merge is: their IoU is greater than 0.3 and they are The gap of the probability of women be no more than 0.4 or IoU be greater than 0.95 and they for the gap of probability of women be at least 0.4；? When carrying out non-maxima suppression algorithm, the condition that two candidate frames merge is that greater than 0.3, their age (belongs to their IoU Property) gap be no more than 5 years old；When carrying out non-maxima suppression algorithm, the condition that two candidate frames merge is: their IoU is big In 0.3 and age (attribute) gap be no more than 5 years old or IoU and be greater than 0.95 and age (attribute) gap at least 5 years old；Carry out When non-maxima suppression algorithm, the condition that two candidate frames merge is, their IoU is greater than (0.3+0.05 × their age Gap)；When carrying out non-maxima suppression algorithm, the condition that two candidate frames merge is that their IoU is greater than (0.3+0.05 × their age gap away from+0.5 × they for the probability of women gap)；When carrying out non-maxima suppression algorithm, two faces The condition that frame merges is, their IoU be greater than (0.3+0.5 × they direction radian value difference).This rule and The difference of front is, one can consider that age and gender are a people constant attributes in a short time, but are directed towards and hold Continue it is changed, but because direction variation have continuity, still can assist non-maxima suppression algorithm.

For example, the candidate frame for face includes candidate frame 1 and candidate frame 2, candidate frame in any image to be processed IoU between 1 and candidate frame 2 is 0.5, and 1 corresponding gender (attribute) of candidate frame is 28,2 corresponding gender (attribute) of candidate frame It is that 32, then candidate frame 1 and candidate frame 2 can merge.

In another example the candidate frame for face includes candidate frame 3 and candidate frame 4, between candidate frame 3 and candidate frame 4 IoU is 0.5, and the 3 corresponding age of candidate frame (attribute) is 28, and the 4 corresponding age of candidate frame (attribute) is 32, and candidate frame 3 is corresponding Target object be the probability of women be 0.8, the corresponding target object of candidate frame 4 is that the attribute of women is 0.6, the first default threshold Value is 0.3, and the first parameter value is 0.05, and the second parameter value is 0.5, then first the+the first parameter value of preset threshold × any two times Select the+the second parameter value of age gap × corresponding target object of any two candidate frames of the corresponding target object of frame Gender is the probability difference of women away from=0.3+0.05 × 4+0.5 × 0.2=0.6, i.e. the 6th preset threshold is 0.6, and candidate frame 3 IoU between candidate frame 4 is 0.5, then candidate frame 3 and candidate frame 4 cannot merge.

For the embodiment of the present application, IoU by non-maxima suppression algorithm, and based on any two candidate frames, any two The attribute information and preset rules of the corresponding face information of each candidate frame, determine that any two candidate frames are in candidate frame It is no to belong to pending combined candidate frame, the accuracy merged for the candidate frame of face can be improved, with promoted for The accuracy of the modes such as face tracking, and then user experience can be promoted.

For the embodiment of the present application, place can be merged by above-mentioned preset rules not only for the candidate frame of face Reason, for other candidate frames, can also merge processing by corresponding rule by non-maxima suppression algorithm.

For example, if target object be information of vehicles, be directed to information of vehicles candidate frame, then by any two candidate frame it Between IoU be greater than 0.3 and belong to the candidate frame of same category and merge processing.

The alternatively possible implementation of the embodiment of the present application, may include: before step S101 step Sa (in figure not Show), step S101 specifically can be with step S1011 (not shown), wherein

Step Sa, multimedia messages are pre-processed, obtains image to be processed.

For the embodiment of the present application, pretreatment may include: image correction process and denoising etc. for initial data Processing mode.

Step S1011, by image to be processed by detection network, each candidate frame difference in image to be processed is determined Corresponding location information.

For the embodiment of the present application, detecting network can be the neural network after training.

The above-mentioned method for describing image procossing in detail, it is following to pass through an application scenarios and introduced in a manner of summing-up The method of the image procossing, specific as shown in Figure 4:

By camera video stream by obtaining frame image to be processed after pretreatment, and frame image to be processed enters inspection Survey grid network obtains the location information of at least one face frame, then the position of frame image to be processed and at least one face frame Information is obtained the attribute information of the corresponding target object of any face frame, is then based on and passes through face by face character network The location information of any face frame, merges place to face frame in the attribute information and any frame image that net with attributes obtains Reason, the location information of the face frame after being merged.

The method for describing image procossing from the angle of method flow above, angle of following combination attached drawings from virtual module The device of image procossing is introduced, specific as follows shown:

The embodiment of the present application provides a kind of device of image procossing, as shown in Fig. 2, the device 20 of the image procossing can be with It include: the first determining module 21, the second determining module 22, merging treatment module 23, wherein

First determining module 21, for determining the corresponding location information of each candidate frame in image to be processed.

Wherein, candidate frame is the candidate frame for target object.

Second determining module 22, each candidate frame for being determined based on image to be processed and the first determining module 21 Corresponding location information, and pass through the attribute information that preset model determines the corresponding target object of each candidate frame respectively.

For the embodiment of the present application, the first determining module 21 and the second determining module 22 can be identical determining module, Or different determining module.In the embodiment of the present application without limitation.

Merging treatment module 23, for passing through non-maxima suppression algorithm and being based on the corresponding position of each candidate frame The attribute information of information and the corresponding target object of each candidate frame merges processing to each candidate frame.

A kind of possible implementation of the embodiment of the present application, for any two candidate frame, merging treatment module 23, packet It includes: the first determination unit, the second determination unit and merge processing unit, wherein

First determination unit, for determining any two candidate frame based on the corresponding location information of each candidate frame It hands over and compares IoU.

Second determination unit, any two times for passing through non-maxima suppression algorithm and being determined based on the first determination unit The attribute information for selecting the corresponding target object of each candidate frame in the IoU and any two candidate frames of frame, determines any two Whether candidate frame belongs to pending combined candidate frame.

For the embodiment of the present application, the first determination unit and the second determination unit can be identical determination unit, can also Think different determination units.In the embodiment of the present application without limitation.

The alternatively possible implementation of the embodiment of the present application, the device 20 further include: preprocessing module, wherein

Preprocessing module obtains image to be processed for pre-processing to multimedia messages.

Specifically, the first determining module 21, specifically for by detection network, determining to be processed image to be processed The corresponding location information of each candidate frame in image.

The colouring information of target object；The density information of target object；The shape information of target object；Belonging to target object Classification information；

The alternatively possible implementation of the embodiment of the present application, when candidate frame is the candidate frame for face information, Second determination unit is specifically used for determining IoU, any two candidate frames of any two candidate frames by non-maxima suppression algorithm In the attribute information of the corresponding face information of each candidate frame whether meet preset rules, to determine that any two candidate frames are It is no to belong to pending combined candidate frame.

Wherein, preset rules include at least one of the following:

Specifically, preset rules include at least one of the following:

The embodiment of the present application provides a kind of device of image procossing, only with non-maxima suppression method in the prior art It is that the location information based on each candidate frame merges candidate frame and compares, image to be processed is determined in the embodiment of the present application In the corresponding location information of each candidate frame, candidate frame is the candidate frame for target object, is then based on to be processed Image and the corresponding location information of each candidate frame, and the corresponding mesh of each candidate frame is determined by preset model respectively Mark object attribute information, then by non-maxima suppression algorithm and based on the corresponding location information of each candidate frame with And the attribute information of the corresponding target object of each candidate frame, processing is merged to each candidate frame.That is the embodiment of the present application When merging processing for the candidate frame in image to be processed, be based on the corresponding location information of each candidate frame with And the attribute information of the corresponding target object of each candidate frame, it will be closer and belong to not in image so as to reduce It is incorporated into the probability of a candidate frame with the candidate frame of target object, and can also reduce and will belong to the multiple of same target object Candidate frame is merged into the case where more than one frame, and then the accuracy handled candidate frame each in image can be improved, And then user experience can be promoted.

The side for the image procossing that the application above method embodiment provides can be performed in the device of the image procossing of the present embodiment Method, realization principle is similar, and details are not described herein again.

Above-described embodiment describes the method for image procossing from the angle of method flow and is situated between from the angle of virtual module Continued the device of image procossing, following combination attached drawings, describes from entity structure angle and executes the electronics of image processing method and set It is standby, specific as follows shown:

The embodiment of the present application provides a kind of electronic equipment, as shown in figure 3, electronic equipment shown in Fig. 3 3000 includes: place Manage device 3001 and memory 3003.Wherein, processor 3001 is connected with memory 3003, is such as connected by bus 3002.It is optional Ground, electronic equipment 3000 can also include transceiver 3004.It should be noted that transceiver 3004 is not limited to one in practical application A, the structure of the electronic equipment 3000 does not constitute the restriction to the embodiment of the present application.

Processor 3001 can be CPU, general processor, DSP, ASIC, FPGA or other programmable logic device, crystalline substance Body pipe logical device, hardware component or any combination thereof.It, which may be implemented or executes, combines described by present disclosure Various illustrative logic blocks, module and circuit.Processor 3001 is also possible to realize the combination of computing function, such as wraps It is combined containing one or more microprocessors, DSP and the combination of microprocessor etc..

Bus 3002 may include an access, and information is transmitted between said modules.Bus 3002 can be pci bus or Eisa bus etc..Bus 3002 can be divided into address bus, data/address bus, control bus etc..Only to be used in Fig. 3 convenient for indicating One thick line indicates, it is not intended that an only bus or a type of bus.

Memory 3003 can be ROM or can store the other kinds of static storage device of static information and instruction, RAM Or the other kinds of dynamic memory of information and instruction can be stored, it is also possible to EEPROM, CD-ROM or other CDs Storage, optical disc storage (including compression optical disc, laser disc, optical disc, Digital Versatile Disc, Blu-ray Disc etc.), magnetic disk storage medium Or other magnetic storage apparatus or can be used in carry or store have instruction or data structure form desired program generation Code and can by any other medium of computer access, but not limited to this.

Memory 3003 is used to store the application code for executing application scheme, and is held by processor 3001 to control Row.Processor 3001 is for executing the application code stored in memory 3003, to realize aforementioned either method embodiment Shown in content.

The embodiment of the present application provides a kind of electronic equipment, the electronic equipment in the embodiment of the present application include: memory and Processor；At least one program is stored in the memory, when for being executed by the processor, compared with prior art It can be achieved: determining that the corresponding location information of each candidate frame, candidate frame are in image to be processed in the embodiment of the present application For the candidate frame of target object, it is then based on image and the corresponding location information of each candidate frame to be processed, and It determines the attribute information of the corresponding target object of each candidate frame respectively by preset model, is then calculated by non-maxima suppression Method and attribute information based on the corresponding location information of each candidate frame and the corresponding target object of each candidate frame, it is right Each candidate frame merges processing.It is base when i.e. the application merges processing for the candidate frame in image to be processed In the corresponding location information of each candidate frame and the attribute information of the corresponding target object of each candidate frame, thus Can reduce will be closer and belong to the candidate frame of different target object in image is incorporated into the probability of a candidate frame, and The case where multiple candidate frames for belonging to same target object are merged into more than one frame can be reduced, and then can be improved to figure The accuracy that each candidate frame is handled as in, and then user experience can be promoted.

The method that the image procossing that the application above method embodiment provides can be performed in the electronic equipment of the present embodiment, in fact Existing principle is similar, and details are not described herein again.

The embodiment of the present application provides a kind of computer readable storage medium, is stored on the computer readable storage medium Computer program allows computer to execute corresponding contents in preceding method embodiment when run on a computer.With The prior art is compared, and the corresponding location information of each candidate frame in image to be processed is determined in the embodiment of the present application, is waited Selecting frame is the candidate frame for target object, is then based on image to be processed and the corresponding position letter of each candidate frame Breath, and pass through the attribute information that preset model determines the corresponding target object of each candidate frame respectively, then pass through non-maximum Restrainable algorithms and attribute based on the corresponding location information of each candidate frame and the corresponding target object of each candidate frame Information merges processing to each candidate frame.I.e. the application merges processing for the candidate frame in image to be processed When, it is to be believed based on the attribute of the corresponding location information of each candidate frame and the corresponding target object of each candidate frame Breath is incorporated into the general of a candidate frame so as to reduce to be closer in image and belong to the candidate frame of different target object Rate, and the case where multiple candidate frames for belonging to same target object are merged into more than one frame can also be reduced, and then can be with The accuracy handled candidate frame each in image is improved, and then user experience can be promoted.

Computer readable storage medium provided by the embodiments of the present application is suitable for above method embodiment, implements in the application It is repeated no more in example.

It should be understood that although each step in the flow chart of attached drawing is successively shown according to the instruction of arrow, These steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly stating otherwise herein, these steps Execution there is no stringent sequences to limit, can execute in the other order.Moreover, at least one in the flow chart of attached drawing Part steps may include that perhaps these sub-steps of multiple stages or stage are not necessarily in synchronization to multiple sub-steps Completion is executed, but can be executed at different times, execution sequence, which is also not necessarily, successively to be carried out, but can be with other At least part of the sub-step or stage of step or other steps executes in turn or alternately.

The above is only some embodiments of the invention, it is noted that for the ordinary skill people of the art For member, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications are also answered It is considered as protection scope of the present invention.

Claims

1. a kind of image processing method characterized by comprising

Determine that the corresponding location information of each candidate frame in image to be processed, the candidate frame are for target object Candidate frame；

Based on the image to be processed and the corresponding location information of each candidate frame, and pass through preset model point The attribute information of the corresponding target object of each candidate frame is not determined；

Pass through non-maxima suppression algorithm and is based on each corresponding location information of candidate frame and each time The attribute information for selecting the corresponding target object of frame merges processing to each candidate frame.

2. passing through non-maxima suppression the method according to claim 1, wherein being directed to any two candidate frame Algorithm and based on each corresponding location information of candidate frame and the corresponding target object of each candidate frame Attribute information merges processing to each candidate frame, comprising:

It is by non-maxima suppression algorithm and each in the IoU based on any two candidate frames and any two described candidate frames The attribute information of the corresponding target object of candidate frame, determines whether any two described candidate frames belong to pending combined time Select frame；

3. the method according to claim 1, wherein determining that each candidate frame respectively corresponds in image to be processed Location information, before further include:

Multimedia messages are pre-processed, image to be processed is obtained；

The corresponding location information of each candidate frame in determination image to be processed, comprising:

By the image to be processed by detection network, the corresponding position of each candidate frame in image to be processed is determined Information.

4. method according to claim 1-3, which is characterized in that the category of the corresponding target object of any candidate frame Property information includes at least one of the following:

The colouring information of target object；

The density information of target object；

The shape information of target object；

Target object generic information.

5. method according to claim 1-3, which is characterized in that if the candidate frame is for face information Candidate frame, then the attribute information of the corresponding target object of any candidate frame includes at least one of the following:

The age information of target object；

The gender information of target object；

The skin color information of target object；

Target object towards radian information.

6. according to the method described in claim 5, it is characterized in that, if the candidate frame be for face information candidate frame, Then

Each time in IoU and any two candidate frames by non-maxima suppression algorithm, and based on any two candidate frames The attribute information of the corresponding target object of frame is selected, determines whether any two described candidate frames belong to pending combined candidate Frame, comprising:

By non-maxima suppression algorithm, the IoU of any two candidate frames is determined, each candidate frame point in any two candidate frames Whether the attribute information of not corresponding face information meets preset rules, with determine any two described candidate frames whether belong to into The combined candidate frame of row；

The preset rules include at least one of the following:

The IoU is greater than the first attribute information phase of the first preset threshold and the corresponding target object of any two candidate frames Together；

The IoU is not greater than the first attribute information of the second preset threshold and the corresponding target object of any two candidate frames Together, the first preset threshold is less than the second preset threshold；

The IoU is greater than the first preset threshold, and the first attribute information of the corresponding target object of any two candidate frames is The probability difference of preset attribute information is away from no more than third predetermined threshold value；

The IoU is greater than the second preset threshold, and the first attribute information of the corresponding target object of any two candidate frames is The probability difference of preset attribute information is away from not less than third predetermined threshold value；

The IoU be greater than the first preset threshold, and the second attribute information of the corresponding target object of any two candidate frames it Difference is not more than the 4th preset threshold；

The IoU be greater than the second preset threshold, and the second attribute information of the corresponding target object of any two candidate frames it Difference is greater than the 4th preset threshold；

The IoU is greater than the 5th preset threshold, and the 5th preset threshold is first preset threshold and any two candidate frames Weighted information between first attribute information difference of corresponding target object；

The IoU is greater than the 6th preset threshold, and the 6th preset threshold is first preset threshold and any two candidate frames First attribute information difference of corresponding target object and the corresponding target object of any two described candidate frames Second attribute information is the weighted information between the probability difference of preset attribute information；

The IoU is greater than the 7th preset threshold, and the 7th preset threshold is first preset threshold and any two candidate frames Weighted information between corresponding third attribute information difference.

7. according to the method described in claim 6, it is characterized in that, the preset rules include at least one of the following:

The IoU is greater than the first preset threshold and the gender information of the corresponding target object of any two candidate frames is identical；

The IoU is greater than the second preset threshold and the gender information of the corresponding target object of any two candidate frames is different, the One preset threshold is less than the second preset threshold；

The IoU is greater than the first preset threshold, and the gender of the corresponding target object of any two candidate frames is that the same sex is other Probability difference is away from no more than third predetermined threshold value；

The IoU is greater than the second preset threshold, and the gender of the corresponding target object of any two candidate frames is that the same sex is other Probability difference is away from greater than third predetermined threshold value；

The IoU is greater than the first preset threshold, and the age gap of the corresponding target object of any two candidate frames is not more than 4th preset threshold；

The IoU is greater than the second preset threshold, and the age gap of the corresponding target object of any two candidate frames is greater than the Four preset thresholds；

The IoU be greater than the 5th preset threshold, the 5th preset threshold=first the+the first parameter value of preset threshold × any two The age gap of the corresponding target object of candidate frame；

The IoU be greater than the 6th preset threshold, the 6th preset threshold=first the+the first parameter value of preset threshold × any two The+the second parameter value of age gap of the corresponding target object of the candidate frame × corresponding target object of any two candidate frames Gender be the other probability difference of the same sex away from；

The IoU be greater than the 7th preset threshold, the 7th preset threshold=first the+the second parameter value of preset threshold × any two Difference between the radian value of the corresponding target object direction of candidate frame.

8. a kind of device of image procossing characterized by comprising

First determining module, for determining the corresponding location information of each candidate frame, the candidate in image to be processed Frame is the candidate frame for target object；

Second determining module, each candidate frame for being determined based on the image to be processed and first determining module Corresponding location information, and determine that the attribute of the corresponding target object of each candidate frame is believed respectively by preset model Breath；

Merging treatment module, for passing through non-maxima suppression algorithm and being based on the corresponding position letter of each candidate frame The attribute information of breath and the corresponding target object of each candidate frame, merges processing to each candidate frame.

9. a kind of electronic equipment, characterized in that it comprises:

One or more processors；

Memory；

One or more application program, wherein one or more of application programs are stored in the memory and are configured To be executed by one or more of processors, one or more of programs are configured to: being executed according to claim 1~7 The method of described in any item image procossings.

10. a kind of computer readable storage medium, which is characterized in that the storage medium is stored at least one instruction, at least One Duan Chengxu, code set or instruction set, at least one instruction, an at least Duan Chengxu, the code set or instruction set Loaded by the processor and executed the method to realize the image procossing as described in claim 1 to 7 is any.