CN109657533A

CN109657533A - Pedestrian recognition methods and Related product again

Info

Publication number: CN109657533A
Application number: CN201811262562.8A
Authority: CN
Inventors: 刘凯; 张鹏; 陈微
Original assignee: SHENZHEN HARZONE TECHNOLOGY Co Ltd
Current assignee: SHENZHEN HARZONE TECHNOLOGY Co Ltd
Priority date: 2018-10-27
Filing date: 2018-10-27
Publication date: 2019-04-19
Anticipated expiration: 2038-10-27
Also published as: CN109657533B

Abstract

This application provides pedestrian again recognition methods and Related products, method includes: to carry out feature extraction to target image by default convolutional neural networks training pattern, obtain fisrt feature collection, default convolutional neural networks training pattern is made of the first training module and the second training module, and the Fusion Features that the first training module and the second training module are extracted are at a feature set；It determines that fisrt feature collection and multiple second feature concentrate the Hamming distance between each second feature collection, obtains multiple Hamming distance values；By the likelihood probability value between each image in multiple Hamming distance value calculating input images and image library, multiple likelihood probability values are obtained；The likelihood probability value for being greater than preset threshold is chosen from multiple likelihood probability values, obtains at least one target likelihood probability value；The image at least one corresponding image library of target likelihood probability value is showed into user according to the descending sequence of likelihood probability value.Pedestrian can be promoted using the application and identify rate of precision.

Description

Pedestrian recognition methods and Related product again

Technical field

This application involves technical field of image processing, and in particular to a kind of pedestrian recognition methods and Related product again.

Background technique

Pedestrian identifies (Person Re-identification) that abbreviation ReID is a kind of utilization computer vision skill again The case where art judges the technology that whether there is specific pedestrian in image or video sequence, that is, gives a monitoring pedestrian image Under, retrieve the target image under striding equipment.It is intended to make up the vision limitation of current fixing camera, while can be examined with pedestrian Survey/pedestrian tracking technology combines.

Pedestrian's weight identification technology (ReID), it is considered to be the subproblem of an image retrieval can be widely applied to intelligent view The fields such as frequency monitoring, intelligent security.In monitor video, due to the reason of resolution ratio of camera head, it is non-to be not typically available quality Often high face picture.In the case where recognition of face failure, ReID just becomes a very important substitute technology.And And actual video monitoring scene is complicated, and the influence of the environment such as angle, illumination locating for different cameras, lead to the same row The macroscopic features that people is shown under different cameras can be different；Conversely, because the change of camera angle and pedestrian's posture Change, causes the macroscopic features of different pedestrians may be more more like than the macroscopic features of same people, and then identify again to pedestrian and cause pole Big influence.

Summary of the invention

The embodiment of the present application provides a kind of pedestrian recognition methods and Related product again, can be improved pedestrian and identifies again precisely Rate.

The embodiment of the present application first aspect provides a kind of pedestrian's recognition methods again, comprising:

Obtain input picture；

The input picture is pre-processed, target image is obtained；

Feature extraction is carried out to the target image by default convolutional neural networks training pattern, obtains fisrt feature Collection, the default convolutional neural networks training pattern is made of the first training module and the second training module, and by described first At a feature set, first training pattern is based on the Fusion Features that training module and second training module extract SphereLoss function and SoftmaxLoss function are realized, are used to train global characteristics, second training pattern is based on SoftmaxLoss function is realized, is used to train local feature；

Feature extraction is carried out to each image in pre-set image library by the default convolutional neural networks training pattern, Obtain multiple second feature collection；

Determine that the fisrt feature collection and the multiple second feature concentrate the Hamming distance between each second feature collection, Obtain multiple Hamming distance values；

The phase between the input picture and each image in described image library is calculated by the multiple Hamming distance value Like probability value, multiple likelihood probability values are obtained；

The likelihood probability value for being greater than preset threshold is chosen from the multiple likelihood probability value, obtains at least one target phase Like probability value；

It will at least one described corresponding described figure of target likelihood probability value according to the descending sequence of likelihood probability value As the image in library shows user.

The embodiment of the present application second aspect provides a kind of pedestrian's weight identification device, comprising:

Acquiring unit, for obtaining input picture；

Processing unit obtains target image for pre-processing to the input picture；

Extraction unit, for carrying out feature extraction to the target image by default convolutional neural networks training pattern, Fisrt feature collection is obtained, the default convolutional neural networks training pattern is made of the first training module and the second training module, And the Fusion Features for extracting first training module and second training module are at a feature set, first training Model is based on SphereLoss function and SoftmaxLoss function is realized, is used to train global characteristics, the second training mould Type is realized based on SoftmaxLoss function, is used to train local feature；And pass through the default convolutional neural networks training Model carries out feature extraction to each image in pre-set image library, obtains multiple second feature collection；

Determination unit, for determine the fisrt feature collection and the multiple second feature concentrate each second feature collection it Between Hamming distance, obtain multiple Hamming distance values；

Computing unit, for by the multiple Hamming distance value calculating input picture with it is each in described image library Likelihood probability value between image obtains multiple likelihood probability values；

Selection unit is obtained for choosing the likelihood probability value for being greater than preset threshold from the multiple likelihood probability value At least one target likelihood probability value；

Display unit, for according to the descending sequence of likelihood probability value at least one target likelihood probability value by described in Image in corresponding described image library shows user.

The third aspect, the embodiment of the present application provide a kind of electronic equipment, comprising: processor and memory；And one Or multiple programs, one or more of programs are stored in the memory, and are configured to be held by the processor Row, described program includes the instruction for the step some or all of as described in first aspect.

Fourth aspect, the embodiment of the present application provide a kind of computer readable storage medium, wherein described computer-readable Storage medium is for storing computer program, wherein the computer program executes computer such as the embodiment of the present application the The instruction of step some or all of described in one side.

5th aspect, the embodiment of the present application provide a kind of computer program product, wherein the computer program product Non-transient computer readable storage medium including storing computer program, the computer program are operable to make to calculate Machine executes the step some or all of as described in the embodiment of the present application first aspect.The computer program product can be one A software installation packet.

Implement the embodiment of the present application, has the following beneficial effects:

As can be seen that obtaining input figure by pedestrian described in the embodiment of the present application again recognition methods and Related product Picture pre-processes input picture, obtains target image, by default convolutional neural networks training pattern to target image into Row feature extraction obtains fisrt feature collection, presets convolutional neural networks training pattern by the first training module and the second training mould Block composition, and the Fusion Features that the first training module and the second training module are extracted are at a feature set, the first training pattern It is realized based on SphereLoss function and SoftmaxLoss function, is used to train global characteristics, the second training pattern is based on SoftmaxLoss function is realized, is used to train local feature, by default convolutional neural networks training pattern to pre-set image Each image in library carries out feature extraction, obtains multiple second feature collection, determines fisrt feature collection and multiple second feature collection In Hamming distance between each second feature collection, obtain multiple Hamming distance values, pass through multiple Hamming distance values and calculate input Likelihood probability value in image and image library between each image obtains multiple likelihood probability values, from multiple likelihood probability values Choose be greater than preset threshold likelihood probability value, obtain at least one target likelihood probability value, according to likelihood probability value by greatly to Image at least one corresponding image library of target likelihood probability value is showed user by small sequence, so, it is possible to extract The global characteristics of pedestrian and local feature can be used to identify, improve pedestrian by the part of pedestrian and global characteristics Identify rate of precision.

Detailed description of the invention

In order to more clearly explain the technical solutions in the embodiments of the present application, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, the accompanying drawings in the following description is some embodiments of the present application, for ability For the those of ordinary skill of domain, without creative efforts, it can also be obtained according to these attached drawings other attached Figure.

Figure 1A is a kind of embodiment flow diagram of pedestrian provided by the embodiments of the present application recognition methods again；

Figure 1B is a kind of demonstration schematic diagram of pedestrian provided by the embodiments of the present application feature extraction of recognition methods again；

Fig. 1 C is a kind of embodiment flow diagram of pedestrian provided by the embodiments of the present application recognition methods again；

Fig. 2 is a kind of another demonstration schematic diagram of pedestrian provided by the embodiments of the present application recognition methods again；

Fig. 3 A is a kind of example structure schematic diagram of pedestrian's weight identification device provided by the embodiments of the present application；

Fig. 3 B is the another structural schematic diagram of the weight identification device of pedestrian described in Fig. 3 A provided by the embodiments of the present application；

Fig. 4 is the example structure schematic diagram of a kind of electronic equipment provided by the embodiments of the present application.

Specific embodiment

Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete Site preparation description, it is clear that described embodiment is some embodiments of the present application, instead of all the embodiments.Based on this Shen Please in embodiment, every other implementation obtained by those of ordinary skill in the art without making creative efforts Example, shall fall in the protection scope of this application.

The description and claims of this application and term " first ", " second ", " third " and " in the attached drawing Four " etc. are not use to describe a particular order for distinguishing different objects.In addition, term " includes " and " having " and it Any deformation, it is intended that cover and non-exclusive include.Such as it contains the process, method of a series of steps or units, be System, product or equipment are not limited to listed step or unit, but optionally further comprising the step of not listing or list Member, or optionally further comprising other step or units intrinsic for these process, methods, product or equipment.

Referenced herein " embodiment " is it is meant that a particular feature, structure, or characteristic described can wrap in conjunction with the embodiments It is contained at least one embodiment of the application.It is identical that each position in the description shows that the phrase might not be each meant Embodiment, nor the independent or alternative embodiment with other embodiments mutual exclusion.Those skilled in the art explicitly and Implicitly understand, embodiment described herein can be combined with other embodiments.

Row electronic equipment described by the embodiment of the present application may include smart phone (such as Android phone, iOS mobile phone, Windows Phone mobile phone etc.), tablet computer, video matrix, monitor supervision platform, mobile unit, satellite, palm PC, notebook Computer, mobile internet device (MID, Mobile Internet Devices) or wearable device etc., above-mentioned is only citing, And it is non exhaustive, including but not limited to above-mentioned apparatus, certainly, above-mentioned electronic equipment can also be server.

Figure 1A is please referred to, for the embodiment flow diagram of pedestrian provided by the embodiments of the present application again recognition methods a kind of. Pedestrian as described in this embodiment recognition methods again, comprising the following steps:

101, input picture is obtained.

Wherein, the available input picture of electronic equipment, input picture can be shone by the whole body for the pedestrian that camera is shot, Input picture can shoot gained by the camera of public place.

102, the input picture is pre-processed, obtains target image.

Wherein, electronic equipment can pre-process input picture, for example, size adjusting, background removal, etc. behaviour Make, obtains target image.

Optionally, above-mentioned steps 102 pre-process the input picture, obtain target image, it may include following step It is rapid:

21, processing is zoomed in and out to the input picture, so that in the input picture after scaling processing and image library Image size it is the same；

22, image segmentation is carried out to the input picture after scaling processing, obtains the target image.

Wherein, electronic equipment can zoom in and out processing to input picture, so that input picture and figure after scaling processing As the size of the image in library, then, processing is zoomed in and out to the input picture after the scaling processing, obtains target Image, so, it is possible to reduce invalid data promote subsequent accuracy of identification.

23, FIG pull handle is carried out to the input picture, obtains pedestrian area image；

24, processing is zoomed in and out to the pedestrian area image, obtains the target image, the target figure after scaling processing As the size of the image in image library.

Wherein it is possible to first carry out FIG pull handle to input picture, pedestrian area image, the specific algorithm of FIG pull handle are obtained It can be image segmentation algorithm, for example, the image segmentation algorithm based on comentropy, cutting based on GraphCuts figure the image of algorithm Partitioning algorithm, image segmentation algorithm based on watershed algorithm etc., are not limited thereto, secondly, to the pedestrian area image Processing is zoomed in and out, so that it is consistent with the size of the image in image library, obtains target image.

103, feature extraction is carried out to the target image by default convolutional neural networks training pattern, obtains the first spy Collection, the default convolutional neural networks training pattern is made of the first training module and the second training module, and by described the At a feature set, first training pattern is based on the Fusion Features that one training module and second training module extract SphereLoss function and SoftmaxLoss function are realized, are used to train global characteristics, second training pattern is based on SoftmaxLoss function is realized, is used to train local feature.

Wherein, presetting convolutional neural networks training pattern trained in advance before above-mentioned steps 101 can obtain, this is default Convolutional neural networks training pattern can be made of the first training module and the second training module, and the first training module is based on SphereLoss function and SoftmaxLoss function are realized, are mainly used for training global characteristics, the second training pattern is based on SoftmaxLoss function is realized, is mainly used for training local feature, then extracts the first training module and the second training mould group Fusion Features at a feature set.

Optionally, above-mentioned steps 103 carry out feature to the target image by default convolutional neural networks training pattern It extracts, obtains fisrt feature collection, it may include following steps:

31, operation is carried out to the target image by DeeperCut algorithm, obtains 4 key points；

32, the target image is divided into three regions according to 4 key points, three regions are people respectively Head region, upper part of the body region and lower part of the body region；

33, the target image is inputted first training pattern to be trained, obtains a feature set；

34, the number of people region, the upper part of the body region and the lower part of the body region are input to second instruction respectively Practice model to be trained, obtains three feature sets；

35, by one feature set and three characteristic sets at the fisrt feature collection.

It, can be with by 4 key points in the specific implementation, electronic equipment can obtain 4 key points by DeeperCut algorithm 3 components (number of people, the upper part of the body and lower part of the body region) of input picture are obtained, as shown in Figure 1B, by global image and component diagram It is trained as being sent into default convolutional neural networks training pattern, learns global characteristics and local feature respectively, obtain global spy A corresponding feature set and corresponding three feature sets of image of component are levied, and finally permeate a feature, i.e. fisrt feature Collection, it is assumed that the coordinate of 4 key points is respectively (x₁,y₁)、(x₂,y₂)、(x₃,y₃) and (x₄,y₄)。

Specifically, it is assumed that pedestrian image size is H × W, and the coordinate of the number of people is (x₁,y₁), neck subcoordinate is (x₂,y₂), then Number of people region P can be obtained^h:

P^h=[(x_c-w/2,y₁-α),(x_c+w/2,y₂+α)]

W=y₂-y₁+2α

x_c=(x₁+x₂)/2

The computation rule of analogy head zone, it is assumed that left shoulder region coordinate is (x₃,y₃), lower right coordinate is (x₄,y₄), then Upper half of human body region P can be obtained^u, lower part of the body region P^lIt is respectively as follows:

P^u=[(0, y₂-2α),(W-1,y_c+2α)]

P^l=[(0, y₂-2α),(W-1,H-1)]

y_c=(y₃+y₄)/2

Wherein, α can be understood as a parameter of a control adjacent area lap.

For above-mentioned Sphere Loss by Feature Mapping to a high n-dimensional sphere n, specific formula is as follows:

In the training process, weight vectors and feature vector can also be all normalized to the influence for eliminating mould, and drawn Enter a temperature parameter s, controls the temperature (i.e. the degree of fluctuation of curve) of softmax.

Shown in as shown in Figure 1 C, original softmax is to compare dot product, if for sample x | W₁||x|cosθ₁>|W₂|| x|cosθ₂The 1st class is then assigned to, conversely, assigning to the 2nd class, classification results are not only related with angle, also related with vector field homoemorphism.And After being normalized, it is only necessary to compare the size of two angles, if cos θ₁>cosθ₂, then the 1st class is assigned to, otherwise assigns to the 2 classes are only determined by angle value, clear simple, are all mapped on a hypersphere after all feature normalizations.

Optionally, carry out pedestrian's feature extraction using default convolutional neural networks training pattern, respectively by global image and Three image of component are sent in default convolutional neural networks training pattern, after extracting default convolutional neural networks training pattern fusion Feature use figure when due to using convolutional network model to extract pedestrian's feature as the character representation vector of image The global characteristics of picture and three local features (human body head feature, upper part of the body feature and lower part of the body feature) are merged, it is assumed that Four feature vectors are respectively f_g, f_h, f_u, f_l.Then the feature of Fusion Features may be expressed as:

F_P={ f_g；f_h；f_u；f_l}

By F_pAs the characteristic feature of image, the similarity between image is calculated using Euclidean distance, and according to similarity Sequence, the probability results after finally being retrieved.

104, feature is carried out to each image in pre-set image library by the default convolutional neural networks training pattern It extracts, obtains multiple second feature collection.

Wherein, the specific descriptions of above-mentioned steps 104 are referred to above-mentioned steps 31-35, and details are not described herein.It is above-mentioned pre- If can store multiple images in image library, the size of each image is equal.Electronic equipment may collect in difference With the picture under a group traveling together's different conditions, unified dimension of picture under camera, and pass through the side such as random cropping, scaling and erasing Formula has carried out enhancing processing to image data, constructs pedestrian's data set, i.e. image library.

105, determine that the fisrt feature collection and the multiple second feature concentrate the Hamming between each second feature collection Distance obtains multiple Hamming distance values.

Wherein, electronic equipment can calculate fisrt feature collection and multiple second feature concentrate Hamming between each feature set Distance obtains multiple Hamming distance values.

106, it is calculated in the input picture and described image library between each image by the multiple Hamming distance value Likelihood probability value, obtain multiple likelihood probability values.

Wherein, Hamming distance reflects likelihood probability value to a certain extent, for example, Hamming distance is smaller, then it is similar general Rate value is bigger.

107, the likelihood probability value for being greater than preset threshold is chosen from the multiple likelihood probability value, obtains at least one mesh Mark likelihood probability value.

Wherein, above-mentioned preset threshold can be by user's self-setting or system default.

It 108, will at least one described corresponding institute of target likelihood probability value according to the descending sequence of likelihood probability value The image stated in image library shows user.

Wherein, electronic equipment can be according to the descending sequence of likelihood probability value by least one target likelihood probability value Image in corresponding image library shows user.In this way, the feature of global characteristics and Local Feature Fusion as pedestrian is extracted, And its feature is encoded, to improve retrieval rate.Using Hamming distance calculate between similarity, similarity is arranged Sequence output, obtains final search result.

Optionally, before above-mentioned steps 101 obtain input picture, can also include the following steps:

A1, multiple images collection is obtained, each image set includes the same pedestrian under the different conditions of different cameras Multiple images；

A2, it concentrates each image to carry out feature extraction described multiple images, obtains global image collection and component diagram image set；

A3, it the global image collection is input in first training module is trained；

A4, it the component diagram image set is input in second training module is trained；

A5, first training module after training and second training module after training are constituted into the default volume Product neural network training model.

Wherein, the specific descriptions of above-mentioned A2 are referred to above-mentioned steps 31-35, and details are not described herein, are executing above-mentioned step During rapid A2, each image in A2 can also be divided into number of people region, upper part of the body region and lower part of the body region.Electronics The available multiple images collection of equipment, each image set include multiple of same a group traveling together under the different conditions under different cameras Image has tri- cameras of A, B, C, three cameras can shoot pedestrian D, obtain for example, being directed to some environment Its multiple image under different conditions, above-mentioned state are not made herein it is to be understood that position, weather, time, light etc. It limits.In the specific implementation, extracting all parts of multiple images concentration, global image is then sent into the first training module Be trained and be sent into the second training module with image of component and be trained, wherein it is global using SphereLoss with SoftmaxLoss is combined, and three components use part SoftmaxLoss, and obtain final training pattern.

Optionally, in the specific implementation, convolutional Neural net can be improved by being enhanced image data The recognition capability and generalization ability of (ConvolutionalNeural Network, CNN) model.The application will use with lower section Method carries out data enhancing to the pedestrian image that pretreatment stage obtains.Firstly, the data to color enhance, main includes color Saturation degree, brightness and contrast etc. calculate pedestrian image secondly, standardizing to the input data of CNN network The mean value and standard deviation in RGB color channel, and covariance matrix is calculated on entire training set, it finally carries out feature decomposition and obtains To feature vector and characteristic value, and do PCA Jittering processing.Finally, being carried out in the trained stage to pedestrian image random Cut, scale and wipe etc., the generalization ability of model can be improved by data enhancing.

As can be seen that obtaining input picture by pedestrian's recognition methods again described in the embodiment of the present application, input is schemed Picture is pre-processed, and target image is obtained, and carries out feature extraction to target image by default convolutional neural networks training pattern, Fisrt feature collection is obtained, default convolutional neural networks training pattern is made of the first training module and the second training module, and will At a feature set, the first training pattern is based on the Fusion Features that first training module and the second training module extract SphereLoss function and SoftmaxLoss function are realized, are used to train global characteristics, the second training pattern is based on SoftmaxLoss function is realized, is used to train local feature, by default convolutional neural networks training pattern to pre-set image Each image in library carries out feature extraction, obtains multiple second feature collection, determines fisrt feature collection and multiple second feature collection In Hamming distance between each second feature collection, obtain multiple Hamming distance values, pass through multiple Hamming distance values and calculate input Likelihood probability value in image and image library between each image obtains multiple likelihood probability values, from multiple likelihood probability values Choose be greater than preset threshold likelihood probability value, obtain at least one target likelihood probability value, according to likelihood probability value by greatly to Image at least one corresponding image library of target likelihood probability value is showed user by small sequence, so, it is possible to extract The global characteristics of pedestrian and local feature can be used to identify, improve pedestrian by the part of pedestrian and global characteristics Identify rate of precision.

Consistent with the abovely, referring to Fig. 2, second for a kind of pedestrian provided by the embodiments of the present application again recognition methods is real Apply a flow diagram.Pedestrian as described in this embodiment recognition methods again, comprising the following steps:

201, multiple images collection is obtained, each image set includes the same pedestrian under the different conditions of different cameras Multiple images.

202, it concentrates each image to carry out feature extraction described multiple images, obtains global image collection and image of component Collection.

203, the global image collection is input in the first training module and is trained.

204, the component diagram image set is input in the second training module and is trained.

205, first training module after training and second training module after training are constituted into default convolution Neural network training model.

206, input picture is obtained.

207, the input picture is pre-processed, obtains target image.

208, feature extraction is carried out to the target image by the default convolutional neural networks training pattern, obtains the One feature set, the default convolutional neural networks training pattern is by first training module and the second training module group At, and the Fusion Features that first training module and second training module are extracted are at a feature set, described first Training pattern is based on SphereLoss function and SoftmaxLoss function is realized, is used to train global characteristics, second instruction Practice model to realize based on SoftmaxLoss function, is used to train local feature.

209, feature is carried out to each image in pre-set image library by the default convolutional neural networks training pattern It extracts, obtains multiple second feature collection.

210, determine that the fisrt feature collection and the multiple second feature concentrate the Hamming between each second feature collection Distance obtains multiple Hamming distance values.

211, it is calculated in the input picture and described image library between each image by the multiple Hamming distance value Likelihood probability value, obtain multiple likelihood probability values.

212, the likelihood probability value for being greater than preset threshold is chosen from the multiple likelihood probability value, obtains at least one mesh Mark likelihood probability value.

It 213, will at least one described corresponding institute of target likelihood probability value according to the descending sequence of likelihood probability value The image stated in image library shows user.

Wherein, the specific descriptions of above-mentioned steps 201- step 213 can refer to the recognition methods again of pedestrian described in Figure 1A Corresponding step 101- step 108, details are not described herein.

As can be seen that obtaining multiple images collection, Mei Yitu by pedestrian's recognition methods again described in the embodiment of the present application Image set includes multiple images of the same pedestrian under the different conditions of different cameras, to multiple images concentrate each image into Row feature extraction obtains global image collection and component diagram image set, and global image collection is input in the first training module and is instructed Practice, component diagram image set is input in the second training module and is trained, by the first training module after training and after training Second training module constitutes default convolutional neural networks training pattern, obtains input picture, pre-processes, obtain to input picture To target image, feature extraction is carried out to target image by default convolutional neural networks training pattern, obtains fisrt feature collection, Default convolutional neural networks training pattern is made of the first training module and the second training module, and by the first training module and the The Fusion Features that two training modules extract at a feature set, the first training pattern be based on SphereLoss function and SoftmaxLoss function is realized, is used to train global characteristics, and the second training pattern is realized based on SoftmaxLoss function, For training local feature, feature is carried out to each image in pre-set image library by presetting convolutional neural networks training pattern It extracts, obtains multiple second feature collection, determine that fisrt feature collection and multiple second feature are concentrated between each second feature collection Hamming distance obtains multiple Hamming distance values, passes through each image in multiple Hamming distance value calculating input images and image library Between likelihood probability value, obtain multiple likelihood probability values, from multiple likelihood probability values choose be greater than preset threshold it is similar Probability value obtains at least one target likelihood probability value, according to the descending sequence of likelihood probability value by least one target Image in the corresponding image library of likelihood probability value shows user, so, it is possible the part and the global characteristics that extract pedestrian, The global characteristics of pedestrian and local feature can be used to identify, improve pedestrian and identify rate of precision.

Consistent with the abovely, specific as follows the following are the device of the above-mentioned pedestrian of implementation again recognition methods:

Fig. 3 A is please referred to, for a kind of example structure schematic diagram of pedestrian's weight identification device provided by the embodiments of the present application. Pedestrian as described in this embodiment weight identification device, comprising: acquiring unit 301, processing unit 302, extraction unit 303, really Order member 304, computing unit 305, selection unit 306 and display unit 307, specific as follows:

Acquiring unit 301, for obtaining input picture；

Processing unit 302 obtains target image for pre-processing to the input picture；

Extraction unit 303 is mentioned for carrying out feature to the target image by default convolutional neural networks training pattern It takes, obtains fisrt feature collection, the default convolutional neural networks training pattern is by the first training module and the second training module group At, and the Fusion Features that first training module and second training module are extracted are at a feature set, described first Training pattern is based on SphereLoss function and SoftmaxLoss function is realized, is used to train global characteristics, second instruction Practice model to realize based on SoftmaxLoss function, is used to train local feature；And pass through the default convolutional neural networks Training pattern carries out feature extraction to each image in pre-set image library, obtains multiple second feature collection；

Determination unit 304, for determining that the fisrt feature collection and the multiple second feature concentrate each second feature Hamming distance between collection obtains multiple Hamming distance values；

Computing unit 305, for being calculated in the input picture and described image library by the multiple Hamming distance value Likelihood probability value between each image obtains multiple likelihood probability values；

Selection unit 306 is obtained for choosing the likelihood probability value for being greater than preset threshold from the multiple likelihood probability value To at least one target likelihood probability value；

Display unit 307, for according to the descending sequence of likelihood probability value that at least one described target is similar general The image that rate is worth in corresponding described image library shows user.

Optionally, feature extraction is carried out to the target image by default convolutional neural networks training pattern described, In terms of obtaining fisrt feature collection, the extraction unit 303 is specifically used for:

Operation is carried out to the target image by DeeperCut algorithm, obtains 4 key points；

The target image is divided into three regions according to 4 key points, three regions are respectively the number of people Region, upper part of the body region and lower part of the body region；

The target image is inputted first training pattern to be trained, obtains a feature set；

The number of people region, the upper part of the body region and the lower part of the body region are input to the second training mould respectively Type is trained, and obtains three feature sets；

By one feature set and three characteristic sets at the fisrt feature collection.

Optionally, the input picture is pre-processed described, in terms of obtaining target image, the processing unit 302 are specifically used for:

Processing is zoomed in and out to the input picture, so that the input picture after scaling processing and the figure in image library As size；

Image segmentation is carried out to the input picture after scaling processing, obtains the target image.

FIG pull handle is carried out to the input picture, obtains pedestrian area image；

Processing is zoomed in and out to the pedestrian area image, obtains the target image, the target image after scaling processing As the size of the image in image library.

Optionally, as shown in Figure 3B, Fig. 3 B is the another modification structures of the weight identification device of pedestrian shown in Fig. 3 A, with figure 3A compares, and can also include training unit 308, specific as follows:

The acquiring unit 301, before the acquisition input picture, also particularly useful for: multiple images collection is obtained, it is each Image set includes multiple images of the same pedestrian under the different conditions of different cameras；

The extraction unit 303 is concentrated each image to carry out feature extraction, is obtained also particularly useful for described multiple images Global image collection and component diagram image set；

The training unit 308 is trained for the global image collection to be input in first training module； And the component diagram image set is input in second training module and is trained；By the first training mould after training Second training module after block and training constitutes the default convolutional neural networks training pattern.

As can be seen that obtaining input picture by the weight identification device of pedestrian described in the embodiment of the present application, scheming to input Picture is pre-processed, and target image is obtained, and carries out feature extraction to target image by default convolutional neural networks training pattern, Fisrt feature collection is obtained, default convolutional neural networks training pattern is made of the first training module and the second training module, and will At a feature set, the first training pattern is based on the Fusion Features that first training module and the second training module extract SphereLoss function and SoftmaxLoss function are realized, are used to train global characteristics, the second training pattern is based on SoftmaxLoss function is realized, is used to train local feature, by default convolutional neural networks training pattern to pre-set image Each image in library carries out feature extraction, obtains multiple second feature collection, determines fisrt feature collection and multiple second feature collection In Hamming distance between each second feature collection, obtain multiple Hamming distance values, pass through multiple Hamming distance values and calculate input Likelihood probability value in image and image library between each image obtains multiple likelihood probability values, from multiple likelihood probability values Choose be greater than preset threshold likelihood probability value, obtain at least one target likelihood probability value, according to likelihood probability value by greatly to Image at least one corresponding image library of target likelihood probability value is showed user by small sequence, so, it is possible to extract The global characteristics of pedestrian and local feature can be used to identify, improve pedestrian by the part of pedestrian and global characteristics Identify rate of precision.

Consistent with the abovely, referring to Fig. 4, the example structure for a kind of electronic equipment provided by the embodiments of the present application is shown It is intended to.Electronic equipment as described in this embodiment, comprising: at least one input equipment 1000；At least one output equipment 2000；At least one processor 3000, such as CPU；With memory 4000, above-mentioned input equipment 1000, output equipment 2000, place Reason device 3000 and memory 4000 are connected by bus 5000.

Wherein, above-mentioned input equipment 1000 concretely touch panel, physical button or mouse.

Above-mentioned output equipment 2000 concretely display screen.

Above-mentioned memory 4000 can be high speed RAM memory, can also be nonvolatile storage (non-volatile ), such as magnetic disk storage memory.Above-mentioned memory 4000 is used to store a set of program code, above-mentioned input equipment 1000, defeated Equipment 2000 and processor 3000 are used to call the program code stored in memory 4000 out, perform the following operations:

Above-mentioned processor 3000, is used for:

Obtain input picture；

The input picture is pre-processed, target image is obtained；

Optionally, feature extraction is carried out to the target image by default convolutional neural networks training pattern described, In terms of obtaining fisrt feature collection, above-mentioned processor 3000 is specifically used for:

Optionally, the input picture is pre-processed described, in terms of obtaining target image, above-mentioned processor 3000 It is specifically used for:

Optionally, the input picture is pre-processed described, in terms of obtaining target image, above-mentioned processor 3000 Also particularly useful for:

Optionally, before the acquisition input picture, above-mentioned processor 3000 also particularly useful for:

Multiple images collection is obtained, each image set includes multiple of the same pedestrian under the different conditions of different cameras Image；

It concentrates each image to carry out feature extraction described multiple images, obtains global image collection and component diagram image set；

The global image collection is input in first training module and is trained；

The component diagram image set is input in second training module and is trained；

First training module after training and second training module after training are constituted into the default convolution Neural network training model.

The embodiment of the present application also provides a kind of computer storage medium, wherein the computer storage medium can be stored with journey Sequence, the program include that some or all of recognition methods walks again by any pedestrian for recording in above method embodiment when executing Suddenly.

The embodiment of the present application also provides a kind of computer program product, and the computer program product includes storing calculating The non-transient computer readable storage medium of machine program, the computer program are operable to that computer is made to execute such as above-mentioned side Any pedestrian recorded in method embodiment some or all of recognition methods step again.

Although the application is described in conjunction with each embodiment herein, however, implementing the application claimed In the process, those skilled in the art are by checking the attached drawing, disclosure and the appended claims, it will be appreciated that and it is real Other variations of the existing open embodiment.In the claims, " comprising " (comprising) word is not excluded for other compositions Part or step, "a" or "an" are not excluded for multiple situations.Claim may be implemented in single processor or other units In several functions enumerating.Mutually different has been recited in mutually different dependent certain measures, it is not intended that these are arranged It applies to combine and generates good effect.

It will be understood by those skilled in the art that embodiments herein can provide as method, apparatus (equipment) or computer journey Sequence product.Therefore, complete hardware embodiment, complete software embodiment or combining software and hardware aspects can be used in the application The form of embodiment.Moreover, it wherein includes the calculating of computer usable program code that the application, which can be used in one or more, The computer program implemented in machine usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) The form of product.Computer program is stored/distributed in suitable medium, is provided together with other hardware or as the one of hardware Part can also use other distribution forms, such as pass through the wired or wireless telecommunication system of Internet or other.

The application be referring to the embodiment of the present application method, apparatus (equipment) and computer program product flow chart with/ Or block diagram describes.It should be understood that each process that can be realized by computer program instructions in flowchart and/or the block diagram and/ Or the combination of the process and/or box in box and flowchart and/or the block diagram.It can provide these computer program instructions To general purpose computer, special purpose computer, Embedded Processor or other programmable License Plate equipment processor to generate one A machine so that by instructions that computer or processors of other programmable License Plate equipment execute generate for realizing The device for the function of being specified in one or more flows of the flowchart and/or one or more blocks of the block diagram.

These computer program instructions, which may also be stored in, is able to guide computer or other programmable License Plate equipment with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.

These computer program instructions can also be loaded into computer or other programmable License Plate equipment, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.

Although the application is described in conjunction with specific features and embodiment, it is clear that, do not departing from this Shen In the case where spirit and scope please, it can be carry out various modifications and is combined.Correspondingly, the specification and drawings are only institute The exemplary illustration for the application that attached claim is defined, and be considered as covered within the scope of the application any and all and repair Change, change, combining or equivalent.Obviously, those skilled in the art the application can be carried out various modification and variations without It is detached from spirit and scope.If in this way, these modifications and variations of the application belong to the claim of this application and its Within the scope of equivalent technologies, then the application is also intended to include these modifications and variations.

Claims

1. a kind of pedestrian recognition methods again characterized by comprising

Obtain input picture；

The input picture is pre-processed, target image is obtained；

Feature extraction is carried out to the target image by default convolutional neural networks training pattern, obtains fisrt feature collection, institute It states default convolutional neural networks training pattern to be made of the first training module and the second training module, and trains mould for described first The Fusion Features that block and second training module extract are based on SphereLoss at a feature set, first training pattern Function and SoftmaxLoss function are realized, are used to train global characteristics, second training pattern is based on SoftmaxLoss Function is realized, is used to train local feature；

Feature extraction is carried out to each image in pre-set image library by the default convolutional neural networks training pattern, is obtained Multiple second feature collection；

It determines that the fisrt feature collection and the multiple second feature concentrate the Hamming distance between each second feature collection, obtains Multiple Hamming distance values；

It is calculated by the multiple Hamming distance value similar general between the input picture and each image in described image library Rate value obtains multiple likelihood probability values；

The likelihood probability value for being greater than preset threshold is chosen from the multiple likelihood probability value, and it is similar general to obtain at least one target Rate value；

It will at least one described corresponding described image library of target likelihood probability value according to the descending sequence of likelihood probability value In image show user.

2. the method according to claim 1, wherein described by presetting convolutional neural networks training pattern to institute It states target image and carries out feature extraction, obtain fisrt feature collection, comprising:

The target image is divided into three regions according to 4 key points, three regions be respectively number of people region, Upper part of the body region and lower part of the body region；

Respectively by the number of people region, the upper part of the body region and the lower part of the body region be input to second training pattern into Row training, obtains three feature sets；

3. method according to claim 1 or 2, which is characterized in that it is described that the input picture is pre-processed, it obtains Target image, comprising:

Processing is zoomed in and out to the input picture, so that the input picture after scaling processing and the image in image library Size is the same；

4. method according to claim 1 or 2, which is characterized in that it is described that the input picture is pre-processed, it obtains Target image, comprising:

Processing is zoomed in and out to the pedestrian area image, obtains the target image, target image and figure after scaling processing As the size of the image in library.

5. method according to any one of claims 1 to 4, which is characterized in that described before the acquisition input picture Method further include:

Multiple images collection is obtained, each image set includes multiple figures of the same pedestrian under the different conditions of different cameras Picture；

The global image collection is input in first training module and is trained；

First training module after training and second training module after training are constituted into the default convolutional Neural Network training model.

6. a kind of pedestrian's weight identification device characterized by comprising

Acquiring unit, for obtaining input picture；

Processing unit obtains target image for pre-processing to the input picture；

Extraction unit is obtained for carrying out feature extraction to the target image by default convolutional neural networks training pattern Fisrt feature collection, the default convolutional neural networks training pattern are made of the first training module and the second training module, and will The Fusion Features that first training module and second training module extract are at a feature set, first training pattern It is realized based on SphereLoss function and SoftmaxLoss function, is used to train global characteristics, the second training pattern base It is realized in SoftmaxLoss function, is used to train local feature；And pass through the default convolutional neural networks training pattern Feature extraction is carried out to each image in pre-set image library, obtains multiple second feature collection；

Determination unit, for determining that the fisrt feature collection and the multiple second feature are concentrated between each second feature collection Hamming distance obtains multiple Hamming distance values；

Computing unit, for calculating each image in the input picture and described image library by the multiple Hamming distance value Between likelihood probability value, obtain multiple likelihood probability values；

Selection unit obtains at least for choosing the likelihood probability value for being greater than preset threshold from the multiple likelihood probability value One target likelihood probability value；

Display unit, for corresponding at least one described target likelihood probability value according to the descending sequence of likelihood probability value Described image library in image show user.

7. device according to claim 6, which is characterized in that described by presetting convolutional neural networks training pattern pair The target image carries out feature extraction, and in terms of obtaining fisrt feature collection, the extraction unit is specifically used for:

8. device according to claim 6 or 7, which is characterized in that pre-process, obtain to the input picture described In terms of target image, the processing unit is specifically used for:

9. device according to claim 6 or 7, which is characterized in that pre-process, obtain to the input picture described In terms of target image, the processing unit is specifically used for:

Processing is zoomed in and out to the pedestrian area image, obtains the target image.

10. a kind of computer readable storage medium, which is characterized in that storage is used for the computer program of electronic data interchange, In, the computer program makes computer execute the method according to claim 1 to 5.