CN110298302A

CN110298302A - A kind of human body target detection method and relevant device

Info

Publication number: CN110298302A
Application number: CN201910566084.8A
Authority: CN
Inventors: 揭泽群; 马林; 刘威
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-06-25
Filing date: 2019-06-25
Publication date: 2019-10-01
Anticipated expiration: 2039-06-25
Also published as: CN110298302B

Abstract

The embodiment of the invention discloses a kind of human body target detection method and relevant devices, it include: to obtain sample image from multiple images first, the human body region in the sample image including comprising human body head region, viewing human region and whole body human region；The model training information of sample image is determined then according to human body region；Then sample image and model training information input are waited for that training pattern is trained, obtains human body target detection model；Finally according to human body target detection model, human body head region, viewing human region and the whole body human region in image to be detected are determined.Using the embodiment of the present invention, the detection effect of the human body target blocked and the accuracy of whole body position of human body detection can be improved.

Description

A kind of human body target detection method and relevant device

Technical field

The present invention relates to technical field of image processing more particularly to a kind of human body target detection method and relevant devices.

Background technique

Currently, human body target detection is an important branch of image and field of video processing.Human body target detection technique It is widely used in content understanding and the analysis in cloud and mobile terminal image/video focusing on people, the behavior to people may be implemented The purpose of analysis, motion analysis and identification.For example, camera is to fields such as real time monitoring, the abnormality alarms of some occasion In scape, unmanned vehicle and robot environment people identification, detection positioning etc. and cloud magnanimity taking human as theme picture and video Motion analysis, sentiment analysis etc..Existing human body target detection technique is usually using whole body human body as unique supervision letter Cease the classification of one specific forecasts whole body human body of training and the target detection network of position regression result.By the information utilized Dimension is single, and the detection effect of the prior art is undesirable.

Summary of the invention

The present invention provides a kind of human body target detection method and relevant device, and the inspection of the human body target blocked can be improved Survey the accuracy of effect and the detection of whole body position of human body.

In a first aspect, the embodiment of the invention provides a kind of human body target detection methods, comprising:

Sample image is obtained from multiple images, includes human body region, the human body in the sample image Region includes human body head region, viewing human region and whole body human region；

According to the human body region, the model training information of the sample image, the model training information are determined At least one in penalty values is returned including type label and position；

The sample image and the model training information input are waited for that training pattern is trained, obtain human body target inspection Survey model；

According to the human body target detection model, the human body region in image to be detected is determined.

Wherein, the human body head region is human body head region in the sample image, the viewing human area Domain is the human region not being blocked in the sample image and the whole body human region is to exist in the sample image Region where the Whole Body blocked.

Wherein, described to include the first classifier to training pattern；The type label include positive example sample label and it is non-just Example sample label；

It is described according to the human body region, determine that the model training information of the sample image includes:

Determine the degree of overlapping of the whole body human region and the sample image；

When first degree of overlapping is greater than first threshold, determine that the type label is the positive example sample label, it is no Then, determine that the type label is the non-positive example sample label.

It is described that the sample image and the model training information input are waited for that training pattern is trained, obtain human body mesh Marking detection model includes:

The sample image and the type label are inputted first classifier to be trained.

Wherein, it is described to training pattern include first position return device；

It is described that the sample image and the model training information input are waited for that training pattern is trained, obtain human body mesh Mark detection model further include:

The sample image that the type label is the positive example sample label is determined as first object sample image；

The corresponding first training loss function of device is returned according to the first position, determines the first object sample image Position return penalty values；

The position of the first object sample image and the first object sample image is returned described in penalty values input First position returns device and is trained.

Wherein, described to include the second classifier to training pattern；The type label include positive example sample label and it is non-just Example sample label；

Determine the degree of overlapping in the viewing human region and the sample image；

When the degree of overlapping is greater than second threshold, determine that the type label is the positive example sample label, otherwise, really The fixed type label is the non-positive example sample label.

The sample image and the type label are inputted second classifier to be trained.

Wherein, it is described to training pattern include the second position return device；

The sample image that the type label is the positive example sample label is determined as the second target sample image；

The corresponding second training loss function of device is returned according to the second position, determines the second target sample image Position return penalty values；

The position of the second target sample image and the second target sample image is returned described in penalty values input The second position returns device and is trained.

Wherein, it is described to training pattern include the third place return device；The type label include positive example sample label and Non- positive example sample label；

Determine the degree of overlapping in the human body head region and the sample image；

When the degree of overlapping is greater than third threshold value, determine that the type label is the positive example sample label, otherwise, really The fixed type label is the non-positive example sample label.

The sample image that the type label is the positive example sample label is determined as third target sample image；

The corresponding third training loss function of device is returned according to the third place, determines the third target sample image Position return penalty values；

The position of the third target sample image and the third target sample image is returned described in penalty values input The third place returns device and is trained.

Wherein, the degree of overlapping in the determination viewing human region or the human body head region and the sample image Include:

Determine first overlapping area in the viewing human region Yu the sample image；By first overlapping area with Degree of overlapping of the quotient of the area of the sample image as the viewing human region and the sample image；And

Determine the second overlapping area of the human body head region Yu the sample image；By second overlapping area and institute State degree of overlapping of the quotient of the area of sample image as the human body head region and the sample image.

Second aspect, the embodiment of the invention provides a kind of human body target detection devices, comprising:

Template acquisition module includes human body in the sample image for obtaining sample image from multiple images Region, the human body region include human body head region, viewing human region and whole body human region；

Information determination module, for determining the model training information of the sample image according to the human body region, The model training information includes at least one in type label and position recurrence penalty values；

Model training module, for the sample image and the model training information input to be waited for that training pattern is instructed Practice, obtains human body target detection model；

Module of target detection, for according to the human body target detection model, determining the place of the human body in image to be detected Region.

The information determination module, is also used to:

When first degree of overlapping is greater than first threshold, determine that the type label is the positive example sample label, it is no Then, determine that the type label is the non-positive example sample label；

The model training module is also used to:

The model training module, is also used to:

Wherein, described to include the second classifier to training pattern；The type label include positive example sample label and it is non-just Example sample label

The information determination module is also used to:

When the degree of overlapping is greater than second threshold, determine that the type label is the positive example sample label, otherwise, really The fixed type label is the non-positive example sample label；

The model training module is also used to:

The third aspect, the embodiment of the invention provides a kind of human body target detection devices, comprising: processor, memory and Communication bus, wherein for realizing connection communication between processor and memory, processor executes to be deposited communication bus in memory The step in a kind of human body target detection method that the program of storage provides for realizing above-mentioned first aspect.

In a possible design, Entity recognition equipment provided by the invention be may include for executing in the above method The corresponding module of behavior.Module can be software and/or hardware.

The another aspect of the embodiment of the present invention provides a kind of computer readable storage medium, the computer-readable storage A plurality of instruction is stored in medium, described instruction is suitable for being loaded as processor and executing method described in above-mentioned various aspects.

The another aspect of the embodiment of the present invention provides a kind of computer program product comprising instruction, when it is in computer When upper operation, so that computer executes method described in above-mentioned various aspects.

Implement the embodiment of the present invention, obtain sample image from multiple images first, includes packet in the sample image Human body region including region containing human body head, viewing human region and whole body human region；Then according to human body institute The model training information of sample image is determined in region；Then by sample image and model training information input wait for training pattern into Row training, obtains human body target detection model；Finally according to human body target detection model, the human body head in image to be detected is determined Portion region, viewing human region and whole body human region.Detection effect, the Yi Jiquan of the human body target blocked can be improved The accuracy of body position of human body detection.

Detailed description of the invention

Technical solution in order to illustrate the embodiments of the present invention more clearly or in background technique below will be implemented the present invention Attached drawing needed in example or background technique is illustrated.

Fig. 1 is a kind of structural schematic diagram of image detecting system provided in an embodiment of the present invention；

Fig. 2 is a kind of classification schematic diagram of human body region provided in an embodiment of the present invention；

Fig. 3 is a kind of flow diagram of human body target detection method provided in an embodiment of the present invention；

Fig. 4 is a kind of method schematic diagram that sample image is obtained from image provided in an embodiment of the present invention；

Fig. 5 is a kind of schematic diagram for marking multiple whole body human regions provided in an embodiment of the present invention；

Fig. 6 is the flow diagram of another human body target detection method provided in an embodiment of the present invention；

Fig. 7 is a kind of schematic diagram of trident prediction network provided in an embodiment of the present invention；

Fig. 8 is a kind of structural schematic diagram of human body target detection device provided in an embodiment of the present invention；

Fig. 9 is a kind of structural schematic diagram of human body target detection device provided in an embodiment of the present invention.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are some of the embodiments of the present invention, instead of all the embodiments.Based on this hair Embodiment in bright, every other implementation obtained by those of ordinary skill in the art without making creative efforts Example, shall fall within the protection scope of the present invention.

Referring to Figure 1, Fig. 1 is a kind of structural schematic diagram of image detecting system provided in an embodiment of the present invention.As schemed Show, which includes network backbone, candidate frame generates network and trident predicts network.Wherein, image detecting system It can be realized based on any classical target detection frame, classical target detection frame includes the target detection of convolutional neural networks (Region with Convolutional Neural Networks, RCNN) network series, such as Fast RCNN, Faster RCNN and RFCN is also possible to the single steps detector such as YOLO, SSD.It is classical target that network backbone and candidate frame, which generate network, Logic module in detection framework.By taking Faster RCNN as an example, network backbone can be the feature in Faster RCNN network It extracts part-residual error neural network (Residual Neural Network, ResNet) and candidate frame generates network as it In collateral branch's network-Area generation network (Region Proposal Network, RPN) for drawing of a certain layer network.Wherein, net Network trunk is used to extract the convolution feature (feature map) of image from original image and candidate frame generation network can root According to the approximate location for obtaining target in original image in convolution feature, and the position is marked with callout box (generally rectangular frame) Note.In addition, the image segments in the callout box can subsequently enter trident prediction network, to detect the human body in the image segments Region, wherein human body region includes human body head region, viewing human region and whole body human region.Wherein, such as Shown in Fig. 2, human body head region is human body head region in image, and viewing human region is the people not being blocked in image Body region and whole body human region are the region existed where the Whole Body blocked in image.Based on above-mentioned image detection System, the embodiment of the invention provides the corresponding human body target detection methods for presetting network with above-mentioned trident below.

Fig. 3 is referred to, Fig. 3 is a kind of flow diagram of human body target detection method provided in an embodiment of the present invention, should Method includes but is not limited to following steps:

S301, obtains sample image from multiple images, and the sample image includes human body region.

In the specific implementation, multiple images refer to one or one or more image, wherein every image It can be in the scene comprising people through the image of camera, camera or the shooting of other photographing devices.As shown in figure 4, can be with First pass through network backbone as shown in Figure 1 and candidate frame generate network and marks candidate frame on image, then can from As cutting out the image segments in the candidate frame in image as sample image.In Fig. 4, in addition to the sample graph marked in figure Picture can also obtain other multiple sample images from the image, will not enumerate here.Wherein, in order to carry out mould Type training, can be by being manually labeled the human body region in every sample image with callout box, wherein where human body Region can be, but not limited to human body head region as shown in Figure 2, viewing human region and whole body human region and be labeled.

S302 determines the model training information of the sample image according to the human body region.

In the specific implementation, model training information may include that the type label of sample image and the position of sample image are returned Return penalty values.Wherein it is possible to which sample image is divided into positive example sample image and non-positive example sample image, correspondingly, type label It may include positive example sample label and non-positive example sample label.In order to which the corresponding classifier in viewing human region and position is respectively trained It puts back into and device, the corresponding classifier of whole body human region and position is returned to return device detection and the position recurrence in human body head region Device can determine every respectively from three human body head region, viewing human region and whole body human region different dimensions Open the model training information of sample image.Specifically includes the following steps:

(1) dimension of whole body human region.

On the one hand, the degree of overlapping (being denoted as IoU) of whole body human region and sample image can be determined first, wherein such as formula It (1), can be by the friendship of whole body human region and the area of sample image and whole body human region and sample graph than being used as shown in The degree of overlapping of picture,

Wherein, B indicates that sample image, G indicate the image segments where whole body human region, intersection (G, B) Indicate that the overlapping area of B and G, union (G, B) indicate the sum of area of B and G.Then, when IoU is greater than first threshold (such as 0.5) When, sample image is determined as positive example sample image, so that it is determined that its type label is positive example sample label.When IoU is not more than When first threshold, sample image is determined as non-positive example sample image, so that it is determined that its type label is non-positive example sample label. Wherein, type label is only used for distinguishing sample type, can be number, letter or character string.In general, whole body human region is classified Shown in the training loss function such as formula (2) of device, wherein p is the prediction fiducial probability of classifier, and d is the true value of sample image, In, the true value of positive example sample image is 1, and the true value of non-positive example sample image is 0, therefore can be directly true by positive example sample label It is set to 1, non-positive example sample label and is determined as 0.

W=- [dlog (p)+(1-d) log (1-p)] (2)

Wherein, sample image B and any other whole body human body in the image where the sample image can also be calculated IoU between region, if the IoU of the sample image and any other whole body human region is greater than first threshold, by the sample The type label of image is determined as positive example sample label, otherwise, it determines being non-positive example sample label.

Such as: as shown in figure 5, both can determine the sample image according to the IoU of sample image and whole body human region 1 Type label, the type mark of the sample image can both can also be determined according to the IoU of this image and whole body human region 2 Label.

It on the other hand, can be then corresponding according to whole body human region using positive example sample image as target sample image Position return device training loss function, calculate the target sample image position return penalty values.Wherein, training loss letter Number can be, but not limited to as the form as shown in formula (3),

Wherein,

Where it is assumed that f is corresponding with sample image B, then

Wherein, log indicates denary logarithm function.P_xAnd P_yThe transverse and longitudinal for respectively indicating the central point of sample image B is sat Mark, the origin of the horizontal coordinates where the transverse and longitudinal coordinate can be some vertex of sample image.P_wAnd P_hRespectively indicate sample The width and height of image B.The corresponding subscript meaning of G is identical, G_x、G_y、G_wAnd G_hImage respectively where whole body human region Transverse and longitudinal coordinate, width and the height of the central point of segment G.AndWithIt is complete in respectively sample image B The real central point transverse and longitudinal coordinate of body human region, width and height.

(2) dimension in viewing human region.

On the one hand, the degree of overlapping (being denoted as IoG_1) of viewing human region and sample image can be determined first, wherein can IoG_1 is calculated with the formula as shown in formula (9), wherein B indicates that sample image, R indicate the image sheet where viewing human region Section, then intersection (R, B) indicates that the overlapping area of B and R, R indicate the area of R.Then, when IoG_1 is greater than the second threshold It is worth (such as 0.7), sample image is determined as positive example sample image, so that it is determined that its type label is positive example sample label.When IoG_1 be not more than second threshold when, sample image is determined as non-positive example sample image, so that it is determined that its type label be it is non-just Example sample label.Wherein, type label is only used for distinguishing sample type, and the training loss letter of viewing human region classifier Number is identical as function shown in formula (2), therefore positive example sample label can be able to be 0 for 1, non-positive example sample label.

Wherein, sample image B and any other viewing human in the image where the sample image can also be calculated IoG_1 between region should if the sample image and the IoG_1 in any other viewing human region are greater than second threshold The type label of sample image is determined as positive example sample label, otherwise, it determines being non-positive example sample label.

It on the other hand, can be then corresponding according to viewing human region using positive example sample image as target sample image Position return device training loss function, calculate target sample image position return penalty values.Wherein, training loss function It can be, but not limited to as the form as shown in formula (3), i.e. the training loss function phase with the position of whole body human region recurrence device Together, thus from the position that the dimension in viewing human region calculates sample image return the methods of penalty values with from whole body human region Dimension calculate the sample image position return penalty values method it is identical, which is not described herein again.

(3) dimension in human body head region.

Specifically, the degree of overlapping (being denoted as IoG_2) of whole body human region and sample image can be determined first, wherein can To calculate IoG_2 according to the formula as shown in formula (10), wherein B indicates that sample image, T indicate the figure where viewing human region Photo section, then intersection (T, B) indicates that the overlapping area of B and T, T indicate the area of T.Then, when IoG_2 is greater than the Three threshold values (such as 0.7), are determined as positive example sample image for sample image, when IoG_2 is not more than third threshold value, by sample image It is determined as non-positive example sample image.It then equally can be using positive example sample image as target sample image, and according to visual people The corresponding position of body region returns the training loss function of device, and the position for calculating the target sample image returns penalty values.Wherein, Training loss function can be, but not limited to as the form as shown in formula (3), the i.e. instruction with the position of whole body human region recurrence device Practice loss function it is identical, therefore from the dimension in human body head region calculate sample image position return penalty values method with from The method that the position that the dimension of whole body human region calculates the sample image returns penalty values is identical, and which is not described herein again.

Wherein, sample image B and any other human body head in the image where the sample image can also be calculated IoG_2 between region should if the sample image and the IoG_2 in any other human body head region are greater than third threshold value The type label of sample image is determined as positive example sample label, otherwise, it determines being non-positive example sample label.

It should be noted that 1), the calculation of the degree of overlapping of whole body human region and sample image and viewing human area Domain and human body head region are inconsistent, wherein the calculation method of whole body human region is common Overlapping Calculation method.And For viewing human region, if a region between viewing human region and whole body human body, but can sufficiently cover visual people The major part of body region, then can forgetting it, it covers the human region how much being blocked, and is regarded as viewing human area Therefore the positive example frame in domain is compared and formula (1).The denominator of formula (9) be viewing human region area rather than viewing human region with The sum of area of sample image.Identical reason is also applied for human body head region.

2), whether the corresponding classifier of whole body human region and viewing human region is for identification in image comprising corresponding to Human region, and what was certain was that generating in network obtained image and must contain by network backbone and candidate frame Human body head region, be not necessarily in individually training be directed to human body head region classifier.Therefore in human body head region, this is one-dimensional On degree, the model training information of sample image only includes that position returns penalty values.

The sample image and the model training information input are waited for that training pattern is trained, obtain human body by S303 Target detection model.

In the specific implementation, it can be directed to whole body human region, each dimension in viewing human region and human body head region, it will The model training information input of sample image and the sample image in corresponding dimension waits for training pattern, to update to training pattern Model parameter, to achieve the purpose that model training.

S304 determines the human body region in image to be detected according to the human body target detection model.

In the specific implementation, as shown in Figure 1, image to be detected successively passes through network backbone and candidate frame generates the processing of network Afterwards, processing result image subsequently enters human body target detection model, and human body target detection model then can detecte out mapping to be checked Whether include viewing human region and whole body human region and output human body head region, viewing human region and complete as in The location information of body human region.

In embodiments of the present invention, sample image is obtained from a variety of images first, includes packet in the sample image Human body region including region containing human body head, viewing human region and whole body human region；Then according to human body institute The model training information of sample image is determined in region；Then by sample image and model training information input wait for training pattern into Row training obtains human body target detection；Finally according to the human body target detection model, the human body head in image to be detected is determined Portion region, viewing human region and whole body human region.It can consider the number of people and viewing human area simultaneously in model training Booster action of the domain to whole body human body.Compared to the classification information just with whole body human region with position regressand value as prison It superintends and directs information training and obtains detector, the detection of whole body position of human body can be improved in the human body target inspection model in the embodiment of the present invention Accuracy.

Fig. 6 is referred to, Fig. 6 is the flow diagram of another human body target detection method provided in an embodiment of the present invention, This method includes but is not limited to following steps:

S601, obtains sample image from multiple images, and the sample image includes human body region.This step Identical as the S301 in a upper embodiment, this step repeats no more.

S602 determines the model training information of sample image according to human body region.In this step and a upper embodiment S302 it is identical, this step repeats no more.

S603 returns device, visual to whole body human region classifier and position according to sample image and model training information Human region classifier and position return device and human body head regional location returns device and is trained respectively, obtain human body mesh Mark detection model.

In the specific implementation, model training step includes:

(1) training of device is returned to whole body human region classifier and position

By in whole body human region dimension positive example sample image and non-positive example sample image mark upper type label it Whole body human region classifier is inputted afterwards, to update the model parameter of the classifier, to realize the training to the classifier.With And the position of positive example sample image and the sample image is returned into penalty values input whole body human region position and returns device, to update The position returns the model parameter of device, to realize that contraposition puts back into the training for returning device.

(2) training of device is returned to viewing human region classifier and position

By in viewing human region dimension positive example sample image and non-positive example sample image mark upper type label it Input viewing human region classifier is trained it afterwards.And the position of positive example sample image and the sample image is returned Penalty values input viewing human regional location returns device and is trained to it.

(3) training of device is returned to human body head regional location

Penalty values input people is returned by the positive example sample image on the degree of human body head region and with the position of the sample image Body head zone position returns device and is trained to it.

In conclusion whole body human region classifier and position can be returned device, viewing human region classifier and position It puts back into and device and human body head regional location is returned to return three detection branches of the device as human body target detection model, so as to Human body target detection model is visually referred to as trident prediction network (corresponding system shown in FIG. 1).

S604 determines whole body human region in image to be detected, viewing human region according to human body target detection model And human body head region.

In the specific implementation, as shown in Figure 1, image to be detected first, which will successively pass through, generates net for network backbone and candidate frame The processing of network, the result that network backbone and candidate frame generate network processes then will enter into human body target detection model continue into Row detection.Wherein, as shown in fig. 7, in a first aspect, the whole body human region classifier in human body target detection model determine it is to be checked Whether include whole body human region in altimetric image, and input classification true value, wherein if comprising true value of classifying is 1, is otherwise divided Class true value is 0.Secondly, if whole body human region position, which returns device, will test the specific of the region comprising whole body human region Position, and export the position regressand value (x in the region₁, y₁, w₁, h₁), wherein x₁, y₁, w₁And h₁Where respectively indicating the region The width and height of the transverse and longitudinal coordinate of central point and the rectangular area.Second aspect, it is visual in human body target detection model Whether human region classifier determines comprising viewing human region in image to be detected, and inputs classification true value, wherein if packet Contain, then true value of classifying is 1, and true value of otherwise classifying is 0.Secondly, if viewing human regional location returns comprising viewing human region Return device to will test the specific location in the region, and exports the position regressand value (x in the region₂, y₂, w₂, h₂).The third aspect, human body Human body head regional location in target detection model returns the specific location that device will determine the region in image to be detected, and defeated Position regressand value (the x in the region out₃, y₃, w₃, h₃).Wherein, coordinate system where above-mentioned transverse and longitudinal coordinate can be figure to be detected One vertex of picture.

In embodiments of the present invention, sample image is obtained from a variety of images first, includes packet in the sample image Human body region including region containing human body head, viewing human region and whole body human region；Then according to human body institute The model training information of sample image is determined in region；Then according to sample image and model training information, to whole body human body area Domain classifier and position return device, viewing human region classifier and position and return device and the recurrence of human body head regional location Device is trained respectively, obtains human body target detection model；Finally according to human body target detection model, determine in image to be detected Whole body human region, viewing human region and human body head region.Due to that can predict that the position of visible area returns simultaneously, It thus to there is the human body target detection effect blocked more preferable, and can predict the position of the number of people, thus can also make full use of the number of people With the relative positional relationship of whole body human body, the position of whole body human body is more accurately predicted.

It is above-mentioned to illustrate the method for the embodiment of the present invention, the relevant device of the embodiment of the present invention is provided below.

Fig. 8 is referred to, Fig. 8 is a kind of structural schematic diagram of human body target detection device provided in an embodiment of the present invention, should Human body target detection device may include:

Sample collection module 801, for obtaining sample image from multiple images, the sample image includes human body Region.

In the specific implementation, multiple images refer to one or one or more image, wherein every image It can be in the scene comprising people through the image of camera, camera or the shooting of other photographing devices.In order to carry out model instruction Practice, it can be by being manually labeled to the human body region in every sample image with callout box, wherein human body region It can be, but not limited to human body head region as shown in Figure 2, viewing human region and whole body human region to be labeled.

Information determination module 802, for determining the model training letter of the sample image according to the human body region Breath.

(1) dimension of whole body human region.

On the one hand, the degree of overlapping (being denoted as IoU) of whole body human region and sample image can be determined first, wherein such as formula It (1), can be by the friendship of whole body human region and the area of sample image and whole body human region and sample graph than being used as shown in The degree of overlapping of picture.Then, when IoU is greater than first threshold (such as 0.5), sample image is determined as positive example sample image, thus Determine that its type label is positive example sample label.When IoU is not more than first threshold, sample image is determined as non-positive example sample Image, so that it is determined that its type label is non-positive example sample label.Wherein, type label is only used for distinguishing sample type, can be with For number, letter or character string.In general, shown in the training loss function such as formula (2) of whole body human region classifier, wherein p is The prediction fiducial probability of classifier, d are the true value of sample image, wherein the true value of positive example sample image is 1, non-positive example sample The true value of image is 0, therefore positive example sample label directly can be determined as 1, non-positive example sample label and be determined as 0.

(2) dimension in viewing human region.

(3) dimension in human body head region.

Model training module 803, for by the sample image and the model training information input wait for training pattern into Row training, obtains human body target detection model.

In the specific implementation, to include that whole body human region classifier and position return device, viewing human area in training pattern Domain classifier and position return device and human body head regional location returns device, and the training step of the model may include:

(3) training of device is returned to human body head regional location

Module of target detection 804, for determining the human body location in image to be detected according to human body target detection model Domain.

In embodiments of the present invention, sample image is obtained from a variety of images first, includes packet in the sample image Human body region including region containing human body head, viewing human region and whole body human region；Then according to human body institute The model training information of sample image is determined in region；Then by sample image and model training information input wait for training pattern into Row training, obtains human body target detection model；Finally according to human body target detection model, the whole body people in image to be detected is determined Body region, viewing human region and human body head region.Due to that can predict that the position of visible area returns simultaneously, thus to having The human body target detection effect blocked is more preferable, and can predict the position of the number of people, thus can also make full use of the number of people and whole body people The relative positional relationship of body more accurately predicts the position of whole body human body.

Fig. 9 is referred to, Fig. 9 is a kind of structural schematic diagram of human body target detection device provided in an embodiment of the present invention.Such as Shown in figure, human body object detection apparatus may include: at least one processor 901, at least one communication interface 902, at least One memory 903 and at least one communication bus 904.

Wherein, processor 901 can be central processor unit, general processor, digital signal processor, dedicated integrated Circuit, field programmable gate array or other programmable logic device, transistor logic, hardware component or it is any Combination.It, which may be implemented or executes, combines various illustrative logic blocks, module and electricity described in the disclosure of invention Road.The processor is also possible to realize the combination of computing function, such as combines comprising one or more microprocessors, number letter Number processor and the combination of microprocessor etc..Communication bus 904 can be Peripheral Component Interconnect standard PCI bus or extension work Industry normal structure eisa bus etc..The bus can be divided into address bus, data/address bus, control bus etc..For convenient for indicate, It is only indicated with a thick line in Fig. 9, it is not intended that an only bus or a type of bus.Communication bus 904 is used for Realize the connection communication between these components.Wherein, the communication interface 902 of equipment is used for and other nodes in the embodiment of the present invention Equipment carries out the communication of signaling or data.Memory 903 may include volatile memory, such as non-volatile dynamic random is deposited Take memory (Nonvolatile Random Access Memory, NVRAM), phase change random access memory (Phase Change RAM, PRAM), magnetic-resistance random access memory (Magetoresistive RAM, MRAM) etc., can also include non- Volatile memory, for example, at least a disk memory, Electrical Erasable programmable read only memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), flush memory device, such as anti-or flash memory (NOR Flash memory) or anti-and flash memory (NAND flash memory), semiconductor devices, such as solid state hard disk (Solid State Disk, SSD) etc..Memory 903 optionally can also be that at least one is located remotely from the storage of aforementioned processor 901 Device.Batch processing code is stored in memory 903, and processor 901 executes the program in memory 903:

Optionally, described to include the first classifier to training pattern；The type label includes positive example sample label and non- Positive example sample label；

Processor 901 is also used to perform the following operations step:

Optionally, it is described to training pattern include first position return device；

Processor 901 is also used to perform the following operations step:

Optionally, described to include the second classifier to training pattern；The type label includes positive example sample label and non- Positive example sample label；

Processor 901 is also used to perform the following operations step:

Optionally, it is described to training pattern include the second position return device；

Processor 901 is also used to perform the following operations step:

Optionally, processor 901 is also used to perform the following operations step:

Determine first overlapping area in the viewing human region Yu the sample image；

Using the quotient of first overlapping area and the area of the sample image as the viewing human region with it is described The degree of overlapping of sample image.

Optionally, it is described to training pattern include the third place return device；The type label includes positive example sample label With non-positive example sample label；

Processor 901 is also used to perform the following operations step:

When the degree of overlapping is greater than third threshold value, determine that the type label is the positive example sample label, otherwise, really The fixed type label is the non-positive example sample label；

Determine the second overlapping area of the human body head region Yu the sample image；

Using the quotient of second overlapping area and the area of the sample image as the human body head region with it is described The degree of overlapping of sample image.

Further, processor can also be matched with memory and communication interface, execute people in foregoing invention embodiment The operation of body object detecting device.

In the above-described embodiments, can come wholly or partly by software, hardware, firmware or any combination thereof real It is existing.When implemented in software, it can entirely or partly realize in the form of a computer program product.The computer program Product includes one or more computer instructions.When loading on computers and executing the computer program instructions, all or It partly generates according to process or function described in the embodiment of the present invention.The computer can be general purpose computer, dedicated meter Calculation machine, computer network or other programmable devices.The computer instruction can store in computer readable storage medium In, or from a computer readable storage medium to the transmission of another computer readable storage medium, for example, the computer Instruction can pass through wired (such as coaxial cable, optical fiber, number from a web-site, computer, server or data center User's line (DSL)) or wireless (such as infrared, wireless, microwave etc.) mode to another web-site, computer, server or Data center is transmitted.The computer readable storage medium can be any usable medium that computer can access or It is comprising data storage devices such as one or more usable mediums integrated server, data centers.The usable medium can be with It is magnetic medium, (for example, floppy disk, hard disk, tape), optical medium (for example, DVD) or semiconductor medium (such as solid state hard disk Solid State Disk (SSD)) etc..

Above-described specific embodiment has carried out further the purpose of the present invention, technical scheme and beneficial effects It is described in detail.All within the spirits and principles of the present invention, any modification, equivalent replacement, improvement and so on should be included in Within protection scope of the present invention.

Claims

1. a kind of human body target detection method, which is characterized in that the described method includes:

Sample image is obtained from multiple images, includes human body region in the sample image, the human body place Region includes human body head region, viewing human region and whole body human region；

According to the human body region, determine that the model training information of the sample image, the model training information include Type label and position return at least one in penalty values；

The sample image and the model training information input are waited for that training pattern is trained, obtain human body target detection mould Type；

2. the method as described in claim 1, which is characterized in that the human body head region is human body head in the sample image Portion region, the viewing human region are the human region not being blocked in the sample image and the whole body people Body region is the region existed where the Whole Body blocked in the sample image.

3. method according to claim 1 or 2, which is characterized in that described to include the first classifier to training pattern；The class Type label includes positive example sample label and non-positive example sample label；

When the degree of overlapping is greater than first threshold, determine that the type label is the positive example sample label, otherwise, it determines institute Stating type label is the non-positive example sample label；

It is described that the sample image and the model training information input are waited for that training pattern is trained, obtain human body target inspection Surveying model includes:

4. method as claimed in claim 3, which is characterized in that it is described to training pattern include first position return device；

It is described that the sample image and the model training information input are waited for that training pattern is trained, obtain human body target inspection Survey model further include:

The corresponding first training loss function of device is returned according to the first position, determines the position of the first object sample image It puts back into and returns penalty values；

The position of the first object sample image and the first object sample image is returned into penalty values input described first Position returns device and is trained.

5. the method as described in claim 1, which is characterized in that described to include the second classifier to training pattern；The type Label includes positive example sample label and non-positive example sample label；

When the degree of overlapping is greater than second threshold, determine that the type label is the positive example sample label, otherwise, it determines institute Stating type label is the non-positive example sample label；

6. method as claimed in claim 5, which is characterized in that it is described to training pattern include the second position return device；

The corresponding second training loss function of device is returned according to the second position, determines the position of the second target sample image It puts back into and returns penalty values；

The position of the second target sample image and the second target sample image is returned into penalty values input described second Position returns device and is trained.

7. such as method described in claim 5 or 6, which is characterized in that the determination viewing human region and the sample The degree of overlapping of image includes:

Using the quotient of first overlapping area and the area of the sample image as the viewing human region and the sample The degree of overlapping of image.

8. the method as described in claim 1, which is characterized in that it is described to training pattern include the third place return device；It is described Type label includes positive example sample label and non-positive example sample label；

When the degree of overlapping is greater than third threshold value, determine that the type label is the positive example sample label, otherwise, it determines institute Stating type label is the non-positive example sample label.

The corresponding third training loss function of device is returned according to the third place, determines the position of the third target sample image It puts back into and returns penalty values；

The position of the third target sample image and the third target sample image is returned into penalty values and inputs the third Position returns device and is trained.

9. method according to claim 8, which is characterized in that the determination human body head region and the sample image Degree of overlapping include:

Using the quotient of second overlapping area and the area of the sample image as the human body head region and the sample The degree of overlapping of image.

10. a kind of human body target detection device, which is characterized in that described device includes:

Template acquisition module includes human body place in the sample image for obtaining sample image from multiple images Region, the human body region include human body head region, viewing human region and whole body human region；

Information determination module, it is described for determining the model training information of the sample image according to the human body region Model training information includes at least one in type label and position recurrence penalty values；

Model training module, for the sample image and the model training information input to be waited for that training pattern is trained, Obtain human body target detection model；

Module of target detection, for determining the human body region in image to be detected according to the human body target detection model.