CN114821676A

CN114821676A - Passenger flow human body detection method and device, storage medium and passenger flow statistical camera

Info

Publication number: CN114821676A
Application number: CN202210746999.9A
Authority: CN
Inventors: 肖兵
Original assignee: Zhuhai Shixi Technology Co Ltd
Current assignee: Zhuhai Shixi Technology Co Ltd
Priority date: 2022-06-29
Filing date: 2022-06-29
Publication date: 2022-07-29
Anticipated expiration: 2042-06-29
Also published as: CN115131827A; CN115131828A; CN114821676B

Abstract

The application discloses a passenger flow human body detection method and device, a storage medium and a passenger flow statistics camera, which are used for improving a human body detection effect aiming at a depth image. The method comprises the following steps: obtaining a human body detection result which is output in advance, wherein the human body detection result comprises a target cluster of a human body area and a bounding box of the human body area; performing data association on the target cluster and the bounding box of the current frame image and the target cluster and the bounding box of the previous frame image to obtain a data association result; determining the adhesion condition of the individual target in the human body detection result according to the data correlation result; based on the adhesion condition, executing corresponding adhesion splitting operation on the human body detection result; and outputting the human body detection result again based on the result of the adhesion splitting operation.

Description

Passenger flow human body detection method and device, storage medium and passenger flow statistical camera

Technical Field

The application relates to the technical field of image processing, in particular to a passenger flow human body detection method and device, a storage medium and a passenger flow statistical camera.

Background

Under the scenes of the fields of consumer electronics, security protection, traffic and the like, passenger flow statistics is often needed to better judge the trend of people. For passenger flow statistics, human body detection is a necessary and crucial link for passenger flow statistics, and the detection accuracy directly influences the final statistical accuracy.

In the prior art, a human body detection technology based on an RGB image is mature, and in the industry, passenger flow statistics is generally performed through the RGB image by using a detection scheme based on HOG + SVM or a detection scheme based on deep learning. In the prior art, a method for acquiring a depth image by using a depth camera based on TOF and structured light and then performing human body detection and passenger flow statistics by using the depth image exists, but the method for performing human body detection based on the depth image is only obtained by directly transferring a human body detection method based on an RGB image, and because the depth image and the RGB image are different in acquisition principle, the imaging has larger difference, so that the actual detection effect is poor, and an ideal human body detection result is difficult to obtain through the depth image.

Disclosure of Invention

In order to solve the technical problem, the application provides a passenger flow human body detection method, a passenger flow human body detection device, a storage medium and a passenger flow statistics camera.

The application provides a passenger flow human body detection method in a first aspect, and the method comprises the following steps:

obtaining a human body detection result which is output in advance, wherein the human body detection result comprises a target cluster of a human body area and a bounding box of the human body area;

performing data association on the target cluster and the bounding box of the current frame image and the target cluster and the bounding box of the previous frame image to obtain a data association result;

determining the adhesion condition of the individual target in the human body detection result according to the data correlation result;

based on the adhesion condition, executing corresponding adhesion splitting operation on the human body detection result;

and outputting the human body detection result again based on the result of the adhesion splitting operation.

Optionally, the determining, according to the data association result, the adhesion condition of the individual target in the human body detection result includes:

and when the data correlation result has a one-to-many condition, determining that the adhesion condition of the individual target in the human body detection result is dynamic adhesion.

Optionally, when there is a one-to-many condition in the data association result, before determining that the adhesion condition of the individual target in the human body detection result is dynamic adhesion, the method further includes:

verifying the data association result of the bounding box of the current frame image and the bounding box of the previous frame image;

if the bounding box of the current frame image and the bounding box of the previous frame image have a one-to-many condition, verifying the data association result of the target cluster of the current frame image and the target cluster of the previous frame image;

and if the target cluster of the current frame image and the target cluster of the previous frame image are in a one-to-many condition, determining that the adhesion condition of the individual targets in the human body detection result is dynamic adhesion.

Optionally, when it is determined that the adhesion condition of the individual target in the human body detection result is dynamic adhesion, the performing, based on the adhesion condition, a corresponding adhesion splitting operation on the human body detection result includes:

splitting all one-to-many associated items in the human body detection result, and executing the following adhesion splitting operation on any associated item:

determining a basic region in the association item and a region attribution of the basic region, wherein the basic region is an overlapping region of a target to be split and an associated target;

determining an undetermined area in the association item, wherein the undetermined area is an area except the basic area in the target to be split;

and determining the region attribution of the undetermined region, thereby obtaining two split sub-regions.

Optionally, the data association between the target cluster and the bounding box of the current frame image and the target cluster and the bounding box of the previous frame image includes:

data association is performed by the following equation:

wherein, the

Representing the target in the human body detection result

A. The area of the overlap of B is,

、

respectively, represent the area of the target A, B.

The second aspect of the present application provides another passenger flow human body detection method, including:

judging whether the human body detection result of the current frame image meets the following conditions:

the ratio between the width and the height of the bounding box is within a preset ratio range;

and is

The target in the bounding box conforms to the preset head-shoulder characteristics;

if the conditions are met, determining that the monomer adhesion condition of the human body detection result is static adhesion;

performing adhesion splitting operation corresponding to the static adhesion on the human body detection result;

Optionally, the determining whether the target in the bounding box satisfies a preset head-shoulder characteristic includes:

taking the upper area of the target as a target area, and calculating integral projection of the target area;

determining the peak and the trough of the integral projection curve;

if target wave troughs exist in the wave troughs, determining that the targets in the bounding box meet preset head-shoulder characteristics;

the target trough satisfies the following condition:

the left side and the right side of the target wave trough are both provided with a wave crest, and the horizontal distance and the vertical distance between the target wave trough and the wave crest are both in a preset distance range.

Optionally, if a trough exists in the trough, storing the position of the target trough as a splitting position of the adhesion splitting operation;

the executing of the adhesion splitting operation corresponding to the static adhesion on the human body detection result comprises:

and taking the vertical line at the position of the trough of the target as a dividing line, and carrying out adhesion and splitting on the target.

The third aspect of the present application provides a passenger flow human body detection method, including:

aiming at any two targets detected by the human body detection result of the current frame image, judging whether the following conditions are met:

the two targets are adjacent in a vertical direction;

and is

The depth values of the two targets are in accordance with the depth features of the upper part, the lower part and the far part;

if the conditions are met, determining the two targets as tearing targets;

performing a merge operation on the two targets;

and outputting the human body detection result again based on the result of the merging operation.

Optionally, the determining whether the two targets are adjacent in the vertical direction includes:

calculating limit values of an overlapping area of a first target and a second target of the two targets, wherein the limit values comprise limits in four directions of up, down, left and right, and are respectively an upper limit, a lower limit, a left limit and a right limit;

if the left limit is smaller than the right limit, respectively calculating a horizontal overlapping proportion and a vertical overlapping proportion;

and if the horizontal overlapping proportion and the vertical overlapping proportion are both larger than a preset threshold value, determining that the two targets are adjacent in the vertical direction.

Optionally, the determining whether the depth values of the two targets conform to the depth features of the upper part, the lower part and the far part includes:

determining the relative up-down positions of a first target and a second target in the two targets;

determining whether an upper target is closer to a camera than a lower target based on the depth values of the first target and the second target;

and if so, determining that the depth values of the two targets are consistent with the depth features of the upper part, the lower part and the far part.

A fourth aspect of the present application provides a traffic statistics camera, which includes a processor and a depth camera, wherein the processor executes the method according to any one of the first aspect and the options of the first aspect during operation.

The present application provides in a fifth aspect a passenger flow human detection device, the device comprising:

the device comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring a human body detection result which is output in advance, and the human body detection result comprises a target cluster of a human body region and a bounding box of the human body region;

the data association unit is used for performing data association on the target cluster and the bounding box of the current frame image and the target cluster and the bounding box of the previous frame image to obtain a data association result;

the adhesion determining unit is used for determining the adhesion condition of the individual target in the human body detection result according to the data correlation result;

the first adhesion splitting unit is used for executing corresponding adhesion splitting operation on the human body detection result based on the adhesion condition;

and the first re-output unit is used for re-outputting the human body detection result based on the result of the adhesion splitting operation.

A sixth aspect of the present application provides another passenger flow human body detection device, the device comprising:

the second acquisition unit is used for acquiring a human body detection result which is output in advance, wherein the human body detection result comprises a target cluster of a human body region and a bounding box of the human body region;

the adhesion judging unit is used for judging whether the human body detection result of the current frame image meets the following conditions:

and is

the second adhesion splitting unit is used for executing adhesion splitting operation corresponding to the static adhesion on the human body detection result;

and the second re-output unit is used for re-outputting the human body detection result based on the result of the adhesion splitting operation.

A seventh aspect of the present application provides another passenger flow human body detection device, including:

a third obtaining unit, configured to obtain a human body detection result output in advance, where the human body detection result includes a target cluster of a human body region and a bounding box of the human body region;

and the tearing judgment unit is used for judging whether the following conditions are met or not according to any two targets detected by the human body detection result of the current frame image:

the two targets are adjacent in a vertical direction;

and is

if the conditions are met, determining the two targets as tearing targets;

the tearing and merging unit is used for executing merging operation on the two targets;

and the third re-outputting unit is used for re-outputting the human body detection result based on the result of the merging operation.

An eighth aspect of the present application provides a passenger flow human body detection device, the device includes:

the device comprises a processor, a memory, an input and output unit and a bus;

the processor is connected with the memory, the input and output unit and the bus;

the memory holds a program that the processor calls to perform the method of any of the first aspect and the first aspect.

A ninth aspect of the present application provides a computer readable storage medium having a program stored thereon, the program, when executed on a computer, performing the method of any one of the first aspect and the alternatives of the first aspect.

According to the technical scheme, the method has the following advantages:

the method can correct a human body detection result output in advance, for example, corresponding adhesion splitting operation can be selected according to actual adhesion conditions during correction, high adaptability and reliability are achieved, the method is reliable in detection, and various complex scenes such as human body shape loss, target adhesion, tearing and the like in a depth image can be met; the human body detection method for the passenger flow camera is high in operation speed and low in calculation force requirement, can achieve real-time detection when running on a middle-low end embedded platform CPU, is convenient to deploy and wide in applicability, and has a large-scale popularization and application prospect.

Drawings

The drawings to which this application relates are described below:

FIG. 1 is a schematic flow chart illustrating one embodiment of a method for dynamic blocking treatment according to the present application;

FIG. 2 is a schematic diagram of a cluster of targets and bounding boxes in the human body test results of the present application;

FIG. 3 is another schematic diagram of the target cluster and bounding box in the human body test results of the present application;

FIG. 4 is a flowchart illustrating an embodiment of step S103 in the present application;

fig. 5 is a flowchart illustrating an embodiment of step S104 in the present application;

FIG. 6 is a schematic diagram of one-to-many association in the present application;

FIG. 7 is a schematic diagram of a base area and pending area in the present application;

FIG. 8 is a diagram illustrating a split association item in the present application;

FIG. 9 is a schematic flow chart illustrating one embodiment of a method for treating static adhesion according to the present application;

FIG. 10 is a flowchart illustrating an embodiment of step S202 in the present application;

FIG. 11 is a schematic illustration of the determination of head and shoulder characteristics of the present application;

FIG. 12 is a schematic illustration of segmentation of a target in the present application;

FIG. 13 is a schematic flow chart illustrating one embodiment of a method for handling vertical tearing according to the present application;

fig. 14 is a schematic flowchart of an embodiment of a method for obtaining a human body detection result in the present application;

FIG. 15 is a flowchart of an embodiment of step S403;

fig. 16 is a schematic diagram comparing the present application with the enclosure of the head instead of the body.

Detailed Description

Based on the above, the application provides a passenger flow human body detection method for improving the human body detection effect aiming at the depth image.

It should be noted that the human body detection method for the passenger flow camera provided by the present application may be applied to the passenger flow camera, the depth camera, and other terminals, and may also be applied to the server, and the other terminals may be a smart phone or a computer, a tablet computer, a smart television, a smart watch, a portable computer terminal, or an intelligent terminal with computing and data analysis capabilities such as a desktop computer. For convenience of explanation, the terminal is taken as an execution subject for illustration in the present application.

The human body detection method provided by the embodiment is mainly applied to a passenger flow camera, and realizes human body detection and subsequent passenger flow statistics based on the depth image. Compared with the RGB image, the depth image has the advantages that the depth image effect is not easily affected by illumination change, and the image can be normally acquired in dim light and even at night; the depth image does not have information such as color, texture and the like, so that the appearance information of people cannot be recorded, and the worry of people about privacy is eliminated; the depth image contains distance information to facilitate function application based on distance determination. Therefore, more and more passenger flow cameras are selecting such depth cameras.

However, in practical applications, the passenger flow camera needs to cover a relatively large detection range, and therefore a depth lens with a large field angle is often mounted, and although the field of view of the large field angle is wider, distortion and inclination generated when a human body approaches the edge of a picture are also more serious; on the other hand, in the depth image, holes or missing parts of human body such as human head, body edge, legs and other low-reverse positions are easy to occur, and in severe cases, the shapes of the head or the legs are completely missing. That is, human bodies in the depth image often do not have robust features, which makes it difficult for conventional schemes of machine learning or deep learning for single-frame images to obtain ideal detection results. The embodiment provides a human body detection method for a passenger flow camera, which can improve the human body detection effect for a depth image.

Referring to fig. 1, fig. 1 is a schematic flow chart of an embodiment of a passenger flow human body detection method provided by the present application, the method including:

s101, obtaining a human body detection result which is output in advance, wherein the human body detection result comprises a target cluster of a human body area and a bounding box of the human body area;

in this embodiment, a human body detection result obtained by performing preliminary detection by a certain method is first obtained, where the human body detection result includes a target cluster and a bounding box of a human body region, see fig. 2 and fig. 3, where the target cluster is a pixel set of all human body regions, and the bounding box is a rectangular bounding box of a region detected as a human body. The preliminary tests can be carried out by the following embodiments:

preprocessing an input depth image to obtain an image to be detected, determining a picture motion area in the image to be detected through background modeling, clustering the picture motion area to determine a human body area set, and calculating a human body detection result according to the human body area set, wherein the result obtained by primary detection is possibly inaccurate, for example, the situation that targets are adhered possibly exists, for example, in practice, two human body areas are adhered to each other when two or more targets stand front and back, wherein the adhesion refers to the situation that two or more targets are adhered together because the two or more targets are close to each other, so that two or more targets in the calculated human body detection result are regarded as one target; there is also a case where there is tearing in the target, and tearing refers to a case where more than one cluster belonging to the same human target obtained in the foregoing steps may be caused due to the lack of features, such as the human body region in the depth image is not connected, or the depth value of the human body region is greatly changed discontinuously (including the case where a pedestrian lifts his hand, wears a hat, opens an umbrella, etc.), that is, one target is regarded as two targets. For the passenger flow camera, the false detection will ultimately affect the passenger flow statistical accuracy, so the human body detection result obtained by the preliminary detection needs to be corrected, and in this embodiment, correction processing is performed for the target adhesion condition.

S102, performing data association on the target cluster and the bounding box of the current frame image and the target cluster and the bounding box of the previous frame image to obtain a data association result;

in order to determine the adhesion condition of individual targets in the human body detection result, performing data association on the human body detection result (including a target bounding box and a target cluster) of the current frame image and the human body detection result of the previous frame image, wherein if the target adhesion does not occur in the current scene, the corresponding data association results are all in a one-to-one relationship; when the targets start to stick, the data association is "one-to-many". Therefore, whether the target is separated to be stuck can be judged by confirming whether the data association is in a one-to-many state or not. If there is a one-to-many case, this indicates that there is adhesion of the individual target.

S103, determining the adhesion condition of the individual target in the human body detection result according to the data correlation result;

in the step S102, if the target adhesion does not occur in the current scene, the corresponding data association results should be all in a one-to-one relationship; when the targets start to be adhered, the data association may occur in a "one-to-many" condition, so that the adhesion condition of the individual targets can be determined through the data association result, for example, when the one-to-many condition occurs, the adhesion condition of the individual targets in the human body detection result is determined to be dynamic adhesion, which means that the targets are initially separated and then adhered together.

Referring to fig. 4, in another embodiment, in order to further improve the determination accuracy of the adhesion condition, the method for determining that the human body detection result has dynamic adhesion may be:

and S1031, when one-to-many conditions exist in the data association result, verifying the data association result of the bounding box of the current frame image and the bounding box of the previous frame image, if one-to-many conditions exist between the bounding box of the current frame image and the bounding box of the previous frame image, executing a step S1032, and if one-to-many conditions do not exist, executing a step 1034, and determining that no blocking exists.

S1032, verifying the data correlation result of the target cluster of the current frame image and the target cluster of the previous frame image, executing step S1033 if the target cluster of the current frame image and the target cluster of the previous frame image have a one-to-many condition, and executing step 1034 if the target cluster of the current frame image and the target cluster of the previous frame image do not have a one-to-many condition, and determining that no adhesion exists.

And S1033, determining that the adhesion condition of the individual target in the human body detection result is dynamic adhesion.

S1034, determining that no adhesion exists.

In the embodiment, a discrimination scheme aiming at dynamic adhesion is provided, the scheme adopts a strategy of 'two-step verification', the discrimination precision can be effectively improved,

in step S1031, performing data association between the bounding box detected from the current frame image and the bounding box of the previous frame, and if there is a "one-to-many" situation, performing the next verification; otherwise, directly judging that no dynamic adhesion exists.

In step S1032, the data in step S1031 is associated with the target of "one-to-many", and the target cluster is used to replace the bounding box for secondary association or confirmation, if the "one-to-many" condition still exists, it is determined that dynamic adhesion exists, and the data association result is output to the splitting link together; otherwise, judging that no dynamic adhesion exists.

Optionally, when performing data association, an iom (interaction over minimum) matching algorithm is used in the data association process to perform association matching, for example:

for target A, B, its IoM calculation formula is:

wherein the content of the first and second substances,

the area of overlap of the object A, B is shown,

、

respectively, represent the area of the target A, B. Specifically, if data association is performed by using a bounding box, the overlap area and the target area are both calculated according to the bounding box; if the data association is performed by using the target cluster, the corresponding overlap area and the target area are calculated according to the target cluster.

S104, based on the adhesion condition, executing corresponding adhesion splitting operation on the human body detection result;

based on the adhesion condition determined in step S103, a corresponding adhesion splitting operation is performed on the human detection result, for example, when it is determined that the adhesion condition is dynamic adhesion, a splitting operation corresponding to the dynamic adhesion is performed, and the splitting operation is used to split the target to be split into a plurality of individual targets, so as to obtain an accurate human detection result.

Referring to fig. 5, in an embodiment, the adhesion splitting operation may be:

s1041, splitting all the one-to-many associated items in the human body detection result, and executing the following adhesion splitting operation on any associated item:

s1042, determining a basic region in the association and the region attribution of the basic region, wherein the basic region is an overlapping region of the target to be split and the associated target;

s1043, determining an undetermined area in the association, wherein the undetermined area is an area except the basic area in the target to be split;

s1044, determining the region attribution of the undetermined region, thereby obtaining two split sub-regions.

Based on the data association result obtained in step S102, the "one-to-many" association items are split one by one. The above-described sticky split operation is performed for each "one-to-many" association, any of which includes the target to be split and the associated target associated therewith.

Referring to fig. 6, a diagram of a "one-to-many" correlation term is shown, in which a dotted line represents a detection result of a previous frame image, a solid line represents a calculation result of a current frame image, and specifically,

、

respectively representing the associated previous frame object A, B,

and representing the adhesion target detected by the current frame, namely the target to be split.

And with

、

These 2 targets are related, belonging to the "one-to-many" case. The splitting process is to

Splitting into two targets.

In step S1042, an overlap region between the target to be split and the associated target is calculated and defined as a basic region, and each basic region belongs to a corresponding associated target and is used as a basis for preliminary splitting.

Referring to FIG. 7, the target to be split

And associated target

The overlapping area of (a) is the basic area a,

and associated target

The overlapping area of (a) is the basic area B. The basic area A, B is

And (4) performing preliminary splitting.

In step S1043, the remaining parts of the to-be-split region excluding the basic region are calculated, that is, the to-be-determined region. In the context of figure 7 of the drawings,

the remaining area other than the basic area A, B beingIs a pending area.

In step S1044, the method of determining the region to be determined may be outward expansion and inward contraction,

for example, the region to be determined is shrunk until no remaining region exists, and the shrinking process is a process of determining to which target the element/pixel of the region to be determined belongs, for example, the boundary of the region to be split is taken as a limit, the region growing is performed on the basic region, and the newly added region of each iteration is attributed to the corresponding basic region until no remaining region exists.

With reference to figure 8 of the drawings,

、

is that

And adhering the split target.

And S105, outputting the human body detection result again based on the result of the adhesion splitting operation.

After the corresponding adhesion splitting operation is performed, for example, the target to be split in step S1044 is split into two sub-regions, and the bounding box and the target cluster are obtained again, so that the information of the bounding box and the target cluster is updated, the corrected human body detection result can be obtained, and the human body detection result is output again.

The embodiment provides a passenger flow human body detection method, which is reliable in detection, and combines a correction mechanism to deal with the situations of human body shape loss and target adhesion in a depth image, so that the human body detection effect aiming at the depth image is greatly improved, and the passenger flow statistical precision is ensured.

The human body detection method for the passenger flow camera is high in operation speed and low in calculation force requirement, can achieve real-time detection when running on a middle-low end embedded platform CPU, is convenient to deploy and wide in applicability, and has a large-scale popularization and application prospect.

The method provided by the above step embodiment can well deal with the problem of dynamic blocking, but is not applicable to static blocking, i.e. the situation that the target is initially blocked. Therefore, the application also provides an adhesion splitting mechanism aiming at the static adhesion, and the mechanism mainly starts from the human body characteristics of a single frame to judge and process the corresponding adhesion. As will be explained in detail below:

referring to fig. 9, the method provided by this embodiment includes:

s201, obtaining a human body detection result output in advance, wherein the human body detection result comprises a target cluster of a human body area and a bounding box of the human body area;

in this embodiment, a pre-output human body detection result is obtained first, where the human body detection result is obtained by performing preliminary detection by a certain method, and includes a target cluster and a bounding box of a human body region, where the target cluster is a pixel set of all human body regions, and the bounding box is a rectangular bounding box of a region detected as a human body. The preliminary tests can be carried out by the following embodiments:

The method provided by the embodiment well solves the problem of dynamic adhesion, and the method provided by the embodiment mainly aims at the condition of static adhesion, namely, the two targets are adhered at the beginning.

S202, judging whether the human body detection result of the current frame image meets the following conditions:

firstly, confirming a human body detection result of an image of a current frame so as to judge whether a static adhesion condition exists, wherein the confirmation mode is that whether the human body detection result meets the following conditions is judged:

the condition a, the ratio between the width and the height of the bounding box is in a preset ratio range;

the condition b is that the target in the bounding box meets the preset head-shoulder characteristics;

the ratio between the width and the height of the bounding box may be an aspect ratio or an aspect ratio, and may be constrained by a preset ratio range, for example, a value of a boundary threshold R of the ratio range is usually larger than a conventional single aspect ratio.

Specifically, for any target, it is first determined whether it meets condition a, and if so, it is further determined whether it meets condition b.

Specifically, referring to fig. 10, the determination process of whether the condition a is satisfied may be calculated according to the pixel coordinates of the bounding box, and the determination process of whether the condition b is satisfied is as follows:

s2021, taking the upper area of the target as a target area, and calculating integral projection of the target area;

s2022, determining the peak and the trough of the integral projection curve;

s2023, if target wave troughs exist in the wave troughs, determining that the targets in the bounding box meet preset head-shoulder characteristics;

the target trough satisfies the following condition:

a wave crest is arranged on each of the left side and the right side of the target wave trough, and the horizontal distance and the vertical distance between the target wave trough and the wave crest are both within a preset distance range;

if the above conditions are simultaneously met, executing step S204, determining that the monomer adhesion condition of the human body detection result is static adhesion, and if no target trough exists, executing step S206, determining that no static adhesion exists.

Referring to fig. 11, an upper region of the target is taken as a target region, integral projection is calculated for the target region, and optionally, the integral projection result is smoothed to obtain an optimized integral projection curve; calculating the peak and the trough of an integral projection curve; for any wave trough

If there is a peak on each side (respectively marked as

、

) And is and

to

、

The horizontal distance and the vertical distance of the head-shoulder sensor are in accordance with the set distance range, and then the head-shoulder sensor is considered to be in accordance with the head-shoulder characteristics; then the wave trough is formed

The position i is used as a splitting position to be stored and transmitted to a splitting link. In particular, the amount of the solvent to be used,

to

、

The setting range of the horizontal and vertical distances is as follows:

wherein the content of the first and second substances,

respectively, a preset horizontal and vertical direction spacing threshold. Preferably, the first and second liquid crystal materials are,

. (resolution for image 160x 120)

S203, determining that the monomer adhesion condition of the human body detection result is static adhesion;

and if the human body detection result simultaneously meets the condition a and the condition b, judging that the adhesion condition is static adhesion, and executing adhesion splitting operation corresponding to the static adhesion.

S204, performing adhesion splitting operation corresponding to the static adhesion on the human body detection result;

splitting a corresponding adhesion target, for example, determining a position i of a trough in the above step, storing the position i as a split position, and performing adhesion splitting on the target by using a vertical line of the position as a split line during splitting, referring to fig. 12, in fig. 12, a left side is a schematic diagram of splitting by using the vertical line, and a right side is a schematic diagram of two sub-regions obtained after splitting. Of course, in practice, the feature (such as an edge) of the target may be combined with the feature to perform the segmentation by a curve, and the present invention is not limited thereto. The former can be taken as a preferred embodiment, although the shape of the split is relatively rough, the splitting efficiency is high, and for the passenger flow statistics, the main focus is to accurately distinguish the number of people, and the shape of the human body area does not need to be particularly accurate.

And S205, outputting the human body detection result again based on the result of the adhesion splitting operation.

And after the corresponding adhesion splitting operation is executed, the bounding box and the target cluster are obtained again, the information of the bounding box and the target cluster is updated, the corrected human body detection result can be obtained, and the human body detection result is output again.

The invention provides a method for judging static adhesion and separating adhesion, which mainly starts from human body characteristics of a single frame to judge and process corresponding adhesion, wherein the human body characteristics refer to the aspect ratio of a surrounding box and the head-shoulder characteristics of side-by-side, namely, when the aspect ratio of a certain target is too large (exceeding the conventional single aspect ratio), the adhesion may exist. Further, the confirmation is further confirmed by the "head-shoulder feature" of the side-by-side. In this regard, the present invention provides an example of determining static adhesion based on a side-by-side "head-shoulder feature": taking the upper region of a target as a target region, wherein the target region mainly covers the head and shoulder regions of a human body, calculating integral projection for the target region, and judging whether the target region has 'peak-trough-peak' which accords with the 'head-shoulder-head' rule or not based on the result of the integral projection. If the static adhesion exists, the static adhesion is judged to exist, the trough corresponding to the shoulder is taken as a splitting position, and the target is divided into two parts, which is shown in figure 12.

Further, in order to obtain the peaks and the troughs better, the integral projection curve may be smoothed, for example, median filtering and mean filtering are adopted, or kalman filtering is adopted, which is not limited specifically.

It should be noted that, in practice, the present application provides a method for determining and splitting dynamic adhesion and static adhesion, which may be used for correcting a human body detection result by using the two methods alone, or may be used simultaneously, for example, for determining and splitting advanced dynamic adhesion, and then for determining and splitting static adhesion according to a human body detection result output after splitting dynamic adhesion. Thereby maximizing the accuracy of the human body detection result.

The embodiment provides a judgment and splitting method for processing two adhesion conditions of dynamic adhesion and static adhesion, which can basically cover most actual conditions, can effectively improve the accuracy of a human body detection result, is suitable for a human body detection method of a passenger flow camera, has low requirement on hardware computing power, does not need hardware modules such as NPU (neutral point unit), GPU (graphics processing unit) and the like, can directly run at a middle-end and low-end CPU (central processing unit), has high computing speed, can achieve real-time detection when running on a middle-end and low-end embedded platform CPU, is convenient to deploy, has low cost, and has a large-scale popularization and application prospect.

However, in some scenarios, for example, the human body regions in the depth map are not connected, or the depth values of the human body regions are greatly and discontinuously changed (including lifting hands of pedestrians, wearing hats, opening umbrellas, etc.), there may be more than one bounding boxes and target clusters of the human body of the same person obtained in the foregoing steps, which means that there is false detection for the passenger flow camera, and the statistical accuracy is ultimately affected.

To this end, the invention attributes this problem to "tearing" (corresponding to "blocking") and proposes a corresponding solution. In practical applications, there are many possible "tears", most often the same human body is detected as two objects, i.e. tearing in the vertical direction and giving a corresponding handling mechanism, and the following will describe the embodiment of the method in detail.

Referring to fig. 13, the method provided by the present embodiment includes:

s301, obtaining a human body detection result which is output in advance, wherein the human body detection result comprises a target cluster of a human body area and a bounding box of the human body area;

the method comprises the steps of preprocessing an input depth image to obtain an image to be detected, determining a picture motion area in the image to be detected through background modeling, clustering the picture motion area to determine a human body area set, and calculating a human body detection result according to the human body area set, wherein the result obtained through primary detection is possibly inaccurate, one condition is that a target is torn, and the tearing is the condition that more than one cluster belonging to the same human body target obtained in the previous step is possible due to the loss of characteristics, such as the fact that the human body area in the depth image is not communicated or the depth value of a partial human body area is greatly and discontinuously changed (including the fact that a pedestrian lifts his hand, wears a hat, supports an umbrella and the like), namely one target is regarded as two targets. For the passenger flow camera, the false detection will ultimately affect the passenger flow statistical accuracy, so the human body detection result obtained by the preliminary detection needs to be corrected, and in this embodiment, correction processing is performed for the target adhesion condition.

It should be noted that the principle/basis of the method for judging tearing in the vertical direction is as follows:

in the application, a tearing judgment principle is innovatively provided, that is, for two targets adjacent to each other in the vertical direction, if the two targets conform to the feature of "up-down-far" in the depth value, the two targets are usually the same human body target and belong to the situation of tearing in the vertical direction. Mainly because, in the human body detection scene of the passenger flow camera, the human body is mostly in a standing posture, the distance from the top of the head to the sole to the camera generally accords with an increasing rule, namely 'up-down-up', and the rule is basically established in scenes such as hand lifting, hat wearing, umbrella opening and the like. As a counter example of tearing in the vertical direction, for example, two persons stand forward and backward, although the condition of positional proximity in the vertical direction is satisfied, since the human body behind is farther from the camera than the human body in front, the condition of "up and down and far" is not satisfied, and thus can be excluded.

S302 determines whether the following conditions c and d are satisfied for any two objects detected from the human body detection result of the current frame image, if so, then step S303 is executed, and if not, then step S306 is executed to determine that there is no vertical tearing:

condition c, the two targets are adjacent in the vertical direction;

the condition d is that the depth values of the two targets are in accordance with the depth characteristics of the upper part, the lower part and the far part;

based on the principle and basis set forth in step S301, in the present embodiment, the tearing is determined by using the condition c and the condition d as constraints, and if the detected object satisfies the condition c and the condition d at the same time, the object is considered to be torn;

specifically, the following provides an example of the discrimination:

for condition c: calculating limit values of an overlapping area of a first target and a second target of the two targets, wherein the limit values comprise limits in four directions of up, down, left and right, and are respectively an upper limit, a lower limit, a left limit and a right limit;

The following examples are given:

for the objects a (x 1, y1, w1, h 1), B (x 2, y2, w2, h 2), the bounding boxes corresponding to the objects are represented in the form of (x, y, w, h), wherein x and y represent horizontal and vertical coordinates of the top left vertex, and w and h represent the width and height of the bounding box (rectangular box).

Calculating the left, right, upper and lower limits of the overlapping area:

if it is

Continuing to calculate and judge (otherwise, returning, judging that the position is not adjacent to the vertical direction), and calculating the horizontal overlapping proportion

And vertical adjacent ratio

：

If it is

And is

And judging that the positions belong to the vertical direction proximity. Wherein the content of the first and second substances,

respectively, a preset horizontal overlap ratio threshold and a vertical adjacent ratio threshold. Preferably, the first and second electrodes are formed of a metal,

。

if the condition c is satisfied, the condition d is further determined.

For condition d: determining the relative up-down positions of a first target and a second target in the two targets;

and if so, determining that the depth values of the two targets are in accordance with the upper, lower and far depth characteristics.

Specifically, the upper and lower positional relationships of the targets (for example, A, B) are determined, and then it is determined whether the target on the upper side is closer to the camera than the target on the lower side based on the depth value, and if so, it is determined that "upper, lower, and far" are satisfied.

S303, determining the two targets as tearing targets;

further, if the targets simultaneously satisfy the above conditions, step S303 is executed to determine that the two targets are tearing targets and associate the tearing targets, and after all the targets are determined, all the association results are sent to a tearing and merging link.

S304, merging the two targets;

if tearing in the vertical direction exists, the related target clusters are combined, the two original target clusters are combined into one, and when a human body detection result is output, the target cluster is regarded as a single body.

S305 outputs the human body detection result again based on the result of the merge operation.

And recalculating the merged target cluster, including calculating a bounding box according to the target cluster, and further screening through human body characteristic constraint.

S306, determining that the vertical tearing does not exist.

The foregoing embodiments in the present application are all used for correcting a pre-output human body detection result, where the pre-output human body detection result is obtained by performing a preliminary detection through a certain method, and the pre-output human body detection result includes a target cluster and a bounding box of a human body region, where the target cluster is a pixel set of all human body regions, and the bounding box is a rectangular bounding box of a region detected as a human body.

In the following, a specific embodiment mode is provided for the step "obtaining a pre-output human body detection result, where the human body detection result includes a target cluster of a human body region and a bounding box of the human body region", and the following detailed description refers to fig. 14, where the embodiment includes:

s401, preprocessing an input depth image to obtain an image to be detected;

In the embodiment, the terminal first performs preprocessing on the input depth image, including but not limited to down-sampling and format conversion on the depth image, so as to reduce the amount of calculation and improve the detection speed.

The terminal carries out preprocessing on the input depth image, wherein the preprocessing specifically comprises down-sampling and/or format conversion, and the preprocessing also comprises threshold gating on the processed image and then obtaining the image to be detected.

Specifically, the down-sampling and/or format conversion includes: if the resolution of the depth image is larger, the original depth image is downsampled (namely reduced); if the bit depth of the depth image is larger than 8bit, the depth image is converted into an 8bit image, so that the calculation amount is reduced, and the detection speed is improved. The downsampling and format conversion sequence can be interchanged, preferably, the terminal firstly downsamples the original depth image and then converts the downsampled depth image into an 8-bit image.

Furthermore, the terminal also needs to perform depth gating on the processed depth image, that is, a depth range [ Imin, Imax ] is preset, and a pixel value that does not conform to the depth range is set to be 0. It should be noted that Imin and Imax may be specifically set according to application requirements and actual scenarios. Some dark part (short distance) noises and bright part (long distance) noises can be preliminarily screened out through depth gating, and the detection effect is favorably improved.

S402, determining a picture motion area in the image to be detected through background modeling;

and the terminal determines the picture motion area in the image to be detected through background modeling. Background modeling can be directly used for detecting a moving target or used as a preprocessing link to reduce the search range and further reduce the calculation amount for a scene with a fixed camera and slowly changed picture background. The main advantages of background modeling are that the amount of computation is relatively small and fast. In the depth image acquired in the passenger flow statistics scene, the human body shape is often incomplete and unfixed, for example, the head or the leg of the human body is missing, the image of the human body close to the edge of the picture acquired by the camera with a large field angle is seriously distorted and inclined, and the effective human body detection cannot be realized by the camera with a conventional view angle based on a single-frame detection means. In the embodiment, the picture motion area in the image to be detected is determined through background modeling, multi-frame information can be fully considered, a moving object (a pedestrian) is distinguished from the picture background, and the interference of the background on the detection result is reduced.

In some specific embodiments, the background modeling may be implemented using a CodeBook algorithm CodeBook or LOBSTER algorithm.

S403, clustering the picture motion areas to determine a human body area set, and calculating a human body detection result according to the human body area set;

the terminal clusters the picture motion region obtained in step S402, specifically, clusters the effective pixels in the picture motion region to obtain a cluster set, that is, the human body region set in the present application. Specifically, the clustering process refers to traversing a neighborhood pixel of any pixel A in a cluster, and adding Ni into the cluster in which A is located if the absolute value of the difference between the pixel values of Ni and A is smaller than a preset intra-cluster similarity threshold value S for any effective neighborhood pixel Ni, or building a new cluster, adding Ni into the new cluster and continuing clustering.

The above-mentioned obtained human body region set usually already contains main human body regions, but each target cluster cannot be directly regarded as a human body region. This is due to:

1) the obtained clusters may include noise regions, moving objects, false detected backgrounds and other non-human body regions;

2) when the human bodies contact or block each other, the resulting cluster areas adhere together, rather than individual human body areas, referred to herein as "adhesion";

3) in some scenarios, such as when the human body region in the depth map is not connected or the change of the depth value of the human body region is greatly discontinuous, more than one cluster belonging to the same human body target may be present, i.e. referred to as "tearing" in this application.

In order to solve the above problem, after the human body region set is obtained, the corresponding human body detection result is further calculated according to the human body region set, that is, the terminal traverses the cluster set, and the bounding boxes of all the clusters are obtained and used as the detection frames of the clusters to obtain the human body detection result.

Specifically, the terminal traverses all pixels of the cluster to obtain the minimum value and the maximum value of x and y coordinates of the pixels in the cluster: and (5) determining the corresponding bounding box rectangle, namely the target detection frame, by xmin, ymin, xmax and ymax. It should be noted that the human body detection result at least includes a human body region set and a human body bounding box set.

Referring to fig. 15, an embodiment of determining a set of body regions is provided, the embodiment comprising:

s4031, marking pixels in the picture motion area as effective pixels through an image mask, and marking pixels outside the picture motion area as ineffective pixels;

and after the terminal obtains the picture motion area, an image mask with the same resolution as the image to be detected is manufactured to mark whether the pixel is effective or not. Specifically, the corresponding pixels of the picture motion area obtained in step S4032 are marked as valid, and the pixels of the remaining area are marked as invalid.

Furthermore, the terminal can mark the upper, lower, left and right boundary pixels of the image mask as invalid, so that the boundary check of each pixel is avoided during the subsequent clustering, and the efficiency is improved.

S4032, clustering effective pixels according to the image mask and the image to be detected to obtain a human body region set;

and the terminal clusters all the marked effective pixels in the step S4031 according to the image mask and the image to be detected to obtain a cluster set, namely the human body region set in the application. Specifically, the clustering process refers to traversing a neighborhood pixel of any pixel A in a cluster, and adding Ni into the cluster in which A is located if the absolute value of the difference between the pixel values of Ni and A is smaller than a preset intra-cluster similarity threshold value S for any effective neighborhood pixel Ni, or building a new cluster, adding Ni into the new cluster and continuing clustering.

Further, since clustering may involve searching of images, in some specific embodiments, Depth-First-Search (DFS) may be used, and Breadth-First-Search (BFS) may also be used. Preferably, breadth-first search is adopted, recursion is avoided, and memory consumption is low, so that the calculation speed is improved.

S4033, determining a bounding box of the human body region set;

the terminal traverses the human body region set, i.e., the cluster set, obtained in step S4032, and finds an AABB bounding box for each cluster, which is used as a detection frame of the cluster. The process of finding the AABB bounding box is as follows: traversing all pixels of the target cluster, and solving the minimum value and the maximum value of x and y coordinates of the pixels in the cluster: xmin, ymin, xmax, ymax, and then determining the corresponding bounding box rectangle, wherein the rectangle is the target detection frame.

In some specific embodiments, the terminal may further obtain a pixel position corresponding to a highest point (height of ymax) of each cluster during the process of obtaining a minimum value and a maximum value of x and y coordinates of pixels in the cluster, so as to obtain a vertex coordinate set of each cluster. Therefore, the highest point can be used as a vertex, and a small human head frame can be generated by taking the highest point as a head top reference to replace a human body bounding box, so that in a multi-person and adhesive scene, dense and overlapped detection frames can be prevented from being output, and the display effect on an application end is improved.

Referring to fig. 16, the upper part of fig. 16 is a display before improvement, and the lower part of fig. 16 is a display effect after improvement by using a human head frame instead of a human body bounding box, which greatly improves user experience.

S4034, screening the bounding boxes through preset constraints, and determining the screened bounding boxes as human body detection results;

and after the terminal obtains the bounding boxes of the human body region set through calculation, screening the obtained bounding boxes through preset constraints.

Specifically, the preset constraints include, but are not limited to: the body region area constraint, bounding box aspect ratio constraint, boundary limit constraint, and height constraint, which will be described separately below.

1) Human body region area constraint;

specifically, a target area threshold range [ Amin, Amax ] is set, and targets having an area not within the set range are discarded. The target area is the number of pixels in the cluster. Further, the Amin value should consider the minimum limit of the area of a single region, and the Amax value should consider the maximum limit of the area of a plurality of regions in an adhesion scene, which is to consider the adhesion situation of the plurality of regions, firstly reserve the region and reserve a subsequent adhesion splitting link for correction. Optionally, the Amin value is smaller than the minimum limit of the single-person region area, and the intention is to consider the situation of tearing of a single human body region, and firstly keep the results for correction of a subsequent tearing and merging link.

2) Bounding box aspect ratio constraints;

specifically, a target detection frame width-to-height ratio range [ Rmin, Rmax ] is set, and targets whose detection frame width-to-height ratios are not within the set range are discarded. The aspect ratio, i.e. the ratio of the width to the height of the bounding box, may also be the ratio of the height to the width. Further, the value of Rmin should consider the minimum limit of the width-height ratio of a single surrounding box, and the value of Rmax should consider the maximum limit of the width-height ratio of a multi-surrounding box in an adhesion scene, which is the same as the above [ Amin, Amax ].

3) Limiting and constraining the boundary;

specifically, an upper boundary line, a lower boundary line, a left boundary line and a right boundary line are set according to application requirements and actual scene characteristics, and targets with center points exceeding the boundary lines are discarded. Specifically, the central point is a bounding box central point or a human body region centroid, and preferably, the central point is a human body region centroid. The method aims to directly ignore the human body when the human body is in the image boundary with more missing shapes.

4) Height constraint;

specifically, the height threshold HT is set, and objects having a height lower than the height threshold are discarded. The aim is to screen out some low objects which are misdetected, such as chairs which are regarded as motion areas by the background modeling module due to moving. Two specific height discrimination schemes are provided, and the preferred scheme is as follows:

according to the first scheme, the actual height of the target in the depth image is estimated through the internal and external parameters of the camera and the pixel coordinates of the target, and then the estimated value is compared with a height threshold value;

and in the second scheme, a height calibration mode is adopted, the depth map acquisition is carried out on the height plane where the height threshold HT is located to obtain a reference depth map, or the height Hc of the camera is recorded, the camera is arranged at the height of Hc-HT, then the ground depth map is acquired to serve as the reference depth map, and in the actual use process, the height relation between the target and the reference depth map is determined by comparing the target depth and the reference depth map, so that whether the target is lower than HT or not is judged.

It should be noted that, in the above screening process, clusters and bounding boxes always correspond to each other, and when a cluster is screened, the corresponding bounding box is also deleted synchronously, and vice versa.

And in practical applications, step S4034 will likely be performed multiple times, wherein the height constraint need not be performed each time, but once for the total detection process. The method specifically comprises the following steps: after the corresponding splitting or merging processing is executed, all the bounding boxes of all the clusters need to be solved again, the results are screened through the human body area constraint, the bounding box aspect ratio and the boundary limit constraint, and the screening detection results through the height constraint are placed after the steps as a single step and are only executed once.

The application also provides a passenger flow camera, the passenger flow statistical camera comprises a processor and a depth camera, and the processor executes the passenger flow human body detection method in the running process.

The application also provides a passenger flow human body detection device, the device includes intercouple:

The application also provides another passenger flow human body detection device, the device includes mutually coupled:

and is

the third acquisition unit is used for acquiring a human body detection result which is output in advance, wherein the human body detection result comprises a target cluster of a human body area and a bounding box of the human body area;

the two targets are adjacent in a vertical direction;

and is

if the conditions are met, determining the two targets as tearing targets;

The application also provides a passenger flow human body detection device, including intercoupling:

the device comprises a processor, a memory, an input and output unit and a bus;

the memory stores a program, and the processor calls the program to execute any one of the passenger flow human body detection methods.

The present application also relates to a computer-readable storage medium having a program stored thereon, which when run on a computer causes the computer to perform any of the methods described above. The above-described device embodiments are merely illustrative, and the division of the unit is only one logical function division, and other division ways may be available in actual implementation.

Claims

1. A passenger flow human body detection method is characterized by comprising the following steps:

2. The passenger flow human body detection method according to claim 1, wherein the determining the adhesion condition of the individual target in the human body detection result according to the data correlation result comprises:

and when one-to-many conditions exist in the data correlation result, determining that the adhesion condition of the individual target in the human body detection result is dynamic adhesion.

3. The passenger flow human body detection method according to claim 2, wherein when there is a one-to-many case in the data correlation result, before determining that the adhesion case of the individual target in the human body detection result is a dynamic adhesion, the method further comprises:

4. The passenger flow human body detection method according to claim 2, wherein when determining that an adhesion condition of an individual target in the human body detection result is dynamic adhesion, the performing a corresponding adhesion splitting operation on the human body detection result based on the adhesion condition comprises:

5. The method of detecting a person in passenger flow according to claim 1, wherein the step of associating the target cluster and the bounding box of a current frame image with the target cluster and the bounding box of a previous frame image comprises:

data association is performed by the following equation:

wherein, the

Represents the target A in the human body detection result,

The area of the overlap of B is,

、

respectively, represent the area of the target A, B.

6. A passenger flow human body detection method is characterized by comprising the following steps:

and is

7. The method of detecting a person in passenger flow according to claim 6, wherein determining whether the target in the bounding box satisfies a predetermined head-shoulder characteristic comprises:

determining the peak and the trough of the integral projection curve;

the target trough satisfies the following condition:

8. The passenger flow human body detection method according to claim 7, wherein if a trough exists in the troughs, the position of the target trough is stored as a splitting position for adhesion splitting operation;

9. A passenger flow human body detection method is characterized by comprising the following steps:

the two targets are adjacent in a vertical direction;

and is

if the conditions are met, determining the two targets as tearing targets;

performing a merge operation on the two targets;

10. The passenger flow human detection method of claim 9, wherein determining whether the two objects are adjacent in a vertical direction comprises:

calculating limit values of an overlapping area of a first target and a second target of the two targets, wherein the limit values comprise limits in four directions of up, down, left and right, namely an upper limit, a lower limit, a left limit and a right limit;

11. The method of detecting a human body with passenger flow according to claim 9, wherein determining whether the depth values of the two objects conform to the depth features of the upper part, the lower part and the far part comprises:

12. A passenger flow statistics camera, characterized in that the passenger flow statistics camera comprises a processor and a depth camera, the processor in operation performing the passenger flow human detection method of any one of claims 1 to 11.

13. A passenger flow human detection device, the device comprising:

14. A passenger flow human detection device, the device comprising:

and is

15. A passenger flow human detection device, the device comprising:

the two targets are adjacent in a vertical direction;

and is

if the conditions are met, determining the two targets as tearing targets;

16. A passenger flow human detection device, the device comprising:

the device comprises a processor, a memory, an input and output unit and a bus;

the memory holds a program that the processor calls to perform the method of any of claims 1 to 11.

17. A computer-readable storage medium having a program stored thereon, which when executed on a computer performs the method of any one of claims 1 to 11.