CN111860261A

CN111860261A - Passenger flow value statistical method, device, equipment and medium

Info

Publication number: CN111860261A
Application number: CN202010664483.0A
Authority: CN
Inventors: 王彦斌; 朱宏吉; 张彦刚
Original assignee: Beijing Orion Star Technology Co Ltd
Current assignee: Beijing Orion Star Technology Co Ltd
Priority date: 2020-07-10
Filing date: 2020-07-10
Publication date: 2020-10-30
Anticipated expiration: 2040-07-10
Also published as: CN111860261B

Abstract

The invention discloses a method, a device, equipment and a medium for counting passenger flow values, which are used for solving the problem of low accuracy of the conventional passenger flow value counting method. In the passenger flow counting process, for each head frame contained in the image to be identified, comparing the first similarity of the head frame and each verified head frame in the currently stored verification queue with a preset first threshold, and when each first similarity corresponding to the head frame is smaller than the preset first threshold, indicating that the head frame is not similar to each verified head frame, determining that the head frame is a target head frame, and performing a subsequent step of updating the current counted passenger flow value, thereby avoiding counting the head frames which may be 'dummy' in posters or billboards into the passenger flow, and improving the accuracy of passenger flow value counting.

Description

Passenger flow value statistical method, device, equipment and medium

Technical Field

The invention relates to the technical field of image processing, in particular to a passenger flow value statistical method, a passenger flow value statistical device, passenger flow value statistical equipment and a passenger flow value statistical medium.

Background

The method has the advantages that the passenger flow and personnel distribution conditions are counted under open scenes such as shopping malls and scenic spots, the attributes of customers under the scenes are analyzed, quantitative service is provided for merchants, and technical support is provided for the merchants to make better business decisions. Therefore, how to count passenger flow is a problem that people pay more attention in recent years.

In the prior art, a human head box contained in an image to be recognized is generally obtained through a conventional visual algorithm, such as Local Binary Pattern (LBP). And calculating the similarity of each tracked person head frame corresponding to each identification information in the current tracking queue and the person head frame, determining the target identification information of the person head frame according to each similarity through Hungarian algorithm, and storing the person head frame of the target identification information in the tracking queue. And finally, determining the statistical passenger flow value according to the number of different identification information in the updated tracking queue.

For the method, when a plurality of people posters or billboards containing people exist in an application scene, people in the collected images cannot be identified whether the head frames of the people are 'dummy' in the posters or the billboards, so that the 'dummy' in the posters or the billboards is counted in the passenger flow value, and the accuracy of the counted passenger flow value is influenced.

Disclosure of Invention

The embodiment of the invention provides a method, a device, equipment and a medium for counting passenger flow values, which are used for solving the problem of low accuracy of the conventional passenger flow value counting method.

The embodiment of the invention provides a statistical method of a passenger flow value, which comprises the following steps:

acquiring a human head frame contained in an image to be identified;

for each head frame, if the first similarity between the head frame and each verified head frame in the current verification queue is smaller than a preset first threshold value, determining that the head frame is a target head frame;

for each target person head frame, determining target identification information of the target person head frame according to the second similarity between the target person head frame and each tracked person head frame in the current tracking queue; and if the target identification information meets the preset updating condition, updating the current statistical passenger flow value.

The embodiment of the invention provides a passenger flow value statistical device, which comprises:

the acquisition unit is used for acquiring a human head frame contained in the image to be identified;

the judging unit is used for determining that the human head frame is a target human head frame if the first similarity between the human head frame and each verified human head frame in the current verification queue is smaller than a preset first threshold value;

the processing unit is used for determining the target identification information of each target person head frame according to the second similarity between the target person head frame and each tracked person head frame in the current tracking queue; and if the target identification information meets the preset updating condition, updating the current statistical passenger flow value.

An embodiment of the present invention provides an electronic device, where the electronic device at least includes a processor and a memory, and the processor is configured to implement the steps of the method for counting a passenger flow value as described in any one of the above when executing a computer program stored in the memory.

An embodiment of the present invention provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the steps of any of the above methods for counting a passenger flow value.

In the passenger flow counting process, for each head frame contained in the image to be identified, comparing the first similarity of the head frame and each verified head frame in the currently stored verification queue with a preset first threshold, and when each first similarity corresponding to the head frame is smaller than the preset first threshold, indicating that the head frame is not similar to each verified head frame, determining that the head frame is a target head frame, and performing a subsequent step of updating the current counted passenger flow value, thereby avoiding counting the head frames which may be 'dummy' in posters or billboards into the passenger flow, and improving the accuracy of passenger flow value counting.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

FIG. 1 is a schematic diagram illustrating a statistical process of passenger flow values according to an embodiment of the present invention;

FIG. 2 is a flow chart illustrating the statistics of specific passenger flow values according to an embodiment of the present invention;

FIG. 3 is a schematic diagram illustrating a statistical process of a specific passenger flow value according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of a device for counting passenger flow values according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Example 1: fig. 1 is a schematic diagram of a statistical process of a passenger flow value according to an embodiment of the present invention, where the process includes:

s101: and acquiring a human head frame contained in the image to be recognized.

The statistical method for the passenger flow value provided by the embodiment of the invention is applied to the electronic equipment, and the electronic equipment can be intelligent equipment, such as an intelligent robot, monitoring equipment and the like, and can also be a server and the like.

After the image to be identified is obtained, the image to be identified is correspondingly processed based on the passenger flow value statistical method provided by the embodiment of the invention, so that the current statistical passenger flow value is updated.

The image to be recognized may be acquired by the electronic device itself, or may be received and transmitted by other image acquisition devices, which is not limited herein.

In a specific implementation process, the human head box contained in the received image to be recognized may be obtained through a conventional visual algorithm, such as LBP. However, because the traditional visual algorithm cannot well acquire the human head frame in the image to be recognized in the larger shooting range, when the human head frame in the image to be recognized in the larger shooting range is expected to be acquired, the human head frame included in the received image to be recognized can be acquired through the human head detector or the human body detector which is trained in advance.

The image recognition method includes the steps of obtaining a head frame contained in an image to be recognized, namely obtaining each head area in the image to be recognized, and determining each head frame contained in the image to be recognized according to position information of each head frame in the image to be recognized, for example, coordinate values of pixel points at the upper left corner of the head frame in the image to be recognized and coordinate values of pixel points at the lower right corner of the head frame in the image to be recognized.

It should be noted that the human head detector or the human body detector may be trained by a deep learning target recognition method, and a specific training process belongs to the prior art and is not described herein again.

S102: and for each head frame, if the first similarity between the head frame and each verified head frame in the current verification queue is smaller than a preset first threshold value, determining that the head frame is the target head frame.

Since the poster or the billboard may exist in the image to be recognized, and the image of the person may also exist in the poster or the billboard, which is called a "dummy", when the human head frame is obtained by the above embodiment, the human head frame of the "dummy" existing in the image to be recognized may be extracted, so that statistics of subsequent passenger flow values is influenced. Therefore, in order to improve the accuracy of the statistical passenger flow value, in the embodiment of the present invention, a verification queue in which a verification person head box is stored is provided in advance. The verification people's head frame is the obtained people's head frame displayed through the propagation medium in the surrounding environment, such as the people's head frame of a "dummy" in a poster or a billboard, the people's head frame of a "dummy" in a video playing device, and the like. When the initial verification queue is set, the verification queue may be empty, or may include a dummy head box collected in the surrounding environment in advance, and in the subsequent passenger flow value statistics process, the verification head box in the verification queue may be updated in real time. For each head frame contained in the acquired image to be recognized, the similarity (for convenience of description, recorded as a first similarity) between the head frame and each verified head frame in the current verification queue is respectively determined, and according to each first similarity and a preset threshold (for convenience of description, recorded as a first threshold), whether an object corresponding to the head frame is a 'dummy' is determined.

When the first threshold value is set, different values can be set according to different scenes, and if the head frames which are possibly 'dummy' in the image to be identified are expected to be screened out as much as possible, the first threshold value can be set to be smaller; this first threshold may be set larger if the human head frame that is not a "dummy" in the image to be recognized is to be prevented from being recognized by mistake. The specific implementation can be flexibly set according to the requirements, and is not particularly limited herein.

Specifically, for each head frame included in the acquired image to be recognized, if each first similarity of the head frame is smaller than a preset first threshold, which indicates that an object corresponding to the head frame may not be a "dummy", the head frame is determined to be a target head frame, and subsequent processing for updating the current statistical passenger flow value is performed.

For example, the first threshold is 85, the three calculated first similarities are 24, 17 and 8, respectively, it is determined that all of the three first similarities 24, 17 and 8 are less than the first threshold, which indicates that the similarity between the human head box and each verified human head box is not high, the object corresponding to the human head box is considered not to be a "dummy", and the human head box is determined as a target human head box, and the subsequent processing of updating the current statistical passenger flow value is executed.

The determination of the first similarity between the human head frame and the verified human head frame belongs to the prior art, and can be determined through a network model or an image similarity algorithm. In specific implementation, the setting can be flexibly performed according to actual requirements, and is not specifically limited herein.

In another possible embodiment, the method further comprises:

and if the first similarity between the people head frame and any verified people head frame in the current verification queue is not smaller than the first threshold, the processing of updating the current statistical passenger flow value is not executed.

In a specific implementation process, for each person head frame, the first similarity between the person head frame and each verified person head frame in the current verification queue is calculated respectively, as long as the first similarity between the person head frame and any verified person head frame in the current verification queue is not smaller than a preset first threshold, it is indicated that an object corresponding to the person head frame may be a 'dummy', the statistical accuracy will be affected if the statistics of the passenger flow values according to the person head frame is performed, and the processing of updating the currently-counted passenger flow values is not performed based on the person head frame.

When determining whether any person head frame contained in the image to be identified is a target person head frame, sequentially calculating a first similarity between the person head frame and the verification person head frames in the verification queue according to a preset sequence, such as the storage sequence of each verification person head frame and the acquisition time of each verification person head frame, and when determining that the currently calculated first similarity is not less than a preset first threshold, ending the step of continuously calculating the first similarity; after the first similarity between the human head frame and each verified human head frame in the current verification queue is calculated respectively, whether any first similarity which is not smaller than a preset first threshold exists or not can be determined according to each calculated first similarity. Of course, the first similarity between the human head frame and the verified human head frame in the current verification queue may also be randomly calculated, and when it is determined that the currently calculated first similarity is not less than the preset first threshold, the step of continuing to calculate the first similarity is ended. The specific implementation may be set according to actual requirements, and is not specifically limited herein.

For example, the first threshold is 85, the calculated first similarities are 24, 31, and 90, respectively, and it is determined that the first similarity 90 is greater than the first threshold 85, which indicates that the object corresponding to the frame of the person head may be a "dummy", and the statistical accuracy will be affected when the passenger flow value is counted according to the frame of the person head, and then the process of updating the current statistical passenger flow value is not performed based on the frame of the person head.

S103: for each target person head frame, determining target identification information of the target person head frame according to the second similarity between the target person head frame and each tracked person head frame in the current tracking queue; and if the target identification information meets the preset updating condition, updating the current statistical passenger flow value.

After each target person head frame is obtained based on the above embodiments, for each target person head frame, the similarity (for convenience of description, denoted as a second similarity) between the target person head frame and each tracked person head frame corresponding to each identification information in the current tracking queue is respectively determined, and based on each second similarity corresponding to the target person head frame, corresponding processing is performed to determine the target identification information of the target person head frame. And the tracked human head frame is the human head frame of the pedestrian in the acquired surrounding environment. When the initial tracking queue is set, the tracking queue can be empty, and in the subsequent passenger flow value counting process, the tracking person head frames in the tracking queue are updated in real time according to each target person head frame distributed with the target identification information.

When determining the second similarity between the target person head frame and each tracked person head frame in the current tracking queue, the determination method of the second similarity may be the same as or different from the determination method of the first similarity. When the two tracked person head frames are different, the second similarity may be to calculate a region overlapping ratio of the target person head frame to each tracked person head frame in the current tracking queue, that is, to calculate an overlapping ratio of a region corresponding to any tracked person head frame to a region of the target person head frame in the image to be recognized, or to calculate a spatial distance, such as a euclidean distance, a chebyshev distance, or the like, between the target person head frame and each tracked person head frame in the current tracking queue. Specifically, the manner of determining the second similarity is not particularly limited herein.

In a possible implementation manner, when determining target identification information of each target person head frame, respectively determining a second similarity between the target person head frame and each tracked person head frame in a current tracking queue for each target person head frame, determining identification information of the tracked person head frame with higher similarity to the target person head frame through the hungarian algorithm and each second similarity, and if it is determined that the second similarity between any tracked person head frame corresponding to the identification information and the target person head frame is greater than a set threshold, determining that the identification information is identification information of the target person head frame (for convenience of description, recorded as target identification information); otherwise, distributing new identification information for the head frame of the target person, and determining the newly distributed identification information as the target identification information of the head frame of the target person.

The identification information of the head frame (including the verification head frame, the tracking head frame and the target head frame) is used for uniquely identifying the identity information of the object to which the head frame belongs, and the identification information can be numbers, letters, special symbols, character strings and the like, and can be in other forms, so long as the identification information can uniquely identify the object to which the head frame belongs, and the identification information can be used as the identification information in the embodiment of the invention.

In the actual application process, after the target identification information of the target person header is determined, if the target identification information is identification information corresponding to any one of the tracking person headers existing in the tracking queue, before the target person header is obtained, the process of updating the statistical passenger flow value is already executed according to the tracking person header corresponding to the target identification information, and the process of updating the current statistical passenger flow value is not required to be executed based on the target person header. Therefore, in order to further accurately count the passenger flow value, an update condition is configured in the embodiment of the present invention, and the update condition may be whether the target identification information is the same as any identification information in the current tracking queue. After determining the target identification information of any target person head box based on the above embodiment, it is determined whether the target identification information meets a preset updating condition, that is, whether the target identification information is the same as the identification information of the person head box in the current tracking queue, so as to determine whether to update the current statistical passenger flow value.

Specifically, if it is determined that the target identification information is the same as the identification information of any one of the tracked person head boxes in the current tracking queue, which indicates that the process of updating the statistical passenger flow value has been executed according to the tracked person head box corresponding to the target identification information, it is determined that the target identification information does not satisfy the preset updating condition, and the current statistical passenger flow value is not updated according to the target identification information. And if the target identification information is determined to be different from any identification information in the current tracking queue, which indicates that the object corresponding to the head box of the target identification information is not counted in the passenger flow value, determining that the target identification information meets a preset updating condition, and updating the current counted passenger flow value.

Example 2: to further improve the accuracy of the statistical passenger flow value, in the embodiment of the present invention, on the basis of the above embodiment, determining the first similarity between the human head box and each verified human head box in the current verification queue includes:

determining the hash value of the human head box according to the image information in the human head box;

and aiming at each verified person head frame, determining a first distance between the hash value of the person head frame and the hash value of the verified person head frame, and determining the first distance as the first similarity between the person head frame and the verified person head frame.

In the practical application process, the content information in each image can be represented by a hash value, which is equivalent to a 'fingerprint' character string of each image, and then the similarity between different images can be determined according to the hash values of different images, and the higher the similarity is, the more similar the two images are. Based on this, in the embodiment of the present invention, the first similarity may be determined according to the hash value of the image in the human head box and the hash value corresponding to each verified human head box, so as to determine whether the human head box is likely to be a human head box of a "dummy".

In a specific implementation process, after any one of the person head frames included in the image to be identified is acquired, the hash value of the image in the person head frame is determined according to the image information in the person head frame, then, for each verified person head frame in the current verification queue, a first distance, for example, a hamming distance, between the hash value of the image in the person head frame and the hash value corresponding to the verified person head frame is determined, and a first similarity between the person head frame and the verified person head frame is determined according to the first distance.

When the first similarity between the human head frame and the verified human head frame is determined according to the first distance, the first distance may be directly used as the first similarity between the human head frame and the verified human head frame, or certain processing, such as weighting, function operation, quantization, and the like, may be performed on the first distance, and a result obtained after the processing is used as the first similarity between the human head frame and the verified human head frame. The method for determining the first similarity according to the first distance may be flexibly set according to actual requirements, and is not specifically limited herein.

The hash value corresponding to each verifier head box in the current verification queue may be determined when the verifier head box is obtained, or may be determined in real time according to each verifier head box in the current verification queue when the first similarity is determined. In specific implementation, the setting can be flexibly performed according to actual requirements, and is not specifically limited herein.

It should be noted that, the process of determining the hash value in the human head frame image and determining the first distance belongs to the prior art, and is not described herein again.

In one possible embodiment, in order to accurately determine the hash value of each person's head box, determining the hash value of the person's head box from the image information within the person's head box includes:

determining a target size according to the size of the human head frame and the set multiple;

in the image to be recognized, determining an image area which contains the human head frame and has the size of the target size; and

and determining the hash value of the image area according to the image information in the image area, and determining the hash value of the image area as the hash value of the human head box.

In order to determine the hash value in the image of the human head frame by fully combining the background information in the human head frame, the region in which the human head frame is located may be expanded by a set multiple, and then the hash value of the human head frame may be calculated based on the expanded image region. Specifically, in the embodiment of the present invention, for each head frame included in the image to be recognized, the target size is determined according to the size of the head frame and the set multiple; in the image to be recognized, determining an image area which contains the human head frame and has a target size, for example, in the recognition image, the frame based on the human head frame is expanded outwards to obtain an image area with the target size; and determining the hash value of the human head box based on the image information in the image area. For example, if the size of the acquired human head frame is 10 × 10, the position information in the image to be recognized is (10, 20), (20, 30), and the multiple is set to 4 times, the target size is determined to be 20 × 20, and the position information of the image region including the human head frame and having the size of 20 × 20 in the image to be recognized is (5, 15), (25, 35).

When setting the value of the set multiple, different values can be set according to different scenes, and if the background information in the human head frame is expected to be learned as much as possible, the value can be set to be larger; the setting multiple may be set to be smaller if the image area is to be prevented from containing information of the human head frame of another object, and preferably, the setting multiple may be 2 times.

Example 3: the embodiment provides another way for determining whether the target identification information meets the preset updating condition, which is specifically as follows:

acquiring a tracked person head frame corresponding to the target identification information in a current tracking queue;

determining the corresponding moving distances of two tracked human head frames according to the image information of every two tracked human head frames adjacent to the acquisition time, and determining the sum of the distances according to each moving distance; and

and if the number of the tracked human head frames corresponding to the target identification information is greater than a set number threshold value and the sum of the distances is greater than a set distance threshold value, determining that the target identification information meets a preset updating condition.

In practical applications, it may occur that the acquired image to be recognized is collected at the door of an application scene, such as an office hall, a mall, a bus, etc., and although the image to be recognized collected at the door of the application scene may include the head frame of each object entering the application scene, the image to be recognized may easily include the head frame of the object passing through the door of the application scene. For example, in a scene of counting passenger flow values entering a bus, an image to be recognized is generally collected at a doorway of an upper bus door of the bus, and the collected image to be recognized not only includes a head frame of an object to be loaded onto the bus, but also may include a head frame of an object passing through the doorway of the bus. In addition, in an application scene, there may be a video playing device such as a screen projection television or a projector for playing a video, and if the acquired image to be identified includes a picture including a person being played by the video playing device, the statistical passenger flow value may also be inaccurate.

Therefore, when determining whether the target identification information meets the preset updating condition, if the target identification information is newly allocated identification information, according to the manner in the above embodiment, when it is determined that the identification information identical to the target identification information does not exist in the current tracking queue, the current statistical passenger flow value is directly updated.

However, in order to further improve the accuracy of the statistical passenger flow value, when it is determined that the current tracking queue does not have the identification information that is the same as the target identification information, the current statistical passenger flow value is not updated, and the target person head frame and the target identification information corresponding to the target person head frame are stored in the tracking queue, so that whether the target identification information meets the preset updating condition or not can be determined subsequently based on the number of the tracking person head frames corresponding to the target identification information in the current tracking queue and the moving distance of the object to which the tracking person head frame corresponding to the target identification information belongs within the shooting range.

Generally, images to be recognized are collected according to a preset time interval, or can be collected in real time, and a certain time is required for an object entering an application scene to leave a shooting range from the time when the object enters the shooting range, so that a plurality of images to be recognized including a human head frame of the object can be collected. And the distance traveled by the object from entering the shooting range to leaving the shooting range will generally be greater than the shortest distance through the shooting range, i.e. the distance traveled by the object from entering the shooting range to leaving the shooting range will generally be greater than a certain threshold. Therefore, in the embodiment of the present invention, in order to further improve the accuracy of the statistical passenger flow value, the number threshold may be set according to the preset time interval of image acquisition, the preset shooting range and the average speed of the object, and the distance threshold may be set according to the preset shooting range.

In a specific implementation process, if the number of the tracked human head frames corresponding to the target identification information in the current tracking queue is greater than a number threshold, and the distance traveled by the object corresponding to the tracked human head frame corresponding to the target identification information in the shooting range is greater than a distance threshold, it is determined that the target identification information meets a preset updating condition.

After the target identification information of the target person head frame is acquired based on the above embodiment, the tracked person head frame corresponding to the target identification information is acquired in the current tracking queue, and then the moving distance corresponding to each two tracked person head frames adjacent to each other in the acquisition time is determined according to the image information of the two tracked person head frames. And then determining the sum of the distances according to each acquired moving distance. And judging whether the number of the tracked human head frames corresponding to the target identification information is greater than a number threshold value or not, and whether the sum of the obtained distances is greater than a distance threshold value or not.

If the number of the tracked human head frames corresponding to the target identification information is larger than a number threshold value, and the sum of the obtained distances is larger than a distance threshold value, it is indicated that the object corresponding to the human head frame of the target identification information is not a dummy, and it is determined that the target identification information meets a preset updating condition;

If the number of the tracked head frames corresponding to the target identification information is not greater than the number threshold, or the sum of the acquired distances is not greater than the distance threshold, it is indicated that the object corresponding to the head frame of the target identification information is likely to be a passerby passing through the doorway of the application scene, or a "dummy" included in the video played on the video playing device, it is determined that the target identification information does not satisfy the preset updating condition, and the processing of updating the current statistical passenger flow value is not performed.

When the moving distance corresponding to the two tracked human head frames is determined, the moving distance can be determined according to the position information of the pixel points at the set positions of the two tracked human head frames in the image to be identified. For example, the distance between the pixel points at the upper left corner of the two tracked human head frames and the coordinate value of the image to be recognized, the distance between the pixel points at the lower right corner of the tracked human head frames and the coordinate value of the image to be recognized, the distance between the pixel points at the middle point of the diagonal line of the tracked human head frames and the coordinate value of the image to be recognized, and the like are determined.

The number threshold is generally not greater than a quotient determined from a product of a maximum shooting distance in a preset shooting range and a time interval and an average speed. In order to improve the accuracy of the statistical passenger flow value, the number threshold value is not too small; in order to avoid the situation that the object is missed to count due to too fast walking speed of the object, the number threshold value is not suitable to be too large. In specific implementation, the setting can be flexibly performed according to actual requirements, and is not specifically limited herein.

The distance threshold is generally not greater than a maximum shooting distance in a preset shooting range. In order to improve the accuracy of the statistical passenger flow value, the distance threshold value is not too small; in order to avoid the situation that the object is missed to count due to the fact that the walking speed of the object is too high, the distance threshold value is not suitable to be too large. In specific implementation, the setting can be flexibly performed according to actual requirements, and is not specifically limited herein.

In the embodiment of the invention, if the number of the tracked head frames corresponding to the target identification information in the current tracking queue is greater than the set number threshold and the sum of the acquired distances is greater than the set distance threshold, the target identification information is determined to meet the preset updating condition, so that passers-by passing through the doorway of the application scene in the acquired image to be recognized can be eliminated, or the head frames of 'dummy' contained in the video played on the video playing equipment can be eliminated, and the accuracy of the statistical passenger flow value can be further improved.

Example 4: since the situation that the 'dummy' is stored in the tracking queue to affect the update of the current statistical passenger flow value may occur, in the actual application process, an object which is not the 'dummy' generally moves in the application scene, and each acquired image to be recognized is acquired according to a set time interval, so that the images of the object are included in the plurality of images to be recognized acquired in the process that the same object enters the application scene. In addition, since many pedestrian objects are generally included in the application scene, since each pedestrian object is moving, the background of each pedestrian object and the position thereof also generally change. For the 'dummy' in the poster and the billboard in the application scene, the background and the position of each 'dummy' are not changed generally because the 'dummy' is always motionless. Based on this, in order to further improve the accuracy of the statistical passenger flow value, it may be determined whether an object corresponding to each tracked human head frame corresponding to any one identification information in the current tracking queue is a "dummy" according to the similarity (for convenience of description, that is, the third similarity) between the areas corresponding to any two tracked human head frames of the identification information in the current tracking queue and a preset threshold (for convenience of description, that is, the second threshold), so as to update the tracked human head frames in the current tracking queue.

The method for determining the third similarity between any two tracked person head frames may be similar to the above-mentioned method for determining the first similarity between the two tracked person head frames and the verified person head frame, for example, hash values corresponding to the two tracked person head frames are determined according to image information in the two tracked person head frames, then the distance between the hash values of the two tracked person head frames is determined, and the third similarity between the two tracked person head frames is determined according to the distance.

Specifically, the tracked person head box in the current tracking queue can be updated in the following two ways:

in a first mode, in order to update the tracked person head frame in the current tracking queue in time, after the target identification information of any target person head frame is obtained, the tracked person head frame corresponding to the target identification information in the current tracking queue can be updated in real time. Specifically, after the target identification information of the target human head frame is acquired based on the above embodiment, the third similarity between any two tracking human head frames corresponding to the target identification information is acquired in the current tracking queue, and whether each third similarity is greater than the preset second threshold is determined.

If the third similarity of any two tracked human head frames corresponding to the target identification information is greater than the preset second threshold, it is indicated that the similarity between any two tracked human head frames corresponding to the target identification information is extremely high, that is, image information such as a background in each tracked human head frame of the target identification information does not change, it is determined that an object corresponding to the tracked human head frame corresponding to the target identification information is possibly a dummy, the related information of each tracked human head frame corresponding to the target identification information is deleted from the current tracking queue, and the process of updating the current statistical passenger flow value is not performed.

Wherein, the related information of each tracked person head box comprises at least one of but not limited to position information, hash value and collection time.

For example, if the preset second threshold is 89, 3 tracked human head frames corresponding to the target identification information 2 in the tracking queue are obtained, where the third similarities of any two tracked human head frames are 92, 90, and 95, respectively, it is determined that each third similarity corresponding to the target identification information 2 is greater than the second threshold 89, which indicates that image information such as a background in each tracked human head frame corresponding to the target identification information 2 has not changed greatly, and it is determined that an object corresponding to the tracked human head frame corresponding to the target identification information 2 may be a "dummy", the relevant information of each tracked human head frame corresponding to the target identification information 2 is deleted from the current tracking queue, and the processing of updating the current statistical passenger flow value is not performed.

And secondly, in order to reduce the resources consumed when the head frames of the tracked persons in the current tracking queue are updated, an updating period is preset. And updating the tracked person head frame corresponding to each identification information in the current tracking queue according to a preset updating period, namely acquiring a third similarity between any two tracked person head frames corresponding to the identification information aiming at each identification information in the current tracking queue, and judging whether each third similarity is greater than a preset second threshold value or not, thereby determining whether the tracked person head frame corresponding to the identification information is deleted or not.

It should be noted that the specific determination of whether to delete the tracked person head frame corresponding to the identification information is similar to the determination method in the first embodiment, and repeated parts are not described again.

In order to facilitate the subsequent filtering out of artefacts contained in the image to be identified, the method further comprises: and saving the related information of each tracked person head frame corresponding to the target identification information in the current verification queue.

In order to facilitate the subsequent screening of the head frame of the "dummy" from the image to be recognized, in the embodiment of the present invention, after determining that the object corresponding to the head frame of the tracked person corresponding to the target identification information in the tracking queue is the "dummy" based on the above embodiment, the related information of each head frame of the tracked person corresponding to the target identification information may be stored in the current verification queue, so that the head frame of the "dummy" included in the image to be recognized may be further screened according to the updated verification queue, and the accuracy of the statistical passenger flow value is improved.

Example 5: the embodiment provides another method for determining target identification information of a target person head frame according to a second similarity between the target person head frame and each tracked person head frame in a current tracking queue, including:

respectively determining second similarity of the target person head frame and each tracked person head frame in the current tracking queue;

determining first identification information according to the Hungarian algorithm and each second similarity;

and if the second similarity of the tracked person head frame corresponding to the first identification information in the current tracking queue and the target person head frame is smaller than the set matching threshold, determining the target identification information of the target person head frame according to the feature similarity of the target person head frame and each tracked person head frame in the current tracking queue and the target threshold corresponding to the target person head frame.

Generally, the acquired images to be recognized are acquired at preset time intervals, such as 2s, 3s, and the like, and in order to ensure that the images of all the objects entering the shooting range can be acquired in real time, the preset time intervals are not set to be large. However, in the practical application, the time taken by the same object from entering the shooting range to leaving the shooting range is generally longer than the preset time interval, and therefore, it may happen that the captured multiple images to be recognized all contain the human head frame of the same object. If the passenger flow value is directly counted according to the number of the head frames of each target person obtained from the image to be recognized, the same object is counted for multiple times, so that the accuracy of the counted passenger flow value is low.

In order to accurately count the passenger flow value, after each target person head frame is acquired based on the above embodiment, for each target person head frame, the second similarity between the target person head frame and each tracked person head frame in the current tracking queue is respectively determined, and for each identification information in the current tracking queue, the maximum value of the second similarity is determined according to the acquired second similarity between the target person head frame and each tracked person head frame corresponding to the identification information in the current tracking queue. And determining the first identification information according to the Hungarian algorithm and the maximum value of each second similarity.

In the specific implementation process, the input of the hungarian algorithm needs to be a matrix with equal rows and columns, so that when the input matrix is determined, the dimension of the input matrix is determined according to the maximum value of the currently determined number of the head boxes of the target person and the number of the identification information stored in the current tracking queue (for convenience of description, the dimension of the matrix is N x N).

And determining the input matrix of the N-by-N dimension based on the maximum value of the second similarity of each target human head frame and the tracking human head frame corresponding to each identification information in the current tracking queue. Wherein, the element in the ith row and the jth column in the N × N dimensional input matrix is the maximum value in the second similarity of the ith target head frame and each tracking head frame corresponding to the jth identification information, where i is 1,2, …, a is the number of the target head frames; j is 1,2, …, B is the number of identification information in the previous tracking queue; n is the maximum of A and B.

If the number of the target person head boxes is equal to the number of the identification information in the current tracking queue, N is the number of the target person head boxes or the number of the identification information in the current tracking queue, a vector of N1 is obtained according to a Hungary algorithm and an input matrix of N, and first identification information corresponding to each target person head box is determined based on the output vector, namely each dimensional element in the output vector is the first identification information corresponding to the N target person head boxes respectively.

In another possible implementation, when the number of the obtained target person head boxes is not equal to the number of the identification information in the current tracking queue, in order to determine the first identification information conveniently, the missing rows or columns of the currently determined input matrix may be supplemented to generate an N × N input matrix. Specifically, the addition method may be to randomly select a value not less than the maximum value in the currently determined input matrix to add the missing row or column of the input matrix.

It is noted that the elements of each dimension in the appended row or column are generally the same, but it is not excluded that the elements of each dimension in the appended row or column may also be different values.

For example, 2 target head frames are currently determined, 4 identification information is stored in the tracking queue, N is 4, only 2 × 4 input matrices can be determined according to the maximum value of the second similarity of each target head frame and the tracking head frame corresponding to the 4 identification information in the current tracking queue, for convenience of determining the first identification information, the missing rows of the 2 × 4 input matrix can be supplemented, the maximum value of the 2 × 4 input matrix is determined to be 0.8, any value not less than 0.8 is randomly selected, for example, 1, the missing rows of the determined 2 × 4 input matrix are supplemented by using the first filling matrix composed of 1 of 2 × 4, and thus the 4 × 4 input matrix is generated.

For another example, 4 target person head frames are currently determined, 2 identification information is stored in the tracking queue, N is 4, only 4 x 2 input matrices can be determined according to the maximum value of the second similarity of each target person head frame and the tracking person head frame corresponding to the 2 identification information in the current tracking queue, for convenience of determining the first identification information, the missing columns of the 4 x 2 input matrices can be supplemented, the maximum value of the 4 x 2 input matrices is 0.95, any value not less than 0.95, such as 0.95, is randomly selected, and the missing columns of the determined 4 x 2 input matrices are supplemented by the first filling matrix composed of 0.95 of 4 x 2, so as to generate the 4 x 4 input matrices.

In order to accurately determine the target identification information of the target head frame, after the first identification information is acquired based on the above embodiment, the tracked head frame corresponding to the first identification information is determined, the second similarity between the target head frame determined in the above embodiment and the tracked head frame corresponding to the first identification information is acquired, and each second similarity is compared with the set matching threshold, so as to determine the target identification information of the target head frame.

Specifically, if the second similarity between any tracked person head frame corresponding to the first identification information in the current tracking queue and the target person head frame is greater than the set matching threshold, that is, it is determined that the second similarity between any tracked person head frame corresponding to the first identification information in the current tracking queue and the target person head frame is greater than the set matching threshold, which indicates that the similarity between the target person head frame and the tracked person head frame corresponding to the first identification information is higher, the target identification information of the target person head frame is determined to be the first identification information.

Due to the fact that the face features contained in the human head frames of the same object obtained from different images to be recognized are possibly shielded, the second similarity between the human head frames of the same object is reduced, and the human head frames of the same object are subsequently recognized as the human head frames of different objects by mistake. For example, in two consecutive images to be recognized that include the certain person a, the frame of the person a obtained in one of the images to be recognized is occluded by the object B, and the frame of the person a obtained in the other image to be recognized is not occluded by the object, so that the second similarity between the frames of the person a determined from the two images to be recognized is small. Therefore, in order to further accurately determine the target identification information of the target person head frame, in the embodiment of the present invention, when it is determined based on the above-described embodiment that the second similarity between the tracked person head frame corresponding to the first identification information in the current tracking queue and the target person head frame is smaller than the set matching threshold, the feature vector of the target person head frame and the feature vector corresponding to each tracked person head frame in the pre-stored current tracking queue are obtained, and then for each tracked person head frame in the current tracking queue, the feature similarity between the target person head frame and the tracked person head frame is determined according to the feature vector of the target person head frame and the feature vector corresponding to the tracked person head frame which are pre-stored. And further analyzing based on the acquired feature similarity and a target threshold corresponding to the head frame of the target person, and determining target identification information of the head frame of the target person.

The feature vector of the target person head frame may be obtained through an image recognition model, or may be determined through other algorithms, which is not specifically limited herein. Preferably, in the embodiment of the present invention, the image is obtained through an image recognition model.

In one possible implementation, if the feature vector of the head frame of the target person is obtained through an image recognition model, the image recognition model is trained in advance in order to accurately obtain the feature vector of the head frame of the target person. Specifically, the image recognition model is obtained as follows:

acquiring a head frame of any sample in a sample set and corresponding sample identification information;

acquiring third identification information corresponding to the sample human head frame through a deep learning network model;

and training the deep learning network model according to the sample identification information and the third identification information to obtain an image recognition model.

The feature vector of the head frame of the target person can be obtained through a feature extraction layer in the image recognition model.

In order to accurately obtain the feature vector of the target human head frame, the deep learning network model can be trained according to each human head frame of the samples in the pre-obtained sample set and the corresponding sample identification information. The sample identification information is used for identifying the identity characteristics of the object corresponding to the sample human head frame, can be represented by numbers, letters, character strings and the like, can also be represented in other forms, and only the sample identification information of the human head frames of different objects needs to be ensured to be different.

In addition, in order to increase the diversity of the sample head frames, the sample head frames of the same sample identification information include sample head frames with different angles, such as a sample head frame including a front face, a sample head frame including a side face turned 45 degrees to the right, a sample head frame including a side face turned 45 degrees to the left, and the like.

The device for training the image recognition model may be the same as or different from the electronic device for performing statistics on the passenger flow value.

Through the deep learning network model, identification information (for convenience of explanation, recorded as third identification information) corresponding to the sample human head frame can be identified, and the deep learning network model is trained according to the third identification information and the sample identification information corresponding to the sample human head frame so as to adjust parameter values of each parameter of the deep learning network, and thus the image identification model is obtained.

For example, the sample identification information corresponding to the sample human head frame is a, the third identification information corresponding to the sample human head frame is identified as B through the deep learning network model, the third identification information B is inconsistent with the corresponding sample identification information a, and it is determined that the identification information of the sample human head frame is mistakenly identified by the deep learning network model.

The sample set contains a large number of sample head frames, the operations are carried out on each sample head frame, and when a preset convergence condition is met, model training is completed.

The method comprises the steps of obtaining a sample frame number of a sample set, obtaining a number of sample frames of the sample set, and determining whether the number of the sample frames is larger than a set number or not. The specific implementation can be flexibly set, and is not particularly limited herein.

In a possible embodiment, when performing the training of the image recognition model, the sample frames of the sample set may be divided into training samples and test samples, the image recognition model is trained based on the training samples and then the reliability of the trained image recognition model is verified based on the test samples.

After the trained image recognition model is obtained, the feature vector of the head frame of the target person can be obtained through a feature extraction layer in the image recognition model subsequently.

In order to further accurately determine the target identification information of the head frame of the target person, on the basis of the above embodiment, in an embodiment of the present invention, determining the target identification information of the head frame of the target person according to the feature similarity between the head frame of the target person and each head frame of the tracked persons in the current tracking queue and a target threshold corresponding to the head frame of the target person includes:

Determining second identification information according to the Hungarian algorithm and the similarity of each feature;

if the feature similarity between any tracked person head frame corresponding to the second identification information in the current tracking queue and the target person head frame is larger than a target threshold corresponding to the target person head frame, determining the target identification information of the target person head frame as the second identification information; and if not, distributing new identification information for the head frame of the target person as the target identification information of the head frame of the target person.

After the feature similarity between the head box of the target person and each of the head boxes of the trackers in the current tracking queue is obtained based on the above embodiment, second identification information corresponding to the head box of the target person is determined according to each feature similarity and the Hungarian algorithm.

In the specific implementation process, the input of the Hungarian algorithm is necessarily a matrix with equal rows and columns, so the dimension of the input matrix required by the Hungarian algorithm is determined according to the maximum value of the number of the head boxes of the target person determined currently and the number of the head boxes of the tracked person stored in the current tracking queue.

Wherein the input matrix is determined based on feature similarity of each target person head box and each tracked person head box in the current tracking queue. The element in the ith row and the jth column in the N × N dimensional input matrix is the feature similarity between the ith target head frame and the jth tracked head frame, i is 1,2, …, a is the number of target head frames, j is 1,2, …, C is the number of tracked head frames in the previous tracking queue, and N is the maximum value of a and C.

It should be noted that, through the hungarian algorithm and the similarity of each feature, the process of determining the second identification information is similar to the method of determining the first identification information, and repeated parts are not repeated.

In order to accurately determine the target identification information of the target person head frame, after the second identification information is acquired based on the above embodiment, the tracked person head frame corresponding to the second identification information is determined, the feature similarity between the target person head frame determined in the above embodiment and the tracked person head frame corresponding to the second identification information is acquired, and the feature similarity is compared with the target threshold corresponding to the target person head frame, so as to determine whether the second identification information is the target identification information of the target person head frame.

Specifically, if the feature similarity between any one of the tracked person head frames corresponding to the second identification information in the current tracking queue and the target person head frame is greater than the target threshold corresponding to the target person head frame, which indicates that the similarity between the target person head frame and the tracked person head frame corresponding to the second identification information is higher, the target identification information of the target person head frame is determined to be the second identification information; if the feature similarity of each tracked person head frame corresponding to the second identification information in the current tracking queue and the target person head frame is not greater than the target threshold corresponding to the target person head frame, it is indicated that the similarity of the target person head frame and the tracked person head frame corresponding to the second identification information is not high, and the target person head frame is likely to be the person head frame of the object which just enters the application scene, new identification information is allocated to the target person head frame, and the target identification information of the target person head frame is determined to be the new identification information.

The target threshold corresponding to the target person head frame may be a preset value, or may be determined according to the size of the target person head frame. The setting can be flexibly performed, and is not particularly limited herein.

Example 6: the following describes a statistical method for a passenger flow value provided in an embodiment of the present invention by a specific embodiment, and fig. 2 is a schematic flow chart of statistics for a specific passenger flow value provided in an embodiment of the present invention, where the flow chart includes:

s201: and acquiring a human head frame contained in the image to be recognized.

Since a plurality of human head frames included in the image to be recognized may be acquired through the above steps, for convenience of description, the following steps are performed for any one of the acquired human head frames in the image to be recognized:

s202: and determining the Hash value of the human head frame according to the image information contained in the human head frame, judging whether the Hamming distances between the Hash value of the human head frame and the Hash value of each verified human head frame in the current verification queue are smaller than a preset first threshold value, if so, executing S204, and otherwise, executing S203.

S203: it is determined not to perform S205 to S213.

S204: and determining the human head frame as a target human head frame.

After each target person head frame included in the image to be recognized is determined based on the steps of S202 to S204 described above, for convenience of explanation, the following steps are performed for any one of the acquired target person head frames:

S205: and determining the area overlapping proportion of the target person head frame and the tracked person head frame of each piece of identification information in the current tracking queue.

S206: and determining first identification information according to the Hungarian algorithm and the overlapping proportion of each area, judging whether the area overlapping proportion of any tracker head box corresponding to the first identification information in the current tracking queue and the target person head box is larger than a set matching threshold value, if so, executing S207, otherwise, executing S208.

S207: the target identification information of the target person' S head box is determined as the first identification information, and then S212 is performed.

S208: and acquiring the feature similarity of the head frame of the target person and each tracked person according to the feature vector of the head frame of the target person acquired through the feature extraction layer in the image recognition model and the feature vector corresponding to each tracked person in the current tracking queue.

S209: and determining second identification information according to the Hungarian algorithm and each feature similarity, judging whether the feature similarity between any tracker head box corresponding to the second identification information in the current tracking queue and the target person head box is larger than a target threshold corresponding to the target person head box, if so, executing S210, and otherwise, executing S211.

S210: the target identification information of the target person' S head box is determined as the second identification information, and then S212 is performed.

S211: new identification information is allocated to the target person 'S head box and the target identification information of the target person' S head box is determined to be the new identification information, and then S212 is performed.

S212: and judging whether the third similarity of any two tracked human head frames corresponding to the target identification information is greater than a preset second threshold, if so, executing S213, and otherwise, executing S215.

S213: and storing the related information of each tracked person head box corresponding to the target identification information in a verification queue.

S214: and deleting the related information of each tracked person head box corresponding to the target identification information in the current tracking queue, and not executing S215-S219.

S215: and acquiring a tracked person head frame corresponding to the target identification information in the current tracking queue.

S216: and determining the corresponding moving distances of the two tracked human head frames according to the image information of every two tracked human head frames adjacent to the acquisition time, and determining the sum of the distances according to each moving distance.

S217: and judging whether the number of the tracked human head frames corresponding to the target identification information is greater than a set number threshold value and whether the sum of the acquired distances is greater than a set distance threshold value, if so, executing S218, otherwise, executing S219.

S218: and updating the current statistical passenger flow value.

S219: and saving the relevant information of the head box of the target person in a current tracking queue.

Example 7: in order to further improve the accuracy of the statistical passenger flow value, on the basis of the foregoing embodiment, in an embodiment of the present invention, determining the target threshold corresponding to the target person head box includes: and determining a target threshold corresponding to the size of the head frame of the target person according to the corresponding relation between the image size and the threshold.

In the practical application process, the facial features contained in the head frames of the same person at different positions are different by a plurality of images to be recognized, and the sizes of the head frames at different positions in the images to be recognized are also different. For example, when a person C is at the upper left corner of an image to be recognized, the head frame of the person C is small and contains facial features that are relatively fuzzy, and when the person C is at the center of another image to be recognized, the head frame of the person C is large and contains facial features that are relatively clear, so that when the feature similarity is calculated according to the feature vector of the head frame with clear facial features of the same object and the feature vector of the head frame with fuzzy facial features, the feature similarity is low and generally cannot be greater than a set threshold, the same person is easily mistakenly recognized for two persons, and the accuracy of the statistical passenger flow value is reduced.

Therefore, in order to further improve the accuracy of the statistical passenger flow value, in the embodiment of the present invention, a correspondence between the image size and the threshold is stored in advance, and when it is necessary to determine the target identification information of the target person head frame according to each feature similarity corresponding to the target person head frame and the target threshold corresponding to the target person head frame, the target threshold corresponding to the size of the target person head frame is determined according to the correspondence between the image size and the threshold stored in advance.

As a possible implementation, the correspondence of the image size to the threshold is determined according to the following:

acquiring a sample characteristic vector of each sample head frame in a sample set, wherein each sample head frame corresponds to sample identification information;

acquiring sample feature vectors of the sample human head frame and sample feature similarities of the sample feature vectors of the target sample human head frame corresponding to the sample identification information aiming at the sample human head frame corresponding to each sample identification information; and

and determining a threshold value corresponding to each image size according to the size of the head frame of each sample and the corresponding sample characteristic similarity.

In order to accurately determine the corresponding threshold value according to the head frames of the target persons at different positions, in the embodiment of the invention, a sample set is collected in advance, and the sample set comprises each sample head frame and sample identification information corresponding to each sample head frame. After the sample feature vector of each sample head frame is respectively obtained through the image recognition model, the sample feature vector of the sample head frame and the sample feature similarity of the sample feature vector of the target sample head frame corresponding to the sample identification information are obtained for the sample head frame corresponding to each sample identification information in the sample set.

In an actual application scene, the same object is gradually close to the image acquisition device from a far distance, so that the size of a human head frame of the object in an image to be recognized, which is acquired by the image acquisition device at the earliest time, containing the object is generally the smallest, and the features contained in the human head frame are also the smallest, in a plurality of images to be recognized, which contain the object, which are acquired continuously later, the size of the human head frame of the object is larger and larger along with the approaching of the object, and the features of the image in the human head frame are clearer, so that the sizes of the images of the two human head frames are larger according to the two human head frames with the clearer features of the image of the same object, the similarity of the determined features is larger according to the features of the two human head frames, and the sizes of the images of the two human head frames are smaller according to the features of the two human head frames, the smaller the feature similarity will be determined. Based on this, the image size of the human head frame and the feature similarity have a certain linear relationship. In the practical application process, the sample human head frame with the earliest acquisition time can be generally determined as the target sample human head frame corresponding to the sample identification information. In addition, the target sample header frame corresponding to the sample identification information does not exclude other possible manners, and the above embodiment does not limit the target sample header frame corresponding to the sample identification information.

After the sample feature similarity corresponding to each sample head frame is acquired based on the above embodiment, the threshold corresponding to each image size is determined according to the size of each sample head frame and the corresponding sample feature similarity. Specifically, fitting is performed by taking the size of the sample head frame as a horizontal axis and the sample feature similarity as a vertical axis according to the size of each sample head frame and the corresponding sample feature similarity, and a fitting curve is determined, for example, D ═ f (x), where x is the size of the sample head frame and D is a threshold value. And determining the corresponding relation between the image size and the threshold according to the fitted curve. The specific fitting process belongs to the prior art, and is not described herein again.

Example 8: the following describes a statistical method for passenger flow values according to a specific embodiment, and fig. 3 is a schematic diagram of a statistical process for passenger flow values according to the specific embodiment of the present invention, where the process includes three parts, namely training of an image recognition model, determining a correspondence between an area of a human head box and a dynamic threshold, and counting passenger flow values, and each part is described in detail below:

a first part: and training an image recognition model.

S301: and training the deep learning network model to obtain an image recognition model.

The method comprises the steps that first electronic equipment obtains any sample head frame in a sample set and corresponding sample identification information; acquiring third identification information corresponding to the sample human head frame through a deep learning network model; and training the deep learning network model according to the sample identification information and the third identification information to obtain an image recognition model, so that the feature vector of the head frame of the target person can be obtained through a feature extraction layer in the image recognition model subsequently.

In the process of training the image recognition model, an offline mode is generally adopted, and the first electronic device trains the deep learning network model in advance according to a sample human head frame in a sample set and corresponding sample identification information to obtain the image recognition model.

It should be noted that the first electronic device and the second electronic device that performs subsequent passenger flow value statistics may be the same or different, and are not specifically limited herein.

A second part: and determining the corresponding relation between the area of the human head frame and the dynamic threshold value.

S302: and determining the corresponding relation between the image size and the threshold value.

The method comprises the steps that a first electronic device respectively obtains a sample feature vector of a head frame of each sample in a sample set through a deep learning network model, and each head frame of each sample corresponds to sample identification information; acquiring sample feature vectors of the sample human head frame and sample feature similarities of the sample feature vectors of the target sample human head frame corresponding to the sample identification information aiming at the sample human head frame corresponding to each sample identification information; and determining a threshold value corresponding to each image size according to the size of the head frame of each sample and the corresponding sample characteristic similarity.

The part may also be completed in a second electronic device for subsequent statistics of the passenger flow value, and the execution subject of the part is not limited in this embodiment.

And a third part: the method comprises the following steps of counting passenger flow values, wherein the passenger flow values are counted through a second electronic device based on an image recognition model obtained by training of the first electronic device, and the method specifically comprises the following steps:

s303: and acquiring a human head frame contained in the image to be recognized.

s304: and judging that the first similarity between the human head frame and each verified human head frame in the current verification queue is smaller than a preset first threshold, if so, executing S305, and otherwise, executing S306.

S305: and determining the human head frame as a target human head frame.

S306: it is determined not to perform S307 to S310.

After each target person head frame included in the image to be recognized is determined based on the steps of S304 to S306 described above, for convenience of explanation, the following steps are performed for any one of the acquired target person head frames:

s307: and determining the target identification information of the target person head frame according to the second similarity of the target person head frame and each tracked person head frame in the current tracking queue.

S308: and judging whether the target identification information meets preset updating conditions, if so, executing S309, otherwise, executing S310.

S309: and updating the current statistical passenger flow value.

S310: and saving the relevant information of the head box of the target person in a current tracking queue.

Example 9: fig. 4 is a schematic structural diagram of a passenger flow value statistic device according to an embodiment of the present invention, where the passenger flow value statistic device according to the embodiment of the present invention includes:

an obtaining unit 41, configured to obtain a frame of a person's head included in an image to be recognized;

the judging unit 42 is configured to, for each head frame, determine that the head frame is a target head frame if the first similarity between the head frame and each verified head frame in the current verification queue is smaller than a preset first threshold;

a processing unit 43, configured to determine, for each target person head frame, target identification information of the target person head frame according to a second similarity between the target person head frame and each tracked person head frame in the current tracking queue; and if the target identification information meets the preset updating condition, updating the current statistical passenger flow value.

In a possible implementation manner, the determining unit 42 is specifically configured to:

Determining the hash value of the human head box according to the image information in the human head box; and aiming at each verified person head frame, determining a first distance between the hash value of the person head frame and the hash value of the verified person head frame, and determining the first distance as the first similarity between the person head frame and the verified person head frame.

determining a target size according to the size of the human head frame and the set multiple; in the image to be recognized, determining an image area which contains the human head frame and has the size of the target size; and determining the hash value of the image area according to the image information contained in the image area, and determining the hash value of the image area as the hash value of the human head box.

In a possible implementation manner, the determining unit 42 is further configured to not perform the process of updating the current statistical passenger flow value if the first similarity between the human head box and any verified human head box in the current verification queue is not less than the first threshold.

In a possible implementation, the processing unit 43 is specifically configured to:

acquiring a tracked person head frame corresponding to the target identification information in a current tracking queue; determining the corresponding moving distances of two tracked human head frames according to the image information of every two tracked human head frames adjacent to the acquisition time, and determining the sum of the distances according to each moving distance; and if the number of the tracked human head frames corresponding to the target identification information is larger than a set number threshold value and the sum of the distances is larger than a set distance threshold value, determining that the target identification information meets a preset updating condition.

In a possible implementation manner, the processing unit 43 is further configured to delete, in the current tracking queue, the relevant information of each tracked person head box corresponding to the target identification information, and not perform a process of updating the current statistical passenger flow value, if the third similarity of any two tracked person head boxes corresponding to the target identification information is greater than a preset second threshold.

In a possible implementation manner, the processing unit 43 is further configured to save, in the current verification queue, related information of each tracked person head box corresponding to the target identification information.

respectively determining second similarity of the target person head frame and each tracked person head frame in the current tracking queue; determining first identification information according to the Hungarian algorithm and each second similarity; and if the second similarity of the tracked person head frame corresponding to the first identification information in the current tracking queue and the target person head frame is smaller than the set matching threshold, determining the target identification information of the target person head frame according to the feature similarity of the target person head frame and each tracked person head frame in the current tracking queue and the target threshold corresponding to the target person head frame.

determining second identification information according to the Hungarian algorithm and the similarity of each feature; if the feature similarity between any tracked person head frame corresponding to the second identification information in the current tracking queue and the target person head frame is larger than a target threshold corresponding to the target person head frame, determining the target identification information of the target person head frame as the second identification information; and if not, distributing new identification information for the head frame of the target person as the target identification information of the head frame of the target person.

In a possible implementation, the processing unit 43 is specifically configured to: and determining a target threshold corresponding to the size of the head frame of the target person according to the corresponding relation between the image size and the threshold.

In a possible implementation manner, the obtaining unit 41 is further configured to obtain a sample feature vector of each sample head frame in the sample set, where each sample head frame corresponds to sample identification information;

the processing unit 43 is further configured to, for each sample head frame corresponding to each sample identification information, obtain a sample feature vector of the sample head frame and a sample feature similarity of a sample feature vector of a target sample head frame corresponding to the sample identification information; and determining a threshold value corresponding to each image size according to the size of the head frame of each sample and the corresponding sample characteristic similarity.

Example 9: fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and on the basis of the foregoing embodiments, an embodiment of the present invention further provides an electronic device, as shown in fig. 5, including: the system comprises a processor 51, a communication interface 52, a memory 53 and a communication bus 54, wherein the processor 51, the communication interface 52 and the memory 53 are communicated with each other through the communication bus 54;

the memory 53 has stored therein a computer program which, when executed by the processor 51, causes the processor 51 to perform the steps of:

acquiring a human head frame contained in an image to be identified; for each head frame, if the first similarity between the head frame and each verified head frame in the current verification queue is smaller than a preset first threshold value, determining that the head frame is a target head frame; for each target person head frame, determining target identification information of the target person head frame according to the second similarity between the target person head frame and each tracked person head frame in the current tracking queue; and if the target identification information meets the preset updating condition, updating the current statistical passenger flow value.

Because the principle of the electronic device for solving the problem is similar to the statistical method of the passenger flow value, the implementation of the electronic device can refer to the implementation of the method, and repeated details are not repeated.

The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.

The communication interface 52 is used for communication between the above-described electronic apparatus and other apparatuses.

The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Alternatively, the memory may be at least one memory device located remotely from the processor.

The processor may be a general-purpose processor, including a central processing unit, a Network Processor (NP), and the like; but may also be a Digital instruction processor (DSP), an application specific integrated circuit, a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like.

Example 10: on the basis of the foregoing embodiments, the present invention further provides a computer-readable storage medium, in which a computer program executable by a processor is stored, and when the program runs on the processor, the processor is caused to execute the following steps:

Since the principle of solving the problem by the computer-readable storage medium is similar to the statistical method of the passenger flow value, the specific implementation can refer to the implementation of the training method of the voiceprint model, and repeated details are not repeated.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims

1. A statistical method of passenger flow values, characterized in that the method comprises:

acquiring a human head frame contained in an image to be identified;

2. The method of claim 1, wherein determining a first similarity of the person's head box to each verified person's head box in a current verification queue comprises:

3. The method of claim 2, wherein determining the hash value of the person's head box from the image information within the person's head box comprises:

4. The method of claim 1, further comprising:

5. The method of claim 1, wherein determining that the target identification information satisfies a preset update condition comprises:

6. The method of claim 1, further comprising:

and if the third similarity of any two tracked person head frames corresponding to the target identification information is greater than a preset second threshold, deleting the relevant information of each tracked person head frame corresponding to the target identification information in the current tracking queue, and not executing the processing of updating the current statistical passenger flow value.

7. The method according to claim 1, wherein the determining the target identification information of the target person head frame according to the second similarity of the target person head frame and each tracked person head frame in the current tracking queue comprises:

8. A statistical apparatus of passenger flow values, characterized in that the apparatus comprises:

the processing unit is used for determining target identification information of each target person head frame according to the second similarity between the target person head frame and each tracked person head frame in the current tracking queue; and if the target identification information meets the preset updating condition, updating the current statistical passenger flow value.

9. An electronic device, characterized in that the electronic device comprises at least a processor and a memory, the processor being adapted to carry out the steps of the statistical method of the passenger flow values according to any one of claims 1-7 when executing a computer program stored in the memory.

10. A computer-readable storage medium, characterized in that it stores a computer program which, when being executed by a processor, carries out the steps of the statistical method of the passenger flow values according to any one of claims 1 to 7.