CN111476183A

CN111476183A - Passenger flow information processing method and device

Info

Publication number: CN111476183A
Application number: CN202010283720.9A
Authority: CN
Inventors: 丁凯; 严石伟; 蒋楠
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-04-13
Filing date: 2020-04-13
Publication date: 2020-07-31

Abstract

The invention provides a passenger flow information processing method, a passenger flow information processing device, electronic equipment and a computer readable storage medium; the method comprises the following steps: recognizing a plurality of first human body tracks and a plurality of first face tracks from the first image set, establishing a first binding relationship between the first human body tracks and the first face tracks belonging to the same visitor, and recognizing a plurality of second human body tracks from the second image set; establishing a second binding relationship between each first face track and the identity of the corresponding visitor; establishing a third binding relationship between the first human body track and the second human body track belonging to the same visitor; and determining the walking track of the visitor in the first area based on the third binding relationship of the visitor, and determining the identity of the visitor based on the bound identity in the first binding relationship and the second binding relationship of the visitor. By the method and the device, the identity of each visitor in the area and the wandering track of the visitor can be accurately identified.

Description

Passenger flow information processing method and device

Technical Field

The present invention relates to the field of artificial intelligence, and in particular, to a method and an apparatus for processing passenger flow information, an electronic device, and a computer-readable storage medium.

Background

Artificial intelligence is a theory, method and technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. Artificial intelligence is now rapidly developing and widely used in various industries.

Taking the application of the passenger flow analysis technology to the field of intelligent retail, accurate passenger flow analysis can energize the off-line stores (such as visitor reminding and products), and a new retail form is created.

However, the passenger flow analysis scheme provided by the related art usually uses human head recognition as a core, and only has a single function of counting regional passenger flow times data, and cannot accurately reflect the passenger flow situation in the region.

Disclosure of Invention

Embodiments of the present invention provide a method and an apparatus for processing passenger flow information, an electronic device, and a computer-readable storage medium, which can accurately identify the identity of each visitor appearing in an area and the walking track of the visitor.

The technical scheme of the embodiment of the invention is realized as follows:

the embodiment of the invention provides a passenger flow information processing method, which comprises the following steps:

acquiring a first image set acquired by simultaneously acquiring a human body and a human face in a first region, and acquiring a second image set acquired by acquiring the human body appearing in the first region;

recognizing a plurality of first human body tracks and a plurality of first face tracks from the first image set, establishing a first binding relationship between the first human body tracks and the first face tracks belonging to the same visitor, and recognizing a plurality of second human body tracks from the second image set;

establishing a second binding relationship between each first face track and the identity of the corresponding visitor;

matching the plurality of second human body tracks with the plurality of first human body tracks to establish a third binding relationship between the first human body tracks and the second human body tracks belonging to the same visitor;

determining a tour trajectory of the visitor in the first area based on a third binding relationship of the visitor, an

And determining the identity corresponding to the visitor based on the bound identity in the first binding relationship and the second binding relationship of the visitor.

An embodiment of the present invention provides a passenger flow information processing apparatus, including:

the flow acquisition module is used for acquiring a first image set acquired by simultaneously acquiring a human body and a human face in a first area and acquiring a second image set acquired by acquiring the human body appearing in the first area;

the algorithm module is used for identifying a plurality of first human body tracks and a plurality of first face tracks from the first image set, establishing a first binding relationship between the first human body tracks and the first face tracks belonging to the same visitor, and identifying a plurality of second human body tracks from the second image set;

the face processing module is used for establishing a second binding relationship between each first face track and the identity of the corresponding visitor;

the human body processing module is used for matching the plurality of second human body tracks with the plurality of first human body tracks so as to establish a third binding relationship between the first human body tracks and the second human body tracks belonging to the same visitor;

a data processing module for determining the walk track of the visitor in the first area based on the third binding relationship of the visitor, and

In the above scheme, the algorithm module is further configured to determine, in any one of the first images in the first image set, a distance between a head key point coordinate of a visitor captured based on a human body and a head key point coordinate of a visitor captured based on a human face; when the distance is smaller than a distance threshold value, determining that the visitor based on human body snapshot and the visitor based on human face snapshot are the same visitor, and establishing a first binding relationship between a first human body track and a first human face track of the same visitor.

In the foregoing solution, the face processing module is further configured to execute the following processing for each of the plurality of first face trajectories: performing feature extraction on the face image of the visitor corresponding to the first face track to obtain the face feature of the visitor; matching the human face features of the visitor with a plurality of human face features of bound identities in an identity library, determining that the visitor is a new visitor when the maximum similarity of the matching is smaller than a similarity threshold, and establishing a second binding relationship between the newly added identity and the first human face track aiming at the newly added corresponding identity of the new visitor; and when the matched maximum similarity is larger than the similarity threshold value, determining that the visitor is the visitor who has appeared, and establishing a second binding relationship between the identity bound by the face features with the maximum similarity and the first face track.

In the foregoing solution, the human body processing module is further configured to execute the following processing for each of the plurality of first human body trajectories: comparing the time window of the first human body trajectory with the time window of each of the second human body trajectories to determine a plurality of second human body trajectories intersecting the time window of the first human body trajectory; and matching the average characteristics of the first human body track with the average characteristics of each intersected second human body track, determining the second human body track with the maximum matching similarity as the human body track belonging to the same visitor as the first human body track, and establishing a third binding relationship between the second human body track with the maximum similarity and the first human body track.

In the above scheme, the human body processing module is further configured to perform feature extraction on a plurality of human body images corresponding to the first human body trajectory to obtain a plurality of human body features; sequencing the human body characteristics of the plurality of human body images according to the time sequence of snapshot, sequentially determining the similarity between the first human body characteristic and the subsequent human body characteristic in the sequencing, and keeping the longest sequence of which the similarity is greater than a similarity threshold value; and determining the average value of a plurality of human body characteristics included in the longest sequence as the average characteristic of the first human body track.

In the above scheme, the data processing module is further configured to determine a first human body trajectory and a second human body trajectory in a third binding relationship of the visitor as a walking trajectory of the visitor in the first area; and the identity module is used for inquiring the first face track bound in the first binding relation based on the first human body track bound in the third binding relation of the second human body track of the visitor, and determining the identity corresponding to the visitor based on the identity bound in the second binding relation of the first face track.

In the above scheme, the flow taking module is further configured to obtain a third image set acquired by collecting a human body appearing in each second region; wherein the first region is a common region in a target region to be analyzed; the second area is a plurality of sub-areas divided from the target area; the algorithm module is further configured to identify a plurality of third human body trajectories from the third image set; the human body processing module is further configured to register the first human body trajectory and the second human body trajectory in the third binding relationship in a seed trajectory library; the data processing module is further used for retrieving the seed trajectory library based on the third human body trajectory to determine the identity of the visitor in each second area or determine the passenger flow volume in each second area.

In the above scheme, the algorithm module is further configured to divide each day into a plurality of time intervals according to a set time granularity, and identify the first image set acquired in any one of the time intervals to obtain a plurality of first face tracks and a plurality of first human body tracks in any one of the time intervals; determining the passenger flow volume of the first area in any time interval based on a plurality of first face tracks or a plurality of first human body tracks in any time interval, wherein the passenger flow volume comprises passenger flow number and passenger flow number; the data processing module is further configured to accumulate the passenger flow volumes of the first area in the plurality of time intervals to obtain a total passenger flow volume of the first area per day.

In the above scheme, the face processing module is further configured to identify, based on the first image set, a behavior of a visitor entering and exiting the first area and an attribute of the visitor; when the visitor with the specific attribute is identified to enter the first area, an operator of the first area is informed, so that the operator sends a corresponding reminding message to a terminal associated with the visitor with the specific attribute.

An embodiment of the present invention provides an electronic device, including:

a memory for storing executable instructions;

and the processor is used for realizing the passenger flow information processing method provided by the embodiment of the invention when the executable instruction stored in the memory is executed.

The embodiment of the invention provides a computer-readable storage medium, which stores executable instructions and is used for causing a processor to execute the method for processing the passenger flow information.

The embodiment of the invention has the following beneficial effects:

firstly, based on face identification technique, bind every first face orbit and the identity of the visitor that corresponds, then, based on human body identification technique, will belong to same visitor's first human orbit and second human orbit and bind, afterwards, through uniting face identification technique and human body identification technique, will belong to same visitor's first face orbit and first human orbit and bind, so, can through the aforesaid relation of binding, can accurately discern the identity of every visitor that appears in the region, and the orbit of wandering away of visitor.

Drawings

FIG. 1 is an alternative architecture diagram of a passenger flow application system provided by an embodiment of the invention;

FIG. 2 is an alternative structural diagram of a server according to an embodiment of the present invention;

FIG. 3 is a schematic flow chart of an alternative passenger flow information processing method according to an embodiment of the present invention;

fig. 4 is a schematic flow chart of another alternative passenger flow information processing method according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of store traffic and customer attributes provided by an embodiment of the present invention;

FIG. 6 is a schematic illustration of a shopping track provided by a patron within a shopping mall in accordance with an embodiment of the present invention;

FIG. 7 is a system framework diagram for implementing a passenger flow processing method according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of a stream fetching module according to an embodiment of the present invention;

FIG. 9 is a schematic structural diagram of an algorithm module provided by an embodiment of the present invention;

FIG. 10 is a schematic diagram of a field-like bolt camera algorithm pipeline provided by an embodiment of the present invention;

fig. 11 is a schematic structural diagram of a face processing module according to an embodiment of the present invention;

FIG. 12 is a schematic structural diagram of a human body treatment module according to an embodiment of the invention;

FIG. 13 is a schematic flow chart of refining a trajectory of a human body according to an embodiment of the present invention;

fig. 14 is a schematic flow chart illustrating binding of a human body trajectory obtained based on a field type gunlock camera and a human body trajectory obtained based on a field type dome camera according to an embodiment of the present invention;

FIG. 15 is a flow chart illustrating a process of retrieving a seed trajectory library based on an incoming store trajectory according to an embodiment of the present invention;

fig. 16 is a schematic structural diagram of a data processing module according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail with reference to the accompanying drawings, the described embodiments should not be construed as limiting the present invention, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.

In the description that follows, references to the terms "first", "second", and the like, are intended only to distinguish between similar objects and not to indicate a particular ordering for the objects, it being understood that "first", "second", and the like may be interchanged under certain circumstances or sequences of events to enable embodiments of the invention described herein to be practiced in other than the order illustrated or described herein.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein is for the purpose of describing embodiments of the invention only and is not intended to be limiting of the invention.

Before further detailed description of the embodiments of the present invention, terms and expressions mentioned in the embodiments of the present invention are explained, and the terms and expressions mentioned in the embodiments of the present invention are applied to the following explanations.

1)1: and N face retrieval, namely finding one or more faces with the highest similarity to the face to be retrieved in a large-scale face database, wherein the retrieval performance is related to the size of the face database scale N.

2)1: and N human body track retrieval, namely finding one or more human body tracks with highest similarity to the human body track to be retrieved in a large-scale human body track database.

3) Computer Vision (CV) is a science for researching how to make a machine "look", and more specifically, it refers to a machine Vision that uses a camera and a Computer to replace human eyes to identify, track and test a target, and further processes an image, and uses the Computer to process the image into an image more suitable for human eyes observation or instrument detection.

4) Artificial Intelligence (AI), a new technology science that studies and develops theories, methods, techniques for simulating, extending, and expanding human Intelligence to apply systems.

5) Pedestrian Re-identification (Reid), which is a technique for determining whether a specific pedestrian exists in an image or video sequence using computer vision technology, is widely considered as a sub-problem of image retrieval, and a monitored pedestrian image is given to retrieve the pedestrian image across devices. The visual limitation of a fixed camera is overcome, the pedestrian detection/pedestrian tracking technology can be combined, and the method can be widely applied to the fields of intelligent video monitoring, intelligent security and the like.

6) The bolt face camera, because of its shape, is often referred to simply as a bolt face. The monitoring position is fixed, and only a certain monitoring position can be directly opposite to the monitoring position, so the monitoring direction is limited, but the snapshot quality is good, and the method is generally used for snapshotting human faces and human bodies.

7) The dome camera, because of its appearance, is often referred to simply as a dome camera. The monitoring range of the ball machine is much larger than that of a fixed gun, 360-degree rotation can be generally achieved, and therefore a large area can be monitored, but the snapping quality is poor, and the ball machine is generally used for snapping a human body.

8) The basic unit of the passenger flow is the head of a person shot, such as the number of the heads of customers shot in a region of interest in a shopping mall.

9) Number of people in passenger flow, number of people who have removed duplicate versions based on customer identity. The basic units are the number of different customer identities captured by natural persons, such as areas of interest in a store.

At present, artificial intelligence technology has extensive application in the field of intelligent retail, and a large part of the artificial intelligence technology is applied to popularization and landing of scenes of intelligent retail stores, enables offline stores and creates new retail. In the related art, head recognition is usually used as a core, only the passenger flow track information in the key area can be counted, and then the single information of the passenger flow times data in the area is obtained, so that the strolling track of the customers in the whole key area and the identity information of each customer cannot be provided.

In view of the above problems, embodiments of the present invention provide an artificial intelligence based method, apparatus, electronic device and computer-readable storage medium, which can simultaneously determine the identities of visitors present in a first area (i.e., a public area of a target area to be analyzed) and the walking track of each visitor in the first area.

An exemplary application of the passenger flow information processing device provided in the embodiment of the present invention is described below, and the passenger flow information processing device provided in the embodiment of the present invention may be implemented as a server or a server cluster, or may be implemented in a manner that a user terminal and a server cooperate with each other.

It should be noted that the server may be an independent physical server, may also be a server cluster or a distributed system formed by a plurality of physical servers, and may also be a cloud server that provides basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, and a big data and artificial intelligence platform, which is not limited herein.

Next, an exemplary application when the passenger flow information processing apparatus is implemented as a server will be described. Referring to fig. 1, fig. 1 is an alternative architecture diagram of a passenger flow application system 100 according to an embodiment of the present invention. The passenger flow application system 100 includes: the server 200, the network 300, the terminal 400, and the database 500 will be described separately.

The database 500 is used for storing a picture captured by the camera for the first area (i.e. a public area of the target area to be analyzed, for example, the first area may be a public area of a shopping mall, a public area of a department store, a public area of a community, etc.), or storing a video frame obtained by decoding a video captured by the camera for the first area.

The server 200 acquires a photo taken by the camera for the first area or a video frame obtained by decoding a video taken by the camera for the first area from the database 500, then determines the identity of each visitor present in the first area and the trajectory of each visitor in the first area based on the acquired photo or video frame (the determination process will be described in detail below), and then the server 200 transmits the determined identity of each visitor present in the first area and the trajectory of each visitor in the first area to the terminal 400 through the network 300.

The network 300 is used as a medium for communication between the server 200 and the terminal 400, and the network 300 may be a wide area network or a local area network, or a combination of both.

The terminal 400 runs a client 410, and the identity of each visitor appearing in the first area and the walking track of each visitor in the first area, which are issued by the server 200, are shown in a graphical interface of the client 410.

The passenger flow application system provided by the embodiment of the invention can be widely applied to passenger flow analysis of various scenes, for example, in the field of intelligent retail, by analyzing the identity of each customer appearing in the public area of a shopping mall and the strolling track of each customer in the shopping mall, the shopping mall can be helped to adjust a shopping policy, perform shopping diversion and establish a scientific rent strategy; in the field of intelligent security, the identity of each visitor appearing in the public area of a cell and the walking track of each visitor in the public area of the cell are analyzed, so that the property can be helped to determine whether suspicious visitors exist in the cell.

Referring to fig. 2, fig. 2 is a schematic structural diagram of a server 200 according to an embodiment of the present invention, where the server 200 shown in fig. 2 includes: at least one processor 210, memory 240, at least one network interface 220. The various components in server 200 are coupled together by a bus system 230. It is understood that the bus system 230 is used to enable connected communication between these components. The bus system 230 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 230 in fig. 2.

The Processor 210 may be an integrated circuit chip having Signal processing capabilities, such as a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like, wherein the general purpose Processor may be a microprocessor or any conventional Processor, or the like.

The memory 240 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard disk drives, optical disk drives, and the like. Memory 240 optionally includes one or more storage devices physically located remote from processor 210.

The memory 240 includes either volatile memory or nonvolatile memory, and may include both volatile and nonvolatile memory. The nonvolatile memory may be a Read Only Memory (ROM), and the volatile memory may be a Random Access Memory (RAM). The memory 240 described in connection with embodiments of the present invention is intended to comprise any suitable type of memory.

In some embodiments, memory 240 is capable of storing data, examples of which include programs, modules, and data structures, or subsets or supersets thereof, to support various operations, as exemplified below.

An operating system 241, including system programs for handling various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and handling hardware-based tasks;

a network communication module 242 for communicating to other computing devices via one or more (wired or wireless) network interfaces 220, exemplary network interfaces 220 including: bluetooth, wireless compatibility authentication (WiFi), and Universal Serial Bus (USB), etc.;

in some embodiments, the passenger flow information processing apparatus provided by the embodiment of the present invention may be implemented in software, and fig. 2 shows the passenger flow information processing apparatus 243 stored in the memory 240, which may be software in the form of programs and plug-ins, and includes the following software modules: a stream fetching module 2431, an algorithm module 2432, a face processing module 2433, a body processing module 2434 and a data processing module 2435, which are logical and thus can be arbitrarily combined or further split depending on the functions implemented. The functions of the respective modules will be explained below.

In other embodiments, the passenger flow information processing apparatus provided in the embodiments of the present invention may be implemented in hardware, and for example, the passenger flow information processing apparatus provided in the embodiments of the present invention may be a processor in the form of a hardware decoding processor, which is programmed to execute the passenger flow information processing method provided in the embodiments of the present invention, for example, the processor in the form of the hardware decoding processor may be one or more Application Specific Integrated Circuits (ASICs), DSPs, Programmable logic devices (P L D, Programmable L analog devices), complex Programmable logic devices (CP L D, complex x Programmable L analog devices), Field Programmable Gate arrays (FPGAs, Field-Programmable Gate arrays), or other electronic components.

The following describes a passenger flow information processing method provided by an embodiment of the present invention, with reference to an exemplary application of the passenger flow information processing apparatus provided by an embodiment of the present invention, when the passenger flow information processing apparatus is implemented as a server.

Referring to fig. 3, fig. 3 is an alternative flow chart of a passenger flow information processing method according to an embodiment of the present invention, which will be described with reference to the steps shown in fig. 3.

In step S101, the server acquires a first image set acquired for a human body and a human face in the first region at the same time, and acquires a second image set acquired for a human body appearing in the first region.

Here, the first region refers to a common region of a target region to be analyzed, the first image set may be an image set composed of photographs captured by a high-precision camera (i.e., a camera capable of clearly capturing a human body and a human face at the same time, such as a gun camera) with respect to the human body and the human face appearing in the first region, or an image set composed of video frames obtained by decoding videos captured by a high-precision camera with respect to the human body and the human face appearing in the first region at the same time, and the second image set may be an image set composed of photographs captured by a low-precision camera (i.e., only a human body appearing in a key region can be captured, but the field of view captured is large, such as a dome camera capable of capturing a region that cannot be captured by the gun camera) with respect to the human body appearing in the first region, or a video frame obtained by decoding videos captured by a low-precision camera with respect to the human body appearing in the first region The image set of (2).

For example, taking an intelligent retail store scene as an example, the first area may be a public area of the store, such as an entrance and an exit of the store, an entrance, and the like, in which a bolt camera is usually deployed to capture human bodies and faces appearing in these key areas, and the server may pull video data captured by the bolt camera in real time or offline video data, and decode the video data to obtain the first image set. In addition, in general, the dome camera and the gunlock camera are deployed in pairs, that is, the dome camera is also deployed in the key areas of the shopping mall to capture human bodies appearing in the key areas, and the server can pull video data shot by the dome camera in real time or offline video data and decode the video data to obtain a second image set.

For example, taking a smart community as an example, the first area may be a public area of the community, such as an activity center, a basketball court, a road, and the like in the community, a gunlock camera and a dome camera are deployed in these areas in pairs, a person and a face appearing in these areas are simultaneously photographed by the gunlock camera, and the photographed video is decoded to obtain a first image set; and shooting human bodies appearing in the areas through a camera of the dome camera, and decoding the shot video to obtain a second image set.

According to the embodiment of the invention, the identity of each visitor in the first area and the walking track of each visitor in the first area are determined based on the first image set and the second image set by acquiring the first image set acquired by aiming at the human body and the human face in the first area and the second image set acquired by only aiming at the human body in the first area.

In step S102, the server identifies a plurality of first human body trajectories and a plurality of first face trajectories from the first image set, establishes a first binding relationship between the first human body trajectories and the first face trajectories belonging to the same visitor, and identifies a plurality of second human body trajectories from the second image set.

Here, after the first image set is obtained, since any one of the first images in the first image set is an image including a clear face and a clear human body, the server may perform face detection and human body detection simultaneously for each first image in the first image set, and record a series of positions where the same face appears according to a time sequence to form a corresponding face trajectory, and record a series of positions where the same human body appears according to a time sequence to form a corresponding human body trajectory.

In an example, the face detection is performed on video frames obtained by shooting a gun camera deployed in a market, the position of the gun camera, which is acquired from the corresponding video frame, is taken as the position where the face has appeared for a series of video frames where each detected face is located, the positions where the same face appears are recorded according to the time sequence to form corresponding face tracks, and a series of positions where the same face appears are recorded according to the time sequence in each face track.

In an example, human body detection is performed on video frames shot by a gun camera deployed in a market, for each detected video frame where a human body is located, the position of the camera which acquires the corresponding video frame is used as the position where the human body has appeared, the positions where the same human body appears are recorded according to the time sequence to form corresponding human body tracks, and a series of positions where the same human body appears are recorded in each human body track according to the time sequence.

It should be noted that the face detection algorithm and the human body detection algorithm may be YO L O (You On ly L ook one) algorithm, ssd (single Shot multi box detector) algorithm, dpm (de formatted Part model) algorithm, and the like, which is not limited in the embodiment of the present invention.

In some embodiments, establishing the first binding relationship between the first human body trajectory and the first face trajectory belonging to the same visitor may be implemented by: determining the distance between the head key point coordinates of the visitor based on human body snapshot and the head key point coordinates of the visitor based on human face snapshot in any first image of the first image set; when the distance is smaller than the distance threshold value, determining that the visitor based on the human body snapshot and the visitor based on the human face snapshot are the same visitor, and establishing a first binding relationship between a first human body track and a first human face track of the same visitor.

In an example, for any video frame in a video shot by a gun camera, calculating a Euclidean distance between a head key point coordinate of a customer obtained based on a human body key point SDK and a head key point coordinate of the customer obtained based on a human face key point SDK, and when the Euclidean distance is smaller than a distance threshold value, determining that the identified human body image and the face image belong to the same customer; and when the Euclidean distance is greater than the distance threshold value, determining that the identified human body image and the face image do not belong to the same customer. When the face image and the body image which are identified belong to the same customer, the face track and the body track corresponding to the customer are bound, which is very important for the subsequent determination of the identity of the customer, because the face track is associated with the identity and the attribute information of the customer, once the body track and the face track of the customer in a market are bound, the body track data is also associated with the identity and the attribute information of the corresponding customer.

In other embodiments, the server further identifies a plurality of second human body trajectories from the second image set, and a process of identifying the plurality of second human body trajectories from the second image set is similar to a process of identifying the plurality of first human body trajectories from the first image set, which is not described herein again in this embodiment of the present invention.

According to the embodiment of the invention, the first image set is subjected to face detection and human body detection at the same time to obtain a plurality of first human body tracks and a plurality of first human face tracks, and the first binding relationship between the first human body tracks and the first human face tracks belonging to the same visitor is established, so that the human body tracks of the visitors can be mapped to the corresponding face tracks, which is very important for the follow-up determination of the identity of the visitor.

In step S103, the server establishes a second binding relationship between each first face track and the identity of the corresponding visitor.

In some embodiments, step S103 shown in fig. 3 may be implemented by steps S201 to S205 shown in fig. 4, which will be described in conjunction with the steps shown in fig. 4.

In step S201, the server performs feature extraction on the face image of the visitor corresponding to the first face trajectory to obtain a face feature of the visitor.

Here, after obtaining the plurality of first face trajectories through step S102, the server performs feature extraction on the face image of the visitor corresponding to each first face trajectory in the plurality of first face trajectories, to obtain the face feature of the visitor. The characteristic extraction of the face image of the visitor refers to a process of processing the face image of the visitor according to a certain algorithm, extracting corresponding characteristic information and further converting the characteristic information into a characteristic matrix.

For example, geometric features of the face, such as the shapes of the eyes, nose, and mouth of the visitor and geometric relationships therebetween (e.g., distances therebetween), may be extracted and converted into corresponding feature matrices as the face features of the visitor.

In some embodiments, in order to further reduce the computational complexity, a face clustering microservice may be further invoked first, the obtained multiple first face tracks are clustered, so that the multiple first face tracks are clustered into the same cluster, which means that the multiple first face tracks are actually face tracks of the same visitor, topN is taken according to quality for the face features corresponding to the face tracks under each cluster (i.e., N face features with the highest quality scores are selected according to the quality scores sorted from high to low), and then an average value of the face features corresponding to the topN face tracks is calculated as the face features of the visitor.

In step S202, the server matches the facial features of the visitor with a plurality of facial features of the bound identity in the identity repository.

Here, after the face features of the visitor are obtained in step S201, whether corresponding face features exist is retrieved in the identity library based on the face features of the visitor, and if so, it is indicated that the visitor is a present visitor, and the identity bound to the corresponding face features in the identity library may be used as the identity of the visitor; if the identity of the new visitor and the corresponding face features do not exist, the visitor is a new visitor, the corresponding identity is added for the new visitor, and the identity of the new visitor and the corresponding face features are stored in an identity library.

In step S203, the server determines whether the matched maximum similarity is greater than a similarity threshold, and executes step S204 when the matched maximum similarity is greater than the similarity threshold; when the maximum similarity of the matching is smaller than the similarity threshold, step S205 is performed.

In step S204, the server determines that the visitor is a present visitor, and establishes a second binding relationship between the identity of the facial feature binding with the maximum similarity and the first facial trajectory.

For example, the server searches the identity library based on the face features of the visitor a (i.e., the face features of the visitor are matched with all face features stored in the identity library in advance), and when the returned maximum search score is greater than the search score threshold value, which indicates that the visitor a has visited before, the identity corresponding to the maximum search score in the identity library is directly used for binding with the visitor a.

In step S205, the server determines that the visitor is a new visitor, and establishes a second binding relationship between the newly added identity and the first face track for the newly added corresponding identity of the new visitor.

For example, the server searches an identity library based on the facial features of the visitor a, determines that the visitor a is a newly visited visitor when the returned maximum search score is less than a search score threshold, registers the identity for the visitor a, and stores the registered identity and the facial features of the visitor a in the identity library.

According to the embodiment of the invention, each face track is bound with the identity of the corresponding visitor through the face recognition technology, and the identity of the visitor can be obtained based on the binding relationship when the face track is mapped to the face track through the human body track of the visitor.

In step S104, the server matches the plurality of second human body trajectories with the plurality of first human body trajectories to establish a third binding relationship between the first human body trajectory and the second human body trajectory belonging to the same visitor.

In some embodiments, matching the plurality of second human body trajectories with the plurality of first human body trajectories to establish a third binding relationship between the first human body trajectory and the second human body trajectory belonging to the same visitor may be implemented by: performing the following for each of a plurality of first human body trajectories: comparing the time window of the first human body trajectory with the time window of each second human body trajectory to determine a plurality of second human body trajectories intersecting the time window of the first human body trajectory; and matching the average characteristics of the first human body track with the average characteristics of each crossed second human body track, determining the second human body track with the maximum matching similarity as the human body track belonging to the same visitor as the first human body track, and establishing a third binding relationship between the second human body track with the maximum similarity and the first human body track.

For example, in a smart retail store scenario, the server may match a human body trajectory obtained by the camera of the field type gun camera (i.e., a first human body trajectory) with a human body trajectory obtained by the camera of the field type ball camera (i.e., a second human body trajectory) to establish a third binding relationship between the first human body trajectory and the second human body trajectory belonging to the same customer. The specific process is as follows: for any field type of gun camera, firstly, the average characteristics of the human body track obtained based on the field type of gun camera and the corresponding time window are determined, then, all the corresponding field type of camera are obtained according to the serial number of the field type of gun camera, and the average characteristics of the human body track corresponding to all the field type of camera and the corresponding time window are determined. And then, sequentially comparing the time window of the human body track obtained based on the field type gunlock camera with the time windows of the human body tracks obtained based on the dome camera, and determining a plurality of human body tracks obtained based on the field type gunlock camera, which are intersected with the time window of the human body track obtained based on the field type gunlock camera. And finally, matching the average characteristics of the human body tracks obtained based on the field type gunlock cameras with the average characteristics of the human body tracks obtained based on the field type ball machine cameras which are intersected, determining the human body track obtained based on the field type ball machine cameras with the maximum matching similarity as the human body track belonging to the same customer as the human body track obtained based on the field type gunlock cameras, and binding the human body track and the human body track.

In other embodiments, before matching the average features of the first human body trajectory with the average features of each of the intersecting second human body trajectories, the first human body trajectory may be further refined first, as follows: extracting the features of a plurality of human body images corresponding to the first human body track to obtain a plurality of corresponding human body features; sequencing the human body characteristics corresponding to the plurality of human body images according to the time sequence of snapshot, sequentially determining the similarity between the first human body characteristic and the subsequent human body characteristic in the sequencing, and keeping the longest sequence of which the similarity is greater than a similarity threshold; and determining the average value of a plurality of human body characteristics included in the longest sequence as the average characteristic corresponding to the first human body track.

For example, suppose that 10 corresponding human body features are obtained after feature extraction is performed on a plurality of human body images corresponding to a first human body trajectory, then the 10 human body features are sorted according to time sequence and are recorded as T1 and T2 … and T10, then similarities between T1 and T2, between T3 … T9 and between T10 are respectively calculated, and if the similarities are 98%, 90%, 60% and 54%, respectively, and if the set similarity threshold is 70%, only T1 to T7 are reserved, and an average feature between T1 and T7 is calculated as the average feature corresponding to the first human body trajectory.

According to the embodiment of the invention, the first human body track and the second human body track belonging to the same visitor are bound, so that the complete wandering track of the visitor in the first area can be obtained, and in addition, the human body tracks are purified, so that the same human body track is ensured to correspond to only one visitor, the algorithm effect is improved, and meanwhile, the complexity of subsequent calculation is reduced.

In step S105, the server determines a walking track of the visitor in the first area based on the third binding relationship of the visitor, and determines an identity corresponding to the visitor based on the bound identities in the first and second binding relationships of the visitor.

Here, after the third binding relationship between the first human body trajectory and the second human body trajectory belonging to the same visitor is obtained through step S104, the first human body trajectory and the second human body trajectory in the third binding relationship of the visitor may be determined as the walking trajectory of the visitor in the first area.

For example, taking a smart retail store scene as an example, for a customer a, after a human body track obtained by the customer a based on a field type gun camera and a human body track obtained by the field type ball camera are bound, the human body track obtained by the field type gun camera and the human body track obtained by the field type ball camera can be determined as a visiting track of the customer a in a public area of the store, so that a complete visiting track of the customer a in the public area of the store can be obtained based on the binding relationship.

In other embodiments, the server may further obtain a third image set acquired for the human body appearing in each second region; the first region is a common region in a target region to be analyzed, the second region is a sub-region divided from the target region, and the number of the second regions is multiple. The server firstly identifies a plurality of third human body tracks from the third image set, then registers the first human body tracks and the second human body tracks in the third binding relationship into the seed track library, and then retrieves the seed track library based on the identified third human body tracks to determine the identity of the visitor in each second area and the wandering track of each visitor in the whole target area to be analyzed (including the first area and the plurality of second areas). That is to say, the passenger flow information processing method provided by the embodiment of the present invention can determine not only the walking track of the visitor in the first area, but also the walking tracks of the same visitor in a plurality of second areas, so as to obtain the walking track of the visitor in the whole target area to be analyzed.

For example, taking a smart retail store scene as an example, the first area is a public area of a store, the second area is each store in the store, and a dome camera is usually deployed at a door of each store to capture a human body appearing in the store, the third image set may be an image set composed of a plurality of video frames obtained by decoding a video captured by the dome camera deployed at the door of the store, and accordingly, the plurality of third body trajectories may be store-in trajectories of different customers obtained based on the store-class dome camera. Subsequently, the store-in track of each customer may be matched with a seed track library (in which the travel tracks of different customers in the public area of the store are pre-stored) to determine the identities of the customers who enter and exit the store, and the travel tracks of each customer throughout the store (including the public area of the store and the stores). Specifically, when the store-entering track based on the customer a is matched with the human body track obtained based on the field type dome camera, the human body track obtained based on the field type gun camera can be mapped according to the third binding relationship, then the human body track obtained based on the field type gun camera is mapped to the human face track obtained based on the field type gun camera according to the first binding relationship, and finally the identity corresponding to the customer a is obtained according to the association between each human face track and the identity of the corresponding customer in the second binding relationship. Meanwhile, the strolling track of the customer A in the whole shopping mall is obtained by combining the strolling track of the customer A in each shopping mall and the strolling track of the customer A in the public area of the shopping mall. In some embodiments, determining the identity corresponding to the visitor based on the bound identities in the first binding relationship and the second binding relationship of the visitor may be implemented by: and inquiring a first face track bound in the first binding relation based on a first human body track bound in the third binding relation of a second human body track of the visitor, and determining an identity corresponding to the visitor based on an identity bound in the second binding relation of the first face track bound.

By way of example, when the identity of a corresponding visitor is determined according to a human body track obtained based on a field type dome camera, the human body track obtained based on the field type dome camera can be mapped to the human body track obtained based on the field type dome camera according to the binding relationship between the field type dome camera and the field type dome camera, then the binding relationship between a human face track and the human body track under the field type dome camera is inquired, the human body track obtained based on the field type dome camera is mapped to the corresponding human face track, and finally the identity of the visitor is obtained based on the bound identity of each human face track.

In other embodiments, the server may further divide each day into a plurality of time intervals according to a set time granularity, and identify the first image set acquired in any time interval to obtain a plurality of first face trajectories and a plurality of first human body trajectories in any time interval; determining the passenger flow volume of the first area in any time interval based on a plurality of first face tracks or a plurality of first human body tracks in any time interval, wherein the passenger flow volume comprises passenger flow times and passenger flow numbers; and accumulating the passenger flow volume of the first area in a plurality of time intervals to obtain the total passenger flow volume of the first area every day.

According to the embodiment of the invention, the passenger flow volume of the first area in a short time and the total passenger flow volume of each day can be counted by dividing each day, and more refined passenger flow data can be provided for the operator of the first area.

In other embodiments, the passenger flow information processing method provided in the embodiments of the present invention may further include a reminding service. For example, the server may identify a behavior of a visitor entering and exiting the first area, and attributes of the visitor based on the first set of images; when the visitor with the specific attribute is identified to enter the first area, the operator of the first area is informed, so that the operator sends a corresponding reminding message to the terminal associated with the visitor with the specific attribute.

For example, taking a smart retail store scene as an example, when the server determines that a VIP customer enters the store based on a video shot by the field-like bolt camera, a corresponding notification may be sent to an operator of the store, so that the operator of the store pushes the latest goods or advertisements to the mobile phone of the VIP customer.

The embodiment of the invention binds each first human face track with the identity of the corresponding visitor based on the human face recognition technology, binds the first human body track and the second human body track belonging to the same visitor based on the human body recognition technology, and binds the first human face track and the first human body track belonging to the same visitor by combining the human face recognition technology and the human body recognition technology, thereby determining the identity of each visitor appearing in a first area (namely a public area of a target area to be analyzed) and the wandering track of each visitor in the first area through the binding relationship.

Continuing with the exemplary structure of the passenger flow information processing device 243 provided by the embodiment of the present invention implemented as software modules, in some embodiments, as shown in fig. 2, the software modules stored in the passenger flow information processing device 243 of the memory 240 may include: a stream fetching module 2431, an algorithm module 2432, a face processing module 2433, a body processing module 2434, and a data processing module 2435.

The flow taking module 2431 is configured to obtain a first image set acquired by simultaneously acquiring a human body and a human face in a first region, and obtain a second image set acquired by acquiring the human body appearing in the first region; an algorithm module 2432, configured to identify a plurality of first human body trajectories and a plurality of first face trajectories from the first image set, establish a first binding relationship between the first human body trajectories and the first face trajectories of the same visitor, and identify a plurality of second human body trajectories from the second image set; the face processing module 2433 is configured to establish a second binding relationship between each first face track and an identity of the corresponding visitor; the human body processing module 2434 is configured to match the plurality of second human body trajectories with the plurality of first human body trajectories to establish a third binding relationship between the first human body trajectory and the second human body trajectory belonging to the same visitor; and the data processing module 2435 is configured to determine a walking track of the visitor in the first area based on the third binding relationship of the visitor, and determine an identity corresponding to the visitor based on the bound identity in the first binding relationship and the second binding relationship of the visitor.

In some embodiments, the algorithm module 2432 is further configured to determine, in any one of the first set of images, a distance between the head keypoint coordinates of the visitor based on the human snapshot and the head keypoint coordinates of the visitor based on the human face snapshot; when the distance is smaller than the distance threshold value, determining that the visitor based on the human body snapshot and the visitor based on the human face snapshot are the same visitor, and establishing a first binding relationship between a first human body track and a first human face track of the same visitor.

In some embodiments, the face processing module 2433 is further configured to perform the following for each of the plurality of first face trajectories: extracting the features of the face image of the visitor corresponding to the first face track to obtain the face features of the visitor; matching the face features of the visitor with a plurality of face features of the bound identities in an identity library, determining the visitor as a new visitor when the maximum similarity of the matching is smaller than a similarity threshold, and establishing a second binding relationship between the newly added identity and the first face track aiming at the newly added corresponding identity of the new visitor; and when the matched maximum similarity is larger than the similarity threshold, determining that the visitor is the visitor who has appeared, and establishing a second binding relationship between the identity bound by the face features with the maximum similarity and the first face track.

In some embodiments, the human processing module 2434 is further configured to perform the following for each of the plurality of first human trajectories: comparing the time window of the first human body trajectory with the time window of each second human body trajectory to determine a plurality of second human body trajectories intersecting the time window of the first human body trajectory; and matching the average characteristics of the first human body track with the average characteristics of each crossed second human body track, determining the second human body track with the maximum matching similarity as the human body track belonging to the same visitor as the first human body track, and establishing a third binding relationship between the second human body track with the maximum similarity and the first human body track.

In some embodiments, the human body processing module 2434 is further configured to perform feature extraction on a plurality of human body images corresponding to the first human body trajectory to obtain a plurality of human body features; sequencing the human body characteristics of the plurality of human body images according to the time sequence of snapshot, sequentially determining the similarity between the first human body characteristic and the subsequent human body characteristic in the sequencing, and keeping the longest sequence of which the similarity is greater than a similarity threshold; and determining the average value of a plurality of human body characteristics included in the longest sequence as the average characteristic of the first human body track.

In some embodiments, the data processing module 2435 is further configured to determine the first human body trajectory and the second human body trajectory in the third binding relationship of the visitor as the walking trajectory of the visitor in the first area; and the first human body track is used for inquiring the first human face track bound in the first binding relation based on the first human body track bound in the third binding relation of the second human body track of the visitor, and determining the identity corresponding to the visitor based on the identity bound in the second binding relation of the first human face track.

In some embodiments, the flow fetching module 2431 is further configured to obtain a third image set acquired for the human body appearing in each second region; wherein the first region is a common region in the target region to be analyzed; the second area is a plurality of sub-areas divided from the target area; an algorithm module 2432, further configured to identify a plurality of third body trajectories from the third set of images; the human body processing module 2434 is further configured to register the first human body trajectory and the second human body trajectory in the third binding relationship in the seed trajectory library; the data processing module 2435 is further configured to retrieve the seed trajectory library based on the third body trajectory to determine an identity of the visitor to each second area or to determine a passenger flow volume for each second area.

In some embodiments, the algorithm module 2432 is further configured to divide each day into a plurality of time intervals according to a set time granularity, and identify the first image set acquired in any time interval to obtain a plurality of first face trajectories and a plurality of first human body trajectories in any time interval; determining the passenger flow volume of the first area in any time interval based on a plurality of first face tracks or a plurality of first human body tracks in any time interval, wherein the passenger flow volume comprises passenger flow times and passenger flow numbers; the data processing module 2435 is further configured to accumulate the passenger flow volume of the first area in a plurality of time intervals to obtain a total passenger flow volume of the first area per day.

In some embodiments, the face processing module 2433 is further configured to identify behavior of a visitor entering and exiting the first area and attributes of the visitor based on the first set of images; when the visitor with the specific attribute is identified to enter the first area, the operator of the first area is informed, so that the operator sends a corresponding reminding message to the terminal associated with the visitor with the specific attribute.

It should be noted that the description of the apparatus according to the embodiment of the present invention is similar to the description of the method embodiment, and has similar beneficial effects to the method embodiment, and therefore, the description is omitted. The inexhaustible technical details in the passenger flow information processing device provided by the embodiment of the invention can be understood according to the description of any one of the figures 3-4.

The following describes a passenger flow information processing method provided by an embodiment of the invention by taking an intelligent retail field as an example.

The artificial intelligence technology is widely applied to the field of intelligent retail at present, and a large part of the artificial intelligence technology is applied to popularization and landing of scenes of intelligent retail stores, enables offline stores and creates new retail. The intelligent retail system for the market provided by the related technology mainly adopts the human head and shoulder recognition technology and the following steps: the scheme mainly comprises three modules of capturing and obtaining original passenger flow video data, processing an algorithm SDK and calculating background passenger flow identity times, and a whole set of framework from multimedia acquisition data flow to CV algorithm processing and final background processing is realized. However, the framework is mainly based on face recognition and human body detection tracking to complete customer identity profiling and customer flow statistics, and the targeted users are mainly shops. That is, the passenger flow system framework provided by the related art has the following problems:

1) the method has the advantages that the method is universal, the passenger flow system provided by the related technology can only enable a user to enter a store, and the behaviors of entering the store, entering a floor, getting in and out of a courtyard and the like of the user are not effectively utilized, so that the scheme can only provide the passenger flow identity and the passenger flow volume at the store level, and the visiting track and the behaviors of the user in the whole store are not fully mined, so that the local data has certain contingency, the reasonable market position selection, the customer habit analysis and the advertisement putting of the store are not facilitated, and the requirement of the passenger flow system at the store level cannot be met.

2) The functionality and the related art provide a passenger flow system which takes head recognition as a core, and only can count track information of an interested area to obtain the passenger flow time data in the area, so that the single function of counting the number of passengers in the area by natural human units cannot be provided, and the visiting track of a customer in the whole shopping mall and each shop cannot be provided.

In view of the above problems, an embodiment of the present invention provides a system framework for implementing a method for processing passenger flow information, where the framework is specially designed for landing on an application of a smart retail store scheme based on a CV technology, provides accurate passenger flow data for stores and malls, and supports cross-day customer profiles and travel track queries. The framework provided by the embodiment of the invention integrates a multimedia acquisition-based streaming module, a CV (constant value coefficient) technology-based algorithm module, a face processing module based on face recognition, a human body processing module based on a human body recognition Reid technology and a final data processing module. The system framework for realizing the passenger flow information processing method provided by the embodiment of the invention can be widely deployed on the current mainstream hardware platform, is suitable for various requirements of the current main market scene, and can provide more refined and digitized passenger flow information for the market and the store, so that the market and the store can make more reasonable operation decision according to the shopping behaviors and passenger flow changes of customers, and the improvement of the user revenue is assisted.

Compared with the related art, the system framework for realizing the passenger flow information processing method provided by the embodiment of the invention at least has the following advantages:

1. the system framework for realizing the passenger flow information processing method provided by the embodiment of the invention enables offline behaviors of approach and store entrance and the like by combining the human face recognition technology and the human body recognition technology, provides regional passenger flow information, customer identity files and strolling tracks for shopping malls and store users, and provides accurate digital information for shopping mall buyer operation and accurate marketing of stores by using data.

2. The System framework for realizing the passenger flow information processing method provided by the embodiment of the invention can be deployed on current mainstream hardware platforms, such as So C (System-on-a-Chip) equipment represented by NVIDIA Tegra and Intel Movidius, a PC (personal computer) with a GeForce GPU and server equipment with a Tesla/quad o GPU. Meanwhile, the embodiment of the invention can meet various requirements of the intelligent retail store and related scenes of the stores on customer consumption habits, operation cost, resource distribution and management, has strong applicability and no requirement on specific stores, and can copy the scheme.

3. The embodiment of the invention designs a set of technical scheme combining Face recognition and human body recognition, and provides the archive information of the permanent identity of a customer and the customer shopping track provided by Reid according to the Face ID to obtain an accurate full-range shopping route map and archive information of each customer.

4. The system has rich and customizable functions, and the embodiment of the invention provides the information of the number of people and the number of people in the market level passenger flow based on the face recognition technology and supports the information query of the identity file of the customers across days; the human body recognition technology provides information of passenger flow times and number of people at the level of a shop in a shopping mall and a shopping track of a customer in the whole shopping mall. Moreover, the system framework function of the method for realizing passenger flow information processing provided by the embodiment of the invention is independent and can be dynamically expanded and contracted, and a plurality of functions can be selected according to the requirements of users, for example, the users only care about regional passenger flow information, and the customization of the system function can be transparently realized through configuration and designation.

The system framework for realizing the passenger flow information processing method provided by the embodiment of the invention can be applied to various business and supermarket scenes such as a shopping mall, a department store, a shopping center and the like of an intelligent retail mall, and rich information such as passenger flow times, number data, customer identities, shopping tracks and the like is provided for the mall and the store by establishing an accurate passenger flow analysis system.

For example, referring to fig. 5, fig. 5 is a schematic diagram of store passenger flow and customer attributes provided by an embodiment of the present invention, and as shown in fig. 5, a system framework implementing a passenger flow information processing method provided by an embodiment of the present invention can obtain rich information such as total passenger flow times (51) of an entire store, an entry rate (52) of the store, male and female proportions (53) of customers in the store, and age distribution (54) of the customers, so as to help the store adjust a business recruitment policy, and help the store complete accurate marketing and personalized recommendations.

For example, referring to fig. 6, fig. 6 is a schematic diagram of a shopping track of a customer in a shopping mall provided by an embodiment of the present invention, as shown in fig. 6, a Face ID (61) provided based on a Face recognition technology provides an identity archive and an attribute portrait such as gender, age, etc. for the customer, a shopping track (62) and a passenger flow behavior analysis (63) of the customer in the shopping mall are completed based on a human body recognition Reid technology, such as entering and exiting the shopping mall, staying time, etc., and then the Face attribute information is combined to help the shopping mall to know the shopping habits of the customer, so as to optimize the diversion efficiency; meanwhile, the system helps the shop to complete personalized recommendation and optimize commodity display strategies, and promotes shop resource management and marketing.

The following is a detailed description of a system framework for implementing the passenger flow information processing method according to the embodiment of the present invention.

Referring to fig. 7, fig. 7 is a schematic diagram of a system framework for implementing a passenger flow information processing method according to an embodiment of the present invention, and as shown in fig. 7, the framework for implementing a passenger flow information processing method according to an embodiment of the present invention integrates a multimedia acquisition flow module, a CV computation-based algorithm module, a face processing module based on a face recognition technology, a human body processing module based on a human body recognition technology, and a final data processing module. The system comprises a camera real-time video stream (comprising a human Face camera real-time video stream and a human body camera real-time video stream) or an off-line video stream, a streaming module, an algorithm module, a Face detection and tracking algorithm pipeline (pipeline) and a human body detection and tracking algorithm pipeline (pipeline) to obtain data such as human Face track, human body track, passenger flow behavior (such as in and out of a shop, in and out of a shop and stay time) and the like, and then the obtained data is further distributed to a Face processing module and a human body processing module, and basic CV capacity such as feature extraction and retrieval and the like is carried out in the Face processing module based on the human Face image to complete customer identity, attribute files and behavior (such as in and out of a shop and the like) based on Face ID; the method comprises the steps that basic CV capabilities such as feature extraction and retrieval are carried out on a human body processing module based on human body images to complete calculation of shopping tracks of customers in and out of a shopping mall and in the shopping mall based on Reid, finally, human face processing results and human body processing results obtained by the human face processing module and the human body processing module are reported to the data processing module, and calculation and display of the passenger flow volume, passenger flow behaviors, the identity of the customer and the shopping tracks of the customer in the whole shopping mall and all shops are completed through the data processing module.

It should be noted that the frames included in the method for processing passenger flow information according to the embodiments of the present invention are relatively independent and decoupled from each other, and can perform deep learning calculation by using computing resources of a Central Processing Unit (CPU)/an image processing unit (GPU)/a Video Processing Unit (VPU) in a balanced manner, and support import and export of intermediate result data.

Referring to fig. 8, fig. 8 is a schematic structural diagram of a stream fetching module according to an embodiment of the present invention, as shown in fig. 8, the stream fetching module mainly includes a decoder and a processing of an original image frame, where a video decoder is a core of the stream fetching module, and in order to support universality, the guest stream information processing system according to an embodiment of the present invention is compatible with a current mainstream hardware platform, and for a GPU decoder, qsv (quick Sync video) hard decoder plug-in on an Intel integrated GPU and CU DA hard decoder plug-in on a GeForce/Tesla/Quadro platform can be supported, where GeForce/Tesla/Quadro is a GPU hardware platform of NVDIA, and CUDA is an underlying library proposed by NVDIA for performing video computation on a GeForce/Tesla/Quadro series product thereof; for CPU soft decoders, the most common FFMPEG decoding can be supported, where FFMPE G is an open source framework that handles multimedia data. When the passenger flow analysis system runs, a corresponding decoder can be selected for initialization and decoding starting according to a specific hardware deployment platform and needs, and original image frames obtained through decoding can be transmitted to the algorithm module through various modes such as network transmission or memory sharing.

Referring to fig. 9, fig. 9 is a schematic structural diagram of an algorithm module provided in the embodiment of the present invention, and as shown in fig. 9, the algorithm module includes three parts from a logic level: the first part is an application layer visible to a user and mainly comprises a configuration management class for reading configuration, a hardware management class for managing hardware resources, an algorithm management class for balancing algorithm flow for selecting an original image frame and a message communication class. The algorithm pipeline required for different types of cameras is specifically described below.

1) Camera algorithm production line of field-like gun camera

The field-type gunlock camera is generally deployed in key areas of shopping malls such as shopping malls and courtyards, and the data acquired by the gunlock camera can be divided into three lines of human body processing, human face processing and human face processing in a combined manner because the gunlock camera can simultaneously capture clear human bodies and human faces.

Referring to fig. 10, fig. 10 is a schematic diagram of a field-like bolt camera algorithm pipeline according to an embodiment of the present invention. As shown in fig. 10, the human body processing mainly adopts human body detection, human body tracking, human body optimization and human body key point detection to form a human body track in a key area of a shopping mall, and the specific process is as follows: carrying out human body identification on video frames shot by a camera of the gun camera, and acquiring the video frames of each identified human bodyThe position of a camera of a video frame is used as the position where a human body appears, and the position where the human body appears is recorded according to time sequence to form a human body track (for example, reid is used as an identifier and is recorded as trace _ id)_reid) A series of positions where the human body appears are recorded in the human body trajectory in chronological order.

The human face processing mainly adopts human face detection, human face tracking, human face registration and human face quality score to form human face tracks in key areas of a market, and the specific process is as follows: performing Face recognition on video frames shot by a gun camera, regarding a series of video frames where each Face is located, taking the position of the gun camera which acquires the corresponding video frame as the position where the Face appears, and recording the position where the Face appears according to time sequence to form a Face track (for example, using Face ID as an identifier and recording the Face ID as trace _ ID)_face) And recording a series of positions where the human face appears in the human face track according to the time sequence.

After human body snapshot and human face snapshot algorithm processing in an original image frame are respectively completed, a binding SDK is called to bind a human face track corresponding to a human face and a human body track corresponding to a human body in the same image frame, namely the human face snapshot and the human body snapshot belonging to the same customer in the same image frame under a gunlock camera are identified through the human face and human body binding SDK, and the human face track belonging to the same customer is bound with the human body track. The basic principle of binding is that a human body key point SDK is used for obtaining head key point coordinates of human body snapshot, Euclidean distance is obtained through the head key point coordinates of human body snapshot, and whether the human face and the human body in the same frame of image are the same customer or not is judged (when the Euclidean distance is smaller than a distance threshold value, the same customer is judged, and when the Euclidean distance is larger than the distance threshold value, the same customer is judged). If the customers are the same, the human body track ID corresponding to the human body of the same customer and the human face track ID corresponding to the human face are bound, so that the track data of the customers simultaneously comprise the human face track and the human body track, which is particularly important for acquiring identity information and attribute information based on a store-entering track (namely, the human body track of the customer in the store) (because the human face track is associated with the identity and attribute information, once the human body track and the human face track of the customer in a market are bound, which is equivalent to associating the human body track data with the identity and attribute information of the corresponding customer), and the formula for binding the human face track and the human body track of the same customer in the market is as follows:

trace_id_face＝trace_id_reid(1-1)

wherein, trace _ id_faceIs the face track, trace _ id, of the customer in the mall_reidIs the human body track of the customer in the store.

2) Camera algorithm production line of field ball machine

The field type dome camera is generally paired with the field type gun camera and is also deployed in key areas of shopping malls such as shopping malls and courtyards, the algorithm processing flow also comprises human body detection, human body tracking and human body optimization, and the field type dome camera and the field type gun camera are mainly used for calculating human body tracks in the key areas of the shopping malls and judging the shopping malls so as to enable the data processing module to count the shopping mall passenger flow information (including passenger flow times and passenger flow number). Because the field type dome camera and the field type gunlock camera are generally paired in the same area, the captured human body is theoretically the same, and a theoretical basis is laid for the gunlock matching logic of the subsequent human body processing module.

3) Camera algorithm production line of shop ball machine

The system is deployed in a store, and adopts a human body SDK processing flow including human body detection, human body tracking and human body optimization in the store, so as to obtain human body tracks in the store, judgment of the passenger flow behavior in the store (such as stay time in the store and getting out of the store), and the like, so that the data processing module can count the passenger flow information in the store.

It should be noted that the above-mentioned algorithm pipeline depends on a plurality of computation layer S DKs based on CV computation encapsulation, and these SDKs can adopt various deep learning forward acceleration computation libraries, so as to be compatible with computation hardware of various architectures such as Intel CPU/GPU, VPU computation card, ARMCPU, NVIDIA GPU, etc., and the functions of each SDK in these algorithm streams are mutually independent, and can be dynamically extended and retracted according to actual needs, for example, when only passenger flow data needs to be counted, only the two SDKs need to be initialized, executed, detected, and tracked. In addition, considering that hardware computing resources such as a CPU/GPU/VPU and the like are used at the same time, the algorithm module can provide the utilization rate of the resources in a heterogeneous parallel computing mode, namely, a multithreading processing is adopted to process video frames captured by multiple paths of cameras, tasks with low computing pressure are distributed to the CPU, tasks with complex computing are dispatched to the G PU and the VPU, and the multiple paths of video frames are guaranteed to be uniformly distributed to each resource card.

Referring to fig. 11, fig. 11 is a schematic structural diagram of a face processing module according to an embodiment of the present invention, and as shown in fig. 11, the face processing module mainly includes various business services based on CV computation and micro services for encapsulating each algorithm SDK. For business services, such as passenger flow services, the passenger flow volume (including passenger number and number) and passenger flow behavior in a short time when entering and leaving a key area of a shopping mall are calculated mainly by calling feature extraction, clustering and retrieval micro-services.

For example, the face processing module retrieves face trajectory data from the database, and since the passenger flow service is a real-time task, the face trajectory data needs to be configured to be retrieved every 10 minutes in a short time, then feature extraction is performed on face images corresponding to retrieved face trajectories, and then the retrieved face trajectories are clustered, so that a plurality of face trajectories are clustered under the same cluster, which means that the plurality of face trajectories clustered into the same cluster are actually face trajectories of the same customer, and finally, an average feature of a plurality of face features under the same cluster is calculated as a feature of the cluster. After clustering of a plurality of face tracks into a cluster is completed, database retrieval is started, and since passenger flow is a real-time task, the range of the database retrieval is required to be a nearest time window, for example, the database data of the last 15 minutes. Judging whether the searched searching score is smaller than a threshold value, and if so, indicating that the customer corresponding to the cluster is a new customer, namely a new passenger flow number; when the threshold value is larger than the threshold value, the corresponding customer in the cluster is the customer which is already present in the shopping mall.

In some embodiments, the passenger flow volume in the key area of the shopping mall in a short time can be calculated by the human body processing module based on the human body track of the customer in the shopping mall.

The daily archive filing service is mainly used for obtaining the identity archive of a current customer by calling and extracting features, clustering and retrieving and attribute micro-service and persisting the identity archive into a relational database management system (Mysql). The processes of calling and extracting features, clustering and retrieving are similar to those of passenger flow service, and the embodiment of the invention is not repeated herein.

The daily file library combining service is used for obtaining a permanent identity supporting the cross-day through calling a retrieval service from the customer identity obtained on the day, and the permanent identity is combined into a permanent library and is persisted. After face feature extraction, clustering, retrieval and merging are completed, each face track can be bound with a corresponding permanent identity, and the formula is as follows:

trace_id_face→merge_id→face_id(1-2)

wherein, trace _ id_faceThe face track of the customer in the market, the merge _ id is the identity of the customer obtained after the merging, and the face _ id is the permanent identity of the customer.

The reporting service reports the calculation results to the data processing module. Certainly, the face processing module also supports configurable function contraction, for example, a reminding service of a new customer or a VIP customer can be added, and micro-services such as feature extraction, retrieval and the like are called to obtain whether customers entering and leaving a mall and a shop are the new customer or the VIP customer, so that an operator of the mall or the shop is reminded of timely recommending commodities and advertisements to the new customer or the VIP customer in real time.

The corresponding CV micro service mainly comprises: the face feature extraction service is used for acquiring features corresponding to face snapshot; the face attribute service is used for acquiring attributes such as age, gender and whether glasses are worn corresponding to face snapshot; the face clustering service is used for acquiring a cluster ID corresponding to each face track by inputting a plurality of face tracks, namely clustering a plurality of face tracks belonging to the same customer under the same cluster; face 1: and N retrieval service, which is used for determining whether to update the bottom base based on the retrieval result by retrieving the bottom base, namely when the returned retrieval score is less than the threshold, the customer is a new customer, and information corresponding to the customer is added in the bottom base. The database retrieved for the passenger service is a temporary passenger database (i.e., the database within the last period of time), the database retrieved for the profiling service is a daily database (i.e., the database within the last day), and the database retrieved for the merge service is a permanent database (i.e., the database within a longer time horizon).

Meanwhile, under the condition of considering more human face processing tasks, a peak clipping and valley filling strategy can be introduced, namely the requested peak clipping and valley filling is realized by introducing strategies such as Redis, Kakfa cache and the like, so that the high availability of the system is ensured. The Redis is an open source log-type and key-value database which is written by using ANSI C language, supports network, can be based on memory and can be also persisted, and provides application programming interfaces of a plurality of languages. Kakfa is a high throughput distributed subscription messaging system that can handle all the action flow data of a consumer in a web site.

Referring to fig. 12, fig. 12 is a schematic structural diagram of a human body processing module according to an embodiment of the present invention. As shown in fig. 12, the human body treatment module mainly includes: various business services based on CV calculation and micro-services encapsulating each algorithm SDK. For business services, for example, the distribution service is mainly to distribute the human body trajectory data calculated by the algorithm module according to the types of the cameras, wherein the data corresponding to the field cameras is distributed to the store-in service, and the data corresponding to the store cameras is distributed to the store-in service. These two services are specifically described below.

1. Approach service

The method mainly completes the processing of the human body track captured by the camera of the field type gunlock and the human body track captured by the camera of the field type ball machine, and comprises the following steps: the method comprises the steps of snapping topN of a human body track, extracting characteristics, purifying, extracting attributes, matching field type region gun balls (namely matching the human body track snapped by a gun camera with the human body track snapped by a ball machine camera), and persisting to Mysql, and then registering the human body track to be retrieved in human body retrieval service to serve as a seed track.

In some embodiments, in order to improve the algorithm effect, a purification scheme is further introduced in the embodiments of the present invention, as shown in fig. 13, the main process is to snap images of multiple human bodies corresponding to the same human body track, obtain reid human body features of the images, then calculate similarities between the reid features according to a time sequence, and ensure that the snap images of the same customer are obtained within the track continuous time through threshold filtering. Therefore, on one hand, the algorithm effect is improved, and on the other hand, the snapshot scale of each human body track is also reduced, so that the complexity of subsequent calculation and the resource consumption are reduced.

In addition, the human body processing module receives a human body track captured by the field type gunlock camera and a human body track captured by the field type dome camera, and the human body track of the gunlock camera is bound with the human face track in the algorithm module, so that according to the idea of combining human face recognition and human body recognition, the embodiment of the invention introduces the logic of matching the gun ball in the field service, and aims to finish the binding between the human body track captured by the gunlock camera and the human body track captured by the dome camera.

Referring to fig. 14, fig. 14 is a schematic flow chart of binding a human body trajectory obtained based on a field-type gunlock camera and a human body trajectory obtained based on a field-type dome camera according to an embodiment of the present invention, and the steps shown in fig. 14 will be described. The steps S302 to S314 are used for processing the human body trajectory data obtained by the field-like gunlock camera, and the steps S315 to S321 are used for processing the human body trajectory data obtained by the field-like gunlock camera, which will be described in detail below.

In step S301, the human body trajectory data captured by the field-type camera, including the human body trajectory obtained by the field-type gun camera and the human body trajectory obtained by the field-type dome camera, is received.

(1) Processing of human body trajectories obtained based on field-like bolt camera

In step S302, the camera of the dome camera has a wide field of view, and the processing requires time, and therefore, the camera waits for the processing.

In step S303, the average feature of the human body trajectory obtained by the field-like bolt camera and the time window corresponding to the trajectory are obtained.

In step S304, the number paid _ id of the field-like bolt camera for capturing the trajectory of the human body is acquired.

In step S305, determining whether a field type camera is included in a gun ball cache pool (paiMap) according to the serial number paid _ id of the field type camera for capturing the trajectory of the human body, and if so, executing step S306; if not, go to step S314.

In step S306, all the corresponding field type dome cameras are obtained according to the serial number pad _ id of the field type gun camera.

In step S307, all corresponding field ball game machine cameras are traversed, and all cached human body trajectories are obtained according to the number camera _ id of the field ball game machine cameras.

In step S308, for each cached human body trajectory obtained based on the field ball camera, an average feature of each human body trajectory and a corresponding time window are calculated.

In step S309, it is determined whether the time window corresponding to the human body trajectory obtained by the field type gunlock camera intersects with the time window corresponding to the human body trajectory obtained by the field type dome camera, and if the time windows intersect, step S310 is executed.

In step S310, the euclidean distance between the human body trajectory obtained by the field type gunlock camera and the human body trajectory obtained by the field type dome camera is calculated.

In step S311, the human body trajectory obtained based on the field ball camera, which is smaller than the distance threshold, is placed in a candidate queue.

In step S312, all the human body trajectories obtained by the field ball machine camera are continuously traversed.

In step S313, the subsequent queues are sorted according to the feature calculation score, and the optimal one is selected as the matching result.

In step S314, the processing procedure is ended.

(2) Processing of human body trajectories obtained based on field-like dome camera

In step S315, it is determined whether or not a field dome camera exists in the gun ball cache pool pa iMap based on the serial number paid _ id of the field dome camera, and if not, step S316 is executed, and if so, step S317 is executed.

In step S316, the number of the camera of the field ball game machine is registered in the gun ball buffer pool paiMap.

In step S317, it is determined whether the field ball game machine camera information exists in the gun ball queue based on the field ball game machine camera number camera _ id, and if not, step S318 is executed, and if so, step S319 is executed.

In step S318, the information of the field ball game machine camera is registered in the gun and ball queue.

In step S319, it is determined whether the cache corresponding to the field ball game machine camera exists according to the trace _ id number of the human body trajectory, and if not, step S320 is executed.

In step S320, new human body trajectory information is registered in the cache corresponding to the camera of the field ball game machine.

In step S321, the processing procedure is ended.

It should be noted that steps S302 to S314 and steps S315 to S321 may be executed simultaneously, that is, the human body trajectory obtained by the field type gunner camera and the human body trajectory obtained by the field type dome camera may be processed simultaneously.

Therefore, through gun and ball matching, the human body track obtained based on the field type gun camera can be bound with the human body track obtained based on the field type ball camera, so that the human body track obtained based on the field type gun camera can be registered in a retrieval service seed library when the human body track obtained based on the field type gun camera is registered in a retrieval service seed library, and even if the ball machine track (namely the human body track obtained based on the field type ball camera) is retrieved in a store service, the human body track obtained based on the field type gun camera can be continuously retrieved through a binding relation, and the recall rate of the re-id algorithm retrieval is effectively improved. Without the gun ball matching logic, since the retrieval registration seed base only contains a small number of gun camera body tracks, the condition that the retrieval cannot be carried out can exist when the gun camera enters a store for retrieval. After the gun and the ball are matched, the human body track obtained based on the field type gun camera and the human body track obtained based on the field type ball camera can be registered as the human body track of the same customer in a market, and the formula is as follows:

wherein the content of the first and second substances,

is based on the human body track obtained by the camera of the field ball-like machine,

is based on the human body track obtained by the camera of the field type gunlock.

2. Store service

The method mainly comprises the steps of performing feature extraction, attribute extraction and persistence after reporting on the human body track captured by the camera of the store ball machine, and finally obtaining the seed track of a customer in the whole market by searching the entering seed database.

The reporting service mainly reports the approach track (namely the human body track obtained based on the field type gunlock and the dome camera), the store approach track (namely the human body track obtained based on the store type dome camera) and the seed track retrieved based on the store approach track to the computing module. Similarly, the human body processing module also supports configurable and retractable functions, for example, when only the volume of the shop is concerned, the subsequent related operations such as feature extraction and retrieval can be skipped. In addition, for CV micro service, human body extraction features mainly obtain features corresponding to human body tracks, human body extraction attributes mainly obtain attributes of clothing, bags and the like corresponding to the human body tracks, human body search mainly calculates similarity between store-entering tracks and seed tracks in a seed library, and the similarity is set as passenger flow search and customer human body identity search through different search threshold values.

The reason why the corresponding field track can be retrieved through the store-entering track is that the track of the customer entering and exiting the store overlaps with the walking track of the customer in the whole shopping mall, for example, the track of the customer near the store door shot by the store-type ball machine camera overlaps with the track of the customer entering and exiting the store shot by the field-type ball machine camera, so that the track of the customer entering and exiting the store can be matched with the track of the whole shopping mall, and the identity of the customer and the walking track of the customer in the whole shopping mall can be obtained.

Referring to fig. 15, fig. 15 is a schematic flowchart of retrieving a seed trajectory library based on an incoming store trajectory according to an embodiment of the present invention. As shown in fig. 15, for the passenger flow retrieval, mainly in order to obtain the identity corresponding to each in-out store track (i.e. only different customers need to be distinguished), the data processing module behind the aspect repeats the obtaining of the information of the number of people, so that it is not necessary to set a threshold value, and N results with the highest retrieval scores are directly returned; for identity retrieval, a threshold value needs to be set for obtaining a full field track corresponding to a store-in and store-out track and further obtaining identity file information of the customer. The specific search process is as follows: assuming that, for customer a, when the identity of customer a needs to be determined, a threshold needs to be set, that is, a seed track library is retrieved based on a plurality of incoming tracks of customer a corresponding to different stores (shopping tracks of different customers in a shopping mall are stored in advance in the seed track library), and when the number of times that the returned highest retrieval score is greater than a score threshold is greater than the set threshold (for example, assuming that the threshold is 80%, that is, when customer a searches the seed track library based on the incoming tracks of 5 different stores, it needs 4 times that the highest retrieval score returned is greater than a retrieval score threshold to determine the identity of the customer), the identity of the customer is determined according to the matched travelling track of the customer in the shopping mall; when the number of the passengers in the shop is only required to be determined, the identity of the customer is determined directly on the basis of the walking track of the customer with the highest retrieval score in the shop without setting a threshold value. Further, as shown in fig. 15, when the trajectory of the ball machine is retrieved based on the trajectory of entering the store, the trajectory of the ball machine can be mapped to the trajectory of the gun camera by the gun-and-ball matching logic described above, and thus:

wherein, trace _ id_shopThe track of the entering store is obtained by combining a formula (1-3) obtained by gun and ball matching logic in the entering service to obtain a relation between the track of the entering store and a human body track obtained based on a field type gun camera, then combining a formula (1-1) obtained by human body binding logic in an algorithm module to obtain mapping between the track of the entering store and the human face track, and finally combining the profiling of a human face processing module and a formula (1-2) of library combining logic to complete the communication between the track of the entering store and the permanent identity of a customer, thus obtaining:

trace_id_shop→face_id (1-5)

meanwhile, under the condition that the human body processing module has more tasks, a peak clipping and valley filling strategy can be introduced, namely, the requested peak clipping and valley filling is realized by introducing strategies such as Redis, Kakfa cache and the like, so that the high availability of the system is ensured.

Referring to fig. 16, fig. 16 is a schematic structural diagram of a data processing module according to an embodiment of the present invention, as shown in fig. 16, the data processing module is generally implemented on a big data computing platform such as Hadoop, and mainly performs statistics and calculation on data reported by a human body processing module and a human face processing module, so as to obtain a total number of people in a mall and a total number of people in a mall, and a number of people in a shop and a number of people in a passenger flow; filing the identity of the customer field based on the customer profile information of the face ID, such as ID, age, gender and the like; communicating the face track ID and the identity ID of the customer; the entering-store track reported by the human body processing module is associated with the face track through the retrieved seed track to obtain the identity ID of the customer, namely the full-range strolling route of the customer is obtained, and the formula is as follows:

trace_id_reid＝trace_id_face

wherein, trace _ id_faceBased on field type gunlock cameraHead derived face tracks, attributes indicating attribution, and mall type, e.g., trace id_faceAttributes, mass _ zone _ type represents a face track corresponding to a customer in a shopping mall of the type, mass _ zone _ id represents a number of the shopping mall, in _ out _ mass represents entering and leaving of the shopping mall, face _ id is a permanent identity of the customer, and trace _ id represents a permanent identity of the customer_reidIs a human body track, trace _ id, obtained based on a field-like camera_shopIs a store-in trajectory, the shop _ zone _ type represents the type of the store, the shop _ zone _ id represents the number of the store, and the in _ out _ shop represents the store-in and store-out.

Of course, the data processing module can also perform add-delete function according to the actual requirement of the user, and the function dynamic contractibility consistent with the previous module is achieved.

In other embodiments, because the embodiment of the invention mainly completes the establishment of the identity of the customer based on face recognition, a gun camera with higher price needs to be deployed in a market, which greatly reduces the attraction for the market with sensitive cost or harsher gun camera erection conditions and no cross-day inquiry requirement for the identity of the customer.

In addition, it should be noted that the framework for implementing the passenger flow information processing method provided by the embodiment of the invention can also be extended to other industry floor applications, such as fields of intelligent security, intelligent communities, intelligent catering and the like.

The embodiment of the invention has the following beneficial effects:

1. the method and the system have wide applicability, can be suitable for the current mainstream hardware platforms including a PC (personal computer), a server and the like, have rich and flexible functions and are suitable for the current main market scene.

2. The method has the advantages that the functions of the modules related to the embodiment of the invention are specific and independent, the modules are decoupled with each other, in addition, the strategies such as Redis and Kafka are utilized to realize peak clipping and valley filling, the high availability of the system is ensured, and meanwhile, various computing resources such as a CPU, a GPU and a VPU are utilized in a balanced manner.

3. The intelligent retail customer flow analysis system has the advantages that the practicability is high, a set of complete intelligent retail customer flow analysis system framework combining face recognition and human body recognition is provided, rich customer flow information can be provided for shopping malls and shops, and then the shopping malls are guided to make more scientific business inviting strategies.

Embodiments of the present invention provide a storage medium having stored therein executable instructions, which when executed by a processor, will cause the processor to perform methods provided by embodiments of the present invention, for example, passenger flow information processing methods as shown in fig. 3-4.

In some embodiments, the storage medium may be a memory such as FRAM, ROM, PROM, EPROM, EE PROM, flash, magnetic surface memory, optical disk, or CD-ROM; or may be various devices including one or any combination of the above memories.

In some embodiments, executable instructions may be written in any form of programming language (including compiled or interpreted languages), in the form of programs, software modules, scripts or code, and may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.

By way of example, executable instructions may correspond, but do not necessarily correspond, to files in a file system, may be stored in a portion of a file that holds other programs or data, such as in one or more scripts stored in a hypertext markup language (HT M L, Hyper text markup L engine) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code).

By way of example, executable instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or distributed across multiple sites and interconnected by a communication network.

In summary, the embodiment of the invention has the following beneficial effects:

3. The intelligent retail customer flow analysis system has the advantages that the practicability is high, a set of complete intelligent retail customer flow analysis system framework combining face recognition and human body recognition is provided, rich customer flow information can be provided for shopping malls and shops, and therefore the shopping malls are guided to make a more scientific business inviting strategy, and personalized commodity recommendation of the shops is facilitated.

The above description is only an example of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, and improvement made within the spirit and scope of the present invention are included in the protection scope of the present invention.

Claims

1. A passenger flow information processing method, characterized by comprising:

2. The method of claim 1, wherein establishing a first binding relationship between a first human body trajectory and a first face trajectory belonging to the same visitor comprises:

determining a distance between head key point coordinates of a visitor based on human body snapshot and head key point coordinates of the visitor based on human face snapshot in any one first image of the first image set;

when the distance is smaller than a distance threshold value, determining that the visitor based on human body snapshot and the visitor based on human face snapshot are the same visitor, and establishing a first binding relationship between a first human body track and a first human face track of the same visitor.

3. The method of claim 1, wherein establishing a second binding relationship between each of the first face tracks and an identity of the corresponding visitor comprises:

performing the following for each of the plurality of first face trajectories:

performing feature extraction on the face image of the visitor corresponding to the first face track to obtain the face feature of the visitor;

matching the human face features of the visitor with a plurality of human face features of bound identities in an identity library, determining that the visitor is a new visitor when the maximum similarity of the matching is smaller than a similarity threshold, and establishing a second binding relationship between the newly added identity and the first human face track aiming at the newly added corresponding identity of the new visitor;

and when the matched maximum similarity is larger than the similarity threshold value, determining that the visitor is the visitor who has appeared, and establishing a second binding relationship between the identity bound by the face features with the maximum similarity and the first face track.

4. The method of claim 1, wherein matching the plurality of second human body trajectories with the plurality of first human body trajectories to establish a third binding relationship between the first human body trajectory and the second human body trajectory belonging to the same visitor comprises:

performing the following for each of the plurality of first human body trajectories:

comparing the time window of the first human body trajectory with the time window of each of the second human body trajectories to determine a plurality of second human body trajectories intersecting the time window of the first human body trajectory;

and matching the average characteristics of the first human body track with the average characteristics of each intersected second human body track, determining the second human body track with the maximum matching similarity as the human body track belonging to the same visitor as the first human body track, and establishing a third binding relationship between the second human body track with the maximum similarity and the first human body track.

5. The method of claim 4, wherein prior to matching the average features of the first human body trajectory with the average features of each of the intersecting second human body trajectories, the method further comprises:

extracting the features of a plurality of human body images corresponding to the first human body track to obtain a plurality of human body features;

sequencing the human body characteristics of the plurality of human body images according to the time sequence of snapshot, sequentially determining the similarity between the first human body characteristic and the subsequent human body characteristic in the sequencing, and keeping the longest sequence of which the similarity is greater than a similarity threshold value;

and determining the average value of a plurality of human body characteristics included in the longest sequence as the average characteristic of the first human body track.

6. The method of claim 1, wherein determining the visitor's trajectory in the first area based on the visitor's third binding comprises:

determining a first human body track and a second human body track in a third binding relationship of the visitor as a walking track of the visitor in the first area;

the determining the identity corresponding to the visitor based on the bound identity in the first binding relationship and the second binding relationship of the visitor comprises:

and inquiring a first face track bound in the first binding relation based on a first human body track bound in the third binding relation of a second human body track of the visitor, and determining an identity corresponding to the visitor based on an identity bound in the second binding relation of the first face track.

7. The method according to any one of claims 1 to 6, further comprising:

acquiring a third image set acquired by aiming at the human body appearing in each second area; wherein the first region is a common region in a target region to be analyzed; the second area is a plurality of sub-areas divided from the target area;

identifying a plurality of third body trajectories from the third set of images;

registering the first human body track and the second human body track in the third binding relationship to a seed track library;

retrieving the seed trajectory library based on the third body trajectory to determine an identity of a visitor to each of the second areas or to determine a volume of passenger for each of the second areas.

8. The method according to any one of claims 1 to 6, further comprising:

dividing each day into a plurality of time intervals according to set time granularity, and identifying the first image set acquired in any one of the time intervals to obtain a plurality of first face tracks and a plurality of first human body tracks in any one of the time intervals;

determining the passenger flow volume of the first area in any time interval based on a plurality of first face tracks or a plurality of first human body tracks in any time interval, wherein the passenger flow volume comprises passenger flow number and passenger flow number;

and accumulating the passenger flow volume of the first area in a plurality of time intervals to obtain the total passenger flow volume of the first area every day.

9. The method according to any one of claims 1 to 6, further comprising:

identifying, based on the first set of images, a behavior of a visitor entering and exiting the first area and an attribute of the visitor;

when the visitor with the specific attribute is identified to enter the first area, an operator of the first area is informed, so that the operator sends a corresponding reminding message to a terminal associated with the visitor with the specific attribute.

10. A passenger flow information processing apparatus, characterized in that the apparatus comprises: