Disclosure of Invention
The embodiment of the application provides a method, a system, an electronic device and a storage medium for tracking a pedestrian track across mirrors, so that the problems that the pedestrian track tracking across mirrors is realized through a pedestrian re-identification technology in the related technology and the accuracy and the real-time performance of the pedestrian track tracking are low are at least solved.
In a first aspect, an embodiment of the present application provides a method for cross-mirror pedestrian trajectory tracking, where the method includes:
the method comprises the steps that image acquisition equipment shoots a human figure, and primary structured information is formed according to the human figure, a time stamp and position information of the image acquisition equipment, wherein the human figure comprises a pedestrian image and a face image;
the site server acquires the primary structured information of a plurality of image acquisition devices within a site, acquiring the face recognition confidence coefficient of the target pedestrian according to the primary structured information, judging whether the face recognition confidence coefficient is greater than a face absolute confidence threshold value, if so, storing the primary structured information into a target ID corresponding to the target pedestrian, if the judgment result is negative, acquiring the multi-modal fusion confidence coefficient of the target pedestrian, judging whether the multi-modal fusion confidence coefficient is greater than a multi-modal fusion confidence threshold value, if so, saving the primary structured information to the target ID, obtaining secondary structured information from a plurality of pieces of the primary structured information of the target ID, wherein the secondary structured information indicates a trajectory of the target pedestrian within the site scope;
and the cloud server acquires the secondary structural information of the plurality of site servers, and integrates the plurality of secondary structural information according to the time stamp and the position information of the image acquisition equipment to obtain the real-time pedestrian track corresponding to the target ID.
In some embodiments, the obtaining a multi-modal fusion confidence level of the target pedestrian, and determining whether the multi-modal fusion confidence level is greater than a multi-modal fusion confidence threshold value includes:
the site server acquires the absolute path confidence, the pedestrian re-recognition confidence and the pedestrian attribute similarity of the target pedestrian according to the primary structured information, and acquires the multi-mode fusion confidence of the target pedestrian according to the absolute path confidence, the pedestrian re-recognition confidence, the pedestrian attribute similarity and the portrait recognition confidence;
and judging whether the multi-modal fusion confidence coefficient is greater than a multi-modal fusion confidence threshold value, and if so, storing the primary structured information into the target ID.
In some embodiments, the obtaining, by the site server, the absolute path confidence of the target pedestrian according to the primary structured information includes:
the method comprises the steps that a site server obtains a topological structure diagram of a plurality of image acquisition devices in a site range, wherein the topological structure diagram takes the position of the image acquisition device as a node and takes a path between the image acquisition devices as an edge;
obtaining a distance time confidence coefficient according to the truncated normal distribution curve and the topological structure diagram, wherein the distance time confidence coefficient meets the truncated normal distribution curve;
and acquiring a space position confidence coefficient according to the topological structure diagram, wherein the absolute path confidence coefficient comprises the distance time confidence coefficient and the space position confidence coefficient.
In some embodiments, the obtaining, by the site server, the portrait recognition confidence level, the pedestrian re-recognition confidence level, and the pedestrian attribute similarity of the target pedestrian according to the primary structured information includes:
acquiring the face features of the target pedestrian and a face feature library of a retrieval library, and taking the cosine distances between the face features and the face feature library as the face recognition confidence;
acquiring the human body characteristics of the target pedestrian and a human body characteristic library of a search library, and taking the cosine distance between the human body characteristics and the human body characteristic library as the confidence coefficient of the pedestrian re-identification;
and acquiring a product operation result of the gender similarity, the ethnicity similarity and the posture similarity as the pedestrian attribute similarity.
In some embodiments, after the image capturing device captures the human image, the method further comprises:
and judging whether the illuminance, the definition, the target size, the integrity and the shielding area of the human image meet corresponding threshold conditions or not, and forming the human image meeting the threshold conditions, the timestamp and the position information of the image acquisition equipment into the primary structured information.
In some embodiments, before the image capturing device captures the human image, the method further comprises:
constructing a search library;
after the image acquisition device captures the human image, the method further comprises:
and judging whether the target pedestrian in the human image map is successfully matched with the search library, if not, storing the human image map into the search library, and establishing a target ID for the target pedestrian.
In some embodiments, after obtaining the plurality of pieces of the one-level structured information of the target ID, the method further includes:
and calculating the distances between the characteristics of the primary structured information of the target ID and the characteristics of the corresponding target ID in the search library respectively, and updating the primary structured information with the similarity lower than a preset value into the target ID of the search library.
In a second aspect, the present application provides a cross-mirror pedestrian trajectory tracking system, which includes an image acquisition device, a site server and a cloud server,
the image acquisition equipment is used for capturing a human figure and forming primary structured information according to the human figure, the timestamp and the position information of the image acquisition equipment, wherein the human figure comprises a pedestrian image and a face image;
the site server is used for acquiring the primary structured information of a plurality of image acquisition devices within the site range, acquiring the face recognition confidence coefficient of the target pedestrian according to the primary structured information, judging whether the face recognition confidence coefficient is greater than a face absolute confidence threshold value, if so, the primary structured information is integrated into the target ID of the target pedestrian, if the judgment result is negative, acquiring the multi-modal fusion confidence coefficient of the target pedestrian, judging whether the multi-modal fusion confidence coefficient is greater than a multi-modal fusion confidence threshold value, if so, saving the primary structured information to the target ID, obtaining secondary structured information from a plurality of pieces of the primary structured information of the target ID, wherein the secondary structured information indicates a trajectory of the target pedestrian within the site scope;
the cloud server is used for acquiring the secondary structural information of the plurality of site servers and integrating the plurality of secondary structural information according to the time stamp and the position information of the image acquisition equipment to obtain the real-time pedestrian track corresponding to the target ID.
In a third aspect, an embodiment of the present application provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the computer program to implement the method for tracking a pedestrian trajectory across mirrors according to the first aspect.
In a fourth aspect, embodiments of the present application provide a storage medium having a computer program stored thereon, where the computer program is executed by a processor to implement the method for cross-mirror pedestrian trajectory tracking as described in the first aspect above.
Compared with the related art, the method for tracking the pedestrian track across the mirrors provided by the embodiment of the application captures the human figure through the image acquisition equipment, and forms the primary structured information according to the human figure, the timestamp and the position information of the image acquisition equipment, wherein the human figure comprises the pedestrian image and the face image; the method comprises the steps that a site server obtains primary structured information of a plurality of image acquisition devices within a site range, obtains a portrait recognition confidence coefficient of a target pedestrian according to the primary structured information, judges whether the portrait recognition confidence coefficient is larger than a face absolute confidence threshold value or not, if the judgment result is yes, stores the primary structured information into a target ID corresponding to the target pedestrian, if the judgment result is no, obtains a multi-mode fusion confidence coefficient of the target pedestrian, judges whether the multi-mode fusion confidence coefficient is larger than the multi-mode fusion confidence threshold value or not, if the judgment result is yes, stores the primary structured information into the target ID, and obtains secondary structured information from a plurality of pieces of primary structured information of the target ID, wherein the secondary structured information indicates the track of the target pedestrian within the site range; the cloud server acquires the secondary structural information of the plurality of site servers, integrates the plurality of pieces of secondary structural information according to the time stamps and the position information of the image acquisition equipment, obtains the real-time pedestrian track corresponding to the target ID, solves the problems of low precision and real-time performance of pedestrian track tracking, and improves the precision and real-time performance of pedestrian track tracking.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described and illustrated below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments provided in the present application without any inventive step are within the scope of protection of the present application. Moreover, it should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.
Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of ordinary skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments without conflict.
Unless defined otherwise, technical or scientific terms referred to herein shall have the ordinary meaning as understood by those of ordinary skill in the art to which this application belongs. Reference to "a," "an," "the," and similar words throughout this application are not to be construed as limiting in number, and may refer to the singular or the plural. The present application is directed to the use of the terms "including," "comprising," "having," and any variations thereof, which are intended to cover non-exclusive inclusions; for example, a process, method, system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to the listed steps or elements, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. Reference to "connected," "coupled," and the like in this application is not intended to be limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. Reference herein to "a plurality" means greater than or equal to two. "and/or" describes an association relationship of associated objects, meaning that three relationships may exist, for example, "A and/or B" may mean: a exists alone, A and B exist simultaneously, and B exists alone. Reference herein to the terms "first," "second," "third," and the like, are merely to distinguish similar objects and do not denote a particular ordering for the objects.
The present embodiment provides a method for tracking a pedestrian track across mirrors, and fig. 1 is a flowchart of the method for tracking a pedestrian track across mirrors according to the embodiment of the present application, as shown in fig. 1, the method includes the following steps:
step S101, the image acquisition equipment shoots a portrait and forms primary structured information according to the portrait, a time stamp and position information of the image acquisition equipment, wherein the portrait comprises a pedestrian image and a face image; in the embodiment, the image acquisition equipment is an intelligent camera, and a pedestrian detection and tracking algorithm is prepositioned on the intelligent camera, wherein the pedestrian detection is to judge whether pedestrians exist in an image or a video sequence by using a computer vision technology and provide accurate positioning, tracks among the pedestrians are associated to realize pedestrian tracking, the pedestrian detection can be realized by adopting a YOLO algorithm, the pedestrian tracking is realized by adopting a DeepsORT tracking algorithm, the pedestrian image and the face image are captured by detection and tracking, the time for capturing the pedestrian image and the face image and the position information of the captured intelligent camera are recorded, the information is packaged into one-level structural information and is uploaded to a site server where the intelligent camera is located;
step S102, a site server acquires primary structured information of a plurality of image acquisition devices within a site range, acquires a portrait recognition confidence coefficient of a target pedestrian according to the primary structured information, judges whether the portrait recognition confidence coefficient is greater than a face absolute confidence threshold value, if so, stores the primary structured information into a target ID corresponding to the target pedestrian, if not, acquires a multi-modal fusion confidence coefficient of the target pedestrian, judges whether the multi-modal fusion confidence coefficient is greater than the multi-modal fusion confidence threshold value, if so, stores the primary structured information into the target ID, and obtains secondary structured information from a plurality of pieces of primary structured information of the target ID, wherein the secondary structured information indicates the track of the target pedestrian within the site range; in the embodiment, a plurality of intelligent cameras are arranged below a site server, each intelligent camera acquires primary structured information and uploads the primary structured information to the site server, the site server processes the primary structured information of the intelligent cameras, the site server acquires the face recognition confidence of a target pedestrian on each piece of primary structured information, screens out the primary structured information with the face recognition confidence greater than the face absolute confidence threshold, acquires the multi-modal fusion confidence of the target pedestrian on each piece of primary structured information when the face recognition confidence is lower than the face absolute confidence threshold, screens out the primary structured information with the multi-modal fusion confidence greater than the multi-modal fusion confidence threshold, and stores the primary structured information into a target ID, and each site server performs the above processing, so that each site server can finally obtain a plurality of pieces of primary structured information of the target ID, and a plurality of pieces of primary structured information of the target ID are called secondary structured information;
and S103, the cloud server acquires secondary structural information of the plurality of site servers, and integrates the plurality of pieces of secondary structural information according to the time stamp and the position information of the image acquisition equipment to obtain the real-time pedestrian track corresponding to the target ID. Illustratively, the first site server comprises a first intelligent camera, the second site server comprises a second intelligent camera and a third intelligent camera, the first intelligent camera captures 3 portrait images, namely portrait image one, portrait image two and portrait image three, the second intelligent camera captures 2 portrait images, namely portrait image four and portrait image five, the third intelligent camera captures 2 portrait images, namely portrait image six and portrait image seven, the first intelligent camera uploads primary structural information corresponding to the captured 3 portrait images to the first site server, the second intelligent camera and the third intelligent camera upload primary structural information corresponding to the captured 4 portrait images to the second site server, the first site server calculates 3 people, and the portrait recognition confidence of a target pedestrian in the portrait image three is greater than an absolute confidence threshold, the multi-modal fusion confidence coefficient of the target pedestrian in the portrait image I is greater than the multi-modal fusion confidence threshold value, so that the third primary structured information corresponding to the portrait image III and the first primary structured information corresponding to the portrait image I are stored in the target ID, the second site server calculates 4 portrait images, the portrait recognition confidence coefficient of the target pedestrian in the portrait image IV and the portrait image VII is greater than the face absolute confidence threshold value, so the fourth primary structured information and the seventh primary structured information corresponding to the portrait image IV and the portrait image VII are stored in the target ID, at this time, the first secondary structured information of the first site server is the first primary structured information and the third primary structured information under the target ID, the second secondary structured information of the second site server is the fourth primary structured information and the seventh primary structured information under the target ID, the cloud server obtains first secondary structured information of a first site server and second secondary structured information of a second site server, and integrates the primary structured information in the secondary structured information according to a timestamp and position information of image acquisition equipment to obtain a real-time track of a pedestrian corresponding to a target ID, namely, if the time and place corresponding to third primary structured information is nine morning, the time and place corresponding to first primary structured information is ten morning, the time and place corresponding to fourth primary structured information is twelve noon, the time and place corresponding to seventh primary structured information is fourteen afternoon, and the real-time track of the pedestrian corresponding to the target ID is place A → place D → B → C.
Fig. 2 is a structural block diagram of a cross-mirror pedestrian trajectory tracking method according to an embodiment of the present application, as shown in fig. 2, an "edge" is an edge device and indicates a plurality of intelligent cameras, an "end" indicates a plurality of site servers, and a "cloud" indicates a cloud server, and in a computational power layout, an "edge-end-cloud" combination mode is adopted to reduce bandwidth pressure and computational power requirements of the cloud server, compared with the prior art, a laid camera is a monitoring camera, does not have computing power, needs to transmit a video stream, has a high bandwidth requirement, and all algorithm calculations are implemented on the cloud server, resulting in a slow computing speed and low real-time performance, in the present application scheme, a plurality of site servers are provided below the cloud server, a plurality of intelligent cameras are provided below each site server, and an intelligent camera only needs to upload one-level structural information to the site server where the intelligent camera is located, each site server only needs to process the first-level structural information of the intelligent camera within the site range, the cloud server only needs to integrate the second-level structural information of the target ID, the real-time performance can be improved under the condition that the bandwidth and the calculation power of the intelligent camera and the cloud server are not high, and the platform server judges whether the human figure contains the target pedestrian according to whether the human figure recognition confidence coefficient is greater than the human face absolute confidence threshold value or not, compared with the prior art that the target pedestrian is searched by simply using a pedestrian re-recognition technology, under the conditions of large flow, pedestrian shielding and complex environment, the pedestrian characteristic is lost and polluted, the precision of pedestrian track tracking is greatly reduced, and even the requirement of land use cannot be met, in the scheme, because the training sample size in the human figure recognition field is continuously enlarged, and the human face more represents the characteristic of an individual, the accuracy of the human image recognition is higher than that of algorithms such as pedestrian re-recognition and the like, when the recognition threshold is higher, the false recognition rate of the human image recognition is almost zero, the human face absolute confidence threshold is tested through a large number of test sets, and when the recognition score is larger than the human face absolute confidence threshold, the false recognition rate is smaller than 1e-9, so that whether the human image contains the target pedestrian is judged according to whether the human image recognition confidence is larger than the human face absolute confidence threshold, and the accuracy of the pedestrian track tracking is improved.
In some embodiments, fig. 3 is a flowchart of another cross-mirror pedestrian trajectory tracking method according to the embodiments of the present application, and determines whether a face recognition confidence is greater than a face absolute confidence threshold, and if the determination result is negative, as shown in fig. 3, the method includes the following steps:
step S301, the site server acquires the absolute path confidence, the pedestrian re-recognition confidence and the pedestrian attribute similarity of the target pedestrian according to the primary structured information, and acquires the multi-mode fusion confidence of the target pedestrian according to the absolute path confidence, the pedestrian re-recognition confidence, the pedestrian attribute similarity and the portrait recognition confidence; in this embodiment, the confidence of multimodal fusionS total The following equation 1 is used to obtain:
Wherein the content of the first and second substances,
,S
pathas absolute path confidence, S
face_ recognition For image recognition confidence, S
person_ReIDRe-identifying confidence for pedestrian, S
person_ attributeThe pedestrian attribute similarity is obtained;
Φithe function is a confidence coefficient mapping function, the purpose of the function is different measurement dimensionality, and the similar credibility of the represented pedestrians is the same as much as possible under the same confidence coefficient score. To reflect confidence levels for multimodal fusionS total The method has the effect of short wooden barrel plates, namely the confidence coefficient of a certain dimension is very small, the confidence coefficient of the pedestrian similarity is low, and multiplication is adopted when the confidence coefficients of all dimensions are fused. ai is the weight of each confidence. PhiiThe functional expression is found in equation 2 below:
I.e. confidence for different dimensions, wi is scaled up to 0.9, and wi fits a specific value in the actual traffic.
Step S302, judging whether the multi-modal fusion confidence coefficient is larger than a multi-modal fusion confidence threshold value, if so, storing the primary structured information into the target ID.
Through the above steps S301 to S302, when the confidence of the portrait recognition is lower than the absolute confidence threshold of the face, the portrait recognition may be wrong, and at this time, it can be determined whether the portrait includes the target pedestrian by using the method of this embodiment, so as to prevent the wrong determination, for example, two people are twins, one wears red clothes and one wears blue clothes, and the person can make a mistake when performing the face recognition, and it is determined whether the portrait includes the target pedestrian by determining whether the multi-modal fusion confidence is greater than the multi-modal fusion confidence threshold, so that the face recognition can be corrected, and the generalization performance and the fault tolerance rate of the cross-border pedestrian trajectory tracking method can be improved.
In some embodiments, the obtaining, by the site server, the absolute path confidence of the target pedestrian according to the primary structured information includes: the method comprises the steps that a site server obtains a topological structure diagram of a plurality of image acquisition devices in a site range, wherein the topological structure diagram takes the position of the image acquisition device as a node and takes a path between the image acquisition devices as an edge; obtaining a distance time confidence coefficient according to the truncated normal distribution curve and the topological structure chart, wherein the distance time confidence coefficient meets the truncated normal distribution curve; and acquiring a space position confidence coefficient according to the topological structure diagram, wherein the absolute path confidence coefficient comprises a distance time confidence coefficient and a space position confidence coefficient.
In this embodiment, fig. 4 is a schematic diagram of a topology structure diagram under a single site server according to the embodiment of the present application, as shown in fig. 4, an undirected graph of each intelligent camera point location may be constructed according to the position of the intelligent camera point location on a plane graph and a walking channel, a node of the undirected graph is each intelligent camera, and an edge is a distance between adjacent nodes, the topology structure diagram may reflect whether two nodes are connected, how many points of the shortest node are and a distance of the shortest path, an absolute path confidence coefficient of a target pedestrian may be obtained by the points of the shortest node and the distance of the shortest path, and an absolute path confidence coefficient SpathReflects the confidence of the physical position and the absolute path confidence S when the human figure is searched for matching in the search librarypathIncluding distance time confidence StimeAnd spatial position confidence SlocationWherein S ispath=Stime*Slocation;
Illustratively, the distance-time confidence level represents the probability density of a person moving a distance x at a certain time interval t, and the distance-time confidence level satisfies a truncated normal distribution
FIG. 5 is a truncated normal according to an embodiment of the present applicationA schematic of the distribution curve, as shown in figure 5,
a and b denote truncation intervals, fig. 6 is a schematic diagram of a partial topology structure diagram according to an embodiment of the present application, as shown in fig. 6, a, b, c, d, e, f, g, and h denote smart cameras, 100, 200, etc. denote path distances between two smart cameras, the shortest distance from point a to point d is 500m, the shortest distance from point a to point g is 200m, if the distance time confidence of 1 minute time interval satisfies the distribution of fig. 5, the distance time confidence from point a to point g is 0.38, and a->The confidence of d is 0.23; when the confidence coefficient of the spatial position is calculated, as shown in fig. 6, it is assumed that the capturing rate of the intelligent camera is p, when the target pedestrian appears at the point a, the maximum probability of being captured to the point f last time is p, and the maximum probability of being captured to the point d is p (1-p), wherein the confidence coefficient of the spatial position from the point f to the point a is p, and the confidence coefficient of the spatial position from the point d to the point a is p (1-p), if the target pedestrian needs to be captured to the point a, and the target pedestrian is not captured to the point c; compared with the prior art, the pedestrian track is constructed by the camera coordinates, the description of the pedestrian track in the real world is not met, the spatial position relation of the camera and the topological relation among pedestrian paths are omitted in the related art, and the running score in a specific test environment is high.
In some embodiments, the obtaining, by the site server, the image recognition confidence level, the pedestrian re-recognition confidence level, and the pedestrian attribute similarity of the target pedestrian according to the primary structured information includes:
acquiring the face features of a target pedestrian and a face feature library of a retrieval library, and taking the cosine distances of the face features and the face feature library as the confidence coefficient of face recognition; in this embodiment, the face feature library of the search library is a set of face features of all pedestrians on all pictures in the search library, and the face features of the target pedestrian and the face features of all pedestrians on all pictures in the search library are obtained through a face recognition model, and an exemplary cosine distance calculation formula is shown in formula 3, where Ai represents a face feature vector of the target pedestrian, and Bi represents a face feature vector of the search library:
Acquiring human body characteristics of a target pedestrian and a human body characteristic library of a search library, and taking the human body characteristics and the cosine distance of the human body characteristic library as a pedestrian re-identification confidence coefficient; exemplarily, the confidence coefficient of the re-recognition of the pedestrian is similar to the confidence coefficient of the recognition of the portrait, and the cosine distance calculation formula is also shown as formula 3, wherein Ai represents the human body feature vector of the target pedestrian, and Bi represents the human body feature vector of the search library;
and acquiring a product operation result of the gender similarity, the ethnicity similarity and the posture similarity as the pedestrian attribute similarity. In this embodiment, the attribute of the pedestrian is also an important feature of the pedestrian and should be taken as confidence judgment for tracking the pedestrian, in the scheme of the application, the gender, race and posture of the pedestrian are identified, and the classification result is converted into the attribute similarity of the pedestrian, taking the gender as an example, and the gender similarity SgenderThe calculation formula is shown as formula 4, and the meaning is the pedestrian gender identity probability of a and b:
Wherein the content of the first and second substances,
representing a confidence that the gender of the pedestrian is i,
representing the confidence that b the pedestrian is gender i,
representing a confidence that the pedestrian has a gender of j,
representing the confidence coefficient that the gender of the pedestrian is j, and the ethnic similarity S
raceSimilarity to posture S
bodyAlso calculated using equation 4, pedestrian attribute similarity S
person_ attributeThe calculation formula is shown in formula 5, and since the same gender, race and posture are defined as the same person, the multiplication is adopted:
In some embodiments, after the image acquisition device captures the human image, whether the illuminance, the definition, the target size, the integrity and the shielding area of the human image meet the corresponding threshold conditions or not is judged, and the human image meeting the threshold conditions, the timestamp and the position information of the image acquisition device form primary structured information. In this embodiment, the accuracy of pedestrian trajectory tracking can be seriously affected by the quality of the portrait image, so that the portrait image needs to be screened, and after the portrait image with the illuminance, the definition, the target size, the integrity and the shielding area all meeting the threshold condition is screened, the portrait image meeting the threshold condition, the timestamp and the position information of the image acquisition equipment form primary structured information.
In some embodiments, before the image acquisition device captures the human figure, a search library is constructed; after the image acquisition equipment captures the human image, whether the target pedestrian in the human image is successfully matched with the search library or not is judged, if the matching is not successful, the human image is stored in the search library, and a target ID is established for the target pedestrian. In this embodiment, the initial search library is empty, the matching between the captured portrait image and the search library is unsuccessful, the captured portrait image enters the search library, the matching between the target pedestrian in the portrait image and the search library is successful, it indicates that the target pedestrian in the portrait image and the same person in the search library are the same person, and the portrait images under different intelligent cameras are simultaneously matched with the same picture in the search library, which indicates that the target pedestrian exists in the portrait images under different intelligent cameras.
In some embodiments, after obtaining the plurality of pieces of primary structured information of the target ID, distances between features of the plurality of pieces of primary structured information of the target ID and features of corresponding target IDs in the search library are calculated, and the primary structured information with similarity lower than a preset value is updated to the target ID in the search library. In this embodiment, the influence of the features of the target ID in the search library on the portrait recognition and the pedestrian re-recognition is important, because in an open scene, the postures and the capturing angles of pedestrians are different, for example, the search library is a front photograph of a pedestrian, and the captured image is a side photograph of a pedestrian, the recognition accuracy is greatly reduced, so in the invention, when the search library is constructed, each pedestrian ID maintains N features, illustratively, the target ID has three pieces of primary structural information respectively corresponding to the feature 1, the feature 2 and the feature 3, the target ID has only one piece of structural information in the search library and corresponds to the feature 4, when the features of the search library are updated, the distances between the feature 1, the feature 2 and the feature 3 and the feature 4 are respectively calculated, when the similarity of the calculated feature 2 and the feature 4 is lower than a preset value, the primary structural information corresponding to the feature 2 is merged into the target ID in the search library, the greater the distance is, the lower the similarity indicates that the two features are features of the same target pedestrian at different angles, and the feature with the lowest feature similarity is selected as supplement of different dimensions among the features, so that the identification precision can be improved, wherein the features comprise human face features and human body features.
It should be noted that the steps illustrated in the above-described flow diagrams or in the flow diagrams of the figures may be performed in a computer system, such as a set of computer-executable instructions, and that, although a logical order is illustrated in the flow diagrams, in some cases, the steps illustrated or described may be performed in an order different than here.
The present embodiment further provides a system for tracking a pedestrian trajectory across mirrors, which is used to implement the foregoing embodiments and preferred embodiments, and the description of the system is omitted here for brevity. As used hereinafter, the terms "module," "unit," "subunit," and the like may implement a combination of software and/or hardware for a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.
Fig. 7 is a block diagram of a cross-mirror pedestrian trajectory tracking system according to an embodiment of the present application, and as shown in fig. 7, the system includes an image acquisition device, a site server, and a cloud server, where the image acquisition device is configured to capture a human figure, and form primary structured information according to the human figure, a timestamp, and position information of the image acquisition device, where the human figure includes a pedestrian image and a face image; the system comprises a site server, a plurality of image acquisition devices and a plurality of image processing devices, wherein the site server is used for acquiring primary structural information of the plurality of image acquisition devices within a site range, acquiring a portrait recognition confidence coefficient of a target pedestrian according to the primary structural information, judging whether the portrait recognition confidence coefficient is greater than a face absolute confidence threshold value, if so, integrating the primary structural information into a target ID of the target pedestrian, if not, acquiring a multi-modal fusion confidence coefficient of the target pedestrian, judging whether the multi-modal fusion confidence coefficient is greater than the multi-modal fusion confidence threshold value, if so, storing the primary structural information into the target ID, and acquiring secondary structural information from a plurality of pieces of primary structural information of the target ID, wherein the secondary structural information indicates the track of the target pedestrian within the site range; the cloud server is used for acquiring secondary structural information of the plurality of site servers, integrating the plurality of pieces of secondary structural information according to the time stamps and the position information of the image acquisition equipment to obtain the real-time pedestrian track corresponding to the target ID, solving the problems of low precision and low real-time performance of pedestrian track tracking, and improving the precision and the real-time performance of the pedestrian track tracking.
The above modules may be functional modules or program modules, and may be implemented by software or hardware. For a module implemented by hardware, the modules may be located in the same processor; or the modules can be respectively positioned in different processors in any combination.
The present embodiment also provides an electronic device comprising a memory having a computer program stored therein and a processor configured to execute the computer program to perform the steps of any of the above method embodiments.
Optionally, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.
It should be noted that, for specific examples in this embodiment, reference may be made to examples described in the foregoing embodiments and optional implementations, and details of this embodiment are not described herein again.
In addition, in combination with the method for tracking pedestrian trajectories across mirrors in the foregoing embodiments, the embodiments of the present application may provide a storage medium to implement. The storage medium having stored thereon a computer program; the computer program when executed by a processor implements any of the above described embodiments of a method of cross-mirror pedestrian trajectory tracking.
In one embodiment, a computer device is provided, which may be a terminal. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a cross-mirror pedestrian trajectory tracking method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
It should be understood by those skilled in the art that various features of the above-described embodiments can be combined in any combination, and for the sake of brevity, all possible combinations of features in the above-described embodiments are not described in detail, but rather, all combinations of features which are not inconsistent with each other should be construed as being within the scope of the present disclosure.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.