WO2023093241A1 - 行人重识别方法及装置、存储介质 - Google Patents
行人重识别方法及装置、存储介质 Download PDFInfo
- Publication number
- WO2023093241A1 WO2023093241A1 PCT/CN2022/120013 CN2022120013W WO2023093241A1 WO 2023093241 A1 WO2023093241 A1 WO 2023093241A1 CN 2022120013 W CN2022120013 W CN 2022120013W WO 2023093241 A1 WO2023093241 A1 WO 2023093241A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- pedestrian
- target person
- person
- image
- features
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 230000000694 effects Effects 0.000 claims description 72
- 238000004590 computer program Methods 0.000 claims description 10
- 238000012546 transfer Methods 0.000 claims description 10
- 238000012216 screening Methods 0.000 claims description 4
- 230000005021 gait Effects 0.000 description 30
- 230000000875 corresponding effect Effects 0.000 description 15
- 230000008859 change Effects 0.000 description 8
- 238000001514 detection method Methods 0.000 description 8
- 230000037237 body shape Effects 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 7
- 238000006243 chemical reaction Methods 0.000 description 6
- 238000012806 monitoring device Methods 0.000 description 3
- 238000012544 monitoring process Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000037308 hair color Effects 0.000 description 2
- 238000005286 illumination Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000006399 behavior Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000011840 criminal investigation Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Definitions
- the embodiments of the present application relate to the technical field of image recognition, and in particular to a pedestrian re-identification method and device, and a storage medium.
- Pedestrian re-identification is an important technology in the field of computer recognition. The problem it solves is to retrieve whether a specific person exists in a specific picture or video sequence. Pedestrian re-identification technology is widely used in criminal investigation, video surveillance, behavior understanding and other fields.
- the pedestrian re-identification technology cannot accurately identify pedestrians wearing different clothing, or changes in the brightness, color temperature, and hue of pedestrian images, and the overall recall rate is not high.
- the main purpose of the embodiments of the present application is to provide a pedestrian re-identification method, device, and storage medium, which can improve the overall recall rate of pedestrian re-identification.
- an embodiment of the present application provides a pedestrian re-identification method, including: acquiring video data and a reference image of a target person; determining a suspicious person in the video data according to the reference image of the target person, and obtaining A candidate image of each of the suspicious persons in the video data, wherein the apparent characteristics of the suspicious person in the candidate image are the same as the apparent characteristics of the target person in the reference image; according to the target A reference image of a person is used to determine whether the suspicious person in the candidate image is the target person.
- an embodiment of the present application also provides a pedestrian re-identification device, including at least one processor; and a memory connected to the at least one processor in communication; wherein, the memory stores information that can be used by the Instructions executed by at least one processor, the instructions are executed by the at least one processor, so that the at least one processor can execute the above pedestrian re-identification method.
- the embodiment of the present application further provides a computer-readable storage medium storing a computer program, and implementing the above pedestrian re-identification method when the computer program is executed by a processor.
- FIG. 1 is a schematic flow chart of a pedestrian re-identification method in an embodiment of the present invention
- FIG. 2 is a schematic flowchart of another pedestrian re-identification method in an embodiment of the present invention.
- FIG. 3 is a schematic diagram of unification of apparent features using a style transfer network in an embodiment of the present invention
- Fig. 4 is an exemplary flow chart of a pedestrian re-identification method in an embodiment of the present invention.
- Fig. 5 is a schematic structural diagram of a pedestrian re-identification device in an embodiment of the present invention.
- pedestrian re-identification methods mainly analyze and judge through the comprehensive analysis of appearance features, including clothing, lightness, color temperature and hue.
- the clothing of the same pedestrian remains unchanged, and the existing pedestrian re-identification network can adapt to the influence of individual posture, slight lighting changes and camera viewing angle changes.
- the existing pedestrian re-identification methods are very sensitive to appearance features, in this case, the pedestrian re-identification method cannot meet the requirements. Recognition of the same pedestrian wearing different clothing; at the same time, the lighting conditions in different camera acquisition areas are quite different, which will also lead to large differences in the brightness, color temperature and hue of the same pedestrian image. Changes in the apparent characteristics of the same pedestrian will lead to a decrease in the accuracy of the person re-identification method, and the overall recall rate of the person re-identification is not high.
- the embodiment of this application proposes a pedestrian re-identification method as shown in Figure 1, including the following steps .
- Step S11 Obtain video data and a reference image of the target person.
- a plurality of video data is obtained from a plurality of surveillance cameras, which may be distributed in different locations, and each surveillance camera corresponds to a piece of video data.
- the target person is the pedestrian who needs to be tracked from the multiple video data.
- Step S12 Determine the suspicious person in the video data according to the reference image of the target person, and obtain a candidate image of each suspicious person in the video data.
- the reference image of the target person includes the forensic features of the target person
- the suspicious person in the video data can be determined according to the forensic features in the reference image of the target person.
- the forensic features are, for example, gait features, body shape, height, and the like.
- the suspicious person in the video data can be determined according to the gait characteristics of the target person in the reference image of the target person, or the suspicious person in the video data can be determined according to the body shape of the target person in the reference image of the target person , or, determine the suspicious person in the video data according to the height of the target person in the reference image of the target person.
- a candidate image in which the suspicious person appears in the video data is obtained, and the apparent characteristics of the suspicious person in the candidate image are the same as the apparent characteristics of the target person in the reference image.
- the apparent features here refer to the clothing and clothing in the image, the degree of light and shade, color temperature and hue, etc.
- Step S13 Determine whether the suspicious person in the candidate image is the target person according to the reference image of the target person.
- the same pedestrian wears different clothes, or the appearance characteristics of the same pedestrian change due to the large difference in illumination conditions in the acquisition areas of different cameras.
- the appearance features of the target person in the reference image are the same, therefore, the situation that the target person cannot be accurately identified due to the change of the appearance feature can be avoided, and the overall recall rate of pedestrian re-identification is improved.
- the pedestrian re-identification method proposed in this application first obtains the video data and the reference image of the target person, and then determines the suspicious person in the video data according to the reference image of the target person, and obtains the candidate image of each suspicious person in the video data, the candidate image
- the apparent characteristics of the suspicious person in the target person are the same as those of the target person, and then determine whether the suspicious person in the candidate image is the target person according to the reference image of the target person, fully considering that the same pedestrian wears different clothes, or is captured by different cameras.
- the apparent characteristics of the same pedestrian change due to large differences in lighting conditions, it avoids the inability to accurately identify the target person due to changes in the apparent characteristics, and improves the overall recall rate of pedestrian re-identification.
- Another pedestrian re-identification method is proposed in this embodiment, as shown in FIG. 2 , including the following steps.
- Step S21 Obtain video data and a reference image of the target person.
- step S21 is the same as the step S11 in the previous embodiment, and in order to avoid repetition, it will not be repeated in this embodiment.
- Step S22 Determine the forensic characteristics of the target person according to the reference image of the target person, and screen suspicious persons in the video data according to the forensic characteristics of the target person and the preset pedestrian activity trajectory database.
- the retrieval data used for pedestrian re-identification is video data
- the data volume in the video is very large, there is a lot of redundant information, and it is unstructured data.
- Each activity trajectory represents the continuous position of the pedestrian in the video over a period of time.
- a database of pedestrian activity trajectories can be constructed by detecting pedestrian activity trajectories.
- video data is transformed from unstructured data into structured data that is easy to analyze and manage.
- the activity trajectory in this embodiment is for example: from time A to time B, pedestrians move from point C to point D.
- a preliminary screening is performed on pedestrians in the video data and suspicious persons are identified therefrom.
- the suspicious person in the video data can be determined according to the forensic features in the reference image of the target person and the activity track of each pedestrian in the preset pedestrian activity track database. According to the activity track of each pedestrian, the pedestrians in the video data that are similar to the target person's forensic characteristics are determined as suspicious persons. Forensic characteristics include but not limited to gait characteristics, gender, age, height, body shape, skin color, hairstyle and hair color , face shape, appendages and other characteristics. For example: the body shape of the target person can be determined according to the reference image of the target person, and then the body shape of the pedestrian can be determined from the video data according to the activity trajectory of each pedestrian, and the suspicious person in the video data can be initially screened out by comparing the body shape.
- the suspicious person in the video data is screened, including: obtaining the activity track of each pedestrian from the preset pedestrian activity track database; Acquire continuous image frames corresponding to the activity trajectory from the video data; calculate the gait characteristics of each pedestrian according to the continuous image frames; The similarity determines that the pedestrian is the suspicious person.
- the continuous image frames corresponding to the activity trajectory of each pedestrian are obtained from the preset pedestrian activity trajectory database, and the forensic features of each pedestrian are determined according to the continuous image frames corresponding to the activity trajectory of each pedestrian ( eg gait); determine the forensic features (eg gait) of the target person according to the reference image of the target person.
- a first preset threshold can be set, and when the similarity between the forensic features of the target person and the forensic features of each pedestrian is greater than the first preset threshold, it is preliminarily determined that the pedestrian is suspicious similar to the target person person; when the similarity is less than the first preset threshold, it is determined that the pedestrian is not similar to the target person and may not be considered.
- the first preset threshold set in this embodiment may be 0.5, and may be set according to actual needs in practical applications.
- gait is one of the most commonly used forensic features. Compared with other biological information such as human face, gait has the advantages of non-contact, long-distance detection, and not easy to disguise. However, at present, gait recognition still has the problem of low accuracy, and it is difficult to use gait recognition to directly verify the identity of pedestrians. Therefore, gait recognition can be used as a way to initially screen suspicious persons in video data.
- the trained gait recognition algorithm GaitSet can be used for gait feature recognition, and by judging the similarity between the gait features of each pedestrian in the video data and the gait features of the target person, it is determined whether the pedestrian is compatible with Suspicious person similar to the target person.
- the pedestrian activity trajectory database preset in this embodiment can be obtained in the following manner: determine the activity trajectory of each pedestrian in the video data, and the activity trajectory represents the continuous position of the pedestrian in a period of time in the video; Clustering, all activity trajectories in a class belong to the same pedestrian; the pedestrian activity trajectory database is constructed according to the clustered activity trajectories. Specifically, there are a large number of activity trajectories in the pedestrian activity trajectory database, and the detected activity trajectories may be the trajectories of the same pedestrian in different cameras, or the trajectories of different pedestrians.
- activity trajectories can be clustered based on forensic characteristics (such as gait characteristics), and activity trajectories with similar forensic characteristics are classified into one category, and the pedestrians corresponding to such activity trajectories belong to the same pedestrian.
- forensic characteristics such as gait characteristics
- the similarity between the gait characteristics of the target person and the gait characteristics of each corresponding pedestrian can be compared to determine the corresponding Whether the pedestrian is a suspicious person or not, thus avoiding comparing the gait characteristics of the target person with the gait characteristics of pedestrians in each video data one by one, which can greatly improve the retrieval efficiency.
- Step S23 According to the suspicious person, determine the candidate image of the suspicious person from the preset pedestrian image query database.
- the preset pedestrian picture query library is obtained in the following manner, including: obtaining the image frame of the corresponding activity trajectory according to the activity trajectory of each pedestrian in the preset pedestrian activity trajectory database; Extract representative images from image frames; unify the apparent features of each pedestrian in all representative images to obtain a pedestrian image query library.
- the forensic characteristics of the suspicious person are similar to those of the target person, but the images corresponding to these suspicious persons may be the same individual with different appearance characteristics, or may be different individuals with different appearance characteristics.
- the clothing attribute is a typical appearance feature, and the change of the clothing attribute will lead to a sharp drop in the recognition rate of the existing pedestrian re-identification system.
- this embodiment uses the style transfer network DG-Net++ to unify the clothing attributes of all images of suspicious persons. After conversion, all image frames of suspicious persons have the same clothing attributes.
- the images of all suspicious persons after the attribute can be obtained from the pedestrian image query database. In the pedestrian image query database, each suspicious person has a corresponding relationship with the image frame in which the suspicious person appears, and all the image frames of suspicious persons have the same clothing attribute.
- the image frame corresponding to each pedestrian activity trajectory includes multiple frames of images, if each frame is compared with the image of the target person one by one, the operation is cumbersome and inefficient.
- the continuous image frames corresponding to the activity trajectory are obtained according to the activity trajectory of each suspicious person.
- representative images are extracted from the continuous image frames
- the image frame is used as a candidate image of a suspicious person.
- the representative image frame is the image frame with the most exposed human body parts, and the apparent characteristics of each pedestrian in all representative images are unified to obtain the pedestrian image query library.
- the number of extracted candidate images may be one frame or two frames.
- Step S24 Unify the apparent features of the suspicious person in the candidate image with the apparent features of the target person in the reference image of the target person.
- unifying the apparent features of the suspicious person in the candidate image with the apparent features of the target person in the reference image of the target person includes: using a style transfer network to combine the apparent features of the suspicious person in the candidate image with the reference image of the target person The apparent characteristics of the target person are the same.
- the apparent features mainly include clothing attributes
- the clothing attributes include: clothing style, clothing color, etc.
- the clothing style and clothing color of the candidate image of the suspicious person and the reference image of the target person are unified, so that all Candidate images have the same apparent features as reference images.
- the apparent characteristics of the suspicious person in the candidate image are: white short-sleeved top and black trousers
- the apparent characteristics of the suspicious person in the reference image are: red short-sleeved top and gray shorts
- the candidate image and the reference image can be unified
- the apparent features of all images are red short-sleeved tops and gray shorts, or the apparent features in the unified candidate image and reference image are all white short-sleeved tops and black trousers.
- the style transfer network (also known as the style transfer network) can be used to unify the apparent features, so that the apparent features in all candidate images are the same as those in the reference image of the target person , to avoid subsequent difficulties in distinguishing whether the suspicious person is the target person because the apparent characteristics of the suspicious person are different from those of the target person.
- the suspicious person in different candidate images has clothing similar to the target person, for example: the target person is A, and its clothing attribute is 1, and the target person and its clothing attribute are recorded as A1; the suspicious person in the candidate image has Two people, B and C, whose clothing attributes are 2 and 3, record the suspicious person and their clothing attributes as B2 and C3.
- DG-Net++ to convert the clothing attributes of suspicious characters B and C to the style of A, B1 and C1 are obtained.
- A1, B1, and C1 have similar clothing attributes.
- the style transfer network is an adversarial generation network, which consists of two parts, the generator and the discriminator.
- the image of the pedestrian to be converted i.e., the candidate image
- the target style image i.e., the reference image of the target person
- the apparent features in all candidate images can be converted to be the same as the apparent features in the reference image of the target person, or the apparent features in the reference image of the target person can be converted to be the same as those in the candidate image
- the appearance characteristics can reduce the workload. Regardless of which method is used in practical applications, it is only necessary to unify the apparent features in the candidate image of the target person and the reference image of the target person.
- Step S25 Determine whether the suspicious person in the candidate image is the target person according to the reference image of the target person.
- the suspicious person before determining whether the suspicious person is the target person, first unify the apparent features of the suspicious person in the candidate image with the apparent features of the target person in the reference image of the target person, and then determine the target person in the candidate image according to the reference image of the target person. Whether the suspicious person is the target person, fully consider the situation that the same pedestrian wears different clothes, or the apparent characteristics of the same pedestrian change due to the large difference in lighting conditions in the acquisition area of different cameras, to avoid changes in the appearance characteristics However, the target person cannot be accurately identified, which improves the overall recall rate of the pedestrian re-identification method.
- the forensic features are features that have nothing to do with the appearance features; and input the candidate image into the pre-trained Obtain the forensic characteristics of the suspicious person in the pedestrian re-identification network; obtain the similarity between the forensic characteristics of the suspicious person and the forensic characteristics of the target person; determine the target person from the suspicious person according to the similarity.
- Existing person re-identification networks mainly focus on the appearance characteristics of pedestrians, and two pedestrians with similar clothing may have very similar characteristics, so they cannot correctly identify pedestrians.
- What is used in the present invention is the re-trained pedestrian re-identification network.
- the style conversion network is used to unify the apparent features of the pedestrians in the sample images, so that the pedestrians in all sample images
- the apparent features of the pedestrians are the same, and keep the labels of pedestrians unchanged after changing the apparent features of pedestrians, so that the network can learn forensic features that have nothing to do with the apparent features of pedestrians.
- the forensic features include but are not limited to gait features , gender, age, height, body shape, skin color, hairstyle, hair color, face shape, appendages and other characteristics.
- the pre-trained new pedestrian re-identification network is used to obtain the forensic characteristics of the corresponding suspicious person in the candidate image and the forensic characteristics of the target person corresponding to the reference image of the target person. Due to the new pedestrian re-identification The forensic features learned by the network have nothing to do with the apparent features. Therefore, by judging the similarity between the forensic features of the suspicious person and the forensic features of the target person, it can be determined whether the suspicious person is the target person.
- determining whether the suspicious person in the candidate image is the target person according to the reference image of the target person includes: inputting the reference image of the target person into a pre-trained pedestrian re-identification network to obtain the forensic features of the target person,
- the forensic feature is a feature that has nothing to do with the appearance feature; according to the candidate image, the forensic feature of the suspicious person is determined from the pre-set pedestrian feature database, and the pedestrian feature library includes the forensic feature of all pedestrians in the video data; according to the suspicious person’s
- the similarity between the forensic characteristics and the forensic characteristics of the target person determines whether the suspicious person is the target person.
- the pedestrian feature database is pre-built, and the forensic features of suspicious persons in the existing video data are stored in the database.
- the forensic features of suspicious persons in the existing video data are stored in the database.
- all suspicious persons whose similarity is greater than the second preset threshold are first screened out, and then combined with spatio-temporal information to filter out All suspicious persons whose similarity degree is greater than the second preset threshold are filtered, and the filtered suspicious persons are determined as target persons.
- the spatio-temporal information mentioned in this embodiment refers to time information and location information, that is, when and where the pedestrian appears, and the time information and location information of each frame of image in the video data of the monitoring system are known.
- the present invention proposes a pedestrian re-identification system and method with a high recall rate.
- the pedestrians appearing in the two videos may be the same individual with different appearance characteristics, or they may be two individuals.
- candidate targets are initially screened out by using forensic features, and then the appearance representations between candidate targets are transferred through the method of style conversion, and the pedestrian appearance representations are unified.
- the impact of appearance representation changes on the person re-identification system is eliminated, and the recall rate of the person re-identification system is improved.
- the method of style conversion can not only solve the impact of pedestrian clothing changes, but also solve the impact of illumination changes on appearance representation.
- the invention improves the generalization and robustness of the pedestrian re-identification network.
- step division of the above various methods is only for the sake of clarity of description. During implementation, it can be combined into one step or some steps can be split and decomposed into multiple steps. As long as they include the same logical relationship, they are all within the scope of protection of this patent. ; Adding insignificant modifications or introducing insignificant designs to the algorithm or process, but not changing the core design of the algorithm and process are all within the scope of protection of this patent.
- the pedestrian re-identification technology is one of the most commonly used technologies in criminal case investigation. Through pedestrian re-identification technology, investigators can analyze the activity trajectory of criminal suspects, which provides a favorable support for the analysis and detection of criminal cases. In recent years, criminal suspects' anti-reconnaissance awareness has gradually increased. During the crime process, they usually use masks and hats to cover their faces, and even change their clothes when they appear in different cameras, which leads to the use of existing pedestrian re-identification systems. When the suspect is tracking, it is very easy to lose the target, and the recognition accuracy is low.
- a high-recall pedestrian re-identification method proposed in this embodiment can greatly improve the recall rate of the pedestrian re-identification system, and then cooperate with spatiotemporal information (time information and position information) to filter false detection targets, which can realize Accurate identification of targets.
- spatiotemporal information time information and position information
- Step S31 Use pedestrian detection and pedestrian tracking to obtain pedestrian activity trajectories, and build a pedestrian activity trajectory database.
- this embodiment first uses the general object detection framework YOLO (You Only Look Once v4, YOLOv4 for short) to detect pedestrians in the video.
- YOLOv4 has the advantages of high detection accuracy and fast inference speed.
- a multi-object tracking algorithm (Fair multi-object tracking, referred to as FairMOT) is used to track the pedestrian to obtain the pedestrian's activity trajectory and build a pedestrian activity trajectory database. Each activity trajectory represents a period of time. The continuous positions of pedestrians in the video in time.
- the FairMOT algorithm achieves the highest accuracy on multiple object tracking datasets.
- Such unstructured video data is transformed into structured video data that is easy to analyze and manage. Since the research focus of this embodiment is not on pedestrian detection and pedestrian tracking, open-source target detection algorithms and pedestrian tracking algorithms can be directly used in this embodiment.
- Step S32 Clustering all trajectories of suspected objects based on forensic features to obtain the apparent feature library of candidate objects.
- activity trajectories in the pedestrian activity trajectory database, and the detected activity trajectories may be the trajectories of the same pedestrian in different cameras, or the trajectories of different pedestrians.
- activity trajectories can be clustered based on forensic characteristics (such as gait characteristics), and activity trajectories with similar forensic characteristics are classified into one category, and the pedestrians corresponding to such activity trajectories belong to the same person.
- the trained gait recognition algorithm GaitSet is used for gait recognition, but the current gait recognition still has the problem of low accuracy, it is difficult to use gait recognition to directly verify the identity of pedestrians, so this example only uses Gait characteristics were used as the basis for initial screening.
- it is necessary to verify the identity of pedestrians it is first necessary to extract the gait characteristics of each person after clustering, and then match the activity trajectory with the target person’s gait similarity exceeding a certain threshold from the activity trajectory database to screen out suspicious persons.
- the gait similarity threshold set in this embodiment may be 0.5, and may be set according to actual needs in practical applications.
- the video data can be combined to select key representative images for each suspicious person's activity trajectory to form an appearance feature library.
- Step S33 Use the style transfer network to normalize the clothing attributes of the images in the appearance feature library to the standard style to obtain candidate images, and form a pedestrian image query library.
- the forensic characteristics of suspicious persons in the images in the appearance feature database are similar to those of the target person, but these suspicious persons may be the same individual with different clothing attributes, or different individuals with different clothing attributes.
- Clothing style is a typical clothing attribute, and the change of clothing attribute will lead to a sharp drop in the recognition rate of existing person re-identification systems.
- this embodiment uses the style conversion network DG-Net++ to unify the clothing attributes of different images in the appearance feature library. After conversion, candidate images of suspicious persons are obtained, and all candidate images of suspicious persons form Pedestrian image query library.
- Step S34 Retrain the pedestrian re-identification network, and use the retrained pedestrian re-identification network to extract features from the pictures in the pedestrian image query library, and add the features to the pedestrian feature library.
- Step S35 Search in the pedestrian feature database according to the features of the target person.
- the pedestrian re-identification network structure used in the present invention is UPTP-ReID, and the network has achieved high accuracy in multiple test benchmarks.
- UPTP-ReID the style transfer network to carry out the clothing attributes of the pedestrians in the collected sample images and keep the labels of the pedestrians unchanged after the dressing, so that the pedestrian re-identification network can learn forensic features that are not related to the clothing attributes of pedestrians.
- the pedestrian feature database is pre-built, and the forensic features of suspicious persons in the existing video data are stored in the database.
- the forensic features of suspicious persons in the existing video data are stored in the database.
- Step S36 Combined with spatio-temporal information to filter the misidentified false positive samples, and output the final retrieval result.
- the location information and time information of the monitoring device in the monitoring system are known, so the location information and time information of each frame image in the video data are also known, so the false positive samples can be filtered through the location information and time information, For example: in a short period of time, it is impossible for a criminal suspect to appear in two monitoring devices that are far away.
- a high-recall pedestrian re-identification method proposed in this embodiment first clusters all the activity trajectories of pedestrians based on forensic features to obtain the apparent feature database of suspicious persons; Unify the clothing attributes of pedestrians to form a complete pedestrian image query database; obtain the pedestrian feature database based on the pedestrian image query database, and search the pedestrian feature database; finally, combine the position information and time information to filter the false positive samples that were misrecognized , output the final search results.
- the embodiment of the present invention also relates to a pedestrian re-identification device, as shown in FIG. 5 , including at least one processor 401; and a memory 402 connected in communication with at least one processor 401; Instructions executed by the processor 401, the instructions are executed by at least one processor 401, so that the at least one processor 401 can execute the above pedestrian re-identification method.
- the memory 402 and the processor 401 are connected by a bus, and the bus may include any number of interconnected buses and bridges, and the bus connects one or more processors 401 and various circuits of the memory 402 together.
- the bus may also connect together various other circuits such as peripherals, voltage regulators, and power management circuits, all of which are well known in the art and therefore will not be further described herein.
- the bus interface provides an interface between the bus and the transceivers.
- a transceiver may be a single element or multiple elements, such as multiple receivers and transmitters, providing means for communicating with various other devices over a transmission medium.
- the data processed by the processor 401 is transmitted on the wireless medium through the antenna, and further, the antenna also receives the data and transmits the data to the processor 401 .
- the processor 401 is responsible for managing the bus and general processing, and may also provide various functions including timing, peripheral interfacing, voltage regulation, power management, and other control functions. And the memory 402 may be used to store data used by the processor 401 when performing operations.
- An embodiment of the present invention also provides a computer-readable storage medium storing a computer program, and implementing the above-mentioned pedestrian re-identification method when the computer program is executed by a processor.
- a storage medium includes several instructions to make a device ( It may be a single-chip microcomputer, a chip, etc.) or a processor (processor) to execute all or part of the steps of the methods described in the various embodiments of the present application.
- the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disc, etc., which can store program codes. .
- an embodiment of the present invention also provides a computer program product, the computer program product includes a computer program stored on a non-transitory computer-readable storage medium, the computer program includes program instructions, and when the program instructions are executed When executed by a computer, the computer is made to execute the method in any of the above method embodiments.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Closed-Circuit Television Systems (AREA)
- Image Analysis (AREA)
Abstract
本发明实施例涉及图像识别技术领域,公开了一种行人重识别方法,包括:获取视频数据及目标人物的参考图像;根据所述目标人物的参考图像确定所述视频数据中的可疑人物,并获取所述视频数据中每个所述可疑人物的候选图像,其中,所述候选图像中所述可疑人物的表观特征与所述参考图像中所述目标人物的表观特征相同;根据所述目标人物的参考图像确定所述候选图像中的所述可疑人物是否为所述目标人物。
Description
本申请实施例涉及图像识别技术领域,特别涉及一种行人重识别方法及装置、存储介质。
随着社会的发展,公共场所的安全问题受到越来越多的关注,智能视频监控与分析系统得到广泛的应用。人脸识别等身份技术被广泛应用到视频的分析及理解中,但当目标人物大多距离摄像头较远、监控画面比较模糊或者人脸被遮挡时,采用人脸识别系统对行人身份进行认证的方法可能会完全失效。此时,可以使用行人重识别系统对行人身份进行识别,行人重识别是计算机识别领域中一个重要的技术,其解决的问题是,在一个特定的图片或者视频序列中检索是否存在特定的人。行人重识别技术被广泛应用于刑事侦查、视频监控、行为理解等领域。
一些实例中的行人重识别技术对于穿着不同服饰的行人,或者行人图像的明暗程度,色温色调等表观特征发生改变导致无法识别准确,整体召回率不高。
发明内容
本申请实施例的主要目的在于提出一种行人重识别方法及装置、存储介质,能够提高行人重识别的整体召回率。
为实现上述目的,本申请实施例提供了一种行人重识别方法,包括:获取视频数据及目标人物的参考图像;根据所述目标人物的参考图像确定所述视频数据中的可疑人物,并获取所述视频数据中每个所述可疑人物的候选图像,其中,所述候选图像中所述可疑人物的表观特征与所述参考图像中所述目标人物的表观特征相同;根据所述目标人物的参考图像确定所述候选图像 中的所述可疑人物是否为所述目标人物。
为实现上述目的,本申请实施例还提供了一种行人重识别装置,包括至少一个处理器;以及,与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行上述行人重识别方法。
为实现上述目的,本申请实施例还提供了一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时实现上述行人重识别方法。
一个或多个实施例通过与之对应的附图中的图片进行示例性说明,这些示例性说明并不构成对实施例的限定。
图1是本发明实施例中一种行人重识别方法的流程示意图;
图2是本发明实施例中另一种行人重识别方法的流程示意图;
图3是本发明实施例中利用样式转换网络进行表观特征的统一的示意图;
图4是本发明实施例中行人重识别方法的示例流程图;
图5是本发明实施例中行人重识别装置的结构示意图。
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合附图对本申请的各实施例进行详细的阐述。然而,本领域的普通技术人员可以理解,在本申请各实施例中,为了使读者更好地理解本申请而提出了许多技术细节。但是,即使没有这些技术细节和基于以下各实施例的种种变化和修改,也可以实现本申请所要求保护的技术方案。以下各个实施例的划分是为了描述方便,不应对本申请的具体实现方式构成任何限定,各个实施例在不矛盾的前提下可以相互结合相互引用。
目前行人重识别方法主要通过表观特征综合进行分析判断,表观特征包 括衣着服饰、明暗程度、色温色调等。在较短的时间范围内,同一个行人的服饰保持不变,现有的行人重识别网络可以适应个体姿态、轻微的光照变化及摄像头视角变化带来的影响。但在较长的时间范围内,比如一个月后,同一个行人的服饰可能会发生改变,由于现有的行人重识别方法对表观特征非常敏感,在该情况下,行人重识别方法无法满足对穿着不同服饰的同一个行人的识别;同时,不同摄像头采集区域内的光照情况差异较大,也会导致同一个行人图像的明暗程度,色温色调等表观特征相差较大。同一行人的表观特征发生改变,会导致行人重识别方法的准确率降低,行人重识别的整体召回率不高。
为了缓解行人表观特征的变化对行人重识别方法的影响,提升行人重识别方法的稳定性与泛化性,本申请实施例提出了一种行人重识别方法如图1所示,包括以下步骤。
步骤S11:获取视频数据及目标人物的参考图像。
具体的说,从多个监控摄像头获取多个视频数据,这多个监控摄像头可分布于不同的位置处,每个监控摄像头对应一个视频数据。目标人物即为需要从这多个视频数据中追踪到的行人。
步骤S12:根据目标人物的参考图像确定视频数据中的可疑人物,并获取视频数据中每个可疑人物的候选图像。
具体的说,目标人物的参考图像中包括目标人物的刑侦学特征,可依据目标人物的参考图像中的刑侦学特征来确定视频数据中的可疑人物。该刑侦学特征例如:步态特征、体型、身高等。本实施例中可依据目标人物的参考图像中该目标人物的步态特征来确定视频数据中的可疑人物,或者,依据目标人物的参考图像中该目标人物的体型来确定视频数据中的可疑人物,或者,依据目标人物的参考图像中该目标人物的身高来确定视频数据中的可疑人物。在确定视频数据中的可疑人物后,获取视频数据中该可疑人物出现的候选图像,候选图像中可疑人物的表观特征与参考图像中目标人物的表观特征相同。 这里的表观特征是指图像中饿衣着服饰、明暗程度、色温色调等。
步骤S13:根据目标人物的参考图像确定候选图像中的可疑人物是否为目标人物。
本实施例中充分考虑同一个行人穿着不同服饰、或者由于不同摄像头采集区域内的光照情况差异较大而导致同一行人的表观特征发生改变的情况,由于候选图像中可疑人物的表观特征与参考图像中目标人物的表观特征相同,因此,能够避免由于表观特征的改变而无法准确识别目标人物的情况,提高了行人重识别的整体召回率。
本申请提出的行人重识别方法先获取视频数据及目标人物的参考图像,之后,根据目标人物的参考图像确定视频数据中的可疑人物,并获取视频数据中每个可疑人物的候选图像,候选图像中可疑人物的表观特征与目标人物的表观特征相同,再根据目标人物的参考图像确定候选图像中的可疑人物是否为目标人物,充分考虑同一个行人穿着不同服饰、或者由于不同摄像头采集区域内的光照情况差异较大而导致同一行人的表观特征发生改变的情况,避免由于表观特征的改变而无法准确识别目标人物,提高了行人重识别的整体召回率。
本实施例中还提出了另一种行人重识别方法如图2所示,包括以下步骤。
步骤S21:获取视频数据及目标人物的参考图像。
上述步骤S21与前一实施例中的步骤S11相同,为避免重复,本实施例中不再赘述。
步骤S22:根据目标人物的参考图像确定目标人物的刑侦学特征,并根据目标人物的刑侦学特征和预先设置的行人活动轨迹数据库筛选视频数据中的可疑人物。
由于用于行人重识别的检索数据为视频数据,视频中数据量非常大、存在大量冗余信息且为非结构化数据。为了从海量数据中快速检索出可疑人物,需要预先检测视频中的行人,并对行人进行跟踪以获取该行人的活动轨迹, 每个活动轨迹表征了一段时间内行人在视频中的连续位置。视频数据中存在大量的行人,通过检测行人的活动轨迹,可构建出行人活动轨迹数据库。这样视频数据就由非结构化的数据转化为结构化易于分析与管理的结构化数据。本实施例中活动轨迹例如:从A时间到B时间行人从C点移动到D点。
为了提高检索效率,本实施例中先对视频数据中的行人进行初筛并从中确定出可疑人物。
本实施例中可根据目标人物的参考图像中的刑侦学特征以及预先设置的行人活动轨迹数据库中每个行人的活动轨迹共同来确定视频数据中的可疑人物。根据每个行人的活动轨迹将视频数据中与目标人物的刑侦学特征相似的行人确定为可疑人物,刑侦学特征包括但不限于步态特征、性别、年龄、身高、体型、肤色、发型发色、脸型、附属物等特征。例如:可根据目标人物的参考图像确定目标人物的体型,之后根据每个行人的活动轨迹从视频数据中确定出行人的体型,通过对比体型可初步筛选出视频数据中的可疑人物。
本实施例中根据目标人物的刑侦学特征和预先设置的行人活动轨迹数据库筛选视频数据中的可疑人物,包括:从预先设置的行人活动轨迹数据库中获取每个行人的活动轨迹;根据每个行人的活动轨迹从视频数据中获取与活动轨迹对应的连续的图像帧;根据连续的图像帧计算每个行人的步态特征;根据目标人物的刑侦学特征与每个行人的刑侦学特征之间的相似度确定行人为所述可疑人物。
本实施例中从预先设置的行人活动轨迹数据库获取每个行人的活动轨迹所对应的连续的图像帧,根据每个行人的活动轨迹所对应的连续的图像帧确定每个行人的刑侦学特征(例如步态);根据目标人物的参考图像确定目标人物的刑侦学特征(例如步态)。本实施例中可设置第一预设阈值,当目标人物的刑侦学特征与每个行人的刑侦学特征之间的相似度大于第一预设阈值则初步确定该行人为与目标人物相似的可疑人物;当相似度小于该第一预设阈值则确定该行人不与目标人物相似,可不予考虑。需要说明的是,在选择第一 预设阈值时,要注意选择合适的阈值,过大的阈值可能会导致该行人的轨迹不在可疑人物中。过小的阈值可能会导致筛选出的可疑人物过多,增加后续步骤的复杂度。本实施例中设置的第一预设阈值可为0.5,在实际应用中,可根据实际需要进行设置。
可选地,步态是一种最常用的刑侦学特征之一,相比于人脸等其他生物信息,步态具有非接触、可远距离检测、不容易伪装的优点。但目前步态识别还存在精度较低的问题,很难使用步态识别直接对行人的身份进行验证,因此中可将步态识别作为初步筛选视频数据中可疑人物的方式。本实施例可使用训练好的步态识别算法GaitSet进行步态特征识别,通过判断视频数据中每个行人的步态特征与目标人物的步态特征之间的相似度来确定该行人是否为与目标人物相似的可疑人物。
本实施例中预先设置的行人活动轨迹数据库可通过以下方式得到:确定视频数据中每个行人的活动轨迹,活动轨迹表征行人在视频中一段时间内的连续位置;根据刑侦学特征对活动轨迹进行聚类,一个类中的所有活动轨迹属于同一个行人;根据聚类后的活动轨迹构架行人活动轨迹数据库。具体的说,行人活动轨迹数据库中存在大量的活动轨迹,检测出的活动轨迹可能是同一个行人在不同摄像头中的轨迹,也可能是不同行人的轨迹。为了提高检索效率,可以先基于刑侦学特征(例如步态特征)对活动轨迹进行聚类,将刑侦学特征比较相似的活动轨迹归为一类,该类活动轨迹所对应的行人属于同一行人。
本实施例中预先设置的行人活动轨迹数据库中的活动轨迹聚类后,可将目标人物的步态特征与每个类对应行人的步态特征之间的相似度对比来确定每个类中对应的行人是否为可疑人物,从而避免了将目标人物的步态特征与每个视频数据中行人的步态特征一一进行比对,可大大提高检索效率。
步骤S23:根据可疑人物从预先设置的行人图片查询库中确定可疑人物的候选图像。
本实施例中预先设置的行人图片查询库通过以下方式得到,包括:根据预先设置的行人活动轨迹数据库中每个行人的活动轨迹获取对应活动轨迹的图像帧;从每个行人对应的活动轨迹的图像帧中抽取具有表征性的图像;统一所有具有表征性的图像中每个行人的表观特征得到行人图片查询库。
具体的,可疑人物的刑侦学特征与目标人物的刑侦学特征相似,但这些可疑人物所对应的图像可能是表观特征不同的同一个体,也可能是表观特征不同的不同个体。衣着属性是一种典型的表观特征,衣着属性的改变会导致现有的行人重识别系统的识别率大幅下降。为了消除衣着属性对行人重识别的影响,本实施例使用样式转换网络DG-Net++统一所有可疑人物的图像的衣着属性,转换后,所有可疑人物的图像帧均具有相同的衣着属性,根据统一衣着属性后的所有可疑人物的图像可得到行人图片查询库,在行人图片查询库中每个可疑人物与该可疑人物出现的图像帧存在对应关系,所有可疑人物的图像帧均具有相同的衣着属性。
可选地,由于每个行人活动轨迹对应的图像帧包括多帧图像,如果每一帧都与目标人物的图像一一进行比对,操作繁琐效率不高。本实施例中在确定出可疑人物后,根据每个可疑人物的活动轨迹获取与活动轨迹对应的连续的图像帧,为提高检索效率,本实施例中从连续的图像帧中抽取具有表征性的图像帧作为可疑人物的候选图像,该具有表征性的图像帧为暴露出的人体部位最多的图像帧,统一所有具有表征性的图像中每个行人的表观特征得到行人图片查询库,该行人图片查询库中每个可疑人物与该可疑人物出现的具有表征性的图像存在对应关系,所有可疑人物的具有表征性的图像均具有相同的衣着属性。其中,抽取的候选图像的数目可为一帧或两帧。
步骤S24:统一候选图像中可疑人物的表观特征与目标人物的参考图像中目标人物的表观特征。
具体的,统一候选图像中可疑人物的表观特征与目标人物的参考图像中目标人物的表观特征,包括:利用样式转换网络将候选图像中可疑人物的表 观特征与目标人物的参考图像中目标人物的表观特征相同。
本实施例中在确定候选图像中的可疑人物是否为目标人物之前,还需将目标人物的参考图像的表观特征与可疑人物的候选图像的表观特征统一起来,从而消除了由于行人的表观特征相同而影响到行人重识别的准确性。本实施例中由于表观特征主要包括衣着属性,衣着属性包括:衣服样式、衣服颜色等,将可疑人物的候选图像与目标人物的参考图像的衣服样式和衣服颜色等表观特征统一,使得所有候选图像与参考图像的表观特征均相同。例如:候选图像中疑人物的表观特征为:白色短袖上衣和黑色长裤;参考图像中可疑人物的表观特征为:红色短袖上衣和灰色短裤,则可统一候选图像与参考图像中的表观特征均为红色短袖上衣和灰色短裤,或者统一候选图像与参考图像中的表观特征均为白色短袖上衣和黑色长裤。
参见图3所示,本实施例中可利用样式转换网络(又称风格迁移网络)进行表观特征的统一,使得所有候选图像中的表观特征与目标人物的参考图像中的表观特征相同,避免由于可疑人物的表观特征与目标人物的表观特征不同而导致后续难以区分出可疑人物是否为目标人物。衣着属性统一后不同候选图像中的可疑人物具有和目标人物相似的服饰,例如:目标人物为A,其衣着属性为1,将目标人物及其衣着属性记为A1;候选图像中的可疑人物有B、C两人,他们的衣着属性为2、3,将可疑人物及其衣着属性记为B2、C3。使用DG-Net++将可疑人物B、C的衣着属性转换到A的风格后,得到了B1,C1。A1、B1、C1具有相似的衣着属性。
样式转换网络为对抗生成网络,该网络由生成器和判别器两部分组成,将待转换行人图像(即候选图像)及目标样式图像(即目标人物的参考图像)输入到生成器中,生成器可以输出拥有与目标样式图像相似样式的行人。
当然在实际应用中,可以将所有候选图像中的表观特征转换成与目标人物的参考图像中的表观特征相同,或者,将目标人物的参考图像中的表观特征转换成与候选图像中的表观特征,可减小工作量。不论在实际应用中采取 上述那种方式,只需将可以人物的候选图像和目标人物的参考图像中的表观特征统一即可。
步骤S25:根据目标人物的参考图像确定候选图像中的可疑人物是否为目标人物。
本实施例中在确定可疑人物是否为目标人物之前,先统一候选图像中可疑人物的表观特征与目标人物的参考图像中目标人物的表观特征,再根据目标人物的参考图像确定候选图像中的可疑人物是否为目标人物,充分考虑同一个行人穿着不同服饰、或者由于不同摄像头采集区域内的光照情况差异较大而导致同一行人的表观特征发生改变的情况,避免由于表观特征的改变而无法准确识别目标人物,提高了行人重识别方法的整体召回率。
在一个例子中,将目标人物的参考图像输入预先训练好的行人重识别网络中得到目标人物的刑侦学特征,刑侦学特征为与表观特征无关的特征;并将候选图像输入预先训练好的行人重识别网络中得到可疑人物的刑侦学特征;获取可疑人物的刑侦学特征与目标人物的刑侦学特征之间的相似度;根据相似度从可疑人物中确定出目标人物。
现有的行人重识别网络主要关注行人的表观特征,具有类似服饰的两个行人可能具有非常相似的特征,因此无法正确识别出行人身份。本发明中利用的是重新训练后的行人重识别网络,在对行人重识别网络进行重新训练时,先使用样式转换网络对样本图像中的行人进行表观特征的统一,使得所有样本图像中行人的表观特征相同,并保持改变行人的表观特征后行人的标签不变,这样网络就可以学习到与行人的表观特征无关的刑侦学特征,该刑侦学特征包括但不限于步态特征、性别、年龄、身高、体型、肤色、发型、发色、脸型、附属物等特征。本实施例中利用预先训练好的新的行人重识别网络分别获取候选图像中对应的可疑人物的刑侦学特征、以及目标人物的参考图像对应的目标人物的刑侦学特征,由于新的行人重识别网络所学习到的刑侦学特征与表观特征无关,因此,通过判断可疑人物的刑侦学特征与目标人物的 刑侦学特征之间的相似度可以确定出该可疑人物是否为目标人物。
在另一个例子中,根据目标人物的参考图像确定候选图像中的可疑人物是否为目标人物,包括:将目标人物的参考图像输入预先训练好的行人重识别网络中得到目标人物的刑侦学特征,刑侦学特征为与表观特征无关的特征;根据候选图像从预先设置的行人特征库中确定可疑人物的刑侦学特征,行人特征库中包括视频数据中所有行人的刑侦学特征;根据可疑人物的刑侦学特征与目标人物的刑侦学特征之间的相似度确定可疑人物是否为目标人物。
本实施例中预先构建行人特征库,将已有视频数据中可疑人物的刑侦学特征存储于数据库中,待需要搜索目标人物时,无需将全部的视频数据进行表观特征的转换即可,可大大提高工作效率。
在一些例子中,在确定目标人物时,为避免由于行人重识别网络出现假阳性样本,本实施例中先筛选出相似度大于第二预设阈值的所有可疑人物,然后结合时空信息对筛选出的相似度大于第二预设阈值的所有可疑人物进行过滤,将过滤后的可疑人物确定为目标人物,如此,使得行人重识别方法的准确性更高。例如,在较短的时间周期内,目标人物不可能同时出现在两个距离较远的监控装置中。本实施例中所说的时空信息即时间信息和位置信息,即该行人什么时间出现在什么位置,监控系统的视频数据中每一帧图像的时间信息和位置信息都是已知的。
综上,本发明提出了一种高召回率的行人重识别系统与方法,视频监控系统中存在大量的行人,两段视频中出现的行人可能是表观特征存在差异的同一个体,也可能是两个个体。为了从海量的行人数据中精准地进行行人重识别,通过使用刑侦学特征初步筛选出候选目标,再通过样式转换的方法对候选目标之间的外观表征进行迁移,对行人外观表征进行了统一,消除了外观表征变化对行人重识别系统的影响,提高了行人重识别系统的召回率。对样式转换的方法不仅可以解决行人服饰变化的影响,还可以解决光照变化对外观表征带来的影响,本发明提高了行人重识别网络的泛化性及鲁棒性。
上面各种方法的步骤划分,只是为了描述清楚,实现时可以合并为一个步骤或者对某些步骤进行拆分,分解为多个步骤,只要包括相同的逻辑关系,都在本专利的保护范围内;对算法中或者流程中添加无关紧要的修改或者引入无关紧要的设计,但不改变其算法和流程的核心设计都在该专利的保护范围内。
下面结合上述实施例给出一个示例,行人重识别技术是刑事案件侦查中最常用的技术之一。通过行人重识别技术,侦查人员可以分析出犯罪嫌疑人的活动轨迹,为刑事案件的分析与侦破提供了有利支撑。近年来,犯罪嫌疑人反侦察意识逐渐提高,在作案过程中通常使用口罩帽子等物品遮挡面部,甚至是出现在不同的摄像头中时不断更换自己的服饰,导致使用现有的行人重识别系统对犯罪嫌疑人进行追踪时,极易丢失目标,识别准确率较低。因此,本实施例提出的一种高召回率的行人重识别方法,可以极大提高行人重识别系统的召回率,再配合时空信息(时间信息和位置信息)对误检目标进行过滤,可实现目标的精准识别。具体流程图如图4所示,包括以下步骤。
步骤S31:使用行人检测和行人跟踪获取行人活动轨迹,构建行人活动轨迹数据库。
为了从海量数据中快速检索出候选对象,本实施例首先采用通用目标检测框架YOLO(You Only Look Once v4,简称YOLOv4)检测视频中的行人,YOLOv4具有检测精度高、推理速度快的优点。在获得待跟踪行人后采用一种多目标跟踪算法(Fair multi-object tracking,简称FairMOT)进行行人跟踪,以获取该行人的活动轨迹,并构建出行人活动轨迹数据库,每个活动轨迹表征了一段时间内行人在视频中的连续位置。FairMOT算法在多个目标跟踪数据集上精度达到了最高。这样非结构化的视频数据转化为易于分析与管理的结构化视频数据。由于本实施例的研究重点并不在于行人检测及行人跟踪,故本实施例中可直接使用开源的目标检测算法及行人跟踪算法。
步骤S32:基于刑侦学特征对疑似对象所有轨迹进行聚类,获取候选对 象的表观特征库。
行人活动轨迹数据库中存在大量的活动轨迹,检测出的活动轨迹可能是同一个行人在不同摄像头中的轨迹,也可能是不同行人的轨迹。为了提高检索效率,可以先基于刑侦学特征(例如步态特征)对活动轨迹进行聚类,将刑侦学特征比较相似的活动轨迹归为一类,该类活动轨迹所对应的行人属于同一人物。本实施例中使用训练好的步态识别算法GaitSet进行步态识别,但目前步态识别还存在精度较低的问题,很难使用步态识别直接对行人的身份进行验证,故本示例仅将步态特征作为初步筛选的依据。当需要验证行人的身份时,首先需要提取聚类后每个人物的步态特征,然后从活动轨迹数据库中匹配到与目标人物的步态相似度超过一定阈值的活动轨迹,筛选出可疑人物。本实施例中设置的步态相似度阈值可为0.5,在实际应用中,可根据实际需要进行设置。
确定可疑人物后可结合视频数据,针对每个可疑人物的活动轨迹从其中选取关键的具有表征性的图像,构成表观特征库。
步骤S33:使用样式转换网络将表观特征库中的图像的衣着属性归一化到标准的样式以得到候选图像,形成行人图片查询库。
表观特征库中的图像的可疑人物的刑侦学特征与目标人物的刑侦学特征相似,但这些可疑人物可能是衣着属性不同的同一个体,也可能是衣着属性不同的不同个体。服装样式是一种典型的衣着属性,衣着属性的改变会导致现有的行人重识别系统的识别率大幅下降。为了消除衣着属性对行人重识别的影响,本实施例使用样式转换网络DG-Net++统一表观特征库中不同图像的衣着属性,转换后,得到可疑人物的候选图像,所有可疑人物的候选图像形成行人图片查询库。
步骤S34:重新训练行人重识别网络,并使用重新训练后的行人重识别网络对行人图片查询库中的图片进行特征提取,将特征加入行人特征库。
步骤S35:依据目标人物的特征在行人特征库中进行检索。
现有的行人重识别网络主要关注行人的衣着属性,具有类似衣着属性的两个行人可能具有非常相似的特征,无法正确识别出行人身份。因此,本发明需对行人重识别网络进行重新训练。本发明使用的行人重识别网络结构为UPTP-ReID,该网络在多项测试基准中取得了较高的精度。训练之前时使用样式转换网络对采集的样本图像中的行人进行衣着属性并保持换装后行人的标签不变,这样行人重识别网络就可以学习到与行人的衣着属性无关的刑侦学特征。
本实施例中预先构建行人特征库,将已有视频数据中可疑人物的刑侦学特征存储于数据库中,待需要搜索目标人物时,无需将全部的视频数据进行表观特征的转换即可,可大大提高工作效率。
步骤S36:结合时空信息对误识别的假阳性样本进行过滤,输出最终检索结果。
由于监控系统中存在海量的视频数据,行人的数量非常多,即使采用样式转换网络和重新训练的行人重识别网络也极易误识别到假阳性样本。监控系统中监控装置的位置信息和时间信息是已知的,那么视频数据中每一帧图像的位置信息和时间信息也是已知的,故可以通过位置信息和时间信息对假阳性样本进行过滤,例如:在较短的时间周期内,犯罪嫌疑人不可能出现在两个距离较远的监控装置中。
本实施例中提出的一种高召回率的行人重识别方法,首先基于刑侦学特征对行人的所有活动轨迹进行聚类,获取可疑人物的表观特征库;然后使用样式转换网络将所有可疑人物的衣着属性统一起来,以形成完备的行人图片查询库;依据行人图片查询库得到行人特征库,对行人特征库进行检索;最后,再结合位置信息和时间信息对误识别的假阳性样本进行过滤,输出最终检索结果。
本发明实施例还涉及一种行人重识别装置,如图5所示,包括至少一个处理器401;以及,与至少一个处理器401通信连接的存储器402;其中,存 储器402存储有可被至少一个处理器401执行的指令,指令被至少一个处理器401执行,以使至少一个处理器401能够执行上述行人重识别方法。
其中,存储器402和处理器401采用总线方式连接,总线可以包括任意数量的互联的总线和桥,总线将一个或多个处理器401和存储器402的各种电路连接在一起。总线还可以将诸如外围设备、稳压器和功率管理电路等之类的各种其他电路连接在一起,这些都是本领域所公知的,因此,本文不再对其进行进一步描述。总线接口在总线和收发机之间提供接口。收发机可以是一个元件,也可以是多个元件,比如多个接收器和发送器,提供用于在传输介质上与各种其他装置通信的单元。经处理器401处理的数据通过天线在无线介质上进行传输,进一步,天线还接收数据并将数据传送给处理器401。
处理器401负责管理总线和通常的处理,还可以提供各种功能,包括定时、外围接口、电压调节、电源管理以及其他控制功能。而存储器402可以被用于存储处理器401在执行操作时所使用的数据。
本发明实施例还提供了一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时实现上述行人重识别方法。
即,本领域技术人员可以理解,实现上述实施例方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序存储在一个存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器(processor)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
此外,本发明实施例还提供了一种计算机程序产品,所述计算机程序产品包括存储在非暂态计算机可读存储介质上的计算机程序,所述计算机程序包括程序指令,当所述程序指令被计算机执行时,使所述计算机执行上述任 意方法实施例中的方法。
本领域的普通技术人员可以理解,上述各实施例是实现本发明的具体实施例,而在实际应用中,可以在形式上和细节上对其作各种改变,而不偏离本发明的精神和范围。
Claims (10)
- 一种行人重识别方法,其中,包括:获取视频数据及目标人物的参考图像;根据所述目标人物的参考图像确定所述视频数据中的可疑人物,并获取所述视频数据中每个所述可疑人物的候选图像,其中,所述候选图像中所述可疑人物的表观特征与所述参考图像中所述目标人物的表观特征相同;根据所述目标人物的参考图像确定所述候选图像中的所述可疑人物是否为所述目标人物。
- 根据权利要求1所述的行人重识别方法,其中,所述根据所述目标人物的参考图像确定所述视频数据中的可疑人物,并获取所述视频数据中每个所述可疑人物的候选图像,包括:根据所述目标人物的参考图像确定所述目标人物的刑侦学特征;根据所述目标人物的刑侦学特征和预先设置的行人活动轨迹数据库筛选所述视频数据中的可疑人物,所述预先设置的行人活动轨迹数据库中存储有所述视频数据中所有行人的活动轨迹,所述活动轨迹表征所述行人在视频中一段时间内的连续位置;根据所述可疑人物从预先设置的行人图片查询库中确定所述可疑人物的候选图像,所述行人图片查询库中存储有每个行人在所述视频数据中出现的图像帧,且所述行人图片查询库所有图像帧中行人的表观特征相同;统一所述候选图像中所述可疑人物的表观特征与所述目标人物的参考图像中所述目标人物的表观特征。
- 根据权利要求2所述的行人重识别方法,其中,所述根据所述目标人物的刑侦学特征和预先设置的行人活动轨迹数据库筛选所述视频数据中的可疑人物,包括:从所述预先设置的行人活动轨迹数据库中获取每个行人的活动轨迹;根据所述每个行人的活动轨迹从所述视频数据中获取与所述活动轨迹对 应的连续的图像帧;根据所述连续的图像帧计算每个所述行人的刑侦学特征;根据所述目标人物的刑侦学特征与每个行人的刑侦学特征之间的相似度确定所述行人为所述可疑人物。
- 根据权利要求2所述的行人重识别方法,其中,所述预先设置的行人活动轨迹数据库通过以下方式得到:确定所述视频数据中每个行人的活动轨迹,所述活动轨迹表征所述行人在视频中一段时间内的连续位置;根据刑侦学特征对所述活动轨迹进行聚类,一个类中的所有活动轨迹属于同一个行人;根据聚类后的所述活动轨迹构架所述行人活动轨迹数据库。
- 根据权利要求2所述的行人重识别方法,其中,所述预先设置的行人图片查询库通过以下方式得到,包括:根据所述预先设置的行人活动轨迹数据库中每个行人的活动轨迹获取对应所述活动轨迹的图像帧;从每个所述行人对应的所述活动轨迹的图像帧中抽取具有表征性的图像;统一所有所述具有表征性的图像中每个所述行人的表观特征得到所述行人图片查询库。
- 根据权利要求5所述的行人重识别方法,其中,所述统一所述候选图像中所述可疑人物的表观特征与所述目标人物的参考图像中所述目标人物的表观特征,包括:利用样式转换网络将所述候选图像中所述可疑人物的表观特征与所述目标人物的参考图像中所述目标人物的表观特征相同。
- 根据权利要求1所述的行人重识别方法,其中,所述根据所述目标人物的参考图像确定所述候选图像中的所述可疑人物是否为所述目标人物,包括:将所述目标人物的参考图像输入预先训练好的行人重识别网络中得到所述目标人物的刑侦学特征,所述刑侦学特征为与所述表观特征无关的特征;并将所述候选图像输入所述预先训练好的行人重识别网络中得到所述可疑人物的刑侦学特征;获取所述可疑人物的刑侦学特征与所述目标人物的刑侦学特征之间的相似度;根据所述相似度从所述可疑人物中确定出所述目标人物。
- 根据权利要求1所述的行人重识别方法,其中,所述根据所述目标人物的参考图像确定所述候选图像中的所述可疑人物是否为所述目标人物,包括:包括:将所述目标人物的参考图像输入预先训练好的行人重识别网络中得到所述目标人物的刑侦学特征,所述刑侦学特征为与所述表观特征无关的特征;根据所述候选图像从预先设置的行人特征库中确定所述可疑人物的刑侦学特征,所述行人特征库中包括所述视频数据中所有行人的刑侦学特征;根据所述可疑人物的刑侦学特征与所述目标人物的刑侦学特征之间的相似度确定所述可疑人物是否为所述目标人物。
- 一种行人重识别装置,其中,包括至少一个处理器;以及,与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行如权利要求1至8中任一所述的行人重识别方法。
- 一种计算机可读存储介质,存储有计算机程序,其中,所述计算机程序被处理器执行时实现如权利要求1至8中任一所述的行人重识别方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111435697.1 | 2021-11-29 | ||
CN202111435697.1A CN116189026A (zh) | 2021-11-29 | 2021-11-29 | 行人重识别方法及装置、存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023093241A1 true WO2023093241A1 (zh) | 2023-06-01 |
Family
ID=86438882
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/120013 WO2023093241A1 (zh) | 2021-11-29 | 2022-09-20 | 行人重识别方法及装置、存储介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN116189026A (zh) |
WO (1) | WO2023093241A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117710903A (zh) * | 2024-02-05 | 2024-03-15 | 南京信息工程大学 | 一种基于ReID和Yolov5双模型下的可视化特定行人追踪方法及系统 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102819578A (zh) * | 2012-07-24 | 2012-12-12 | 武汉大千信息技术有限公司 | 视频侦查嫌疑目标分析系统及方法 |
CN109359552A (zh) * | 2018-09-21 | 2019-02-19 | 中山大学 | 一种高效的跨摄像头行人双向跟踪方法 |
CN113449550A (zh) * | 2020-03-25 | 2021-09-28 | 华为技术有限公司 | 人体重识别数据处理的方法、人体重识别的方法和装置 |
-
2021
- 2021-11-29 CN CN202111435697.1A patent/CN116189026A/zh active Pending
-
2022
- 2022-09-20 WO PCT/CN2022/120013 patent/WO2023093241A1/zh unknown
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102819578A (zh) * | 2012-07-24 | 2012-12-12 | 武汉大千信息技术有限公司 | 视频侦查嫌疑目标分析系统及方法 |
CN109359552A (zh) * | 2018-09-21 | 2019-02-19 | 中山大学 | 一种高效的跨摄像头行人双向跟踪方法 |
CN113449550A (zh) * | 2020-03-25 | 2021-09-28 | 华为技术有限公司 | 人体重识别数据处理的方法、人体重识别的方法和装置 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117710903A (zh) * | 2024-02-05 | 2024-03-15 | 南京信息工程大学 | 一种基于ReID和Yolov5双模型下的可视化特定行人追踪方法及系统 |
CN117710903B (zh) * | 2024-02-05 | 2024-05-03 | 南京信息工程大学 | 一种基于ReID和Yolov5双模型下的可视化特定行人追踪方法及系统 |
Also Published As
Publication number | Publication date |
---|---|
CN116189026A (zh) | 2023-05-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11288504B2 (en) | Iris liveness detection for mobile devices | |
CN110458101B (zh) | 基于视频与设备结合的服刑人员体征监测方法及设备 | |
JP6018674B2 (ja) | 被写体再識別のためのシステム及び方法 | |
CN109558810B (zh) | 基于部位分割与融合目标人物识别方法 | |
CN103049459A (zh) | 一种基于特征识别的快速录像检索方法 | |
CN107230267B (zh) | 基于人脸识别算法的幼儿园智能签到方法 | |
US20180261071A1 (en) | Surveillance method and system based on human behavior recognition | |
CN104091176A (zh) | 人像比对在视频中的应用技术 | |
KR20190093799A (ko) | Cctv를 통한 실시간 실종자 얼굴 인식 시스템 및 그 방법 | |
CN105975938A (zh) | 一种智能动态人脸识别社区管家服务系统 | |
WO2022121498A1 (zh) | 身份识别方法、模型训练方法、装置、设备和存储介质 | |
KR20130122262A (ko) | 영상 검색 시스템 및 영상 분석 서버 | |
Galiyawala et al. | Person retrieval in surveillance using textual query: a review | |
WO2023093241A1 (zh) | 行人重识别方法及装置、存储介质 | |
Amaro et al. | Evaluation of machine learning techniques for face detection and recognition | |
Singh et al. | A comprehensive survey on person re-identification approaches: various aspects | |
CN112435414A (zh) | 一种基于人脸识别的安防监控系统及其监控方法 | |
Sokolova et al. | Methods of gait recognition in video | |
Yuan et al. | Ear detection based on CenterNet | |
Gupta et al. | HaarCascade and LBPH Algorithms in Face Recognition Analysis | |
Nguyen et al. | Reliable detection of eye features and eyes in color facial images using ternary eye-verifier | |
Peng et al. | [Retracted] Helmet Wearing Recognition of Construction Workers Using Convolutional Neural Network | |
Saranya et al. | Computer Vision on Identifying Persons under Real Time Surveillance using IOT | |
US20230360402A1 (en) | Video-based public safety incident prediction system and method therefor | |
Park et al. | Intensity classification background model based on the tracing scheme for deep learning based CCTV pedestrian detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22897327 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |