WO2023207197A1 - 目标重识别方法、装置、设备和计算机可读存储介质 - Google Patents

目标重识别方法、装置、设备和计算机可读存储介质 Download PDF

Info

Publication number
WO2023207197A1
WO2023207197A1 PCT/CN2022/143487 CN2022143487W WO2023207197A1 WO 2023207197 A1 WO2023207197 A1 WO 2023207197A1 CN 2022143487 W CN2022143487 W CN 2022143487W WO 2023207197 A1 WO2023207197 A1 WO 2023207197A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
image
frame
identified
appearance
Prior art date
Application number
PCT/CN2022/143487
Other languages
English (en)
French (fr)
Inventor
何烨林
魏新明
肖嵘
Original Assignee
深圳云天励飞技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳云天励飞技术股份有限公司 filed Critical 深圳云天励飞技术股份有限公司
Publication of WO2023207197A1 publication Critical patent/WO2023207197A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques

Definitions

  • the present application belongs to the field of image recognition technology, and in particular relates to a target re-identification method, device, equipment and computer-readable storage medium.
  • Target re-identification technology is an important technical means in the fields of intelligent security, missing target search, case investigation and other fields. For example, in the field of missing target finding, it is used to find missing targets based on their images.
  • target re-identification is mainly based on deep learning technology.
  • the Yolo detection algorithm is first used to detect each image in the multi-frame images included in multiple video files, and then crop and save it to form a historical image library; and then use
  • the feature extraction network extracts the feature vector of each image in the cropped historical image library; finally, it compares and finds the image in the historical image library that is closest to the feature vector of the target image to be recognized.
  • the recognition accuracy of the target image to be identified in the related technology is low.
  • This application provides a target re-identification method, device, equipment and computer-readable storage medium, which can avoid inaccurate re-identification of the target image to be identified due to the different colors of clothing between the target image to be identified and the target images in the historical image library.
  • this application provides a target re-identification method, including:
  • Input a single frame image in the multi-frame image into the appearance feature extraction network to obtain the appearance features of the target to be identified.
  • the appearance features of the target to be identified have nothing to do with the color of the single frame image.
  • the appearance feature The feature extraction network is used to eliminate the interference of the color of the image on the appearance of the target;
  • an image matching the target to be identified is determined from a historical image library.
  • This application inputs a single frame image in a multi-frame image into the appearance feature extraction network to obtain the appearance features of the target to be identified that are independent of the color of the single frame image. Based on the appearance features of the target to be identified, it is determined from the historical image database An image that matches the target to be identified. This method avoids the problem of low re-identification accuracy of the target image to be identified due to the different colors of clothing between the target image to be identified and the target images in the historical image database, and improves the accuracy of re-identification.
  • the present application provides a target re-identification device, which is used to perform the method in the above-mentioned first aspect or any possible implementation of the first aspect.
  • the device includes:
  • the first acquisition module is used to acquire multiple frames of images, each of which includes the target to be identified;
  • the second acquisition module is used to input a single frame image in the multi-frame image into the appearance feature extraction network to obtain the appearance features of the target to be identified.
  • the appearance features of the target to be identified are consistent with the single frame image. Regardless of the color, the appearance feature extraction network is used to eliminate the interference of the color of the image on the appearance of the target;
  • a recognition module configured to determine an image matching the target to be recognized from a historical image library based on the appearance characteristics of the target to be recognized.
  • this application provides a target re-identification device, which includes a memory and a processor.
  • the memory is used to store instructions; the processor executes the instructions stored in the memory, so that the device performs the target re-identification method in the first aspect or any possible implementation of the first aspect.
  • a computer-readable storage medium In a fourth aspect, a computer-readable storage medium is provided. Instructions are stored in the computer-readable storage medium. When the instructions are run on a computer, they cause the computer to execute the first aspect or any possible implementation of the first aspect. Target re-identification method.
  • a fifth aspect provides a computer program product containing instructions that, when run on a device, cause the device to execute the target re-identification method of the first aspect or any possible implementation of the first aspect.
  • Figure 1 is a schematic flowchart of a target re-identification method provided by an embodiment of the present application
  • Figure 2 is a schematic flowchart of a target re-identification method provided by an embodiment of the present application
  • FIG. 3 is a schematic flowchart of a target re-identification method provided by an embodiment of the present application.
  • Figure 4 is a schematic flowchart of a target re-identification method provided by an embodiment of the present application.
  • Figure 5a is a schematic flowchart of a target re-identification method provided by an embodiment of the present application.
  • Figure 5b is a schematic diagram of determining the first score provided by an embodiment of the present application.
  • Figure 5c is a schematic diagram of determining the second score provided by an embodiment of the present application.
  • Figure 6 is a schematic flowchart of a target re-identification method provided by an embodiment of the present application.
  • Figure 7 is a schematic structural diagram of a target re-identification device provided by an embodiment of the present application.
  • Figure 8 is a schematic structural diagram of a target re-identification device provided by an embodiment of the present application.
  • the term “if” may be interpreted as “when” or “once” or “in response to determining” or “in response to detecting” depending on the context. ". Similarly, the phrase “if determined” or “if [the described condition or event] is detected” may be interpreted, depending on the context, to mean “once determined” or “in response to a determination” or “once the [described condition or event] is detected ]” or “in response to detection of [the described condition or event]”.
  • This application provides a target re-identification method, device, equipment and computer-readable storage medium.
  • the method can be implemented through a target re-identification device and can be used in scenarios such as case investigation, missing person search and intelligent security.
  • the target re-identification device is communicatively connected with the target re-identification equipment.
  • the target re-identification device can communicate with the target re-identification device through an application (APP), a web page, a public account and a small program in the application, so that the target re-identification device and the target re-identification device can communicate with each other.
  • APP application
  • Users can achieve target re-identification through a target re-identification device that communicates with the target re-identification device.
  • the target re-identification device refers to the equipment used by the user when performing target re-identification.
  • the target re-identification device can be a device with display hardware and corresponding software support, such as smartphones, tablets, desktop computers, laptops, wearable devices, handheld devices, and vehicle-mounted devices.
  • the embodiments of this application do not place any restrictions on the specific type of target re-identification equipment.
  • the target re-identification method provided by the embodiment of the present application will be described in detail below in conjunction with the target re-identification device.
  • Figure 1 shows a schematic flowchart of a target re-identification method provided by an embodiment of the present application.
  • the target re-identification method provided by this application may include:
  • the target re-identification device may display a search page of the target re-identification device.
  • the search page of the target re-identification device is used to display an entry port for inputting multi-frame images and to display search results. This application does not limit the specific implementation of the search page of the target re-identification device.
  • the search page may include a control for inputting videos or images, and the control is used to trigger search by inputting videos or images.
  • the user can perform target re-identification on the search page of the target re-identification device. Therefore, the target re-identification device can send the user's search request to the target re-identification device. Among them, the user's search request is used to indicate that the user wants to perform target re-identification.
  • the target re-identification device can convert the operation into a search request from the user and send it to The target re-identification device sends the user's search request.
  • This application does not limit the specific implementation method of the user's search request.
  • the multi-frame images are obtained by tracking the target to be identified in the given video data through the target tracking algorithm.
  • the target re-identification device may store the target tracking algorithm in the target re-identification device and/or the storage device.
  • the storage device can communicate with the target re-identification device, so that the target re-identification device can obtain the appearance features extracted through the appearance feature extraction network from the storage device.
  • This application does not limit the storage method and specific type of storage devices.
  • the target tracking algorithm is a DeepSORT algorithm.
  • the given video data can be directly given by the user, or it can be extracted from the video data collected by image collection equipment such as surveillance cameras and video cameras.
  • the camera may include a single camera, dual cameras, or three cameras, or the camera may be set to a wide-angle camera or a telephoto camera, which is not limited in the embodiments of the present application.
  • the multi-frame images are from images in the video data given directly by the user. Specifically, in a given multi-frame image, each frame includes a target to be recognized.
  • the multi-frame images come from multi-frame images extracted from video data collected by image collection devices such as surveillance cameras and video cameras. Specifically, among the multiple frames of images collected, each frame includes the target to be identified.
  • each frame of the multiple frames of images includes the target with the known identity to be identified.
  • each frame of the multiple frames of images includes the target with an unknown identity to be identified.
  • the targets to be identified include but are not limited to the human body.
  • the target re-identification device is a mobile phone and the target re-identification device is an applet.
  • the applet can display the search page of the target re-identification device.
  • the user can enter multiple frames of images in which each frame of the image includes the missing person A to be identified on the search page of the mini program.
  • the target re-identification device can obtain the search request. Therefore, the target re-identification device can execute the search request, first input a single frame image among the multi-frame images into the appearance feature extraction network, and extract the appearance features of the target to be recognized.
  • the appearance feature extraction network is designed in advance, and the appearance feature extraction network is used to eliminate the interference of the color of the image on the appearance of the target.
  • the appearance feature extraction network can be insensitive to the color of clothing when extracting appearance features.
  • the target re-identification device may store the pre-designed appearance feature extraction network in the target re-identification device and/or the storage device.
  • appearance characteristics may include but are not limited to one or more of hair, tops, bottoms, gender, whether a backpack is worn, whether a hat is worn, whether a backpack is worn, and shoes.
  • appearance features include five features: hair, tops, bottoms, gender, and shoes.
  • the target re-identification device is a mobile phone and the target re-identification device is an applet.
  • the applet can display the search page of the target re-identification device.
  • the mini program retrieves the appearance feature extraction network from the storage device to extract the missing person A including the missing person A. Hair, tops, bottoms, gender and shoe characteristics.
  • the images in the historical image library can be an image library with multiple historical target images pre-stored in the target re-identification device/storage device, or they can be collected from multiple surveillance cameras, video cameras, etc. that are communicatively connected to the target re-identification device.
  • the image corresponding to the video data collected by the device can be an image library with multiple historical target images pre-stored in the target re-identification device/storage device, or they can be collected from multiple surveillance cameras, video cameras, etc.
  • images matching the target to be identified can be matched from a pre-stored image library with multiple historical target images.
  • the image corresponding to the video data collected by multiple surveillance cameras, video cameras and other image acquisition devices that are communicatively connected to the target re-identification device can be matched with the target to be identified. Matching images.
  • the target re-identification device is a mobile phone and the target re-identification device is an applet.
  • the applet can display the search page of the target re-identification device.
  • the mini program retrieves the appearance feature extraction network from the storage device to extract the missing person A including the missing person A.
  • the applet retrieves the historical image library from the storage device and searches for missing person A’s hair, top, bottoms, gender, and shoe characteristics from the historical image library based on the characteristics of the hair, top, bottoms, gender, and shoes. Images that match Population A, and the images that match Missing Person A that are found from the historical image library are displayed on the search page.
  • the target re-identification method inputs a single frame image in a multi-frame image into the appearance feature extraction network to obtain the appearance features of the target to be identified that are independent of the color of the single frame image. According to the appearance features of the target to be identified , determine the image that matches the target to be recognized from the historical image library.
  • the appearance feature extraction network used to eliminate the interference of image color on the appearance of the target is used to extract the appearance features of the target to be identified that are independent of color in a single frame image. Since the appearance features of the target to be identified are independent of color, it is possible to avoid The clothing color of the image is different from the target image in the historical image database, which leads to the problem of low re-identification accuracy of the target image to be recognized.
  • this application also provides a generation process of a pre-designed appearance feature extraction network.
  • the target re-identification device uses the appearance feature extraction network to eliminate the interference of the color of the image on the appearance of the target.
  • the appearance feature extraction network generation process can be generated by a feature network generation system or other feasible network generation systems, which will not be described again here.
  • Figure 2 shows a schematic flow chart of a process of generating an appearance feature extraction network provided by an embodiment of the present application.
  • the process of generating an appearance feature extraction network may include:
  • the sample images are multiple frames.
  • the sample image may be an image prepared for shooting in advance, or may be selected from existing images in which each frame includes a sample target.
  • S202 Perform data enhancement on each pixel in the sample image of each frame to obtain a multi-frame enhanced image.
  • multiple frames of enhanced images are obtained by exchanging the color values of each color channel of each pixel in the sample image of each frame.
  • the sample image is an RGB image
  • the color values of the RGB color channels of each pixel in the RGB image are exchanged to obtain five images in addition to the original RGB image.
  • the five images include: RBG image, GRB image , GBR image, BRG image and BGR image.
  • the color value of each color channel is exchanged for each pixel in the sample image of each frame, and the gray value of each pixel in the sample image of each frame is converted. , to obtain multi-frame enhanced images.
  • the gray value of each pixel in the sample image of each frame is converted, that is, the gray value of each pixel is randomly changed.
  • S203 Determine the appearance characteristics of the sample target according to the multi-frame enhanced image.
  • the appearance characteristics of the sample target have nothing to do with the color of the multi-frame enhanced image.
  • the sample image is first obtained, and each pixel in each frame of the sample image is data enhanced to obtain a multi-frame enhanced image. Based on the multi-frame enhanced image, the sample is determined The appearance characteristics of the target and the appearance characteristics of the sample target have nothing to do with the color of the multi-frame enhanced image. According to the appearance characteristics of the sample target, the original feature extraction network is trained to obtain the appearance feature extraction network. Since data enhancement is performed on each pixel of the sample image during the generation process, the network trained through this process can be used to eliminate the interference of the color of the image on the appearance of the target.
  • the target re-identification device can also introduce gait features when implementing target re-identification, and combine the gait features with appearance features to determine from the historical image library matching the target to be recognized. image.
  • gait features aim at identification through posture and movements while walking.
  • Gait recognition has the characteristics of being difficult to disguise and insensitive to clothing.
  • Figure 3 shows a schematic flowchart of a target re-identification method provided by an embodiment of the present application.
  • the target re-identification method provided by this application may include:
  • S301 and S302 are respectively similar to the implementation methods of S101 and S102 in the embodiment shown in Figure 1, and will not be described again in this application.
  • Temporally continuous continuous frame images represent images that are continuous in time sequence.
  • temporally continuous consecutive frame images are obtained by intercepting images that are consecutive in time sequence among multiple frame images.
  • the target re-identification device is a mobile phone and the target re-identification device is an applet.
  • the applet can display the search page of the target re-identification device.
  • the mini program intercepts consecutive frames of images from the multiple frames of missing person A to be identified.
  • the gait characteristics of the target to be identified are used to represent the posture and movements of the target to be identified when walking.
  • the target re-identification device extracts the gait features of the target to be recognized from the gait feature extraction network based on the continuous frame images.
  • the target re-identification device stores the gait feature extraction network in the target re-identification device and/or the storage device.
  • the gait features of the target to be identified can be obtained.
  • the formula for extracting appearance features through the gait feature extraction network is:
  • fi represents the gait feature of the continuous frame image Xi of the target identity i obtained through the conversion function of the gait feature extraction network
  • F represents the conversion function of the gait feature extraction network
  • Xi represents the target to be identified
  • the identity of the continuous frame image is i
  • n represents the total number of frames of the continuous frame image
  • P represents the identity of the target to be recognized
  • P i represents the target to be recognized with the identity of i
  • the target re-identification device is a mobile phone and the target re-identification device is an applet.
  • the applet can display the search page of the target re-identification device.
  • the mini program calls the gait feature extraction network from the storage device to extract the gait feature from the consecutive frame images. Extract missing person A’s movement and posture features while walking.
  • the target re-identification device is a mobile phone and the target re-identification device is an applet.
  • the applet can display the search page of the target re-identification device.
  • the mini program retrieves the appearance feature extraction network from the storage device to extract the missing person A including the missing person A.
  • the hair, top, bottom, gender and shoe features of each frame in the multi-frame images are retrieved from the storage device.
  • the applet retrieves the gait feature extraction network from the storage device to extract the walking pattern of the missing person A from the consecutive frame images.
  • the applet searches for human body images that match missing person A from the historical image database. And the search page displays the human body images found from the historical image library that match missing person A.
  • the target re-identification device inputs a single frame image in a multi-frame image into the appearance feature extraction network to obtain the appearance features of the target to be identified that are independent of the color of the single frame image. According to the appearance features of the target to be identified, And through successive frames of images, the gait characteristics of the target to be identified are determined, and then based on the appearance characteristics of the target to be identified and the gait characteristics of the target to be identified, the image that matches the target to be identified is determined from the historical image library.
  • the feature of insensitivity can also avoid the problem of low re-identification accuracy of the target image to be identified due to the different colors of clothing between the target image to be identified and the target images in the historical image database.
  • the target re-identification device can also acquire the gait characteristics of the target to be recognized from the continuous frame images after acquiring temporally continuous consecutive frame images from the multi-frame images.
  • FIG. 4 shows a schematic flowchart of a target re-identification method provided by an embodiment of the present application.
  • the target re-identification method provided by this application may include:
  • the foreground area includes the target to be recognized, and the background area does not include the target to be recognized.
  • the front background refers to the area that sets off the subject in the image and serves the subject;
  • the background refers to the area behind the subject that is used to set off the subject and has the function of setting off the subject.
  • the front background in an image of a human body standing on the beach, the area including the human body is the front background, and the sea area behind the human body excluding the human body is the background.
  • Separating the foreground area and the background area means separating the area including the target to be recognized and the area not including the target to be recognized.
  • the target re-identification device is a mobile phone and the target re-identification device is an applet.
  • the applet can display the search page of the target re-identification device.
  • the mini program intercepts consecutive frames of images from the multiple frames of images, and then adds the consecutive frames of images into the search page of the mini program.
  • the area including the missing person A is separated from the area not including the missing person A.
  • S402. Use the foreground area of each frame image in the continuous frame image as the continuous frame front background image.
  • the target re-identification device is a mobile phone and the target re-identification device is an applet.
  • the applet can display the search page of the target re-identification device.
  • the mini program intercepts consecutive frame images from the multi-frame images, and then converts the consecutive frame images into The area including the missing person A is separated from the area not including the missing person A, and the area including the missing person A is used as the front background image of the continuous frame.
  • the target re-identification device is a mobile phone and the target re-identification device is an applet.
  • the applet can display the search page of the target re-identification device.
  • the mini program intercepts consecutive frames of images from the multiple frames of images, and then adds the consecutive frames of images into the search page of the mini program.
  • the area including the missing person A is separated from the area not including the missing person A, and the area including the missing person A is used as the front and background image of the continuous frame.
  • the applet retrieves the gait feature extraction network from the storage device from the continuous frame. Extract missing person A's walking movement and posture features from the foreground and background images.
  • the target re-identification device separates the foreground area and the background area of each frame of the continuous frame image, and uses the foreground area of each frame of the continuous frame image as the continuous frame front background. image, and determine the gait characteristics of the target to be identified based on the continuous frame front and background images. Extracting the continuous frame foreground image from the continuous frame image, and then extracting the gait features from the continuous frame foreground image, avoids the interference of the background area and is conducive to quickly obtaining the gait features of the target to be identified.
  • the target re-identification device when the target re-identification device combines gait features and appearance features to determine images that match the target to be recognized from the historical image library, it can use a variety of methods, such as introducing similarities degree to match.
  • the target re-identification device can introduce similarity through the similarity between the gait characteristics of the target to be identified and the gait characteristics of each frame in the historical image database, and the similarity between the appearance characteristics of the target to be identified and each frame in the historical image database.
  • the similarity of the appearance features of the image is combined to determine the image that matches the target to be recognized from the historical image library.
  • similarity refers to the degree of similarity between two targets.
  • Figure 5a shows a schematic flowchart of a target re-identification method provided by an embodiment of the present application.
  • the target re-identification method provided by this application may include:
  • S501, S502, S503 and S504 are respectively similar to the implementation methods of S301, S302, S303 and S304 in the embodiment shown in Figure 3, and will not be described again in this application.
  • the appearance features and gait features of the target in each frame of the historical image database may be extracted in advance, or may be extracted by the target re-identification device during the target re-identification process.
  • the target re-identification device is a mobile phone and the target re-identification device is an applet.
  • the applet can display the search page of the target re-identification device.
  • the mini program can obtain the hair, hair, and hair of the human body in each frame of the image from the historical image library. Top, bottom, gender and shoe characteristics as well as movement and posture characteristics while walking.
  • S506. Determine a plurality of first scores based on the appearance characteristics of the target to be identified and the appearance characteristics of the target in each frame of the image. Each first score is used to indicate that the appearance characteristics of the target to be identified are consistent with the appearance characteristics of the target in each frame of the image. The similarity between the appearance features of the objects in each frame of image.
  • each first score represents the similarity between the appearance features of the target to be recognized and the appearance features of the target in each frame of image.
  • the similarity between the appearance features of the target to be recognized and the appearance features of the target in each frame of the image refers to the cosine similarity between the appearance features of the target to be recognized and the appearance features of the target in each frame of the image.
  • cosine similarity is a similarity obtained by calculating the cosine value of the angle between the vectors corresponding to two features.
  • Cosine similarity is the product of the vectors corresponding to the two features multiplied by the modulus of the two vectors. The greater the cosine similarity, the smaller the angle between the vectors corresponding to the two features, and the closer the vectors corresponding to the two features are; conversely, the farther they are.
  • the cosine similarity calculation formula between the appearance features of the target to be recognized and the appearance features of the target in each frame of image is:
  • s′ represents the cosine similarity between the appearance features of the target to be identified and the appearance features of the target in each frame of the image
  • f′ i represents the vector corresponding to the appearance features of the target to be identified
  • f′ j represents the target in each frame of the image.
  • represents the vector corresponding to the appearance feature of the target to be recognized and the module of the vector corresponding to the appearance feature of the target in each frame image.
  • the target re-identification device is a mobile phone and the target re-identification device is an applet.
  • the applet can display the search page of the target re-identification device.
  • the mini program can extract the appearance characteristics of the target in each frame of image from the historical image library. , and calculate the cosine similarity between the hair, tops, bottoms, gender and shoe features of missing person A to be identified and the hair, tops, bottoms, gender and shoes features of the human body in each frame of the historical image database.
  • the first column is the appearance characteristics of missing person A to be identified.
  • the vector corresponding to the appearance characteristics of missing person A to be identified can be obtained through the appearance characteristics of missing person A to be identified;
  • the second column is in the historical image database
  • the appearance features of the human body in each frame of the image can be obtained through the appearance features of the human body in each frame of the image in the historical image database.
  • the vector corresponding to the appearance features of the human body in each frame of the image in the historical image database can be obtained;
  • the vectors corresponding to the appearance features and the vectors corresponding to the appearance features of the human body in each frame of the image in the historical image database are used to calculate the cosine similarity between the appearance features of the missing person A to be identified and the appearance features of the human body in each frame of the image in the historical image database. , which is the first rating in column 3.
  • each second score is used to indicate the gait characteristics of the target to be identified. The similarity with the gait characteristics of the target in each frame of image.
  • each second score represents the similarity between the gait characteristics of the target to be recognized and the gait characteristics of the target in each frame of image.
  • the similarity between the gait characteristics of the target to be identified and the gait characteristics of the target in each frame of the image refers to the cosine similarity between the gait characteristics of the target to be identified and the gait characteristics of the target in each frame of the image.
  • the cosine similarity calculation formula between the gait characteristics of the target to be identified and the gait characteristics of the target in each frame of image is:
  • s represents the cosine similarity between the gait characteristics of the target to be identified and the gait characteristics of the target in each frame of the image
  • f i represents the vector corresponding to the gait characteristics of the target to be identified
  • f j represents each frame in the historical image library
  • represents the vector corresponding to the gait feature of the target to be recognized and the vector corresponding to the gait feature of the target in each frame of the image.
  • the target re-identification device is a mobile phone and the target re-identification device is an applet.
  • the applet can display the search page of the target re-identification device.
  • the mini program can extract the gait of the human body in each frame of image from the historical image library. Features, and calculate the cosine similarity between the walking movements and postures of missing person A to be identified and the walking movements and postures of the human body in each frame of the historical image database.
  • the first column is the gait characteristics of the missing person A to be identified.
  • the vector corresponding to the gait characteristics of the missing person A to be identified can be obtained through the gait characteristics of the missing person A to be identified;
  • the second column is the history
  • the gait characteristics of the human body in each frame of the image in the image library can be obtained through the gait characteristics of the human body in each frame of the image in the historical image library.
  • the vector corresponding to the gait characteristics of the human body in each frame of the image in the historical image library can be obtained by The vector corresponding to the gait characteristics of the missing person A to be identified and the vector corresponding to the gait characteristics of the human body in each frame of the image in the historical image database are calculated to calculate the gait characteristics of the missing person A to be identified and the vector in each frame of the image in the historical image database.
  • the cosine similarity of the human body's gait characteristics which is the second score in column 3.
  • the first score and the second score corresponding to each frame of the image in the historical image database are matched, and then the first score and the second score are fused to obtain multiple scores.
  • two cosine similarities are fused using:
  • s fusion fusion (s; ⁇ 0 , ⁇ 0 ) ⁇ fusion (s′; ⁇ 1 , ⁇ 1 )
  • s fusion represents the value of the score obtained by fusing the first score and the second score.
  • fusion represents the conversion function.
  • the value range of the cosine similarity can be converted to between 0 and 1 through the conversion function.
  • ⁇ and ⁇ represent conversions.
  • the parameters of the function, ⁇ 0 and ⁇ 0 represent the parameters of the conversion function of the cosine similarity between the gait feature and the gait feature of the target in each frame image
  • ⁇ 1 and ⁇ 1 represent the cosine similarity between the appearance feature and the target in each frame image. Parameters of the transformation function for cosine similarity of appearance features.
  • the target re-identification device is a mobile phone and the target re-identification device is an applet.
  • the applet can display the search page of the target re-identification device.
  • the mini program can fuse the first score and the second score corresponding to each frame of image. , get multiple ratings.
  • N is preset, for example, N can be 50.
  • the first N ratings from multiple ratings are the first 50 ratings from multiple ratings.
  • the target re-identification device is a mobile phone and the target re-identification device is an applet.
  • the applet can display the search page of the target re-identification device.
  • the mini program can fuse the first score and the second score corresponding to each frame of image. , get multiple ratings, sort the rating values from large to small, and take the top N ratings.
  • match the first score and the second score corresponding to each frame of the 100 images for example, align the first score corresponding to image 1 with the second score corresponding to image 1, Perform fusion to get 100 scores, and take the top 50 scores from the 100 scores.
  • the target re-identification device is a mobile phone and the target re-identification device is an applet.
  • the applet can display the search page of the target re-identification device.
  • the mini program can fuse the first score and the second score corresponding to each frame of image. , get multiple scores, sort the score values from large to small, and take the images corresponding to the top N scores.
  • the target re-identification device is a mobile phone and the target re-identification device is an applet.
  • the applet can display the search page of the target re-identification device.
  • the mini program can fuse the first score and the second score corresponding to each frame of image. , obtain multiple scores, and sort the score values in order from large to small, and select the images corresponding to the top N scores as images that match the target to be recognized, and display them on the search page of the mini program.
  • the target re-identification device inputs a single frame image in a multi-frame image into the appearance feature extraction network, obtains the appearance features of the target to be identified that are independent of the color of the single frame image, and determines the target to be identified through subsequent frame images.
  • the gait characteristics of the target determine a plurality of first scores based on the appearance characteristics of the target to be identified and the appearance characteristics of the target in each frame of the image, and determine the gait characteristics of the target based on the gait characteristics of the target to be identified and the gait characteristics of the target in each frame of the image.
  • the image corresponding to the first N scores in order from large to small, as the images to be identified Image that matches the target.
  • the accuracy of the recognition can be improved; at the same time, the images corresponding to the first N scores are As images that match the target to be identified, a larger number of images similar to the target to be identified are obtained, which can avoid missing images similar to the target to be identified and ensure the humanization of the recognition.
  • the target re-identification device when acquiring appearance features, sequentially inputs each frame of the multi-frame images into the appearance feature extraction network to obtain multiple appearance features, and then fuses the multiple appearance features.
  • FIG. 6 shows a schematic flowchart of a target re-identification method provided by an embodiment of the present application.
  • the target re-identification method provided by this application may include:
  • S601 is similar to the implementation method of S102 in the embodiment shown in Figure 1, and will not be described again in this application.
  • the formula for extracting appearance features through the appearance feature extraction network is:
  • f′ i represents the appearance features of the multi-frame image K i with the identity of the target i to be identified obtained through the conversion function of the appearance feature extraction network
  • G represents the conversion function of the appearance feature extraction network
  • K i represents the identity of the target including the identity to be identified.
  • n represents the total number of frames of the multi-frame image
  • P represents the identity of the target to be identified
  • P i represents the target to be identified with the identity of i
  • S602 Fusion of the multiple appearance features to obtain the appearance features of the target to be identified.
  • the target re-identification device inputs each frame of the multi-frame images into the appearance feature extraction network, obtains multiple appearance features respectively, and fuses the multiple appearance features to obtain the to-be-recognized The appearance characteristics of the target.
  • the obtained appearance features can be ensured to be more accurate, thereby further ensuring the accuracy of the recognition.
  • this application also provides a target re-identification device.
  • FIG. 7 shows a schematic block diagram of a target re-identification device provided by an embodiment of the present application.
  • a target re-identification device provided by an embodiment of the present application includes a first acquisition module 701 , a second acquisition module 702 and an identification module 703 .
  • the first acquisition module 701 is used to acquire multiple frames of images, each of which includes a target to be identified;
  • the second acquisition module 702 is used to input a single frame image in the multi-frame image into the appearance feature extraction network to obtain the appearance features of the target to be identified.
  • the appearance features of the target to be identified are consistent with the single frame.
  • the color of the image has nothing to do, and the appearance feature extraction network is used to eliminate the interference of the color of the image on the appearance of the target;
  • the recognition module 703 is configured to determine an image that matches the target to be recognized from the historical image library according to the appearance characteristics of the target to be recognized.
  • a feature network generation system is used to:
  • each frame of the sample image includes a sample target
  • the original feature extraction network is trained to obtain the appearance feature extraction network.
  • the feature network generation system is specifically used for:
  • the color value of each color channel is exchanged and the gray value is converted to obtain a multi-frame enhanced image
  • the color values of each color channel are exchanged for each pixel in the sample image of each frame to obtain a multi-frame enhanced image.
  • the target re-identification device 700 further includes: a third acquisition module (not shown in Figure 7).
  • the third acquisition module is used for:
  • the gait characteristics of the target to be identified are determined, and the gait characteristics of the target to be identified are used to represent the posture and movements of the target to be identified when walking.
  • the identification module 703 is specifically used to:
  • an image matching the target to be identified is determined from the historical image library.
  • the third acquisition module is specifically used for:
  • the foreground area includes the target to be recognized, and the background area does not include the target to be recognized;
  • the gait characteristics of the target to be identified are determined.
  • the identification module 703 is specifically used to:
  • a plurality of first scores are determined according to the appearance characteristics of the target to be identified and the appearance characteristics of the target in each frame of the image, and each first score is used to indicate the appearance characteristics of the target to be identified and the appearance characteristics of the target in each frame. The similarity between the appearance features of objects in the image;
  • a plurality of second scores are determined according to the gait characteristics of the target to be identified and the gait characteristics of the target in each frame of the image, and each second score is used to indicate the relationship between the gait characteristics of the target to be identified and the gait characteristics of the target in each frame of the image. Describes the similarity between the gait features of the target in each frame of image;
  • the first N scores are taken from the multiple scores, N is a positive integer;
  • the image matching the target to be identified is the image corresponding to the first N scores.
  • the second acquisition module 702 is used for:
  • Each frame of the multiple frames of images is input into the appearance feature extraction network to obtain multiple appearance features, each appearance feature having nothing to do with the color of the corresponding image of each frame;
  • the multiple appearance features are fused to obtain the appearance features of the target to be identified.
  • the device 700 of the present application can be implemented by an application-specific integrated circuit (ASIC) or a programmable logic device (PLD).
  • the above-mentioned PLD can be a complex program logic device (complex).
  • CPLD programmable logical device
  • FPGA field-programmable gate array
  • GAL general array logic
  • the target re-identification method shown in Figure 1 can also be implemented through software.
  • the device 700 and its respective modules can also be software modules.
  • FIG 8 is a schematic structural diagram of a target re-identification device provided by this application.
  • the device 800 includes a processor 801, a memory 802, a communication interface 803 and a bus 804.
  • the processor 801, the memory 802, and the communication interface 803 communicate through the bus 804. Communication can also be achieved through other means such as wireless transmission.
  • the memory 802 is used to store instructions, and the processor 801 is used to execute the instructions stored in the memory 802.
  • the memory 802 stores program code 8021, and the processor 801 can call the program code 8021 stored in the memory 802 to execute the target re-identification method shown in Figure 1.
  • the processor 501 may be a CPU, and the processor 801 may also be other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or Other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • DSPs digital signal processors
  • ASICs application-specific integrated circuits
  • FPGAs field-programmable gate arrays
  • a general-purpose processor can be a microprocessor or any conventional processor, etc.
  • the memory 802 may include read-only memory and random access memory and provides instructions and data to the processor 801 .
  • Memory 802 may also include non-volatile random access memory.
  • the memory 802 may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
  • non-volatile memory can be read-only memory (ROM), programmable ROM (PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrically removable memory. Erase electrically programmable read-only memory (EPROM, EEPROM) or flash memory.
  • Volatile memory can be random access memory (RAM), which is used as an external cache.
  • RAM static random access memory
  • DRAM dynamic random access memory
  • SDRAM synchronous dynamic random access memory
  • Double data rate synchronous dynamic random access memory double data date SDRAM, DDR SDRAM
  • enhanced synchronous dynamic random access memory enhanced SDRAM, ESDRAM
  • synchronous link dynamic random access memory direct rambus RAM, DR RAM
  • bus 804 may also include a power bus, a control bus, a status signal bus, etc. However, for the sake of clarity, the various buses are labeled bus 804 in FIG. 8 .
  • the device 800 may correspond to the device 700 in the present application, and may correspond to the device in the method shown in FIG. 1 of the present application.
  • the device 800 corresponds to the device in the method shown in FIG. 1
  • the device The above and other operations and/or functions of each module in 800 are respectively to implement the operation steps of the method executed by the device in Figure 1. For the sake of brevity, they will not be described again here.
  • This application also provides a computer-readable storage medium that stores a computer program.
  • the computer program is executed by a processor, the steps in each of the above method embodiments can be implemented.
  • This application provides a computer program product.
  • the steps in each of the above method embodiments can be implemented when the mobile device executes it.
  • sequence number of each step in the above embodiment does not mean the order of execution.
  • the execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of this application.
  • Module completion means dividing the internal structure of the above device into different functional units or modules to complete all or part of the functions described above.
  • Each functional unit and module in the embodiment can be integrated into one processing unit, or each unit can exist physically alone, or two or more units can be integrated into one unit.
  • the above-mentioned integrated unit can be hardware-based. It can also be implemented in the form of software functional units.
  • the specific names of each functional unit and module are only for the convenience of distinguishing each other and are not used to limit the scope of protection of the present application.
  • For the specific working processes of the units and modules in the above system please refer to the corresponding processes in the foregoing method embodiments, and will not be described again here.
  • the disclosed devices/network devices and methods can be implemented in other ways.
  • the device/network equipment embodiments described above are only illustrative.
  • the division of the above modules or units is only a logical function division. In actual implementation, there may be other division methods, such as multiple units or units. Components may be combined or may be integrated into another system, or some features may be ignored, or not implemented.
  • the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, indirect coupling or communication connection of devices or units, which may be in electrical, mechanical or other forms.
  • the units described above as separate components may or may not be physically separated.
  • the components shown as units may or may not be physical units, that is, they may be located in one place, or they may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

本申请适用于图像识别技术领域,提供了一种目标重识别方法、装置、设备和计算机可读存储介质,该方法包括:获取多帧图像,每帧图像中均包括待识别目标;将多帧图像中的单帧图像输入到外观特征提取网络中,得到待识别目标的外观特征,待识别目标的外观特征与单帧图像的颜色无关,外观特征提取网络用于消除图像的颜色对目标的外观的干扰;根据待识别目标的外观特征,从历史图像库中确定与待识别目标相匹配的图像。该方法可以避免因待识别目标图像与历史图像库中的目标图像的衣物颜色不同,导致的对待识别目标图像重识别准确率低的问题。

Description

目标重识别方法、装置、设备和计算机可读存储介质
本申请要求于2022年4月28日提交中国专利局,申请号为202210458101.8、发明名称为“目标重识别方法、装置、设备和计算机可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请属于图像识别技术领域,尤其涉及一种目标重识别方法、装置、设备和计算机可读存储介质。
背景技术
目标重识别技术是智能安防、失踪目标查找、案件侦查等领域的重要技术手段。例如,在失踪目标查找领域,用来根据失踪目标的图像查找失踪目标。
目前,相关技术中,主要基于深度学习技术进行目标重识别,具体为先利用Yolo检测算法将多个视频文件包括的多帧图像中的每张图像检测出来并裁剪保存形成历史图像库;再使用特征提取网络提取裁剪得到的历史图像库中的每张图像的特征向量;最后对比找出历史图像库中与待识别目标图像的特征向量最接近的图像。
然而,由于待识别目标图像与历史图像库中的目标图像的衣物颜色可能不同,导致相关技术对待识别目标图像的识别准确率较低。
发明内容
本申请提供了一种目标重识别方法、装置、设备和计算机可读存储介质,可以避免因待识别目标图像与历史图像库中的目标图像的衣物颜色不同,导致的对待识别目标图像重识别准确率低的问题。
第一方面,本申请提供一种目标重识别方法,包括:
获取多帧图像,每帧图像中均包括待识别目标;
将所述多帧图像中的单帧图像输入到外观特征提取网络中,得到所述待识别目标的外观特征,所述待识别目标的外观特征与所述单帧图像的颜色无关,所述外观特征提取网络用于消除图像的颜色对目标的外观的干扰;
根据所述待识别目标的外观特征,从历史图像库中确定与所述待识别目标相匹配的图像。
本申请通过将多帧图像中的单帧图像输入到外观特征提取网络中,得到与单帧图像的颜色无关的待识别目标的外观特征,根据待识别目标的外观特征,从历史图像库中确定与待识别目标相匹配的图像。避免了因待识别目标图像与历史图像库中的目标图像的衣物颜色不同,导致的对待识别目标图像重识别准确率低的问题,提高了重识别的准确率。
第二方面,本申请提供了一种目标重识别装置,该装置用于执行上述第一方面或第一方面的任一可能的实现方式中的方法。具体地,该装置包括:
第一获取模块,用于获取多帧图像,每帧图像中均包括待识别目标;
第二获取模块,用于将所述多帧图像中的单帧图像输入到外观特征提取网络中,得到所述待识别目标的外观特征,所述待识别目标的外观特征与所述单帧图像的颜色无关,所述外观特征提取网络用于消除图像的颜色对目标的外观的干扰;
识别模块,用于根据所述待识别目标的外观特征,从历史图像库中确定与所述待识别目标相匹配的图像。
第三方面,本申请提供了一种目标重识别设备,该设备包括存储器与处理器。该存储器用于存储指令;该处理器执行该存储器存储的指令,使得该设备执行第一方面或第一方面的任一可能的实现方式中目标重识别方法。
第四方面,提供一种计算机可读存储介质,该计算机可读存储介质中存储有指令,当该指令在计算机上运行时,使得计算机执行第一方面或第一方面的任一可能的实现方式中目标重识别方法。
第五方面,提供一种包含指令的计算机程序产品,当该指令在设备上运行时,使得设备执行第一方面或第一方面的任一可能的实现方式中目标重识别方法。
可以理解的是,上述第二方面至第五方面的有益效果可以参见上述第一方面中的相关描述,在此不再赘述。
附图说明
为了更清楚地说明本申请中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请一实施例提供的目标重识别方法的流程示意图;
图2是本申请一实施例提供的目标重识别方法的流程示意图;
图3是本申请一实施例提供的目标重识别方法的流程示意图;
图4是本申请一实施例提供的目标重识别方法的流程示意图;
图5a是本申请一实施例提供的目标重识别方法的流程示意图;
图5b是本申请一实施例提供的确定第一评分的示意图;
图5c是本申请一实施例提供的确定第二评分的示意图;
图6是本申请一实施例提供的目标重识别方法的流程示意图;
图7是本申请一实施例提供的目标重识别装置的结构示意图;
图8是本申请一实施例提供的目标重识别设备的结构示意图。
具体实施方式
以下描述中,为了说明而不是为了限定,提出了诸如特定系统结构、技术之类的具体细节,以便透彻理解本申请。然而,本领域的技术人员应当清楚,在没有这些具体细节的其它实施例中也可以实现本申请。在其它情况中,省略对众所周知的系统、装置、电路以及方法的详细说明,以免不必要的细节妨碍本申请的描述。
应当理解,当在本申请说明书和所附权利要求书中使用时,术语“包括”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。
还应当理解,在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。
如在本申请说明书和所附权利要求书中所使用的那样,术语“如果”可 以依据上下文被解释为“当...时”或“一旦”或“响应于确定”或“响应于检测到”。类似地,短语“如果确定”或“如果检测到[所描述条件或事件]”可以依据上下文被解释为意指“一旦确定”或“响应于确定”或“一旦检测到[所描述条件或事件]”或“响应于检测到[所描述条件或事件]”。
另外,在本申请说明书和所附权利要求书的描述中,术语“第一”、“第二”、“第三”等仅用于区分描述,而不能理解为指示或暗示相对重要性。
在本申请说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此,在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例,而是意味着“一个或多个但不是所有的实施例”,除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。
本申请提供一种目标重识别方法、装置、设备和计算机可读存储介质,该方法可以通过目标重识别装置实现,且应用于案件侦查、失踪人口查找和智能安防等场景中。
其中,目标重识别装置与目标重识别设备通信连接。例如,目标重识别装置可以通过应用程序(application,APP)、网页、应用中的公众号和小程序等形式,与目标重识别设备进行相互通信,使得目标重识别装置与目标重识别设备能够相互传递信息。用户可通过与目标重识别设备通信的目标重识别装置实现目标重识别。
其中,目标重识别装置指的是用户进行目标重识别时所使用的设备。目标重识别设备可以为具有显示屏硬件以及相应软件支持的设备,例如智能手机、平板电脑、台式电脑、笔记本电脑、可穿戴设备、手持设备和车载设备等。本申请实施例对目标重识别设备的具体类型不作任何限制。
基于上述场景描述,下面,结合目标重识别装置,对本申请实施例提供的目标重识别方法进行详细说明。
请参阅图1,图1示出了本申请一实施例提供的目标重识别方法的流程示意图。
如图1所示,本申请提供的目标重识别方法可以包括:
S101、获取多帧图像,每帧图像中均包括待识别目标。
在一些实施方式中,目标重识别设备可显示目标重识别装置的查找页面。其中,目标重识别装置的查找页面用于显示输入多帧图像的导入口以及显示查找结果。本申请对目标重识别装置的查找页面的具体实现方式不做限定。在一些实施例中,查找页面中可以包括一个用于输入视频或者图像的控件,该控件用于触发通过输入视频或者图像进行查找。
用户可在目标重识别装置的查找页面中进行目标重识别。从而,目标重识别设备可向目标重识别装置发送用户的查找请求。其中,用户的查找请求用于表示用户想要进行目标重识别。
例如,目标重识别设备在接收到用户指示的在用于输入视频或者图像的控件上执行的如点击、双击、长按等类型的操作后,可将该操作转为用户的查找请求,并向目标重识别装置发送用户的查找请求。本申请对用户的查找请求的具体实现方式不做限定。
其中,多帧图像为通过目标跟踪算法在给定的视频数据中对待识别目标进行目标跟踪而获取的。
目标重识别装置可以将目标跟踪算法存储在目标重识别装置和/或存储设备中。
其中,存储设备可与目标重识别装置进行通信,使得目标重识别装置能够从存储设备中获取通过外观特征提取网络提取的外观特征。本申请对存储设备的存储方式和具体类型不做限定。
在一些实施例中,目标跟踪算法为DeepSORT算法。
给定的视频数据可以是用户直接给定的,也可以是从监控摄像头、摄像机等图像采集设备采集的视频数据中抽取的。
摄像头可以包括单摄像头、双摄像头或三摄像头,或者,摄像头可以设定为广角摄像头或长焦摄像头,本申请实施例对此不作限定。
在一些实施例中,多帧图像来自用户直接给定的视频数据中的图像。具体地,给定的多帧图像中,每帧图像中均包括待识别目标。
在另一些实施例中,多帧图像来自监控摄像头、摄像机等图像采集设备采集的视频数据中抽取的多帧图像。具体地,采集的多帧图像中,每帧图像 中均包括待识别目标。
在一些实施方式中,当待识别目标的身份已知时,多帧图像中每帧图像中均包括待识别已知身份的目标。
在一些实施方式中,待识别目标的身份未知时,多帧图像中每帧图像中均包括待识别未知身份的目标。
其中,待识别目标包括但不限于人体。
在一个具体的实施例中,假设目标重识别设备为手机,目标重识别装置为小程序,小程序可以显示目标重识别装置的查找页面。在查找失踪人口场景中,用户在小程序的查找页面中可以输入每帧图像均包括有待识别失踪人口甲的多帧图像。
S102、将所述多帧图像中的单帧图像输入到外观特征提取网络中,得到所述待识别目标的外观特征,所述待识别目标的外观特征与所述单帧图像的颜色无关,所述外观特征提取网络用于消除图像的颜色对目标的外观的干扰。
基于S101,目标重识别装置可以获得查找请求。从而,目标重识别装置可以执行查找请求,首先将多帧图像中的单帧图像输入到外观特征提取网络中,提取待识别目标的外观特征。
其中,外观特征提取网络为预先设计,外观特征提取网络用于消除图像的颜色对目标的外观的干扰。外观特征提取网络可在提取外观特征时,对衣物颜色不敏感。
目标重识别装置可以将预先设计的外观特征提取网络存储在目标重识别装置和/或存储设备中。
在一些实施方式中,外观特征可以包括但不限于头发、上衣、下装、性别、是否背包、是否戴帽子、是否背包、鞋子中的一个或者多个。例如,外观特征包括头发、上衣、下装、性别和鞋子五个。
在一个具体的实施例中,假设目标重识别设备为手机,目标重识别装置为小程序,小程序可以显示目标重识别装置的查找页面。在查找失踪人口场景中,用户在小程序的查找页面中输入每帧图像均包括有待识别失踪人口甲的多帧图像后,小程序从存储设备中调取外观特征提取网络提取包括有失踪人口甲的头发、上衣、下装、性别和鞋子特征。
S103、根据所述待识别目标的外观特征,从历史图像库中确定与所述待 识别目标相匹配的图像。
历史图像库中图像可以是预先存储在目标重识别装置/存储设备中的具有多个历史目标图像的图像库,也可以是从多个与目标重识别设备通信连接的监控摄像头、摄像机等图像采集设备采集的视频数据对应的图像。
在一些实施方式中,当待识别目标的身份已知时,可以从预先存储的具有多个历史目标图像的图像库中匹配与待识别目标相匹配的图像。
在另一些实施例中,当待识别目标的身份未知时,可以从多个与目标重识别设备通信连接的监控摄像头、摄像机等图像采集设备采集的视频数据对应的图像中匹配与待识别目标相匹配的图像。
在一个具体的实施例中,假设目标重识别设备为手机,目标重识别装置为小程序,小程序可以显示目标重识别装置的查找页面。在查找失踪人口场景中,用户在小程序的查找页面中输入每帧图像均包括有待识别失踪人口甲的多帧图像后,小程序从存储设备中调取外观特征提取网络提取包括有失踪人口甲的头发、上衣、下装、性别和鞋子特征,小程序从存储设备中调取历史图像库,并根据失踪人口甲的头发、上衣、下装、性别和鞋子特征从历史图像库中查找与失踪人口甲相匹配的图像,并在查找页面显示从历史图像库中查找到的与失踪人口甲相匹配的图像。
本申请提供的目标重识别方法,通过将多帧图像中的单帧图像输入到外观特征提取网络中,得到与单帧图像的颜色无关的待识别目标的外观特征,根据待识别目标的外观特征,从历史图像库中确定与待识别目标相匹配的图像。借助用于消除图像颜色对目标的外观的干扰的外观特征提取网络提取单帧图像中与颜色无关的待识别目标的外观特征,由于待识别目标的外观特征与颜色无关,可以避免因待识别目标图像与历史图像库中的目标图像的衣物颜色不同,而导致的对待识别目标图像重识别准确率低的问题。
基于上述图1所示实施例的描述,本申请还提供了预先设计的外观特征提取网络的生成过程。
下面,结合图2,详细介绍本申请的生成外观特征提取网络的过程的具体实现过程。
基于图1中S102的描述,目标重识别装置在获取单帧图像中的外观特征时,通过外观特征提取网络用于消除图像的颜色对目标的外观的干扰。
外观特征提取网络生成过程可以通过特征网络生成系统生成,也可以通过其他可行的网络生成系统生成,在此不再赘述。
请参阅图2,图2示出了本申请一实施例提供的生成外观特征提取网络的过程的流程示意图。
如图2所示,本申请提供的生成外观特征提取网络的过程可以包括:
S201、获取至少一帧样本图像,每帧所述样本图像中均包括样本目标。
在一些实施例中,样本图像为多帧。
其中,样本图像可以为预先拍摄准备的图像,也可以为从现有的每帧均包括样本目标的图像中选取得到的。
S202、对每帧所述样本图像中的每个像素点进行数据增强,得到多帧增强图像。
在一些实施例中,通过对每帧所述样本图像中的每个像素点的各个颜色通道的颜色值进行交换,得到多帧增强图像。
例如,当样本图像为RGB图像时,将该RGB图像中的每个像素点的RGB颜色通道的颜色值进行交换,得到除原始RGB图像的五张图像,五张图像包括:RBG图像、GRB图像、GBR图像、BRG图像和BGR图像。
在另一些实施例中,通过对每帧所述样本图像中的每个像素点分别进行各个颜色通道的颜色值交换,以及对每帧所述样本图像中的每个像素点进行灰度值转换,得到多帧增强图像。
对每帧所述样本图像中的每个像素点进行灰度值转换,即随机将每个像素点的灰度值进行改变。
例如,将样本图像中每个灰度值为10的像素点,转换为灰度值为20或者8的像素点,将样本图像中每个灰度值为40的像素点,转换成灰度值为50或者45的像素点。
S203、根据所述多帧增强图像,确定所述样本目标的外观特征,所述样本目标的外观特征与所述多帧增强图像的颜色无关。
S204、根据所述样本目标的外观特征,对原始特征提取网络进行训练,得到所述外观特征提取网络。
本申请中,特征网络生成系统生成外观提取网络的过程中,先获取样本图像,对每帧样本图像中的每个像素点进行数据增强,得到多帧增强图像, 根据多帧增强图像,确定样本目标的外观特征,样本目标的外观特征与多帧增强图像的颜色无关,根据样本目标的外观特征,对原始特征提取网络进行训练,得到外观特征提取网络。由于生成的过程中对样本图像的每个像素点进行了数据增强处理,因而通过此过程训练的网络能够用于消除图像的颜色对目标的外观的干扰。
基于图1所示实施例的描述,目标重识别装置在实现目标重识别时,还可以引入步态特征,将步态特征与外观特征结合来从历史图像库中确定与待识别目标相匹配的图像。
其中,步态特征旨在通过走路时的姿态和动作进行身份识别。步态识别具有不容易伪装和对衣着服饰不敏感的特点。
下面,结合图3,详细介绍目标重识别装置执行上述过程的具体实现方式。
请参阅图3,图3示出了本申请一实施例提供的目标重识别方法的流程示意图。
如图3所示,本申请提供的目标重识别方法可以包括:
S301、获取多帧图像,每帧图像中均包括待识别目标。
S302、将所述多帧图像中的单帧图像输入到外观特征提取网络中,得到所述待识别目标的外观特征,所述待识别目标的外观特征与所述单帧图像的颜色无关,所述外观特征提取网络用于消除图像的颜色对目标的外观的干扰。
其中,S301和S302分别与图1所示实施例中的S101和S102实现方法类似,本申请此处不再赘述。
S303、从所述多帧图像中,获取时间连续的连续帧图像。
时间连续的连续帧图像表示在时间序列上连续的图像。
在一些实施方式中,通过将多帧图像中在时间序列上连续的图像截取,得到时间连续的连续帧图像。
在一个具体的实施例中,假设目标重识别设备为手机,目标重识别装置为小程序,小程序可以显示目标重识别装置的查找页面。在查找失踪人口场景中,用户在小程序的查找页面中输入每帧图像均包括有待识别失踪人口甲的多帧图像后,小程序从待识别失踪人口甲的多帧图像中截取连续帧图像。
S304、根据所述连续帧图像,确定所述待识别目标的步态特征,所述待 识别目标的步态特征用于表示所述待识别目标在走路时的姿态和动作。
在一些实施例中,目标重识别装置根据连续帧图像从步态特征提取网络中提取待识别目标的步态特征。
其中,目标重识别装置将步态特征提取网络存储在目标重识别装置和/或存储设备中。
在一些实施方式中,将多帧图像输入到步态特征提取网络中,就可以得到待识别目标的步态特征。
在一些实施方式中,通过步态特征提取网络提取外观特征的公式为:
f i=F(X i)
Figure PCTCN2022143487-appb-000001
其中,f i表示通过步态特征提取网络的转换函数得到的待识别目标身份为i的连续帧图像X i的步态特征,F表示步态特征提取网络的转换函数,X i表示待识别目标的身份为i的连续帧图像,n表示连续帧图像的总帧数,P表示待识别目标的身份,P i表示身份为i的待识别目标,
Figure PCTCN2022143487-appb-000002
表示连续帧图像中身份为i的待识别目标的第j帧图像,
Figure PCTCN2022143487-appb-000003
表示其中,
Figure PCTCN2022143487-appb-000004
为P i的第j帧图像。
在一个具体的实施例中,假设目标重识别设备为手机,目标重识别装置为小程序,小程序可以显示目标重识别装置的查找页面。在查找失踪人口场景中,用户在小程序的查找页面中输入每帧图像均包括有待识别失踪人口甲的多帧图像后,小程序从存储设备中调取步态特征提取网络从连续帧图像中提取失踪人口甲的走路时的动作与姿态特征。
S305、根据所述待识别目标的外观特征和所述待识别目标的步态特征,从所述历史图像库中确定与所述待识别目标相匹配的图像。
在一个具体的实施例中,假设目标重识别设备为手机,目标重识别装置为小程序,小程序可以显示目标重识别装置的查找页面。在查找失踪人口场景中,用户在小程序的查找页面中输入每帧图像均包括有待识别失踪人口甲的多帧图像后,小程序从存储设备中调取外观特征提取网络提取包括有失踪人口甲的多帧图像中每帧图像的头发、上衣、下装、性别和鞋子特征,同时,小程序从从存储设备中调取步态特征提取网络从连续帧图像中提取包括有失踪人口甲的走路时的动作与姿态特征,小程序根据失踪人口甲的头发、上衣、下装、性别和鞋子特征以及走路时的动作与姿态特征结合从历史图像库中查 找与失踪人口甲相匹配的人体图像,并在查找页面显示从历史图像库中查找到的与失踪人口甲相匹配的人体图像。
本申请中,目标重识别装置通过将多帧图像中的单帧图像输入到外观特征提取网络中,得到与单帧图像的颜色无关的待识别目标的外观特征,根据待识别目标的外观特征,并通过续帧图像,确定待识别目标的步态特征,再根据待识别目标的外观特征和待识别目标的步态特征,从历史图像库中确定与待识别目标相匹配的图像。结合外观特征和步态特征从历史图像库中确定与待识别目标相匹配的图像,提高了识别的效率,也提高了识别的准确率;同时,由于步态识别具有不容易伪装和对衣着服饰不敏感的特点,因而也可以避免因待识别目标图像与历史图像库中的目标图像的衣物颜色不同,而导致的对待识别目标图像重识别准确率低的问题。
基于图3所示实施例的描述,目标重识别装置还可以在将从多帧图像中获取时间连续的连续帧图像后,从连续帧图像中获取待识别目标的步态特征。
下面,结合图4,详细介绍目标重识别装置执行上述过程的具体实现方式。
请参阅图4,图4示出了本申请一实施例提供的目标重识别方法的流程示意图。
如图4所示,本申请提供的目标重识别方法可以包括:
S401、将所述连续帧图像中的每帧图像的前背景区域和背景区域进行分离,所述前背景区域中包括所述待识别目标,所述背景区域中不包括所述待识别目标。
其中,前背景是指在图像画面中衬托主体的区域,是为主体服务的;背景是指位于主体后面,用以陪衬主体的区域,具有烘托主体的作用。例如,一张人体站在海边的图像,包括人体的区域为前背景,不包括人体的位于人体后面的大海区域为背景。
将前背景区域和背景区域进行分离是指将包括有待识别目标的区域和不包括待识别目标的区域分离。
在一个具体的实施例中,假设目标重识别设备为手机,目标重识别装置为小程序,小程序可以显示目标重识别装置的查找页面。在查找失踪人口场景中,用户在小程序的查找页面中输入每帧图像均包括有待识别失踪人口甲 的多帧图像后,小程序从多帧图像中截取连续帧图像,再将连续帧图像中包括有失踪人口甲的区域和不包括有失踪人口甲的区域分离。
S402、将所述连续帧图像中的每帧图像的前背景区域作为连续帧前背景图像。
在一个具体的实施例中,假设目标重识别设备为手机,目标重识别装置为小程序,小程序可以显示目标重识别装置的查找页面。在在查找失踪人口场景中,用户在小程序的查找页面中输入每帧图像均包括有待识别失踪人口甲的多帧图像后,小程序从多帧图像中截取连续帧图像,再将连续帧图像中包括有失踪人口甲的区域和不包括有失踪人口甲的区域分离,并将包括有失踪人口甲的区域作为连续帧前背景图像。
S403、根据所述连续帧前背景图像,确定所述待识别目标的步态特征。
在一个具体的实施例中,假设目标重识别设备为手机,目标重识别装置为小程序,小程序可以显示目标重识别装置的查找页面。在查找失踪人口场景中,用户在小程序的查找页面中输入每帧图像均包括有待识别失踪人口甲的多帧图像后,小程序从多帧图像中截取连续帧图像,再将连续帧图像中包括有失踪人口甲的区域和不包括有失踪人口甲的区域分离,并将包括有失踪人口甲的区域作为连续帧前背景图像,小程序从存储设备中调取步态特征提取网络从连续帧前背景图像中提取失踪人口甲的走路时的动作与姿态特征。
本申请中,目标重识别装置通过将所述连续帧图像中的每帧图像的前背景区域和背景区域进行分离,将所述连续帧图像中的每帧图像的前背景区域作为连续帧前背景图像,根据所述连续帧前背景图像,确定所述待识别目标的步态特征。从连续帧图像中提取连续帧前背景图像,再从连续帧前背景图像中提取步态特征,避免了背景区域的干扰,有利于快速获取待识别目标的步态特征。
基于图4所示实施例的描述,目标重识别装置在将步态特征与外观特征结合来从历史图像库中确定与待识别目标相匹配的图像时,可以通过多种方式,比如,引入相似度进行匹配。
下面,结合图5a,详细介绍目标重识别装置执行上述过程的具体实现方式。
基于S304的描述,目标重识别装置可引入相似度,通过待识别目标的步 态特征与历史图像库中每帧图像步态特征的相似度以及待识别目标的外观特征与历史图像库中每帧图像的外观特征的相似度结合来从历史图像库中确定与待识别目标相匹配的图像。
其中,相似度是指两个目标之间的相近程度。
请参阅图5a,图5a示出了本申请一实施例提供的目标重识别方法的流程示意图。
如图5a所示,本申请提供的目标重识别方法可以包括:
S501、获取多帧图像,每帧图像中均包括待识别目标。
S502、将所述多帧图像中的单帧图像输入到外观特征提取网络中,得到所述待识别目标的外观特征,所述待识别目标的外观特征与所述单帧图像的颜色无关,所述外观特征提取网络用于消除图像的颜色对目标的外观的干扰。
S503、从所述多帧图像中,获取时间连续的连续帧图像。
S504、根据所述连续帧图像,确定所述待识别目标的步态特征。
其中,S501、S502、S503和S504分别与图3所示实施例中的S301、S302、S303和S304实现方法类似,本申请此处不再赘述。
S505、在所述历史图像库中,确定每帧图像中的目标的外观特征和步态特征。
其中,历史图像库的每帧图像中的目标的外观特征和步态特征可以为预先提取的,也可以为在目标重识别过程中由目标重识别装置进行提取的。
在一个具体的实施例中,假设目标重识别设备为手机,目标重识别装置为小程序,小程序可以显示目标重识别装置的查找页面。在查找失踪人口场景中,用户在小程序的查找页面中输入每帧图像均包括有待识别失踪人口甲的多帧图像后,小程序可以从历史图像库中获取每帧图像中的人体的头发、上衣、下装、性别和鞋子特征以及走路时的动作与姿态特征。
S506、根据所述待识别目标的外观特征与所述每帧图像中的目标的外观特征,确定多个第一评分,每个第一评分用于指示所述待识别目标的外观特征与所述每帧图像中的目标的外观特征之间的相似度。
其中,每个第一评分表示待识别目标的外观特征与每帧图像中的目标的外观特征的相似度。
在一些实施例中,待识别目标的外观特征与每帧图像中的目标的外观特 征的相似度是指待识别目标的外观特征与每帧图像中的目标的外观特征的余弦相似度。
其中,余弦相似度,又称为余弦相似性,是通过计算两个特征对应的向量的夹角余弦值来得到的相似度。
余弦相似度,为两个特征对应的向量相乘并处以两个向量模的乘积。余弦相似度越大,说明两个特征对应的向量之间的夹角越小,两个特征对应的向量越接近;反之,则越远。
在一些实施例中,待识别目标的外观特征与每帧图像中的目标的外观特征的余弦相似度计算公式为:
Figure PCTCN2022143487-appb-000005
其中,s′表示待识别目标的外观特征与每帧图像中的目标的外观特征的余弦相似度,f′ i表示待识别目标的外观特征对应的向量,f′ j表示每帧图像中的目标的外观特征对应的向量,||f′ i||||f′ j||表示待识别目标的外观特征对应的向量和每帧图像中的目标的外观特征对应的向量的模。
在一个具体的实施例中,假设目标重识别设备为手机,目标重识别装置为小程序,小程序可以显示目标重识别装置的查找页面。在查找失踪人口场景中,用户在小程序的查找页面中输入每帧图像均包括有待识别失踪人口甲的多帧图像后,小程序可以从历史图像库中提取每帧图像中的目标的外观特征,并计算待识别失踪人口甲的头发、上衣、下装、性别和鞋子特征与历史图像库中每帧图像中的人体的头发、上衣、下装、性别和鞋子特征的余弦相似度。
在历史图像库中有100帧图像时,计算待识别失踪人口甲的头发、上衣、下装、性别和鞋子特征与100帧图像中每帧图像中的人体的头发、上衣、下装、性别和鞋子特征的余弦相似度。
如图5b所示,第1列为待识别失踪人口甲的外观特征,可以通过待识别失踪人口甲的外观特征得到待识别失踪人口甲的外观特征对应的向量;第2列为历史图像库中每帧图像中的人体的外观特征,可以通过历史图像库中每帧图像中的人体的外观特征得到历史图像库中每帧图像中的人体的外观特征对应的向量;通过待识别失踪人口甲的外观特征对应的向量和历史图像库中 每帧图像中的人体的外观特征对应的向量计算出待识别失踪人口甲的外观特征与历史图像库中每帧图像中的人体的外观特征的余弦相似度,即第3列的第一评分。
S507、根据所述待识别目标的步态特征与所述每帧图像中的目标的步态特征,确定多个第二评分,每个第二评分用于指示所述待识别目标的步态特征与所述每帧图像中的目标的步态特征之间的相似度。
其中,每个第二评分表示待识别目标的步态特征与每帧图像中的目标的步态特征的相似度。
在一些实施例中,待识别目标的步态特征与每帧图像中的目标的步态特征的相似度是指待识别目标的步态特征与每帧图像中的目标的步态特征的余弦相似度。
在一些实施例中,待识别目标的步态特征与每帧图像中的目标的步态特征的余弦相似度计算公式为:
Figure PCTCN2022143487-appb-000006
其中,s表示待识别目标的步态特征与每帧图像中的目标的步态特征的余弦相似度,f i表示待识别目标的步态特征对应的向量,f j表示历史图像库中每帧图像中的目标的步态特征对应的向量,||f i||||f j||表示待识别目标的步态特征对应的向量和每帧图像中的目标的步态特征对应的向量的模。
在一个具体的实施例中,假设目标重识别设备为手机,目标重识别装置为小程序,小程序可以显示目标重识别装置的查找页面。在查找失踪人口场景中,用户在小程序的查找页面中输入每帧图像均包括有待识别失踪人口甲的多帧图像后,小程序可以从历史图像库中提取每帧图像中的人体的步态特征,并计算待识别失踪人口甲的走路时的动作与姿态和历史图像库中每帧图像中的人体的走路时的动作与姿态的余弦相似度。
在历史图像库中有100帧图像时,计算待识别失踪人口甲的走路时的动作与姿态和100帧图像中每帧图像中的人体的走路时的动作与姿态的余弦相似度。
如图5c所示,第1列为待识别失踪人口甲的步态特征,可以通过待识别失踪人口甲的步态特征得到待识别失踪人口甲的步态特征对应的向量;第2 列为历史图像库中每帧图像中的人体的步态特征,可以通过历史图像库中每帧图像中的人体的步态特征得到历史图像库中每帧图像中的人体的步态特征对应的向量;通过待识别失踪人口甲的步态特征对应的向量和历史图像库中每帧图像中的人体的步态特征对应的向量计算出待识别失踪人口甲的步态特征与历史图像库中每帧图像中的人体的步态特征的余弦相似度,即第3列的第二评分。
S508、对所述多个第一评分和所述多个第二评分进行融合,得到多个评分。
在一些实施例中,将历史图像库中每帧图像对应第一评分和第二评分相对应,再进行第一评分和第二评分的融合,得到多个评分。
由于余弦相似度的取值范围在-1~1之间,因此,余弦相似度的融合不能单纯使用两个余弦相似度相乘。
在一些实施例中,使用下式对两个余弦相似度进行融合:
s fusion=fusion(s;λ 00)·fusion(s′;λ 11)
Figure PCTCN2022143487-appb-000007
其中,s fusion表示将第一评分和第二评分融合得到的评分的值,fusion表示转换函数,可以通过转换函数可以将余弦相似度的值域转换为0到1之间,λ和γ表示转换函数的参数,λ 0和γ 0表示步态特征与每帧图像中的目标的步态特征的余弦相似度的转换函数的参数,λ 1和γ 1表示外观特征与每帧图像中的目标的外观特征的余弦相似度的转换函数的参数。
在一个具体的实施例中,假设目标重识别设备为手机,目标重识别装置为小程序,小程序可以显示目标重识别装置的查找页面。在查找失踪人口场景中,用户在小程序的查找页面中输入每帧图像均包括有待识别失踪人口甲的多帧图像后,小程序可以将每帧图像对应的第一评分和第二评分进行融合,得到多个评分。
在历史图像库中有100帧图像时,将100帧图像中每帧图像对应的第一评分和第二评分相对应,比如先将图像1对应的第一评分和图像1对应的第二评分对齐,再进行融合。
S509、按照评分由大到小的顺序,从所述多个评分中取前N个评分,N为正整数。
其中,N为预先设置的,比如,N可以为50。在N为50时,从多个评分取前N个评分为从多个评分取前50个评分。
在一个具体的实施例中,假设目标重识别设备为手机,目标重识别装置为小程序,小程序可以显示目标重识别装置的查找页面。在查找失踪人口场景中,用户在小程序的查找页面中输入每帧图像均包括有待识别失踪人口甲的多帧图像后,小程序可以将每帧图像对应的第一评分和第二评分进行融合,得到多个评分,并将评分的值按照由大到小的顺序排序,并从中取前N个评分。
在历史图像库中有100帧图像时,将100帧图像中每帧图像对应的第一评分和第二评分相对应,比如将图像1对应的第一评分和图像1对应的第二评分对齐,进行融合,得到100个评分,从100个评分中取前50个评分。
S510、从所述历史图像库中,获取所述前N个评分对应的图像。
在一个具体的实施例中,假设目标重识别设备为手机,目标重识别装置为小程序,小程序可以显示目标重识别装置的查找页面。在查找失踪人口场景中,用户在小程序的查找页面中输入每帧图像均包括有待识别失踪人口甲的多帧图像后,小程序可以将每帧图像对应的第一评分和第二评分进行融合,得到多个评分,并将评分的值按照由大到小的顺序排序,从中取前N个评分对应的图像。
在历史图像库中有100帧图像时,将100帧图像中每帧图像对应的第一评分和第二评分相对应,比如将图像1对应的第一评分和图像1对应的第二评分对齐,进行融合,得到100个评分,从100个评分中取前50个评分对应的图像。
S511、确定与所述待识别目标相匹配的图像为所述前N个评分对应的图像。
在一个具体的实施例中,假设目标重识别设备为手机,目标重识别装置为小程序,小程序可以显示目标重识别装置的查找页面。在查找失踪人口场景中,用户在小程序的查找页面中输入每帧图像均包括有待识别失踪人口甲的多帧图像后,小程序可以将每帧图像对应的第一评分和第二评分进行融合,得到多个评分,并将评分的值按照由大到小的顺序排序,从中取前N个评分对应的图像作为与待识别目标相匹配的图像,在小程序的查找页面上进行显 示。
本申请中,目标重识别装置将多帧图像中的单帧图像输入到外观特征提取网络中,得到与单帧图像的颜色无关的待识别目标的外观特征,并通过续帧图像,确定待识别目标的步态特征;根据待识别目标的外观特征与每帧图像中的目标的外观特征,确定多个第一评分,根据待识别目标的步态特征与每帧图像中的目标的步态特征,确定多个第二评分,对多个第一评分和多个第二评分进行融合,得到多个评分,按照评分由大到小的顺序,获取前N个评分对应的图像,作为与待识别目标相匹配的图像。通过两个评分进行融合,再将融合得到的前N个评分对应的图像作为与待识别目标相匹配的图像,可以达到提高识别的准确率的效果;同时,将前N个评分对应的图像,作为与待识别目标相匹配的图像,得到的与待识别目标相似的图像数量较多,可以避免漏掉与待识别目标相似的图像,保证识别的人性化。
基于图1所示实施例的描述,目标重识别装置在获取外观特征时,将多帧图像中的每帧图像依次输入外观特征提取网络,得到多个外观特征,再将多个外观特征融合。
下面,结合图6,详细介绍目标重识别装置执行上述过程的具体实现方式。
请参阅图6,图6示出了本申请一实施例提供的目标重识别方法的流程示意图。
如图6所示,本申请提供的目标重识别方法可以包括:
S601、将所述多帧图像中的每帧图像均输入到所述外观特征提取网络中,分别得到多个外观特征,每个外观特征与对应的每帧图像的颜色无关。
其中,S601与图1所示实施例中的S102实现方法类似,本申请此处不再赘述。
在一些实施方式中,通过外观特征提取网络提取外观特征的公式为:
Figure PCTCN2022143487-appb-000008
Figure PCTCN2022143487-appb-000009
其中,f′ i表示通过外观特征提取网络的转换函数得到的待识别目标身份为i的多帧图像K i的外观特征,G表示外观特征提取网络的转换函数,K i表 示包括待识别目标身份为i的多帧图像,n表示多帧图像的总帧数,P表示待识别目标的身份,P i表示身份为i的待识别目标,
Figure PCTCN2022143487-appb-000010
表示多帧图像中身份为i的待识别目标的第j帧图像,
Figure PCTCN2022143487-appb-000011
表示其中,
Figure PCTCN2022143487-appb-000012
为P i的第j帧图像。
S602、对所述多个外观特征进行融合,得到所述待识别目标的外观特征。
本申请中,目标重识别装置将所述多帧图像中的每帧图像均输入到外观特征提取网络中,分别得到多个外观特征,对所述多个外观特征进行融合,得到所述待识别目标的外观特征。通过将多帧图像中每帧图像对应的外观特征进行融合,得到待识别目标的外观特征的方式,可以保证得到的外观特征更加准确,以进一步确保识别的准确率。
对应于上述图1所示实施例所述的一种目标重识别方法,本申请还提供了一种目标重识别装置。
下面,结合图7,对本申请一实施例提供的目标重识别装置进行详细说明。
请参阅图7,图7示出了本申请一实施例提供的目标重识别装置的示意性框图。
如图7所示,本申请一实施例提供的目标重识别装置,包括第一获取模块701、第二获取模块702和识别模块703。
第一获取模块701,用于获取多帧图像,每帧图像中均包括待识别目标;
第二获取模块702,用于将所述多帧图像中的单帧图像输入到外观特征提取网络中,得到所述待识别目标的外观特征,所述待识别目标的外观特征与所述单帧图像的颜色无关,所述外观特征提取网络用于消除图像的颜色对目标的外观的干扰;
识别模块703,用于根据所述待识别目标的外观特征,从历史图像库中确定与所述待识别目标相匹配的图像。
在一些实施例中,特征网络生成系统,用于:
获取至少一帧样本图像,每帧所述样本图像中均包括样本目标;
对每帧所述样本图像中的每个像素点进行数据增强,得到多帧增强图像;
根据所述多帧增强图像,确定所述样本目标的外观特征,所述样本目标的外观特征与所述多帧增强图像的颜色无关;
根据所述样本目标的外观特征,对原始特征提取网络进行训练,得到所 述外观特征提取网络。
在一些实施例中,特征网络生成系统,具体用于:
对每帧所述样本图像中的每个像素点分别进行各个颜色通道的颜色值交换和进行灰度值转换,得到多帧增强图像;
或者,对每帧所述样本图像中的每个像素点进行各个颜色通道的颜色值交换,得到多帧增强图像。
在一些实施例中,目标重识别装置700还包括:第三获取模块(图7中未进行示意)。
第三获取模块,用于:
从所述多帧图像中,获取时间连续的连续帧图像;
根据所述连续帧图像,确定所述待识别目标的步态特征,所述待识别目标的步态特征用于表示所述待识别目标在走路时的姿态和动作。
在一些实施例中,识别模块703,具体用于:
根据所述待识别目标的外观特征和所述待识别目标的步态特征,从所述历史图像库中确定与所述待识别目标相匹配的图像。
在一些实施例中,第三获取模块,具体用于:
将所述连续帧图像中的每帧图像的前背景区域和背景区域进行分离,所述前背景区域中包括所述待识别目标,所述背景区域中不包括所述待识别目标;
将所述连续帧图像中的每帧图像的前背景区域作为连续帧前背景图像;
根据所述连续帧前背景图像,确定所述待识别目标的步态特征。
在一些实施例中,识别模块703,具体用于:
在所述历史图像库中,确定每帧图像中的目标的外观特征和步态特征;
根据所述待识别目标的外观特征与所述每帧图像中的目标的外观特征,确定多个第一评分,每个第一评分用于指示所述待识别目标的外观特征与所述每帧图像中的目标的外观特征之间的相似度;
根据所述待识别目标的步态特征与所述每帧图像中的目标的步态特征,确定多个第二评分,每个第二评分用于指示所述待识别目标的步态特征与所述每帧图像中的目标的步态特征之间的相似度;
对所述多个第一评分和所述多个第二评分进行融合,得到多个评分;
按照评分由大到小的顺序,从所述多个评分中取前N个评分,N为正整数;
从所述历史图像库中,获取所述前N个评分对应的图像;
确定与所述待识别目标相匹配的图像为所述前N个评分对应的图像。
在一些实施例中,第二获取模块702,用于:
将所述多帧图像中的每帧图像均输入到所述外观特征提取网络中,分别得到多个外观特征,每个外观特征与对应的每帧图像的颜色无关;
对所述多个外观特征进行融合,得到所述待识别目标的外观特征。
应理解的是,本申请的装置700可以通过专用集成电路(application-specific integrated circuit,ASIC)实现,或可编程逻辑器件(programmable logic device,PLD)实现,上述PLD可以是复杂程序逻辑器件(complex programmable logical device,CPLD),现场可编程门阵列(field-programmable gate array,FPGA),通用阵列逻辑(generic array logic,GAL)或其任意组合。也可以通过软件实现图1所示的目标重识别方法,当通过软件实现图1所示的目标重识别方法时,装置700及其各个模块也可以为软件模块。
图8为本申请提供的一种目标重识别设备的结构示意图。如图8所示,其中设备800包括处理器801、存储器802、通信接口803和总线804。其中,处理器801、存储器802、通信接口803通过总线804进行通信,也可以通过无线传输等其他手段实现通信。该存储器802用于存储指令,该处理器801用于执行该存储器802存储的指令。该存储器802存储程序代码8021,且处理器801可以调用存储器802中存储的程序代码8021执行图1所示的目标重识别方法。
应理解,在本申请中,处理器501可以是CPU,处理器801还可以是其他通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者是任何常规的处理器等。
该存储器802可以包括只读存储器和随机存取存储器,并向处理器801提供指令和数据。存储器802还可以包括非易失性随机存取存储器。该存储器802可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性 存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data date SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。
该总线804除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图8中将各种总线都标为总线804。
应理解,根据本申请的设备800可对应于本申请中的装置700,并可以对应于本申请图1所示方法中的设备,当设备800对应于图1所示方法中的设备时,设备800中的各个模块的上述和其它操作和/或功能分别为了实现图1中的由设备执行的方法的操作步骤,为了简洁,在此不再赘述。
本申请还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现可实现上述各个方法实施例中的步骤。
本申请提供了一种计算机程序产品,当计算机程序产品在移动终端上运行时,使得移动设备执行时实现可实现上述各个方法实施例中的步骤。
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请的实施过程构成任何限定。
需要说明的是,上述装置/单元之间的信息交互、执行过程等内容,由于与本申请方法实施例基于同一构思,其具体功能及带来的技术效果,具体可参见方法实施例部分,此处不再赘述。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而 将上述功能分配由不同的功能单元、模块完成,即将上述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。上述系统中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的实施例中,应该理解到,所揭露的装置/网络设备和方法,可以通过其它的方式实现。例如,以上所描述的装置/网络设备实施例仅仅是示意性的,例如,上述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通讯连接可以是通过一些接口,装置或单元的间接耦合或通讯连接,可以是电性,机械或其它的形式。
上述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本申请方案的目的。
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱 离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。

Claims (10)

  1. 一种目标重识别方法,其特征在于,包括:
    获取多帧图像,每帧图像中均包括待识别目标;
    将所述多帧图像中的单帧图像输入到外观特征提取网络中,得到所述待识别目标的外观特征,所述待识别目标的外观特征与所述单帧图像的颜色无关,所述外观特征提取网络用于消除图像的颜色对目标的外观的干扰;
    根据所述待识别目标的外观特征,从历史图像库中确定与所述待识别目标相匹配的图像。
  2. 如权利要求1所述的方法,其特征在于,生成所述外观特征提取网络的过程,包括:
    获取至少一帧样本图像,每帧所述样本图像中均包括样本目标;
    对每帧所述样本图像中的每个像素点进行数据增强,得到多帧增强图像;
    根据所述多帧增强图像,确定所述样本目标的外观特征,所述样本目标的外观特征与所述多帧增强图像的颜色无关;
    根据所述样本目标的外观特征,对原始特征提取网络进行训练,得到所述外观特征提取网络。
  3. 如权利要求2所述的方法,其特征在于,所述对每帧所述样本图像中的每个像素点进行数据增强,得到多帧增强图像,包括:
    对每帧所述样本图像中的每个像素点分别进行各个颜色通道的颜色值交换和进行灰度值转换,得到多帧增强图像;
    或者,对每帧所述样本图像中的每个像素点进行各个颜色通道的颜色值交换,得到多帧增强图像。
  4. 如权利要求1所述的方法,其特征在于,所述方法还包括:
    从所述多帧图像中,获取时间连续的连续帧图像;
    根据所述连续帧图像,确定所述待识别目标的步态特征,所述待识别目标的步态特征用于表示所述待识别目标在走路时的姿态和动作;
    所述根据所述待识别目标的外观特征,从历史图像库中确定与所述待识 别目标相匹配的图像,包括:
    根据所述待识别目标的外观特征和所述待识别目标的步态特征,从所述历史图像库中确定与所述待识别目标相匹配的图像。
  5. 如权利要求4所述的方法,其特征在于,所述根据所述连续帧图像,确定所述待识别目标的步态特征,包括:
    将所述连续帧图像中的每帧图像的前背景区域和背景区域进行分离,所述前背景区域中包括所述待识别目标,所述背景区域中不包括所述待识别目标;
    将所述连续帧图像中的每帧图像的前背景区域作为连续帧前背景图像;
    根据所述连续帧前背景图像,确定所述待识别目标的步态特征。
  6. 如权利要求4或5所述的方法,其特征在于,所述根据所述待识别目标的外观特征和所述待识别目标的步态特征,从所述历史图像库中确定与所述待识别目标相匹配的图像,包括:
    在所述历史图像库中,确定每帧图像中的目标的外观特征和步态特征;
    根据所述待识别目标的外观特征与所述每帧图像中的目标的外观特征,确定多个第一评分,每个第一评分用于指示所述待识别目标的外观特征与所述每帧图像中的目标的外观特征之间的相似度;
    根据所述待识别目标的步态特征与所述每帧图像中的目标的步态特征,确定多个第二评分,每个第二评分用于指示所述待识别目标的步态特征与所述每帧图像中的目标的步态特征之间的相似度;
    对所述多个第一评分和所述多个第二评分进行融合,得到多个评分;
    按照评分由大到小的顺序,从所述多个评分中取前N个评分,N为正整数;
    从所述历史图像库中,获取所述前N个评分对应的图像;
    确定与所述待识别目标相匹配的图像为所述前N个评分对应的图像。
  7. 如权利要求1至5中任一所述的方法,其特征在于,所述将所述多帧图像中的单帧图像输入到外观特征提取网络中,得到所述待识别目标的外观 特征,包括:
    将所述多帧图像中的每帧图像均输入到所述外观特征提取网络中,分别得到多个外观特征,每个外观特征与对应的每帧图像的颜色无关;
    对所述多个外观特征进行融合,得到所述待识别目标的外观特征。
  8. 一种目标重识别装置,其特征在于,包括:
    第一获取模块,用于获取多帧图像,每帧图像中均包括待识别目标;
    第二获取模块,用于将所述多帧图像中的单帧图像输入到外观特征提取网络中,得到所述待识别目标的外观特征,所述待识别目标的外观特征与所述单帧图像的颜色无关,所述外观特征提取网络用于消除图像的颜色对目标的外观的干扰;
    识别模块,用于根据所述待识别目标的外观特征,从历史图像库中确定与所述待识别目标相匹配的图像。
  9. 一种目标重识别设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如权利要求1至7任一项所述的方法。
  10. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至7任一项所述的方法。
PCT/CN2022/143487 2022-04-28 2022-12-29 目标重识别方法、装置、设备和计算机可读存储介质 WO2023207197A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210458101.8A CN114707614A (zh) 2022-04-28 2022-04-28 目标重识别方法、装置、设备和计算机可读存储介质
CN202210458101.8 2022-04-28

Publications (1)

Publication Number Publication Date
WO2023207197A1 true WO2023207197A1 (zh) 2023-11-02

Family

ID=82177354

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/143487 WO2023207197A1 (zh) 2022-04-28 2022-12-29 目标重识别方法、装置、设备和计算机可读存储介质

Country Status (2)

Country Link
CN (1) CN114707614A (zh)
WO (1) WO2023207197A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114707614A (zh) * 2022-04-28 2022-07-05 深圳云天励飞技术股份有限公司 目标重识别方法、装置、设备和计算机可读存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070211938A1 (en) * 2006-03-09 2007-09-13 General Electric Company Method and system for performing image re-identification
CN111414840A (zh) * 2020-03-17 2020-07-14 浙江大学 步态识别方法、装置、设备及计算机可读存储介质
CN112784728A (zh) * 2021-01-18 2021-05-11 山东省人工智能研究院 基于衣物脱敏网络的多粒度换衣行人重识别方法
CN114092873A (zh) * 2021-10-29 2022-02-25 北京大学深圳研究生院 一种基于外观与形态解耦的长时期跨摄像头目标关联方法及系统
CN114220078A (zh) * 2021-11-16 2022-03-22 浙江大华技术股份有限公司 一种目标重识别方法、装置和计算机可读存储介质
CN114707614A (zh) * 2022-04-28 2022-07-05 深圳云天励飞技术股份有限公司 目标重识别方法、装置、设备和计算机可读存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070211938A1 (en) * 2006-03-09 2007-09-13 General Electric Company Method and system for performing image re-identification
CN111414840A (zh) * 2020-03-17 2020-07-14 浙江大学 步态识别方法、装置、设备及计算机可读存储介质
CN112784728A (zh) * 2021-01-18 2021-05-11 山东省人工智能研究院 基于衣物脱敏网络的多粒度换衣行人重识别方法
CN114092873A (zh) * 2021-10-29 2022-02-25 北京大学深圳研究生院 一种基于外观与形态解耦的长时期跨摄像头目标关联方法及系统
CN114220078A (zh) * 2021-11-16 2022-03-22 浙江大华技术股份有限公司 一种目标重识别方法、装置和计算机可读存储介质
CN114707614A (zh) * 2022-04-28 2022-07-05 深圳云天励飞技术股份有限公司 目标重识别方法、装置、设备和计算机可读存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"Doctoral Dissertation", 10 April 2019, BEIJING UNIVERSITY OF POSTS AND TELECOMMUNICATIONS, CN, article LI, SHUANGQUN: "Research on Key Technology of Progressive Pedestrian Re-identification", pages: 1 - 121, XP009549996 *

Also Published As

Publication number Publication date
CN114707614A (zh) 2022-07-05

Similar Documents

Publication Publication Date Title
JP7500689B2 (ja) 制御されていない照明条件の画像中の肌色を識別する技術
WO2020253657A1 (zh) 视频片段定位方法、装置、计算机设备及存储介质
US8401254B2 (en) Image search device and image search method
US20200327311A1 (en) Image clustering method and apparatus, electronic device, and storage medium
WO2021012484A1 (zh) 基于深度学习的目标跟踪方法、装置及计算机可读存储介质
US9436892B2 (en) Method and apparatus for facial detection using regional similarity distribution analysis
WO2016107482A1 (zh) 确定人脸图像中人脸的身份标识的方法、装置和终端
Mudunuri et al. Dictionary alignment with re-ranking for low-resolution NIR-VIS face recognition
US20150169987A1 (en) Method and apparatus for semantic association of images with augmentation data
TWI745818B (zh) 視覺定位方法、電子設備及電腦可讀儲存介質
WO2017114237A1 (zh) 一种图像查询方法和装置
JP2010152884A (ja) 画像認識アルゴリズム、それを用いて目標画像を識別する方法、および、携帯用電子装置へ送信するデータを選択する方法
CN110796100B (zh) 步态识别方法、装置、终端及存储装置
WO2023207197A1 (zh) 目标重识别方法、装置、设备和计算机可读存储介质
EP3591580A1 (en) Method and device for recognizing descriptive attributes of appearance feature
WO2020147414A1 (zh) 网络优化方法及装置、图像处理方法及装置、存储介质
WO2021031704A1 (zh) 对象追踪方法、装置、计算机设备和存储介质
TW202242716A (zh) 用於目標匹配的方法、裝置、設備及儲存媒體
US11068707B2 (en) Person searching method and apparatus and image processing device
CN112733901A (zh) 基于联邦学习和区块链的结构化动作分类方法与装置
CN109087240B (zh) 图像处理方法、图像处理装置及存储介质
JP2024045460A (ja) 情報処理システム、情報処理装置、情報処理方法、およびプログラム
JP7501747B2 (ja) 情報処理装置、制御方法、及びプログラム
US9286707B1 (en) Removing transient objects to synthesize an unobstructed image
CN111429210A (zh) 用于衣物推荐的方法及装置、设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22940003

Country of ref document: EP

Kind code of ref document: A1