WO2022041830A1 - 行人重识别方法和装置 - Google Patents

行人重识别方法和装置 Download PDF

Info

Publication number
WO2022041830A1
WO2022041830A1 PCT/CN2021/092020 CN2021092020W WO2022041830A1 WO 2022041830 A1 WO2022041830 A1 WO 2022041830A1 CN 2021092020 W CN2021092020 W CN 2021092020W WO 2022041830 A1 WO2022041830 A1 WO 2022041830A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
head
character
target image
shoulders
Prior art date
Application number
PCT/CN2021/092020
Other languages
English (en)
French (fr)
Inventor
何凌霄
徐博强
廖星宇
刘武
梅涛
周伯文
Original Assignee
北京京东尚科信息技术有限公司
北京京东世纪贸易有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京京东尚科信息技术有限公司, 北京京东世纪贸易有限公司 filed Critical 北京京东尚科信息技术有限公司
Priority to EP21859677.3A priority Critical patent/EP4137991A4/en
Priority to US18/013,795 priority patent/US20230334890A1/en
Publication of WO2022041830A1 publication Critical patent/WO2022041830A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/42Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Definitions

  • the present disclosure relates to the field of image recognition technology, in particular to the field of computer vision technology, and in particular, to a pedestrian re-identification method, apparatus, electronic device, and computer-readable medium.
  • Pedestrian re-identification is a cross-camera retrieval of people through computer vision technology.
  • Existing pedestrian re-identification models mainly rely on the color, style and other attributes of pedestrians' clothes. The recognition performance will be severely degraded.
  • Embodiments of the present disclosure propose a pedestrian re-identification method, apparatus, electronic device, and computer-readable medium.
  • an embodiment of the present disclosure provides a pedestrian re-identification method, the method includes: collecting a target image set including at least two target images, each target image including at least one person; extracting each target in the target image set The global features and head and shoulders features of each person in the image, where the global feature is the overall feature of the appearance, and the head and shoulders feature is the feature of the head and shoulders; Characteristic features of each person in the target image; based on the characteristic feature of each person in each target image, determine the same person in different target images.
  • determining the characterizing features of each character in each target image based on the global features and head and shoulders features of each character in each target image includes: for each character in each target image, connecting the global features of the character and the weighted features of the head and shoulders features to obtain the character's representative features.
  • connecting the weighted features of the global feature and the head and shoulders feature of the character to obtain the characterizing feature of the character including: for each character in each target image, if There is a reference recognition feature in the global feature of the character, and the weight value of the global feature and the weight value of the head and shoulders feature corresponding to the reference recognition feature are obtained; the weighted feature of the global feature and the head and shoulders feature of the character is connected to obtain Characteristic characteristics of the character.
  • connecting the weighted features of the global feature and the head and shoulders feature of the character to obtain the characterizing feature of the character further comprising: if the global feature of the character does not exist Referring to the identification feature, obtain the weight value of the preset global feature and the weight value of the head and shoulder feature; connect the weighted feature of the global feature and the head and shoulder feature of the character to obtain the characterizing feature of the character.
  • the reference identification feature includes that the brightness of the person image is less than a preset brightness value or the color of the person's clothes is black.
  • determining the characteristic feature of each person in each target image based on the global feature and head and shoulders feature of each person in each target image includes: for each person in each target image, inputting the global feature of the person into The weight adaptive model has been trained to obtain the weight value of the global feature of the character output by the weight adaptive model; based on the weight value of the global feature of the character, calculate the weight value of the head and shoulders feature of the character; connect the head and shoulders of the character The weighted feature of both the feature and the global feature of the character is obtained to obtain the characterizing feature of the character.
  • determining the same person in different target images based on the characterizing features of each character in each target image includes: calculating the distance between the characterizing features of any two characters in the different target images to form a distance matrix; Based on the distance matrix, the same person in different target images is determined.
  • the above-mentioned extracting the global features of each person in each target image in the target image set includes: inputting the image of each person in each target image into a trained global deep learning model to obtain a trained global deep learning model The global features of each person in each target image are output.
  • the above-mentioned extracting the head and shoulders features of each person in each target image in the target image set includes: inputting the image of each person in each target image into a trained head and shoulders localization model to obtain the trained head and shoulders localization model
  • the image of the head and shoulder area in the image of each person in each target image output by the module is input into the trained head and shoulders deep learning model, and the trained head and shoulders area is obtained.
  • a pedestrian re-identification device comprising: a collection unit configured to collect a target image set including at least two target images, each target image including at least one person; extracting The unit is configured to extract the global feature and head and shoulder feature of each person of each target image in the target image set, wherein the global feature is the overall appearance feature, and the head and shoulder feature is the feature of the head and shoulders; the representation unit is configured to determine the characteristic features of each person in each target image based on the global characteristics and head and shoulders characteristics of each person in each target image; the determining unit is configured to determine the characteristic characteristics of each person in different target images same person.
  • the above-mentioned characterizing unit includes: a characterizing module configured to, for each character in each target image, connect the weighted features of both the global feature and the head and shoulders feature of the character to obtain the characterizing feature of the character.
  • the above-mentioned characterization module includes: a first acquisition sub-module, configured to, for each person in each target image, acquire a reference identification feature corresponding to the reference identification feature if there is a reference identification feature in the global feature of the person The weight value of the global feature and the weight value of the head and shoulders feature; the first connection sub-module is configured to connect the weighted features of the global feature and the head and shoulders feature of the character to obtain the characterizing feature of the character.
  • the above-mentioned characterization module further includes: a second acquisition sub-module, configured to, for each character of each target image, obtain a preset global feature if there is no reference identification feature in the global feature of the character.
  • the weight value, the weight value of the head and shoulders feature; the second connection sub-module is configured to connect the weighted feature of the global feature of the character and the head and shoulders feature to obtain the characterizing feature of the character.
  • the reference identification feature includes that the brightness of the person image is less than a preset brightness value or the color of the person's clothes is black.
  • the above-mentioned characterization unit includes: an acquisition module configured to, for each character in each target image, input the global feature of the character into the trained weight adaptive model, and obtain the character output by the weight adaptive model The weight value of the global feature of The weighted features of the two global features are obtained to obtain the representative features of the character.
  • the above determination unit includes: a calculation module configured to calculate the distance between the characteristic features of any two characters in different target images to form a distance matrix; a determination module configured to determine the difference based on the distance matrix the same person in the target image.
  • the above-mentioned extraction unit includes: a global extraction module, configured to input the images of each person in each target image into the trained global deep learning model, and obtain the output of each target image of the trained global deep learning model. Global characteristics of individual characters.
  • the above-mentioned extraction unit includes: an image extraction module configured to input the images of each person in each target image into the trained head and shoulders localization model, and obtain the output of each target image from the trained head and shoulders localization module.
  • the head and shoulders area images in the images of each person The head and shoulders area images in the images of each person; the head and shoulders extraction module is configured to input the head and shoulders area images in the images of each person in each target image into the trained head and shoulders deep learning model to obtain the trained head and shoulders area image.
  • the head and shoulder features of each person in each target image output by the shoulder deep learning model.
  • embodiments of the present disclosure provide an electronic device, the electronic device includes: one or more processors; a storage device on which one or more programs are stored; when the one or more programs are stored by one or more A plurality of processors execute such that one or more processors implement a method as described in any implementation of the first aspect.
  • an embodiment of the present disclosure provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, implements the method described in any of the implementation manners of the first aspect.
  • the pedestrian re-identification method and device firstly collect a target image set including at least two target images; secondly extract the global features and head and shoulders features of each person in each target image in the target image set; The global features and head and shoulders features of each person in the target image are used to determine the characterization features of each person in each target image; finally, based on the characterization features of each person in each target image, the same person in different target images is determined.
  • the same person in different target images is determined by the global characteristics of the person and the head and shoulders characteristics, which improves the person recognition effect.
  • FIG. 1 is an exemplary system architecture diagram to which an embodiment of the present disclosure may be applied;
  • FIG. 2 is a flowchart of an embodiment of a pedestrian re-identification method according to the present disclosure
  • FIG. 3 is a flowchart of one embodiment of a method for obtaining a character’s characterization feature according to the present disclosure
  • FIG. 4 is a flowchart of one embodiment of a method for determining characterization features of respective characters of respective target images according to the present disclosure
  • FIG. 5 is a schematic diagram of a specific implementation scenario of the pedestrian re-identification method according to the present disclosure.
  • FIG. 7 is a schematic structural diagram of an embodiment of a pedestrian re-identification device according to the present disclosure.
  • FIG. 8 is a schematic structural diagram of an electronic device suitable for implementing embodiments of the present disclosure.
  • FIG. 1 illustrates an exemplary system architecture 100 to which the method of person re-identification of the present disclosure may be applied.
  • the system architecture 100 may include terminal devices 101 , 102 , and 103 , a network 104 and a server 105 .
  • the network 104 is a medium used to provide a communication link between the terminal devices 101 , 102 , 103 and the server 105 .
  • the network 104 may include various connection types, and may typically include wireless communication links and the like.
  • the terminal devices 101, 102, and 103 interact with the server 105 through the network 104 to receive or send messages and the like.
  • Various communication client applications may be installed on the terminal devices 101 , 102 and 103 , such as instant messaging tools, email clients, and the like.
  • the terminal devices 101, 102, and 103 may be hardware or software; when the terminal devices 101, 102, and 103 are hardware, they may be user devices with communication and control functions, and the above-mentioned user settings can communicate with the server 105.
  • the terminal devices 101, 102, and 103 are software, they can be installed in the above-mentioned user equipment; the terminal devices 101, 102, and 103 can be implemented into multiple software or software modules (for example, software or software modules for providing distributed services) , can also be implemented as a single software or software module. There is no specific limitation here.
  • the server 105 may be a server that provides various services, such as an image server that provides support for the image processing systems on the terminal devices 101 , 102 , and 103 .
  • the image server can analyze and process the relevant information of each target image in the network, and feed back the processing results (such as the pedestrian re-identification strategy) to the terminal device.
  • the server may be hardware or software.
  • the server can be implemented as a distributed server cluster composed of multiple servers, or can be implemented as a single server.
  • the server is software, it can be implemented as a plurality of software or software modules (for example, software or software modules for providing distributed services), or can be implemented as a single software or software module. There is no specific limitation here.
  • the pedestrian re-identification method provided by the embodiments of the present disclosure is generally executed by the server 105 .
  • terminal devices, networks and servers in FIG. 1 are merely illustrative. There can be any number of terminal devices, networks and servers according to implementation needs.
  • FIG. 2 shows a process 200 of an embodiment of the pedestrian re-identification method according to the present disclosure, and the pedestrian re-identification method includes the following steps:
  • Step 201 collecting a target image set including at least two target images.
  • each target image includes at least one person.
  • the execution subject (such as a server, a terminal device) on which the pedestrian re-identification method runs may acquire at least two target images by real-time acquisition or by means of memory reading, and combine the acquired target images into a target image collection.
  • the person in the target image can be one or more.
  • how to determine the same person from different target images is the subject of the execution of the pedestrian re-identification method. The main work.
  • the target image can also be an image obtained from a surveillance video. Due to the resolution of the camera and the shooting angle, the target image cannot usually obtain a very high-quality face picture, so the execution subject can recognize the face through face recognition. External means to identify the same person in different target images.
  • the target image can also be an image obtained in a dark night or abnormal weather (rain, fog, strong wind, etc.), and the target image is usually unable to pass through the characters in the target image due to the surrounding environment of the characters or because they are all wearing black clothes.
  • the color, style, etc. of the clothes are used to identify the characters.
  • Step 202 Extract the global features and head and shoulders features of each person in each target image in the target image set.
  • the global feature is the overall feature of the appearance
  • the head and shoulders feature is the feature of the head and shoulders.
  • the image of each person in each target image may be extracted, and based on the characteristics of each person's image in the target image, extract each person
  • the overall feature of the appearance of the respective target images—the global feature, the global feature is the feature of the appearance of the person in the target image.
  • the global feature includes: the color and style of the person’s clothes or/and shoes in the target image, etc. .
  • the global features may include: the clothing or/and the style of shoes of the person in the target image.
  • the global features of each person in each target image can be extracted through a global deep learning model or a pedestrian re-identification model, wherein the pedestrian re-identification model is an existing mature model, which may specifically include: Spindle Net, multiple Granularity Network (Multiple Granularity Network, MGN for short), etc.
  • MGN Multiple Granularity Network
  • the global deep learning model is a model established based on deep learning algorithms in order to achieve global feature extraction.
  • the global deep learning model can include: a ResNet50 network structure, a global average pooling layer, and a convolutional layer, as shown in Figure 5.
  • the image of each person in each target image first extracts features through the ResNet50 network structure, and then performs feature dimension reduction through the global average pooling layer and the convolution layer with a convolution kernel of 1*1 in turn, and finally obtains the global feature f g .
  • the ResNet50 network structure is a residual network with 50 layers.
  • the residual network is a deep convolutional network.
  • the residual network is easier to optimize and can improve the accuracy by increasing a considerable depth.
  • the core technology of the residual network is to solve the side effect of increasing the depth, which can improve the network performance by simply increasing the network depth.
  • extracting the global features of each person in each target image in the target image set includes:
  • the images of each person in each target image are input into the trained global deep learning model, and the global features of each person in each target image output by the trained global deep learning model are obtained.
  • the trained global deep learning model is a pre-trained deep learning model.
  • training the global deep learning model first collect multiple person images as training samples, determine the preset global features corresponding to the person images, and detect the global feature detection results of the training samples according to the global deep learning model and the preset global features corresponding to the training samples.
  • the difference between the global features determines the error of the global deep learning model, and the parameters of the global deep learning model are iteratively adjusted by means of error back propagation to gradually reduce the error.
  • the parameter adjustment can be stopped, and a trained global deep learning model can be obtained.
  • the method for extracting global features implemented by this optional implementation adopts a global deep learning model to perform global feature extraction on images of each person in the target image, which improves the efficiency of global feature extraction and ensures the reliability of global feature extraction.
  • the execution subject may extract the images of each person in each target image, and determine the head and shoulder region image in the image of each person based on the image of each person in each target image; then, Based on the head and shoulder area images in the images of each person, the features of the head and shoulders are extracted—the head and shoulders features, the head and shoulders features are the material or attribute features of the person related to the head or shoulders of the person, such as head and shoulders
  • the features include: gender, face, hairstyle, glasses, shoulder shape, scarf, neck thickness, etc. It should be noted that for low-quality target images, the obtained head and shoulder features of the character may include: gender, hairstyle, glasses , shoulder type, scarf, etc.
  • the head and shoulders features of each person in each target image can be extracted through the head and shoulders localization model and the head and shoulders deep learning model.
  • the head and shoulders positioning model can include: a ResNet18 network structure and a fully connected layer. After the training of the head and shoulders positioning model is completed, the image of each person in each target image is first input into the ResNet18 network structure to extract features, and then the coordinates of the rectangular frame of the head and shoulders area in the image of each person in each target image are output through the full connection layer. .
  • the coordinates of the rectangular frame of the head and shoulders area include: coordinates of the vertices of the upper left corner and the lower right corner of the rectangular frame, and the head and shoulders area image in the image of each person in each target image can be obtained through the coordinates of the rectangular frame of the head and shoulders area.
  • the ResNet18 network structure is a residual network with 18 layers.
  • the residual network is a deep convolutional network. The residual network is easier to optimize and can improve the accuracy by increasing a considerable depth.
  • the head and shoulders deep learning model can include: a ResNet50 network structure, three attention modules, three generalized average pooling layers, and three convolutional layers, as shown in Figure 5.
  • ResNet50 the head and shoulders area images in the images of each person in each target image output by the head and shoulders positioning model are extracted by ResNet50 to obtain the original head and shoulders features, and the obtained head and shoulders original feature level It is divided into three blocks; each of the three horizontally divided features weights the high-response parts in the spatial dimension and the channel dimension through their corresponding attention modules, and then each feature is sequentially passed through the generalized average pooling layer and A convolutional layer with a convolution kernel of 1*1 performs feature dimension reduction, and the three features after dimension reduction are connected in the channel dimension to obtain the head-shoulder feature f h .
  • the attention module includes two modules, one is an attention module in spatial dimension, and the other is an attention module in channel dimension.
  • the attention modules of spatial dimension and channel dimension strengthen the weights of high-response spatial regions and channels respectively, so that the features learned by the network are more concentrated in meaningful and discriminative parts, and the distinguishability and robustness of features are increased.
  • extracting the head and shoulders features of each person in each target image in the target image set includes: inputting the image of each person in each target image into a trained head and shoulders localization model to obtain a The head and shoulders area images in the images of each person in each target image output by the trained head and shoulders localization module; input the head and shoulder area images in the images of each person in each target image into the trained head and shoulders deep learning model to obtain The head and shoulders features of each person in each target image output by the trained head and shoulders deep learning model.
  • the trained head and shoulders localization model is used to locate the head and shoulder region images in the images of each person in each target image, which is a pre-trained model.
  • training the head and shoulders positioning model first collect multiple person images as training samples, determine the coordinates of the preset rectangular frame of the head and shoulders area corresponding to the character images, and then determine the rectangular frame of the head and shoulders area of the training sample according to the head and shoulders positioning model.
  • the difference between the coordinate detection result of the training sample and the coordinates of the preset rectangular frame of the head and shoulders region corresponding to the training sample determines the error of the head and shoulders positioning model, and iteratively adjusts the parameters of the head and shoulders positioning model by means of error back propagation to make it
  • the error is gradually reduced.
  • the parameter adjustment can be stopped, and the trained head and shoulders positioning model can be obtained.
  • the trained head and shoulders deep learning model is a pre-trained deep learning model.
  • training the head and shoulders deep learning model first collect multiple head and shoulders region images as training samples, determine the preset head and shoulders features corresponding to the head and shoulders region images, and detect the head and shoulders features of the training samples according to the head and shoulders deep learning model
  • the difference between the result and the preset head and shoulders features corresponding to the training samples determines the error of the head and shoulders deep learning model, and the parameters of the head and shoulders deep learning model are iteratively adjusted by means of error back propagation, so that the error is gradually reduced.
  • the parameter adjustment can be stopped, and the trained head and shoulders deep learning model can be obtained.
  • the head and shoulders positioning model is used to locate the images of each person in the target image to obtain the image of the head and shoulders part of the image of each person in the target image; the head and shoulders deep learning is adopted.
  • the model extracts the head and shoulders features from the image of the head and shoulders part of the region, which improves the extraction efficiency of the head and shoulders features and ensures the reliability of the extraction of the head and shoulders features.
  • Step 203 based on the global features and head and shoulders features of each person in each target image, determine the characterizing feature of each person in each target image.
  • the characterization feature is a feature that expresses the substantive characteristics of each person, and each person in each target image can be distinguished by the characterization feature.
  • each target image is determined based on the global characteristics and head and shoulders characteristics of each person in each target image.
  • the characterization features of each character include: directly connecting the global features and the head and shoulders features together to obtain the characterization features.
  • each target image is blurry, or each person in the target image is wearing black clothes, or the target image cannot identify the global features of each person
  • based on each target image The global features and head-shoulders features of each character, and determining the characterizing features of each character in each target image includes: for each character in each target image, connecting the weighted features of both the global feature and the head-and-shoulders feature of the character to obtain the Characteristic traits.
  • the weighted feature is a feature obtained by multiplying the feature with its corresponding feature weight value respectively.
  • the feature weight value corresponding to each feature can be obtained through various channels (for example, a feature and feature weight value correspondence table or a trained weight model).
  • the feature and feature weight value correspondence table may be a one-to-one correspondence table between features and feature weight values made in advance by an operator.
  • the trained weight model is a pre-trained feature and feature weight value relationship model, which can output different feature weight values for different features.
  • the weighted feature of the global feature is the feature obtained by multiplying the global feature by the weight value of the global feature
  • the weighted feature of the head and shoulders feature is the feature obtained by multiplying the head and shoulders feature by the weight value of the head and shoulders feature.
  • the characteristic feature of each character can be obtained quickly and conveniently. And the obtained characterization features can effectively represent the characteristics of people, and improve the reliability of subsequent pedestrian re-identification.
  • Step 204 Determine the same person in different target images based on the characteristic features of each person in each target image.
  • the characterization features of each person in different target images in the target image set are compared for similarity, and the characters corresponding to the two characterization features whose similarity is greater than the preset similarity are determined to be the same person, for example, any If the similarity between the two representational features is more than 80%, and it is determined that the two representational features are the same feature, then the characters with the two representational features are the same character.
  • the pedestrian re-identification method firstly collects a target image set including at least two target images; secondly, extracts the global features and head and shoulders features of each person in each target image in the target image set; then based on each target image The global features and head-shoulders features of each person in each target image are determined, and the representative features of each person in each target image are determined; finally, based on the representative features of each person in each target image, the same person in different target images is determined. , by determining the same person in different target images from the person's global features and head and shoulders features, the effect of person recognition is improved.
  • FIG. 3 A process 300 of an embodiment of the method for obtaining a character's characteristic feature of the present disclosure is shown.
  • the method for determining the characterization features of each person in each target image includes the following steps:
  • Step 301 for each person in each target image, determine whether there is a reference identification feature in the global feature of the person; if the determination result is yes, then step 302 is executed. If the judgment result is no, then step 305 is executed.
  • the reference identification feature is characterized as a feature that cannot distinguish the characters in the target image, which is also called the feature of the global feature.
  • each character in the target image is a man in black, or the environment where each character in the target image is located is dark as a whole. , it is impossible to effectively distinguish the color, style, etc. of each character's clothing.
  • the head-shoulder feature includes a lot of information that can distinguish characters on the head or shoulders, such as gender, face, hairstyle, glasses, etc., and each character in the target image is a man in black, or in the target image
  • the head and shoulders features of each character can also be collected under the condition of dark overall.
  • the reference identification feature includes: the brightness of the person image is less than a preset brightness value or the color of the person's clothes is black.
  • the preset brightness value is a preset brightness value, and when the person image in each target image is smaller than the preset brightness value, it means that the image is dark, such as a low-light image, and the global characteristics of the person cannot be effectively distinguished.
  • the reference recognition feature is set as the brightness of the person image is less than the preset brightness value or the color of the person's clothes is black, which can provide an effective distinguishing condition for obtaining the weight value of the head and shoulders feature and the global feature.
  • step 302 the weight value of the global feature and the weight value of the head and shoulders feature corresponding to the reference identification feature are obtained, and then step 303 is executed.
  • the executor on which the pedestrian re-identification method runs can obtain the weight value of the global feature corresponding to the reference recognition feature, the head and shoulders feature by looking up the correspondence table between the reference recognition feature and the feature weight value or the trained weight model weight value.
  • the correspondence table between the reference identification feature and the feature weight value may be a table of correspondence between the reference identification feature and the weight value of the global feature and the weight value of the head and shoulders feature made in advance by the operator. By looking up the correspondence table between the reference recognition feature and the feature weight value, the weight value of the global feature and the weight value of the head and shoulders feature under a certain reference recognition feature can be obtained.
  • the global features cannot effectively identify people, but the head and shoulders features can effectively identify people. Therefore, the weight value of the global feature corresponding to the reference recognition feature is smaller than that of the reference recognition feature.
  • the weight value of the head and shoulders feature referring to the weight value of the global feature corresponding to the identification feature and the weight value of the head and shoulders feature, can change with the change of the global feature.
  • the global feature may not be able to distinguish the person normally; The influence of the feature is not large.
  • the weight value of the global feature corresponding to the reference recognition feature is smaller than the weight value of the head and shoulder feature corresponding to the reference recognition feature, which can effectively highlight the proportion of the head and shoulder feature in all the features of the character.
  • Step 303 connect the weighted features of the global feature and the head-shoulder feature of the character to obtain the characterizing feature of the character, and then execute step 304 .
  • the weighted feature of the global feature is a feature obtained by multiplying the weight value of the global feature and the global feature;
  • the weighted feature of the head and shoulders feature is a feature obtained by multiplying the weight value of the head and shoulders feature by the head and shoulders feature.
  • the weight value of the global feature and the weight value of the head and shoulders feature corresponding to the reference identification feature are obtained, and the global feature and the head and shoulders feature of the character are connected.
  • the weighted features of the person can be obtained by the representative features of the person. Therefore, the salience of the head and shoulders features can be highlighted, and the head and shoulders features can be used as key objects to reflect the characteristics of the person, ensure the effect of pedestrian re-identification, and improve the identification of pedestrians. effectiveness.
  • Step 304 end.
  • step 305 the preset weight value of the global feature and the weight value of the head and shoulders feature are acquired, and then step 306 is performed.
  • the preset weight value of the global feature and the weight value of the head and shoulders feature are two preset fixed values at this time, for example, both are 0.5.
  • Step 306 connect the weighted features of the global feature and the head and shoulders feature of the character to obtain the characterizing feature of the character, and then step 304 is executed.
  • the preset weight value of the global feature and the weight value of the head and shoulders feature are acquired, and the global feature and the head and shoulders feature of the person are connected.
  • the weighted features of the two features are used to obtain the representative features of the person, thereby highlighting the balance between the global features and the head and shoulders features, so that the obtained representative features can effectively represent the identification features of the person and improve the identification efficiency of pedestrian re-identification.
  • the weight value of the global feature and the weight value of the head and shoulders feature corresponding to the reference identification feature are obtained, and Connect the weighted features of the global feature and the head-shoulder feature of the person to obtain the representative feature of the person;
  • FIG. 4 shows a flow 400 of an embodiment of the method for determining the characterizing features of each person in each target image of the present disclosure.
  • the method for determining the characterization features of each person in each target image includes the following steps:
  • Step 401 for each character in each target image, input the global feature of the character into the trained weight adaptive model, and obtain the weight value of the global feature of the character output by the weight adaptive model.
  • the input of the weight adaptive model is the global feature f g
  • the output of the weight adaptive model is the weight value of the global feature.
  • the weight adaptive model may include: two fully connected layers connected in sequence, the first fully connected layer may judge whether it has the reference identification feature based on the global feature; the second fully connected layer receives the first fully connected layer identification result , output the weight value of the global feature.
  • the trained weight adaptive model may be a pre-trained deep learning model.
  • first collect multiple global features and head and shoulders features corresponding to each global feature and use the collected multiple global features as training samples (wherein, the multiple global features can be obtained from the previous embodiment).
  • the global features obtained in the training process of the global deep learning model, and the head and shoulders features corresponding to each global feature may be the head and shoulders features obtained in the training process of the head and shoulders deep learning model in the preceding embodiment), determine the corresponding global features
  • the preset characterization feature and the preset judgment result corresponding to the output value of the first fully connected layer are determined by the prediction result of the weight value of the global feature of the training sample according to the weight adaptive model and the head and shoulders corresponding to each global feature.
  • the feature calculation represents the feature, the difference between the computed representation feature and the preset representation feature corresponding to the training sample and the difference between the judgment result of the first fully connected layer and the preset judgment result corresponding to the training sample,
  • the error of the weight adaptive model is determined, and the parameters of the weight adaptive model are iteratively adjusted by means of error back propagation, so that the error is gradually reduced.
  • the parameter adjustment can be stopped, and the trained weight adaptive model can be obtained.
  • the weight adaptation module may assign different and optimal weights to the global feature and the head and shoulders feature respectively according to the global feature of the input image. For example, the weight adaptive module first judges whether the input image belongs to "low light, people wearing black clothes" through global features. If it does, then the weight adaptive module will give a larger weight value to the head and shoulders feature.
  • Step 402 Calculate the weight value of the head and shoulders feature of the character based on the weight value of the global feature of the character.
  • the sum of the weight value of the head and shoulders feature and the weight value of the global feature is 1. After the weight value of the global feature is obtained, the weight value of the head and shoulders feature can be obtained by subtracting the weight value of the global feature from 1.
  • Step 403 connecting the weighted features of the head and shoulders feature of the character and the global feature of the character to obtain the characterizing feature of the character.
  • the character's characteristic feature is f.
  • the global feature f g and the head and shoulders feature f h are fused to obtain the character representation feature f.
  • the method for determining the characteristic features of each person in each target image uses a weight adaptive module to assign the weight value of the global feature to the global feature, obtains the weight value of the head and shoulders feature from the global feature weight value, and connects the head of the person.
  • the weighted features of both shoulder features and global features are used to obtain the representative features of the character. Therefore, the weight adaptive module is used to realize the adaptive distribution of the weight value of the head-shoulder feature and the global feature, which improves the adaptability of the characterization feature and improves the recognition effect of pedestrian re-identification.
  • FIG. 6 illustrates a process 600 of the present disclosure for determining the same person in different target images.
  • the method for determining the same person in different target images includes the following steps:
  • Step 601 Calculate the distance between the characterizing features of any two characters in different target images to form a distance matrix.
  • the Euclidean distance between the characteristic features of any two characters can be calculated by the Euclidean calculation formula, or the residual distance between the characteristic features of any two characters can be calculated by the Euclidean calculation formula.
  • the position of each person in each target image in the target image set is marked, and a distance matrix is formed based on the position of each person.
  • the specific representation form of the distance between the two characterization features in the distance matrix is: d ij , wherein the positions of the characters corresponding to the two characterization features in different target image sets are i, j respectively; for example, i is 1103, It indicates the sorting position of the third person in the 11th target image.
  • Step 602 based on the distance matrix, determine the same person in different target images.
  • whether the characters in different target images are the same is related to the distance between the characterizing features of the characters.
  • the distance between the characterizing features of two characters in the distance matrix is within the preset distance value, it can be determined that the Persons are the same person.
  • the distance values between the characteristic features of any two characters in the distance matrix can also be sorted from small to large, and each character corresponding to the distance value of the top preset positions (for example, the top 5 positions) is selected.
  • the judgment object of the same person the smaller the distance value in the object, the higher the similarity of the person, and the greater the possibility that the person is considered to be the same person.
  • the method for determining the same person in different target images calculates the distance between the representation features of any two tasks in different target images to form a distance matrix; based on the distance matrix, determines the same person in different target images, Therefore, the distance between the characteristic features of each person can be conveniently and quickly determined by the distance matrix, and the same person in the target image can be determined, which improves the efficiency of pedestrian re-identification.
  • the present disclosure provides an embodiment of a pedestrian re-identification device, which corresponds to the method embodiment shown in FIG. 2 , and the device can be specifically applied in various electronic devices.
  • an embodiment of the present disclosure provides a pedestrian re-identification device 700 .
  • the device 700 includes a collection unit 701 , an extraction unit 702 , a characterizing unit 703 and a determination unit 704 .
  • the acquisition unit 701 may be configured to acquire a target image set including at least two target images, and each target image includes at least one person.
  • the extraction unit 702 may be configured to extract global features and head and shoulders features of each person in each target image in the target image set, wherein the global feature is the overall appearance feature, and the head and shoulders feature is the features of the head and shoulders.
  • the characterizing unit 703 may be configured to determine the characterizing feature of each character in each target image based on the global feature and head and shoulders feature of each character in each target image.
  • the determining unit 704 may be configured to determine the same person in different target images based on the characteristic features of each person in each target image.
  • the specific processing of the acquisition unit 701, the extraction unit 702, the representation unit 703, and the determination unit 704 and the technical effects brought about by the collection unit 701, the extraction unit 702, and the determination unit 704 and the technical effects brought about by them can refer to the steps in the corresponding embodiment of FIG. 2, respectively. 201 , step 202 , step 203 , step 204 .
  • the above-mentioned characterization unit 703 includes: a characterization module (not shown in the figure).
  • the characterization module may be configured to, for each character in each target image, connect the weighted features of both the global feature and the head and shoulders feature of the character to obtain the characterizing feature of the character.
  • the above-mentioned characterization module further includes: a first acquisition sub-module (not shown in the figure) and a first connection sub-module (not shown in the figure).
  • the first acquisition sub-module can be configured to obtain the weight value of the global feature corresponding to the reference recognition feature, the weight value of the head and shoulders feature, if there is a reference recognition feature in the global feature of the character for each character of each target image. Weights.
  • the first connecting sub-module may be configured to connect the weighted features of both the global feature and the head and shoulders feature of the character to obtain the characterizing feature of the character.
  • the above-mentioned characterization module further includes: a second acquisition sub-module (not shown in the figure) and a second connection sub-module (not shown in the figure).
  • the second obtaining sub-module is configured to obtain the weight value of the preset global feature and the weight value of the head and shoulders feature for each character of each target image, if there is no reference identification feature in the global feature of the character;
  • second The connecting sub-module is configured to connect the weighted features of both the global feature and the head and shoulders feature of the character to obtain the characterizing feature of the character.
  • the above-mentioned reference identification feature includes that the brightness of the person image is less than a preset brightness value or the color of the person's clothes is black.
  • the above-mentioned characterizing unit 703 includes: an acquiring module (not shown in the figure), a obtaining module (not shown in the figure), and a connecting module (not shown in the figure).
  • the acquisition module can be configured to input the global feature of the character into the trained weight adaptive model for each character of each target image, and obtain the weight value of the global feature of the character output by the weight adaptive model.
  • the obtaining module may be configured to calculate the weight value of the head and shoulders feature of the character based on the weight value of the global feature of the character.
  • the connecting module is configured to connect the weighted features of the head and shoulders feature of the character and the global feature of the character to obtain the characterizing feature of the character.
  • the above determination unit 704 includes: a calculation module (not shown in the figure) and a determination module (not shown in the figure).
  • the calculation module can be configured to calculate the distance between the characterizing features of any two characters in different target images to form a distance matrix.
  • the determination module may be configured to determine the same person in different target images based on the distance matrix.
  • the above-mentioned extraction unit 702 includes: a global extraction module (not shown in the figure).
  • the global extraction module can be configured to input the images of each person in each target image into the trained global deep learning model, and obtain the global features of each person in each target image output by the trained global deep learning model.
  • the above-mentioned extraction unit 702 includes: an image extraction module (not shown in the figure), a head and shoulders extraction module (not shown in the figure), and an image extraction module, which can be configured to extract each person in each target image
  • the images of are input into the trained head and shoulders localization model, and the head and shoulders area images in the images of each person in each target image output by the trained head and shoulders localization module are obtained.
  • the head and shoulders extraction module can be configured to input the images of the head and shoulders regions in the images of each person in each target image into the trained head and shoulders deep learning model, and obtain each target image output by the trained head and shoulders deep learning model.
  • the character's head and shoulders features are configured to extract each person in each target image
  • the images of are input into the trained head and shoulders localization model, and the head and shoulders area images in the images of each person in each target image output by the trained head and shoulders localization module are obtained.
  • the head and shoulders extraction module can be configured to input the images of the head and shoulders regions in the images of each person
  • the collection unit 701 collects a target image set including at least two target images; secondly, the extraction unit 702 extracts the global features and head of each person in each target image in the target image set shoulder feature; then, the characterizing unit 703 determines the characterizing feature of each character of each target image based on the global feature and head and shoulders feature of each character of each target image; finally, determining unit 704 is based on the characterizing feature of each character of each target image, The same person in different target images is determined, so that in the process of pedestrian re-identification, the same person in different target images is determined by the global feature and the head and shoulder feature of the person, which improves the person recognition effect.
  • FIG. 8 a schematic structural diagram of an electronic device 800 suitable for implementing embodiments of the present disclosure is shown.
  • an electronic device 800 may include a processing device (eg, a central processing unit, a graphics processor, etc.) 801, which may be loaded into random access according to a program stored in a read only memory (ROM) 802 or from a storage device 808 Various appropriate actions and processes are executed by the programs in the memory (RAM) 803 . In the RAM 803, various programs and data required for the operation of the electronic device 800 are also stored.
  • the processing device 801, the ROM 802, and the RAM 803 are connected to each other through a bus 804.
  • An input/output (I/O) interface 805 is also connected to bus 804 .
  • the following devices can be connected to the I/O interface 805: input devices 806 including, for example, a touch screen, touchpad, keyboard, mouse, etc.; output devices including, for example, a Liquid Crystal Display (LCD), speakers, vibrators, etc. 807; storage devices 808 including, for example, magnetic tapes, hard disks, etc.; and communication devices 809.
  • Communication means 809 may allow electronic device 800 to communicate wirelessly or by wire with other devices to exchange data. While FIG. 8 shows an electronic device 800 having various means, it should be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in FIG. 8 can represent one device, and can also represent multiple devices as required.
  • embodiments of the present disclosure include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via the communication device 809, or from the storage device 808, or from the ROM 802.
  • the processing device 801 the above-described functions defined in the methods of the embodiments of the present disclosure are executed.
  • the computer-readable medium of the embodiments of the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), fiber optics, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal in baseband or propagated as part of a carrier wave, carrying computer-readable program code therein. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
  • the program code contained on the computer-readable medium can be transmitted by any suitable medium, including but not limited to: electric wire, optical cable, RF (Radio Frequency, radio frequency), etc., or any suitable combination of the above.
  • the above-mentioned computer-readable medium may be included in the above-mentioned server; or may exist alone without being assembled into the server.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the server, the server: collects a target image set including at least two target images, and each target image includes at least one character ; Extract the global features and head and shoulders features of each character of each target image in the target image set, wherein the global feature is the overall appearance feature, and the head and shoulder feature is the feature of the head and shoulders; Based on the global features of each character in each target image feature and head and shoulders feature, determine the characterizing feature of each person in each target image; determine the same character in different target images based on the characterizing feature of each character in each target image.
  • Computer program code for carrying out operations of embodiments of the present disclosure may be written in one or more programming languages, including object-oriented programming languages—such as Java, Smalltalk, C++, and also A conventional procedural programming language - such as the "C" language or similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider through Internet connection).
  • LAN local area network
  • WAN wide area network
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions.
  • the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments of the present disclosure may be implemented in software or hardware.
  • the described unit may also be provided in the processor, for example, it may be described as: a processor including an acquisition unit, an extraction unit, a characterization unit and a determination unit.
  • the names of these units do not constitute a limitation of the unit itself in some cases, for example, the acquisition unit may also be described as "configured to acquire a target image set including at least two target images, each target image Include at least one character" unit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

本申请公开了行人重识别方法和装置。该方法的一具体实施方式包括:采集包括至少两个目标图像的目标图像集合,每个目标图像包括至少一个人物;提取目标图像集合中各个目标图像的各个人物的全局特征以及头肩特征,其中,全局特征为外观整体特征,头肩特征为头部和肩部的特征;基于各个目标图像的各个人物的全局特征和头肩特征,确定各个目标图像的各个人物的表征特征;基于各个目标图像的各个人物的表征特征,确定不同目标图像中的相同人物。该实施方式提高了行人重识别的识别效果。

Description

行人重识别方法和装置
本专利申请要求于2020年08月25日提交的、申请号为202010863443.9、发明名称为“行人重识别方法和装置”的中国专利申请的优先权,该申请的全文以引用的方式并入本申请中。
技术领域
本公开涉及图像识别技术领域,具体涉及计算机视觉技术领域,尤其涉及一种行人重识别方法、装置、电子设备和计算机可读介质。
背景技术
行人重识别是通过计算机视觉技术对人物进行跨摄像头检索,现有行人重识别模型主要依赖行人衣服的颜色、样式等属性,针对无法辨别行人衣服的颜色、样式的场景,现有行人重识别模型的识别性能会严重下降。
发明内容
本公开的实施例提出了行人重识别方法、装置、电子设备和计算机可读介质。
第一方面,本公开的实施例提供了一种行人重识别方法,该方法包括:采集包括至少两个目标图像的目标图像集合,每个目标图像包括至少一个人物;提取目标图像集合中各个目标图像的各个人物的全局特征以及头肩特征,其中,全局特征为外观整体特征,头肩特征为头部和肩部的特征;基于各个目标图像的各个人物的全局特征和头肩特征,确定各个目标图像的各个人物的表征特征;基于各个目标图像的各个人物的表征特征,确定不同目标图像中的相同人物。
在一些实施例中,上述基于各个目标图像的各个人物的全局特征和头肩特征,确定各个目标图像的各个人物的表征特征,包括:针对 各个目标图像的每一个人物,连接该人物的全局特征和头肩特征两者的加权特征,得到该人物的表征特征。
在一些实施例中,针对各个目标图像的每一个人物,连接该人物的全局特征和头肩特征两者的加权特征,得到该人物的表征特征,包括:针对各个目标图像的每一个人物,若该人物的全局特征中存在参考识别特征,获取与该参考识别特征相对应的全局特征的权重值、头肩特征的权重值;连接该人物的全局特征和头肩特征两者的加权特征,得到该人物的表征特征。
在一些实施例中,针对各个目标图像的每一个人物,连接该人物的全局特征和头肩特征两者的加权特征,得到该人物的表征特征,还包括:若该人物的全局特征中不存在参考识别特征,获取预设的全局特征的权重值、头肩特征的权重值;连接该人物的全局特征和头肩特征两者的加权特征,得到该人物的表征特征。
在一些实施例中,参考识别特征包括人物图像的亮度小于预设亮度值或人物衣服的颜色为黑色。
在一些实施例中,基于各个目标图像的各个人物的全局特征和头肩特征,确定各个目标图像的各个人物的表征特征,包括:针对各个目标图像的每一个人物,将该人物的全局特征输入已训练的权重自适应模型,得到权重自适应模型输出该人物的全局特征的权重值;基于该人物的全局特征的权重值,计算该人物的头肩特征的权重值;连接该人物的头肩特征与该人物的全局特征两者的加权特征,得到该人物的表征特征。
在一些实施例中,上述基于各个目标图像的各个人物的表征特征,确定不同目标图像中的相同人物,包括:计算不同目标图像中任意两个人物的表征特征之间的距离,形成距离矩阵;基于距离矩阵,确定不同目标图像中的相同人物。
在一些实施例中,上述提取目标图像集合中各个目标图像的各个人物的全局特征,包括:将各个目标图像的各个人物的图像输入已训练的全局深度学习模型,得到已训练的全局深度学习模型输出的各个目标图像的各个人物的全局特征。
在一些实施例中,上述提取目标图像集合中各个目标图像的各个人物的头肩特征,包括:将各个目标图像的各个人物的图像输入已训练的头肩定位模型,得到已训练的头肩定位模块输出的各个目标图像的各个人物的图像中的头肩部区域图像;将各个目标图像的各个人物的图像中头肩部区域图像输入已训练的头肩深度学习模型,得到已训练的头肩深度学习模型输出的各个目标图像的各个人物的头肩特征。
第二方面,本公开的实施例提供了一种行人重识别装置,该装置包括:采集单元,被配置成采集包括至少两个目标图像的目标图像集合,每个目标图像包括至少一个人物;提取单元,被配置成提取目标图像集合中各个目标图像的各个人物的全局特征以及头肩特征,其中,全局特征为外观整体特征,头肩特征为头部和肩部的特征;表征单元,被配置成基于各个目标图像的各个人物的全局特征和头肩特征,确定各个目标图像的各个人物的表征特征;确定单元,被配置成基于各个目标图像的各个人物的表征特征,确定不同目标图像中的相同人物。
在一些实施例中,上述表征单元包括:表征模块,被配置成针对各个目标图像的每一个人物,连接该人物的全局特征和头肩特征两者的加权特征,得到该人物的表征特征。
在一些实施例中,上述表征模块包括:第一获取子模块,被配置成针对各个目标图像的每一个人物,若该人物的全局特征中存在参考识别特征,获取与该参考识别特征相对应的全局特征的权重值、头肩特征的权重值;第一连接子模块,被配置成连接该人物的全局特征和头肩特征两者的加权特征,得到该人物的表征特征。
在一些实施例中,上述表征模块还包括:第二获取子模块,被配置成针对各个目标图像的每一个人物,若该人物的全局特征中不存在参考识别特征,获取预设的全局特征的权重值、头肩特征的权重值;第二连接子模块,被配置成连接该人物的全局特征和头肩特征两者的加权特征,得到该人物的表征特征。
在一些实施例中,参考识别特征包括人物图像的亮度小于预设亮度值或人物衣服的颜色为黑色。
在一些实施例中,上述表征单元包括:获取模块,被配置成针对 各个目标图像的每一个人物,将该人物的全局特征输入已训练的权重自适应模型,得到权重自适应模型输出的该人物的全局特征的权重值;得到模块,被配置成基于该人物的全局特征的权重值,计算该人物的头肩特征的权重值;连接模块,被配置成连接该人物的头肩特征与该人物的全局特征两者的加权特征,得到该人物的表征特征。
在一些实施例中,上述确定单元包括:计算模块,被配置成计算不同目标图像中任意两个人物的表征特征之间的距离,形成距离矩阵;确定模块,被配置成基于距离矩阵,确定不同目标图像中的相同人物。
在一些实施例中,上述提取单元包括:全局提取模块,被配置成将各个目标图像的各个人物的图像输入已训练的全局深度学习模型,得到已训练的全局深度学习模型输出的各个目标图像的各个人物的全局特征。
在一些实施例中,上述提取单元包括:图像提取模块,被配置成将各个目标图像的各个人物的图像输入已训练的头肩定位模型,得到已训练的头肩定位模块输出的各个目标图像的各个人物的图像中的头肩部区域图像;头肩提取模块,被配置成将各个目标图像的各个人物的图像中头肩部区域图像输入已训练的头肩深度学习模型,得到已训练的头肩深度学习模型输出的各个目标图像的各个人物的头肩特征。
第三方面,本公开的实施例提供了一种电子设备,该电子设备包括:一个或多个处理器;存储装置,其上存储有一个或多个程序;当一个或多个程序被一个或多个处理器执行,使得一个或多个处理器实现如第一方面中任一实现方式描述的方法。
第四方面,本公开的实施例提供了一种计算机可读介质,其上存储有计算机程序,该程序被处理器执行时实现如第一方面中任一实现方式描述的方法。
本公开的实施例提供的行人重识别方法和装置,首先采集包括至少两个目标图像的目标图像集合;其次提取目标图像集合中各个目标图像的各个人物的全局特征以及头肩特征;然后基于各个目标图像的各个人物的全局特征和头肩特征,确定各个目标图像的各个人物的表征特征;最后基于各个目标图像的各个人物的表征特征,确定不同目 标图像中的相同人物,由此在行人重识别过程中,通过人物全局特征和头肩特征确定不同目标图像中的相同人物,提高了人物识别效果。
附图说明
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本公开的其它特征、目的和优点将会变得更明显:
图1是本公开的一个实施例可以应用于其中的示例性系统架构图;
图2是根据本公开的行人重识别方法的一个实施例的流程图;
图3是根据本公开的得到人物的表征特征的方法的一个实施例的流程图;
图4是根据本公开的确定各个目标图像的各个人物的表征特征的方法的一个实施例的流程图;
图5是根据本公开的行人重识别方法的一个具体实施场景的示意图;
图6是根据本公开的确定不同目标图像中的相同人物的流程图;
图7是根据本公开的行人重识别装置的一个实施例的结构示意图;
图8是适于用来实现本公开的实施例的电子设备的结构示意图。
具体实施方式
下面结合附图和实施例对本公开作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释相关发明,而非对该发明的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与有关发明相关的部分。
需要说明的是,在不冲突的情况下,本公开中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本公开。
图1示出了可以应用本公开的行人重识别方法的示例性系统架构100。
如图1所示,系统架构100可以包括终端设备101、102、103,网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型,通常可以包括无线通信链路等等。
终端设备101、102、103通过网络104与服务器105交互,以接收或发送消息等。终端设备101、102、103上可以安装有各种通讯客户端应用,例如即时通信工具、邮箱客户端等。
终端设备101、102、103可以是硬件,也可以是软件;当终端设备101、102、103为硬件时,可以是具有通信和控制功能的用户设备,上述用户设置可与服务器105进行通信。当终端设备101、102、103为软件时,可以安装在上述用户设备中;终端设备101、102、103可以实现成多个软件或软件模块(例如用来提供分布式服务的软件或软件模块),也可以实现成单个软件或软件模块。在此不做具体限定。
服务器105可以是提供各种服务的服务器,例如为终端设备101、102、103上图像处理系统提供支持的图像服务器。图像服务器可以对网络中各目标图像的相关信息进行分析处理,并将处理结果(如行人重识别策略)反馈给终端设备。
需要说明的是,服务器可以是硬件,也可以是软件。当服务器为硬件时,可以实现成多个服务器组成的分布式服务器集群,也可以实现成单个服务器。当服务器为软件时,可以实现成多个软件或软件模块(例如用来提供分布式服务的软件或软件模块),也可以实现成单个软件或软件模块。在此不做具体限定。
需要说明的是,本公开的实施例所提供的行人重识别方法一般由服务器105执行。
应该理解,图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。
如图2,示出了根据本公开的行人重识别方法的一个实施例的流程200,该行人重识别方法包括以下步骤:
步骤201,采集包括至少两个目标图像的目标图像集合。
其中,每个目标图像包括至少一个人物。
本实施例中,行人重识别方法运行于其上的执行主体(比如服务器、终端设备)可以通过实时获取或者通过内存读取的方式获取至少两个目标图像,并将获取到的目标图像组成目标图像集合。
目标图像中的人物可以是一个,也可以是多个,在不同目标图像中具有相同的人物时,如何从不同目标图像中确定相同的人物,是行人重识别方法运行于其上的执行主体的主要工作。
进一步地,目标图像还可以是从监控视频中获取到的图像,该目标图像由于相机分辨率和拍摄角度的缘故,通常无法得到质量非常高的人脸图片,因此执行主体可以通过人脸识别之外的手段确定不同目标图像中相同的人物。
更进一步地,目标图像还可以是在黑夜或者非正常天气(阴雨、雾、大风等)得到的图像,该目标图像由于人物周围环境或者均穿着黑衣的缘故,通常无法通过该目标图像中人物的衣服的颜色、样式等,辨别人物。
步骤202,提取目标图像集合中各个目标图像的各个人物的全局特征以及头肩特征。
其中,全局特征为外观整体特征,头肩特征为头部和肩部的特征。
本实施例中,在提取目标图像集合中各个目标图像的各个人物的全局特征之前,可以提取各个目标图像的各个人物的图像,基于各个人物的图像在所在的目标图像中的特点,提取各个人物在各自的目标图像的外观整体特征—全局特征,该全局特征为人物在目标图像中所体现外观的特征,比如,全局特征包括:人物在目标图像中的衣着或/和鞋子的颜色、样式等。需要说明的是,对于目标图像是在黑夜或者非正常天气的到的目标图像,全局特征可以包括:人物在目标图像中的衣着或/和鞋子的样式等。
具体地,可以通过全局深度学习模型或行人重识别模型提取各个目标图像的各个人物的全局特征,其中,行人重识别模型为现有成熟的模型,具体可以包括:主轴网络(Spindle Net),多粒度网络(Multiple Granularity Network,简称MGN)等。
全局深度学习模型是为了实现全局特征提取而基于深度学习算法建立的模型。全局深度学习模型可以包括:一个ResNet50网络结构、一个全局平均池化层以及一个卷积层,如图5所示。各个目标图像的各个人物的图像首先经过ResNet50网络结构提取特征,然后依次通过全局平均池化层和卷积核为1*1的卷积层进行特征降维,最终得到全局特征f g。其中,ResNet50网络结构是具有50层的残差网络,残差网络是一种深度卷积网络,残差网络更容易优化,并且能够通过增加相当的深度来提高准确率。残差网络的核心技术是解决了增加深度带来的副作用,这样能够通过单纯地增加网络深度,来提高网络性能。
在本实施例的一些可选实现方式中,提取目标图像集合中各个目标图像的各个人物的全局特征,包括:
将各个目标图像的各个人物的图像输入已训练的全局深度学习模型,得到已训练的全局深度学习模型输出的各个目标图像的各个人物的全局特征。
本可选实现方式中,已训练的全局深度学习模型是预先训练完成的深度学习模型。在训练全局深度学习模型时,首先收集多个人物图像作为训练样本,确定人物图像对应的预设的全局特征,根据全局深度学习模型对训练样本的全局特征检测结果与训练样本对应的预设的全局特征之间的差异确定全局深度学习模型的误差,利用误差反向传播的方式迭代调整全局深度学习模型的参数,使其误差逐步缩小。在全局深度学习模型的误差收敛至一定的范围内或迭代的次数达到预设的次数阈值时可以停止调整参数,得到训练完成的全局深度学习模型。
本可选实现方式实现的提取全局特征的方法,采用全局深度学习模型,对目标图像的各个人物的图像进行全局特征提取,提高了全局特征提取的效率,保证了全局特征提取的可靠性。
本实施例中,在上述提取全局特征的同时,执行主体可以提取各个目标图像的各个人物的图像,基于各个目标图像的各个人物的图像,确定各个人物的图像中的头肩区域图像;然后,基于各个人物的图像中的头肩区域图像,提取头部和肩部的特征—头肩特征,头肩特征为与人物的头部或肩部相关的人物的物质或属性特征,比如,头肩特征 包括:性别、人脸、发型、眼镜、肩型、围巾、脖子粗细等等,需要说明的是,对于质量较低的目标图像,得到的人物的头肩特征可以包括:性别、发型、眼镜、肩型、围巾等。
具体地,可以通过头肩定位模型和头肩深度学习模型提取各个目标图像的各个人物的头肩特征。其中,头肩定位模型可以包括:一个ResNet18网络结构和一个全连接层。在头肩定位模型训练完成之后,将各个目标图像的各个人物的图像首先输入ResNet18网络结构提取特征,再通过全连接层输出各个目标图像的各个人物的图像中的头肩部区域矩形框的坐标。头肩部区域矩形框的坐标包括:矩形框的左上角和右下角顶点的坐标,通过头肩部区域矩形框的坐标可以得到各个目标图像的各个人物的图像中的头肩部区域图像。其中,ResNet18网络结构是具有18层的残差网络,残差网络是一种深度卷积网络,残差网络更容易优化,并且能够通过增加相当的深度来提高准确率。
头肩深度学习模型可以包括:一个ResNet50网络结构、三个注意力模块、三个广义平均池化层以及三个卷积层,如图5所示。在头肩深度学习模型训练完成之后,将头肩定位模型输出的各个目标图像的各个人物的图像中的头肩部区域图像经过ResNet50提取特征得到头肩原始特征,将得到的头肩原始特征水平分为三块;水平划分的三块特征中每块特征都通过各自对应的注意力模块对空间维度和通道维度中的高响应部分进行加权,然后每块特征都依次通过广义平均池化层和一个卷积核为1*1的卷积层进行特征降维,将降维后的三个特征在通道维度进行连接得到头肩特征f h。本实施例中,注意力模块包括两个模块,一个是空间维度的注意力模块,另一个是通道维度的注意力模块。空间维度和通道维度的注意力模块分别对高响应的空间区域和通道进行权重加强,让网络学习的特征更集中在有意义,有区分性的部位,增加特征的区分性和鲁棒性。
在本实施例的一些可选实现方式中,提取目标图像集合中各个目标图像的各个人物的头肩特征,包括:将各个目标图像的各个人物的图像输入已训练的头肩定位模型,得到已训练的头肩定位模块输出的各个目标图像的各个人物的图像中的头肩部区域图像;将各个目标图 像的各个人物的图像中头肩部区域图像输入已训练的头肩深度学习模型,得到已训练的头肩深度学习模型输出的各个目标图像的各个人物的头肩特征。
本可选实现方式中,已训练的头肩定位模型用于定位各个目标图像的各个人物的图像中的头肩部区域图像,其是预先训练完成的模型。在训练头肩定位模型时,首先收集多个人物图像作为训练样本,确定人物图像对应的预设的头肩部区域矩形框的坐标,根据头肩定位模型对训练样本的头肩部区域矩形框的坐标检测结果与训练样本对应的预设的头肩部区域矩形框的坐标之间的差异确定头肩定位模型的误差,利用误差反向传播的方式迭代调整头肩定位模型的参数,使其误差逐步缩小。在头肩定位模型的误差收敛至一定的范围内或迭代的次数达到预设的次数阈值时可以停止调整参数,得到训练完成的头肩定位模型。
本可选实现方式中,已训练的头肩深度学习模型是预先训练完成的深度学习模型。在训练头肩深度学习模型时,首先收集多个头肩部区域图像作为训练样本,确定头肩部区域图像对应的预设的头肩特征,根据头肩深度学习模型对训练样本的头肩特征检测结果与训练样本对应的预设的头肩特征之间的差异确定头肩深度学习模型的误差,利用误差反向传播的方式迭代调整头肩深度学习模型的参数,使其误差逐步缩小。在头肩深度学习模型的误差收敛至一定的范围内或迭代的次数达到预设的次数阈值时可以停止调整参数,得到训练完成的头肩深度学习模型。
本可选实现方式实现的提取头肩特征的方法,采用头肩定位模型对目标图像的各个人物的图像进行定位得到目标图像的各个人物的图像中的头肩部分区域图像;采用头肩深度学习模型,对头肩部分区域图像进行头肩特征的提取,提高了头肩特征的提取效率,保证了提取头肩特征的可靠性。
步骤203,基于各个目标图像的各个人物的全局特征和头肩特征,确定各个目标图像的各个人物的表征特征。
本实施例中,表征特征为表现各个人物实质性特点的特征,通过 表征特征可以区别各个目标图像的每个人物。
针对目标图像的特点,确定表征特征的方式不同,比如在各个目标图像较清晰、可以识别到各个人物的全局特征时,基于各个目标图像的各个人物的全局特征和头肩特征,确定各个目标图像的各个人物的表征特征包括:将全局特征和头肩特征直接连接在一起,得到表征特征。
又如,在各个目标图像较模糊,或者目标图像中各个人物均穿着黑色衣服、或者目标图像无法辨别到各个人物的全局特征时,在本实施例的一些可选实现方式中,基于各个目标图像的各个人物的全局特征和头肩特征,确定各个目标图像的各个人物的表征特征包括:针对各个目标图像的每一个人物,连接该人物的全局特征和头肩特征两者的加权特征,得到该人物的表征特征。
本可选实现方式中,加权特征是特征分别与其对应的特征权重值相乘之后得到的特征。而各个特征对应的特征权重值可以通过多种渠道(例如,特征与特征权重值对应表或已训练的权重模型)获得。其中,特征与特征权重值对应表可以是由操作人员预先做出的特征与特征权重值一一对应的表。已训练的权重模型是预先训练的特征与特征权重值关系模型,其可以针对不同特征输出不同的特征权重值。
具体地,全局特征的加权特征是全局特征与全局特征的权重值相乘之后得到的特征;头肩特征的加权特征是头肩特征与头肩特征的权重值相乘之后得到的特征。
本可选实现方式中,通过连接目标图像中每个人物的全局特征和头肩特征的加权特征,可以快速、方便地得到每个人物的表征特征。并且得到的表征特征可以有效地表现人物的特征,提高了后续行人重识别的可靠性。
步骤204,基于各个目标图像的各个人物的表征特征,确定不同目标图像中的相同人物。
本实施例中,将目标图像集合中不同目标图像的各个人物的表征特征进行相似性比对,相似度为预设相似度以上的两个表征特征各自对应的人物确定为相同人物,比如,任意两个表征特征之间相似度为 80%以上,确定该两个表征特征是相同的特征,则具有该两个表征特征的人物为相同的人物。
本公开的实施例提供的行人重识别方法,首先采集包括至少两个目标图像的目标图像集合;其次提取目标图像集合中各个目标图像的各个人物的全局特征以及头肩特征;然后基于各个目标图像的各个人物的全局特征和头肩特征,确定各个目标图像的各个人物的表征特征;最后基于各个目标图像的各个人物的表征特征,确定不同目标图像中的相同人物,由此在行人重识别过程中,通过由人物全局特征和头肩特征确定不同目标图像中的相同人物,提高了人物识别效果。
在目标图像中具有参考识别特征(如低光照)无法很好区分目标图像中的人物时,为了得到更好的表征特征,在本实施例的一些可选实现方式中,进一步参考图3,其示出了本公开的得到人物的表征特征的方法一个实施例的流程300。该确定各个目标图像的各个人物的表征特征的方法,包括以下步骤:
步骤301,针对各个目标图像的每一个人物,判断该人物的全局特征中是否存在参考识别特征;若判断结果为是,之后执行步骤302。若判断结果为否,之后执行步骤305。
本实施例中,参考识别特征表征为无法区分目标图像中人物的特征,也叫全局特征的特征,比如,目标图像中各个人物均是黑衣人,或者目标图像中各个人物所在环境整体较暗,无法有效辨别各个人物的衣着的颜色、样式等。
本实施例中,头肩特征是包含很多头部或肩部可区分人物的信息,比如性别、人脸、发型、眼镜等,且在目标图像中各个人物均是黑衣人,或者目标图像中各个人物整体较暗的条件下头肩特征也可以采集到。
本实施例的一些可选实现方式中,参考识别特征包括:人物图像的亮度小于预设亮度值或人物衣服的颜色为黑色。
本可选实现方式中,预设亮度值为预先设置的亮度值,在各个目标图像中人物图像小于预设亮度值时,说明图像较暗,比如低光照图 像,无法有效区分人物的全局特征。
本可选实现方式,将参考识别特征设置为人物图像的亮度小于预设亮度值或者人物衣服的颜色为黑色,可以为获取头肩特征和全局特征的权重值提供了有效地区分条件。
步骤302,获取与该参考识别特征相对应的全局特征的权重值、头肩特征的权重值,之后执行步骤303。
本实施例中,行人重识别方法运行于其上的执行主体可以通过查找参考识别特征与特征权重值对应表或已训练的权重模型获得参考识别特征相对应的全局特征的权重值、头肩特征的权重值。其中,参考识别特征与特征权重值对应表可以是由操作人员预先作出的参考识别特征分别与全局特征的权重值、头肩特征的权重值之间对应关系的表。通过查找参考识别特征与特征权重值对应表,可以得到在某一参考识别特征下的全局特征的权重值、头肩特征的权重值。
进一步地,由于全局特征中存在参考识别特征,此时全局特征无法有效识别人物,而头肩特征则可以有效识别人物,因此,参考识别特征相对应的全局特征的权重值小于参考识别特征对应的头肩特征的权重值,参考识别特征相对应的全局特征的权重值与头肩特征的权重值,可随全局特征的变化而变化。
本实施例中,由于目标图像中存在参考识别特征,确定人物衣服的颜色为黑色或目标图像中各个人物所在环境整体较暗,此时全局特征可能无法正常区分人物;由于参考识别特征存在对头肩特征的影响不大,将参考识别特征相对应的全局特征的权重值小于参考识别特征对应的头肩特征的权重值,可以有效突出头肩特征在人物的所有特征中的比例。
步骤303,连接该人物的全局特征和头肩特征两者的加权特征,得到该人物的表征特征,之后执行步骤304。
本实施例中,全局特征的加权特征是全局特征与全局特征的权重值相乘之后得到的特征;头肩特征的加权特征是头肩特征与头肩特征的权重值相乘之后得到的特征。通过连接目标图像中每个人物的全局特征和头肩特征的加权特征,可以在全局特征存在参考识别特征时, 快速、方便地得到的每个人物的表征特征。
本实施例中,在全局特征中存在参考识别特征时,获取与该参考识别特征相对应的全局特征的权重值、头肩特征的权重值,并连接该人物的全局特征和头肩特征两者的加权特征,得到该人物的表征特征,由此,可以突出头肩特征的显著性,将头肩特征作为重点对象,体现了人物的特征,保证了行人重识别的效果,提高了行人的辨识效率。
步骤304,结束。
步骤305,获取预设的全局特征的权重值、头肩特征的权重值,之后执行步骤306。
本实施例中,由于各个目标图像的每个人物的全局特征中不存在参考识别特征,说明各个目标图像的各个人物的全局特征与头肩特征均可以有效反映目标图像中各个人物可识别特征,并且全局特征可以有效保证人物在目标图像中全局效果,头肩特征可以有效保证人物在目标图像中的局部效果。因此,预设的全局特征的权重值、头肩特征的权重值此时为预设的两个固定值,比如,两者均为0.5。
步骤306,连接该人物的全局特征和头肩特征两者的加权特征,得到该人物的表征特征,之后执行步骤304。
本实施例中,在目标图像的每个人物的全局特征中不存在参考识别特征时,获取预设的全局特征的权重值、头肩特征的权重值,并连接该人物的全局特征和头肩特征两者的加权特征,得到该人物的表征特征,由此,可以突出全局特征与头肩特征均衡性,使得到的表征特征可以有效表现人物的识别特征,提高了行人重识别的辨识效率。
综上,本实施例提供的得到人物的表征特征的方法,在全局特征中不存在参考识别特征时,获取与该参考识别特征相对应的全局特征的权重值、头肩特征的权重值,并连接该人物的全局特征和头肩特征两者的加权特征,得到该人物的表征特征;在目标图像的每个人物的全局特征中不存在参考识别特征时,获取预设的全局特征的权重值、头肩特征的权重值,并连接该人物的全局特征和头肩特征两者的加权特征,得到该人物的表征特征;由此,在具有参考识别特征时,可以突出头肩特征的显著性,将头肩特征作为重点对象;在不具有参考识 别特征,合理均衡全局特征和头肩特征得到表征特征,保证了行人重识别的效果,提高了行人重识别的辨识效率。
在本实施例的一些可选实现方式中,进一步参考图4,其示出了本公开的确定各个目标图像的各个人物的表征特征的方法的一个实施例的流程400。该确定各个目标图像的各个人物的表征特征的方法,包括以下步骤:
步骤401,针对各个目标图像的每一个人物,将该人物的全局特征输入已训练的权重自适应模型,得到权重自适应模型输出的该人物的全局特征的权重值。
本实施例中,如图5所示,权重自适应模型的输入为全局特征f g,权重自适应模型的输出为全局特征的权重值。
具体地,权重自适应模型可以包括:依次连接的两个全连接层,第一全连接层可以基于全局特征判断是否具有参考识别特征;第二个全连接层接收第一个全连接层识别结果,输出全局特征的权重值。
本可选实现方式中,已训练的权重自适应模型可以是预先训练完成的深度学习模型。在训练权重自适应模型时,首先收集多个全局特征以及与各个全局特征相对应的头肩特征,将收集的多个全局特征作为训练样本(其中,多个全局特征可以是由前面实施例中全局深度学习模型训练过程中得到的全局特征,而与各个全局特征相对应的头肩特征可以是由前述实施例中头肩深度学习模型训练过程中得到的头肩特征),确定全局特征对应的预设的表征特征以及第一个全连接层的输出值对应的预设的判断结果,由根据权重自适应模型对训练样本的全局特征的权重值的预测结果以及与各个全局特征对应的头肩特征计算表征特征,由计算得到的表征特征与训练样本对应的预设的表征特征之间的差异以及第一个全连接层的判断结果与训练样本对应的预设的判断结果之间的差异,确定权重自适应模型的误差,利用误差反向传播的方式迭代调整权重自适应模型的参数,使其误差逐步缩小。在权重自适应模型的误差收敛至一定的范围内或迭代的次数达到预设的次数阈值时可以停止调整参数,得到训练完成的权重自适应模型。
本实施例中,权重自适应模块可以根据输入图片的全局特征分别赋给全局特征和头肩特征以不同的、最优的权重。比如,权重自适应模块首先通过全局特征判断输入的图片属不属于“低光照、人物均穿黑色衣服”,如果属于,那么权重自适应模块会给与头肩特征更大的权重值。
步骤402,基于该人物的全局特征的权重值,计算该人物的头肩特征的权重值。
本实施例中,头肩特征的权重值与全局特征的权重值之和为1,在得到全局特征的权重值之后,采用1减去全局特征的权重值,可以得到头肩特征的权重值。
步骤403,连接该人物的头肩特征与该人物的全局特征两者的加权特征,得到该人物的表征特征。
具体地,在图5中,人物的表征特征为f。通过权重自适应模块,全局特征f g与头肩特征f h融合得到人物表征特征f。
本实施例提供的确定各个目标图像的各个人物的表征特征的方法,采用权重自适应模块为全局特征分配全局特征的权重值,由全局特征权重值得到头肩特征的权重值,连接该人物的头肩特征与全局特征两者的加权特征,得到该人物的表征特征。从而采用权重自适应模块实现了对头肩特征和全局特征权重值自适应分配,提高了表征特征的自适应性,提高了行人重识别的识别效果。
在本实施例的一些可选实现方式中,进一步参考图6,其示出了本公开的确定不同目标图像中的相同人物的流程600。该确定不同目标图像中的相同人物的方法,包括以下步骤:
步骤601,计算不同目标图像中任意两个人物的表征特征之间的距离,形成距离矩阵。
本实施例中,可以通过欧式计算公式计算任意两个人物的表征特征之间的欧式距离,或者通过余玄计算公式计算任意两个人物的表征特征之间的余玄距离。
为目标图像集合中各个目标图像的各个人物进行位置标注,基于 各个人物的位置,形成距离矩阵。具体地,两个表征特征的距离在距离矩阵的具体表示形式为:d ij,其中,两个表征特征对应的人物在不同目标图像集合中的位置分别为i,j;例如,i为1103,则表示在第11张目标图像中第三人物排序位置。
步骤602,基于距离矩阵,确定不同目标图像中的相同人物。
本实施例中,不同目标图像中人物是否相同与人物之间的表征特征的距离有关,在距离矩阵中两个人物的表征特征之间的距离值为预设距离值之内时,可以确定两个人物为相同人物。可选地,还可以将距离矩阵中任意两个人物的表征特征之间的距离值由小到大进行排序,选取排名前预设位(例如,前5位)的距离值相对应的各个人物的表征特征作为相同人物的判断对象,在该对象中距离值越小代表人物相似度越高,认为是同一个人物的可能性就越大。
通过距离矩阵可以方便、快捷地的找到不同目标图像中的相同人物。
本实施例提供的确定不同目标图像中的相同人物的方法,计算不同目标图像中任意两个任务的表征特征之间的距离,形成距离矩阵;基于距离矩阵,确定不同目标图像中的相同人物,由此通过距离矩阵可以方便、快捷地确定各个人物表征特征之间的距离,确定目标图像中的相同人物,提高了行人重识别的效率。
进一步参考图7,作为对上述各图所示方法的实现,本公开提供了行人重识别装置的一个实施例,该装置实施例与图2所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。
如图7所示,本公开的实施例提供了一种行人重识别装置700,该装置700包括:采集单元701、提取单元702、表征单元703和确定单元704。其中,采集单元701,可以被配置成采集包括至少两个目标图像的目标图像集合,每个目标图像包括至少一个人物。提取单元702,可以被配置成提取目标图像集合中各个目标图像的各个人物的全局特征以及头肩特征,其中,全局特征为外观整体特征,头肩特征为头部和肩部的特征。表征单元703,可以被配置成基于各个目标图像 的各个人物的全局特征和头肩特征,确定各个目标图像的各个人物的表征特征。确定单元704,可以被配置成基于各个目标图像的各个人物的表征特征,确定不同目标图像中的相同人物。
在本实施例中,行人重识别装置700中,采集单元701、提取单元702、表征单元703和确定单元704的具体处理及其所带来的技术效果可分别参考图2对应实施例中的步骤201、步骤202、步骤203、步骤204。
在一些实施例中,上述表征单元703包括:表征模块(图中未示出)。表征模块可以被配置成针对各个目标图像的每一个人物,连接该人物的全局特征和头肩特征两者的加权特征,得到该人物的表征特征。
在一些实施例中,上述表征模块还包括:第一获取子模块(图中未示出)、第一连接子模块(图中未示出)。第一获取子模块,可以被配置成针对各个目标图像的每一个人物,若该人物的全局特征中存在参考识别特征,获取与该参考识别特征相对应的全局特征的权重值、头肩特征的权重值。第一连接子模块,可以被配置成连接该人物的全局特征和头肩特征两者的加权特征,得到该人物的表征特征。
在一些实施例中,上述表征模块还包括:第二获取子模块(图中未示出)、第二连接子模块(图中未示出)。第二获取子模块,被配置成针对各个目标图像的每一个人物,若该人物的全局特征中不存在参考识别特征,获取预设的全局特征的权重值、头肩特征的权重值;第二连接子模块,被配置成连接该人物的全局特征和头肩特征两者的加权特征,得到该人物的表征特征。
在一些实施例中,上述参考识别特征包括人物图像的亮度小于预设亮度值或人物衣服的颜色为黑色。
在一些实施例中,上述表征单元703包括:获取模块(图中未示出)、得到模块(图中未示出)、连接模块(图中未示出)。获取模块,可以被配置成针对各个目标图像的每一个人物,将该人物的全局特征输入已训练的权重自适应模型,得到权重自适应模型输出的该人物的全局特征的权重值。得到模块,可以被配置成基于该人物的全局特征的权重值,计算该人物的头肩特征的权重值。连接模块,被配置成连 接该人物的头肩特征与该人物的全局特征两者的加权特征,得到该人物的表征特征。
在一些实施例中,上述确定单元704包括:计算模块(图中未示出)、确定模块(图中未示出)。计算模块,可以被配置成计算不同目标图像中任意两个人物的表征特征之间的距离,形成距离矩阵。确定模块,可以被配置成基于距离矩阵,确定不同目标图像中的相同人物。
在一些实施例中,上述提取单元702包括:全局提取模块(图中未示出)。全局提取模块,可以被配置成将各个目标图像的各个人物的图像输入已训练的全局深度学习模型,得到已训练的全局深度学习模型输出的各个目标图像的各个人物的全局特征。
在一些实施例中,上述提取单元702包括:图像提取模块(图中未示出)、头肩提取模块(图中未示出),图像提取模块,可以被配置成将各个目标图像的各个人物的图像输入已训练的头肩定位模型,得到已训练的头肩定位模块输出的各个目标图像的各个人物的图像中的头肩部区域图像。头肩提取模块,可以被配置成将各个目标图像的各个人物的图像中头肩部区域图像输入已训练的头肩深度学习模型,得到已训练的头肩深度学习模型输出的各个目标图像的各个人物的头肩特征。
本公开的实施例提供的行人重识别装置,首先,采集单元701采集包括至少两个目标图像的目标图像集合;其次,提取单元702提取目标图像集合中各个目标图像的各个人物的全局特征以及头肩特征;然后,表征单元703基于各个目标图像的各个人物的全局特征和头肩特征,确定各个目标图像的各个人物的表征特征;最后,确定单元704基于各个目标图像的各个人物的表征特征,确定不同目标图像中的相同人物,由此在行人重识别过程中,通过由人物全局特征和头肩特征确定不同目标图像中的相同人物,提高了人物识别效果。
下面参考图8,其示出了适于用来实现本公开的实施例的电子设备800的结构示意图。
如图8所示,电子设备800可以包括处理装置(例如中央处理器、 图形处理器等)801,其可以根据存储在只读存储器(ROM)802中的程序或者从存储装置808加载到随机访问存储器(RAM)803中的程序而执行各种适当的动作和处理。在RAM 803中,还存储有电子设备800操作所需的各种程序和数据。处理装置801、ROM 802以及RAM 803通过总线804彼此相连。输入/输出(I/O)接口805也连接至总线804。
通常,以下装置可以连接至I/O接口805:包括例如触摸屏、触摸板、键盘、鼠标、等的输入装置806;包括例如液晶显示器(LCD,Liquid Crystal Display)、扬声器、振动器等的输出装置807;包括例如磁带、硬盘等的存储装置808;以及通信装置809。通信装置809可以允许电子设备800与其他设备进行无线或有线通信以交换数据。虽然图8示出了具有各种装置的电子设备800,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。图8中示出的每个方框可以代表一个装置,也可以根据需要代表多个装置。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置809从网络上被下载和安装,或者从存储装置808被安装,或者从ROM 802被安装。在该计算机程序被处理装置801执行时,执行本公开的实施例的方法中限定的上述功能。
需要说明的是,本公开的实施例的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储 器件、或者上述的任意合适的组合。在本公开的实施例中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开的实施例中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、RF(Radio Frequency,射频)等等,或者上述的任意合适的组合。
上述计算机可读介质可以是上述服务器中所包含的;也可以是单独存在,而未装配入该服务器中。上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该服务器执行时,使得该服务器:采集包括至少两个目标图像的目标图像集合,每个目标图像包括至少一个人物;提取目标图像集合中各个目标图像的各个人物的全局特征以及头肩特征,其中,全局特征为外观整体特征,头肩特征为头部和肩部的特征;基于各个目标图像的各个人物的全局特征和头肩特征,确定各个目标图像的各个人物的表征特征;基于各个目标图像的各个人物的表征特征,确定不同目标图像中的相同人物。
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的实施例的操作的计算机程序代码,程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利 用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开的各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开的实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的单元也可以设置在处理器中,例如,可以描述为:一种处理器,包括采集单元、提取单元、表征单元和确定单元。其中,这些单元的名称在某种情况下并不构成对该单元本身的限定,例如,采集单元还可以被描述为“被配置成采集包括至少两个目标图像的目标图像集合,每个目标图像包括至少一个人物”的单元。
以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本公开的实施例中所涉及的发明范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述发明构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本公开的实施例中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。

Claims (20)

  1. 一种行人重识别方法,所述方法包括:
    采集包括至少两个目标图像的目标图像集合,每个目标图像包括至少一个人物;
    提取所述目标图像集合中各个目标图像的各个人物的全局特征以及头肩特征,其中,所述全局特征为外观整体特征,所述头肩特征为头部和肩部的特征;
    基于各个目标图像的各个人物的所述全局特征和所述头肩特征,确定各个目标图像的各个人物的表征特征;以及
    基于各个目标图像的各个人物的所述表征特征,确定不同目标图像中的相同人物。
  2. 根据权利要求1所述的方法,其中,所述基于各个目标图像的各个人物的所述全局特征和所述头肩特征,确定各个目标图像的各个人物的表征特征,包括:
    针对各个目标图像的每一个人物,连接该人物的所述全局特征和所述头肩特征两者的加权特征,得到该人物的表征特征。
  3. 根据权利要求2所述的方法,其中,针对各个目标图像的每一个人物,连接该人物的所述全局特征和所述头肩特征两者的加权特征,得到该人物的表征特征,包括:
    针对各个目标图像的每一个人物,若该人物的全局特征中存在参考识别特征,获取与该参考识别特征相对应的全局特征的权重值、头肩特征的权重值;以及
    连接该人物的所述全局特征和所述头肩特征两者的加权特征,得到该人物的表征特征。
  4. 根据权利要求3所述的方法,其中,针对各个目标图像的每一个人物,连接该人物的所述全局特征和所述头肩特征两者的加权特征, 得到该人物的表征特征,还包括:
    若该人物的全局特征中不存在参考识别特征,获取预设的全局特征的权重值、头肩特征的权重值;以及
    连接该人物的所述全局特征和所述头肩特征两者的加权特征,得到该人物的表征特征。
  5. 根据权利要求3或4所述的方法,其中,所述参考识别特征包括人物图像的亮度小于预设亮度值或人物衣服的颜色为黑色。
  6. 根据权利要求1所述的方法,其中,所述基于各个目标图像的各个人物的所述全局特征和所述头肩特征,确定各个目标图像的各个人物的表征特征,包括:
    针对各个目标图像的每一个人物,将该人物的所述全局特征输入已训练的权重自适应模型,得到所述权重自适应模型输出的该人物的所述全局特征的权重值;
    基于该人物的所述全局特征的权重值,计算该人物的头肩特征的权重值;以及
    连接该人物的所述头肩特征与该人物的所述全局特征两者的加权特征,得到该人物的表征特征。
  7. 根据权利要求1-6任一项所述的方法,其中,所述基于各个目标图像的各个人物的所述表征特征,确定不同目标图像中的相同人物,包括:
    计算不同目标图像中任意两个人物的所述表征特征之间的距离,形成距离矩阵;以及
    基于所述距离矩阵,确定不同目标图像中的相同人物。
  8. 根据权利要求1-6任一项所述的方法,其中,所述提取所述目标图像集合中各个目标图像的各个人物的全局特征,包括:
    将各个目标图像的各个人物的图像输入已训练的全局深度学习模 型,得到所述已训练的全局深度学习模型输出的各个目标图像的各个人物的全局特征。
  9. 根据权利要求1-6任一项所述的方法,其中,所述提取所述目标图像集合中各个目标图像的各个人物的头肩特征,包括:
    将各个目标图像的各个人物的图像输入已训练的头肩定位模型,得到所述已训练的头肩定位模块输出的各个目标图像的各个人物的图像中的头肩部区域图像;以及
    将各个目标图像的各个人物的图像中头肩部区域图像输入已训练的头肩深度学习模型,得到所述已训练的头肩深度学习模型输出的各个目标图像的各个人物的头肩特征。
  10. 一种行人重识别装置,所述装置包括:
    采集单元,被配置成采集包括至少两个目标图像的目标图像集合,每个目标图像包括至少一个人物;
    提取单元,被配置成提取所述目标图像集合中各个目标图像的各个人物的全局特征以及头肩特征,其中,所述全局特征为外观整体特征,所述头肩特征为头部和肩部的特征;
    表征单元,被配置成基于各个目标图像的各个人物的所述全局特征和所述头肩特征,确定各个目标图像的各个人物的表征特征;以及
    确定单元,被配置成基于各个目标图像的各个人物的所述表征特征,确定不同目标图像中的相同人物。
  11. 根据权利要求10所述的装置,其中,所述表征单元包括:
    表征模块,被配置成针对各个目标图像的每一个人物,连接该人物的所述全局特征和所述头肩特征两者的加权特征,得到该人物的表征特征。
  12. 根据权利要求11所述的装置,其中,所述表征模块还包括:
    第一获取子模块,被配置成针对各个目标图像的每一个人物,若该人物的全局特征中存在参考识别特征,获取与该参考识别特征相对 应的全局特征的权重值、头肩特征的权重值;以及
    第一连接子模块,被配置成连接该人物的所述全局特征和所述头肩特征两者的加权特征,得到该人物的表征特征。
  13. 根据权利要求12所述的装置,其中,所述表征模块还包括:
    第二获取子模块,被配置成针对各个目标图像的每一个人物,若该人物的全局特征中不存在参考识别特征,获取预设的全局特征的权重值、头肩特征的权重值;以及
    第二连接子模块,被配置成连接该人物的所述全局特征和所述头肩特征两者的加权特征,得到该人物的表征特征。
  14. 根据权利要求12或13所述的装置,其中,所述参考识别特征包括人物图像的亮度小于预设亮度值或人物衣服的颜色为黑色。
  15. 根据权利要求10所述的装置,其中,所述表征单元包括:
    获取模块,被配置成针对各个目标图像的每一个人物,将该人物的所述全局特征输入已训练的权重自适应模型,得到所述权重自适应模型输出的该人物的所述全局特征的权重值;
    得到模块,被配置成基于该人物的所述全局特征的权重值,计算该人物的头肩特征的权重值;以及
    连接模块,被配置成连接该人物的所述头肩特征与该人物的所述全局特征两者的加权特征,得到该人物的表征特征。
  16. 根据权利要求10-15任一项所述的装置,其中,所述确定单元包括:
    计算模块,被配置成计算不同目标图像中任意两个人物的所述表征特征之间的距离,形成距离矩阵;
    确定模块,被配置成基于所述距离矩阵,确定不同目标图像中的相同人物。
  17. 根据权利要求10-15任一项所述的装置,其中,所述提取单 元包括:
    全局提取模块,被配置成将各个目标图像的各个人物的图像输入已训练的全局深度学习模型,得到所述已训练的全局深度学习模型输出的各个目标图像的各个人物的全局特征。
  18. 根据权利要求10-15任一项所述的装置,其中,所述提取单元包括:
    图像提取模块,被配置成将各个目标图像的各个人物的图像输入已训练的头肩定位模型,得到所述已训练的头肩定位模块输出的各个目标图像的各个人物的图像中的头肩部区域图像;以及
    头肩提取模块,被配置成将各个目标图像的各个人物的图像中头肩部区域图像输入已训练的头肩深度学习模型,得到所述已训练的头肩深度学习模型输出的各个目标图像的各个人物的头肩特征。
  19. 一种电子设备,包括:
    一个或多个处理器;
    存储装置,其上存储有一个或多个程序;
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-9中任一所述的方法。
  20. 一种计算机可读介质,其上存储有计算机程序,其中,该程序被处理器执行时实现如权利要求1-9中任一所述的方法。
PCT/CN2021/092020 2020-08-25 2021-05-07 行人重识别方法和装置 WO2022041830A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21859677.3A EP4137991A4 (en) 2020-08-25 2021-05-07 PEDESTRIAN RECOGNITION METHOD AND DEVICE
US18/013,795 US20230334890A1 (en) 2020-08-25 2021-05-07 Pedestrian re-identification method and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010863443.9A CN112307886A (zh) 2020-08-25 2020-08-25 行人重识别方法和装置
CN202010863443.9 2020-08-25

Publications (1)

Publication Number Publication Date
WO2022041830A1 true WO2022041830A1 (zh) 2022-03-03

Family

ID=74483220

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/092020 WO2022041830A1 (zh) 2020-08-25 2021-05-07 行人重识别方法和装置

Country Status (4)

Country Link
US (1) US20230334890A1 (zh)
EP (1) EP4137991A4 (zh)
CN (1) CN112307886A (zh)
WO (1) WO2022041830A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115631509A (zh) * 2022-10-24 2023-01-20 智慧眼科技股份有限公司 一种行人再识别方法、装置、计算机设备及存储介质
CN116311105A (zh) * 2023-05-15 2023-06-23 山东交通学院 一种基于样本间上下文指导网络的车辆重识别方法

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112307886A (zh) * 2020-08-25 2021-02-02 北京京东尚科信息技术有限公司 行人重识别方法和装置
CN112818896B (zh) * 2021-02-18 2023-04-07 支付宝(杭州)信息技术有限公司 生物识别方法、装置和电子设备
CN113239784B (zh) * 2021-05-11 2022-09-30 广西科学院 一种基于空间序列特征学习的行人重识别系统及方法
CN113221764B (zh) * 2021-05-18 2023-04-28 安徽工程大学 一种快速行人再识别方法
CN113269070B (zh) * 2021-05-18 2023-04-07 重庆邮电大学 融合全局和局部特征的行人重识别方法、存储器及处理器
CN113284106B (zh) * 2021-05-25 2023-06-06 浙江商汤科技开发有限公司 距离检测方法及装置

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109784166A (zh) * 2018-12-13 2019-05-21 北京飞搜科技有限公司 行人重识别的方法及装置
CN110543841A (zh) * 2019-08-21 2019-12-06 中科视语(北京)科技有限公司 行人重识别方法、系统、电子设备及介质
CN112307886A (zh) * 2020-08-25 2021-02-02 北京京东尚科信息技术有限公司 行人重识别方法和装置

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109389589A (zh) * 2018-09-28 2019-02-26 百度在线网络技术(北京)有限公司 用于统计人数的方法和装置
CN110070073A (zh) * 2019-05-07 2019-07-30 国家广播电视总局广播电视科学研究院 基于注意力机制的全局特征和局部特征的行人再识别方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109784166A (zh) * 2018-12-13 2019-05-21 北京飞搜科技有限公司 行人重识别的方法及装置
CN110543841A (zh) * 2019-08-21 2019-12-06 中科视语(北京)科技有限公司 行人重识别方法、系统、电子设备及介质
CN112307886A (zh) * 2020-08-25 2021-02-02 北京京东尚科信息技术有限公司 行人重识别方法和装置

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
See also references of EP4137991A4 *
XU BOQIANG BOQIANG.XU@CRIPAC.IA.AC.CN; HE LINGXIAO HELINGXIAO3@JD.COM; LIAO XINGYU LIAOXINGYU5@JD.COM; LIU WU LIUWU1@JD.COM; SUN Z: "Black Re-ID A Head-shoulder Descriptor for the Challenging Problem of Person Re-Identification", PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, ACMPUB27, NEW YORK, NY, USA, 12 October 2020 (2020-10-12) - 16 October 2020 (2020-10-16), New York, NY, USA , pages 673 - 681, XP058478628, ISBN: 978-1-4503-7988-5, DOI: 10.1145/3394171.3414056 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115631509A (zh) * 2022-10-24 2023-01-20 智慧眼科技股份有限公司 一种行人再识别方法、装置、计算机设备及存储介质
CN116311105A (zh) * 2023-05-15 2023-06-23 山东交通学院 一种基于样本间上下文指导网络的车辆重识别方法
CN116311105B (zh) * 2023-05-15 2023-09-19 山东交通学院 一种基于样本间上下文指导网络的车辆重识别方法

Also Published As

Publication number Publication date
EP4137991A4 (en) 2024-04-03
EP4137991A1 (en) 2023-02-22
CN112307886A (zh) 2021-02-02
US20230334890A1 (en) 2023-10-19

Similar Documents

Publication Publication Date Title
WO2022041830A1 (zh) 行人重识别方法和装置
CN108229504B (zh) 图像解析方法及装置
CN108898086B (zh) 视频图像处理方法及装置、计算机可读介质和电子设备
JP6994588B2 (ja) 顔特徴抽出モデル訓練方法、顔特徴抽出方法、装置、機器および記憶媒体
WO2020199484A1 (zh) 基于视频的轨迹跟踪方法、装置、计算机设备及存储介质
CN111368685B (zh) 关键点的识别方法、装置、可读介质和电子设备
US9129191B2 (en) Semantic object selection
CN108197618B (zh) 用于生成人脸检测模型的方法和装置
CN112954450B (zh) 视频处理方法、装置、电子设备和存储介质
CN112861575A (zh) 一种行人结构化方法、装置、设备和存储介质
CN111797653A (zh) 基于高维图像的图像标注方法和装置
CN112381104B (zh) 一种图像识别方法、装置、计算机设备及存储介质
TW202026948A (zh) 活體檢測方法、裝置以及儲存介質
CN110163076A (zh) 一种图像数据处理方法和相关装置
CN108229418B (zh) 人体关键点检测方法和装置、电子设备、存储介质和程序
CN111738120B (zh) 人物识别方法、装置、电子设备及存储介质
WO2021031954A1 (zh) 对象数量确定方法、装置、存储介质与电子设备
CN114612987B (zh) 一种表情识别方法及装置
US20200019789A1 (en) Information generating method and apparatus applied to terminal device
CN113784171A (zh) 视频数据处理方法、装置、计算机系统及可读存储介质
CN110110666A (zh) 目标检测方法和装置
CN114332993A (zh) 人脸识别方法、装置、电子设备及计算机可读存储介质
CN116704432A (zh) 基于分布不确定性的多模态特征迁移人群计数方法及装置
CN112101479B (zh) 一种发型识别方法及装置
CN115019057A (zh) 图像特征提取模型确定方法及装置、图像识别方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21859677

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021859677

Country of ref document: EP

Effective date: 20221115

NENP Non-entry into the national phase

Ref country code: DE