WO2023173544A1 - Person re-identification method and apparatus based on artificial intelligence, and device and storage medium - Google Patents

Person re-identification method and apparatus based on artificial intelligence, and device and storage medium Download PDF

Info

Publication number
WO2023173544A1
WO2023173544A1 PCT/CN2022/090156 CN2022090156W WO2023173544A1 WO 2023173544 A1 WO2023173544 A1 WO 2023173544A1 CN 2022090156 W CN2022090156 W CN 2022090156W WO 2023173544 A1 WO2023173544 A1 WO 2023173544A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
vector
initial
human body
target
Prior art date
Application number
PCT/CN2022/090156
Other languages
French (fr)
Chinese (zh)
Inventor
郑喜民
朱翌
舒畅
陈又新
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2023173544A1 publication Critical patent/WO2023173544A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Definitions

  • This application relates to the field of artificial intelligence technology, and in particular to a pedestrian re-identification method, device, equipment and storage medium based on artificial intelligence.
  • Person re-identification also known as person re-identification, is a technology that uses computer vision technology to determine whether a specific pedestrian exists in an image or video sequence. It is widely considered to be a sub-problem of image retrieval, and its task is to give a Monitor pedestrian images and retrieve whether there are human body images that are the same as the pedestrians in the monitored pedestrian images in the human body image database.
  • the process of the pedestrian re-identification task is to first train a feature extraction network, obtain the loss value based on the cosine metric distance calculated by the feature, and the optimizer continuously updates the parameters of the network based on the loss value to achieve the learning effect.
  • features close to the input layer often contain more local detailed semantic information, while features close to the output layer often contain higher-level global semantic information.
  • the main purpose of this application is to provide a method, device, equipment and storage medium for pedestrian re-identification based on artificial intelligence, aiming to solve the problem that the existing pedestrian re-identification method only uses the feature vector of the last layer of the feature extraction network and the human body image.
  • the human body images in the database are matched without considering the underlying features, resulting in technical problems such as low accuracy of pedestrian re-identification.
  • the main purpose of this application is to provide a method, device, equipment and storage medium for pedestrian re-identification based on artificial intelligence, aiming to solve the problem that the existing pedestrian re-identification method only uses the feature vector of the last layer of the feature extraction network and the human body image.
  • the human body images in the database are matched without considering the underlying features, resulting in technical problems such as low accuracy of pedestrian re-identification.
  • this application proposes a pedestrian re-identification method based on artificial intelligence, which method includes:
  • the target image into a preset feature extraction model to obtain the feature vector to be analyzed output by each feature output module, wherein the feature extraction model includes: a feature pyramid and a plurality of the feature output modules, the feature pyramid Connect to each of the feature output modules respectively;
  • the pedestrian re-identification result is determined.
  • This application also proposes a pedestrian re-identification device based on artificial intelligence, which includes:
  • Image acquisition module used to acquire target images
  • a feature vector determination module to be analyzed is used to input the target image into a preset feature extraction model to obtain a feature vector to be analyzed output by each feature output module, wherein the feature extraction model includes: a feature pyramid and a plurality of The feature output module, the feature pyramid is connected to each of the feature output modules respectively;
  • the classification probability prediction result determination module is used to input each of the feature vectors to be analyzed into a preset classification prediction module to perform classification probability prediction and obtain the classification probability prediction result;
  • a similar human body image set determination module configured to determine a similar human body image set from a preset human body image library based on the target feature vector and a preset number of similar images, wherein the target feature vector is any one of the features to be analyzed vector;
  • a soft voting score determination module configured to perform a weighted sum of each of the classification probability prediction results and the weights of each of the classification prediction modules for each human body image in each of the similar human body image sets to obtain a soft voting score
  • the pedestrian re-identification result determination module is used to determine the pedestrian re-identification result according to each of the soft voting scores.
  • This application also proposes a computer device, including a memory and a processor.
  • the memory stores a computer program.
  • the processor executes the computer program, it implements an artificial intelligence-based pedestrian re-identification method, wherein: The method includes the following steps:
  • the target image into a preset feature extraction model to obtain the feature vector to be analyzed output by each feature output module, wherein the feature extraction model includes: a feature pyramid and a plurality of the feature output modules, the feature pyramid Connect to each of the feature output modules respectively;
  • the pedestrian re-identification result is determined.
  • This application also proposes a computer-readable storage medium on which a computer program is stored.
  • the computer program When executed by a processor, it implements an artificial intelligence-based pedestrian re-identification method, wherein the method includes the following steps:
  • the target image into a preset feature extraction model to obtain the feature vector to be analyzed output by each feature output module, wherein the feature extraction model includes: a feature pyramid and a plurality of the feature output modules, the feature pyramid Connect to each of the feature output modules respectively;
  • the pedestrian re-identification result is determined.
  • the artificial intelligence-based pedestrian re-identification method, device, equipment and storage medium of this application improves the accuracy of pedestrian re-identification.
  • Figure 1 is a schematic flow chart of a pedestrian re-identification method based on artificial intelligence according to an embodiment of the present application
  • Figure 2 is a schematic structural block diagram of an artificial intelligence-based pedestrian re-identification device according to an embodiment of the present application
  • FIG. 3 is a schematic structural block diagram of a computer device according to an embodiment of the present application.
  • an embodiment of the present application provides a pedestrian re-identification method based on artificial intelligence. Please refer to Figure 1 for details.
  • the method includes:
  • S2 Input the target image into a preset feature extraction model to obtain the feature vector to be analyzed output by each feature output module, wherein the feature extraction model includes: a feature pyramid and a plurality of the feature output modules, The feature pyramid is connected to each of the feature output modules respectively;
  • S3 Input each feature vector to be analyzed into the preset classification prediction module to perform classification probability prediction, and obtain the classification probability prediction result;
  • S4 Determine a set of similar human body images from the preset human body image library according to the target feature vector and the preset number of similar images, where the target feature vector is any one of the feature vectors to be analyzed;
  • the feature pyramid of the feature extraction model is connected to each feature output module respectively, and feature vectors to be analyzed with different levels of semantic information are obtained, thereby achieving attention to low-level features such as clothing color and texture and high-level global semantic information, and improving The accuracy of pedestrian re-identification is improved; and the soft voting score is determined within the range of similar human image sets determined based on the target feature vector and the preset number of similar images, which further improves the accuracy of pedestrian re-identification.
  • the target image can be obtained from the database, the target image input by the user can be obtained, and the target image can be obtained from a third-party application.
  • the target image is an image that needs to be retrieved in the human body image database.
  • the target image is an image taken of a human body.
  • the target image is input into a preset feature extraction model, and each feature output module of the feature extraction model outputs a feature vector to be analyzed.
  • the feature pyramid includes: bottom-up feature extraction link, top-down feature fusion link and horizontal connection.
  • the horizontal connection is one of the bottom-up feature extraction link and the top-down feature fusion link. connections between.
  • Each feature fusion layer of the top-down feature fusion link of the feature pyramid is linked to one of the feature output modules, and the feature fusion layer corresponds to the feature output module one-to-one.
  • Bottom-up feature extraction link which is the forward process of the network.
  • the size of the feature map will change after passing through some network layers, and when passing through other network layers It will not change. Network layers with the same feature map size will be unified into one feature extraction layer.
  • the top-down feature fusion link uses upsampling to amplify high-level feature maps in equal proportions.
  • Horizontal connection is to fuse the feature vector obtained by upsampling and the feature vector of the same size output by the bottom-up feature extraction link through the channel expansion of the 1*1 convolution kernel to obtain the fused feature vector .
  • the fused feature vector will be output to the feature output module.
  • each feature vector to be analyzed is input into a preset classification prediction module for classification probability prediction, and the predicted probability vector is used as the classification probability prediction result.
  • the number of vector elements in the classification probability prediction result is the same as the number of human body images in the human body image database. That is to say, the vector elements in the classification probability prediction result correspond to the human body images in the human body image database.
  • the value of the vector element in the classification probability prediction result is the probability that the human body in the target image is similar to the human body in the human body image corresponding to the vector element.
  • the number of classification probability prediction results is the same as the number of feature vectors to be analyzed.
  • the classification prediction module is a fully connected layer using softmax activation function.
  • any one of the feature vectors to be analyzed is used as the target feature vector; according to the target feature vector, the most similar human body image is determined from the preset human body image library as a similar human body image set, and the number of images in the similar human body image set is equal to the number of similar images.
  • the human body image library includes: image identification, human body images, and feature vectors corresponding to the human body images.
  • Human body images are images taken of the human body.
  • the number of the classification probability prediction results is 4, and the vector element value corresponding to the image to be voted in the first classification probability prediction result (which is output by the first classification prediction module) is the same as the first
  • the weights of the two classification prediction modules are multiplied to obtain the first score
  • the vector element value corresponding to the image to be voted in the second classification probability prediction result (which is output by the second classification prediction module)
  • the second score is obtained by multiplying it with the weight of the second classification prediction module.
  • the third classification probability prediction result (which is output by the third classification prediction module) corresponds to the image to be voted.
  • the vector element value is multiplied by the weight of the third classification prediction module to obtain the third score.
  • the fourth classification probability prediction result (which is output by the fourth classification prediction module) is equal to the value to be voted on.
  • the vector element value corresponding to the image is multiplied by the weight of the fourth classification prediction module to obtain the fourth score, and the first score, the second score, the third score and the fourth score are added, The added data is used as the soft voting score corresponding to the image to be voted on.
  • the human body image corresponding to the soft voting score with the largest value in each of the soft voting scores is used as the hit image of the pedestrian re-identification result.
  • the above-mentioned step of inputting the target image into a preset feature extraction model to obtain the feature vector to be analyzed output by each feature output module includes:
  • S22 Input the i-1th feature initial vector into the i-th feature extraction layer of the feature pyramid to obtain the i-th feature initial vector, where i is greater than 1 and less than n+1, n is an integer greater than 2;
  • S25 Input the m-th fused feature vector into the m-th feature output module for feature output, and obtain the m-th feature vector to be analyzed, where m is greater than 0 and less than n.
  • This embodiment implements feature extraction using bottom-up feature extraction links, top-down feature fusion links and horizontal connections, and outputs a feature vector to be analyzed for each fusion feature vector, and obtains features with different
  • the feature vector to be analyzed is the hierarchical semantic information, so as to focus on low-level features such as clothing color and texture and high-level global semantic information, thereby improving the accuracy of pedestrian re-identification.
  • the 1st to nth feature extraction layers of the feature pyramid can use a Resnet50 network (residual network).
  • the target image is input into the first feature extraction layer of the feature pyramid, and the feature vector output by the first feature extraction layer is used as the first feature initial vector.
  • i is greater than 1 and less than n+1, and n is an integer greater than 2.
  • the number of feature initial vectors is n.
  • the nth feature initial vector is input into the first feature fusion layer of the feature pyramid for channel expansion, and the feature vector obtained by channel expansion is used as the first fusion feature vector.
  • the k-1th fusion feature vector and the n-k+1th feature initial vector are input into the k-th feature fusion layer of the feature pyramid for element addition to achieve feature fusion.
  • the feature vector obtained by feature fusion is used as the k-th fused feature vector, and k is greater than 1 and less than n, that is, the number of fused feature vectors is n-1.
  • the above-mentioned step of inputting the n-th feature initial vector into the first feature fusion layer of the feature pyramid for feature processing to obtain the first fused feature vector includes:
  • S231 Use the channel expansion convolution kernel of the first feature fusion layer to perform channel expansion on the nth feature initial vector to obtain the first fusion feature vector, wherein the channel expansion convolution kernel It is a 1*1 convolution kernel;
  • the k-1th fusion feature vector and the n-k+1th feature initial vector are input into the k-th feature fusion layer of the feature pyramid for feature fusion to obtain the k-th feature initial vector.
  • the steps to fuse feature vectors include:
  • This embodiment uses channel expansion to perform fusion processing using features of the same channel, which provides a basis for obtaining feature vectors to be analyzed with different levels of semantic information.
  • the channel expansion convolution kernel of the first feature fusion layer is used to convolve the nth feature initial vector to achieve channel expansion, and the feature vector obtained by convolution is used as the first feature vector. Fusion of feature vectors.
  • the channel expansion convolution kernel of the kth feature fusion layer is used to convolve the n-k+1th feature initial vector to achieve channel expansion, and the feature vector obtained by convolution is As the kth channel expansion feature.
  • the nearest neighbor interpolation processing sub-layer of the k-th feature fusion layer is used to perform nearest neighbor interpolation processing on the k-1th fusion feature vector to achieve equal-proportion amplification, and the feature vector obtained by equal-proportion amplification is As the kth equal-scale enlargement feature.
  • the amplification ratio of the nearest neighbor interpolation processing sub-layer is 2.
  • a 1*1 convolution kernel is used to fuse the k-th channel expansion feature and the k-th equal-scale amplification feature, and the data obtained by the fusion processing is used as the k-th fusion feature vector.
  • the above-mentioned step of inputting the m-th fused feature vector into the m-th feature output module for feature output to obtain the m-th feature vector to be analyzed includes:
  • S251 Use the aliasing effect elimination layer of the mth feature output module to perform aliasing effect elimination on the mth fusion feature vector to obtain the mth aliasing effect eliminated feature vector, wherein the aliasing effect is eliminated.
  • the stacking effect elimination layer is a 3*3 convolution kernel;
  • S252 Input the m-th aliasing effect eliminated feature vector into the pooling layer of the m-th feature output module for pooling processing to obtain the m-th feature vector to be analyzed.
  • this embodiment uses a 3*3 convolution kernel for convolution, thereby improving the accuracy of the feature vector to be analyzed. accuracy.
  • the aliasing effect elimination layer of the mth feature output module is used to convolve the mth fusion feature vector, and the convolved data is used as the mth aliasing effect eliminated feature vector.
  • convolution kernels of other sizes can also be used for convolution to eliminate aliasing effects, which is not limited here.
  • the above-mentioned step of determining a set of similar human body images from a preset human body image library based on the target feature vector and the preset number of similar images includes:
  • S42 From each of the first similarities, find the first similarity with the largest value and the same number as the number of similar images as an initial similarity set;
  • This embodiment first finds the human body images that are most similar to the target feature vector and the number is the number of similar images, and then averages the found feature vectors of the human image and the target feature vector to obtain the adjusted feature vector, and then finds Find the human body images that are most similar to the adjusted feature vector and the number is the number of similar images, and use each found human body image as a set of similar human body images, because the most similar human body images and the number is the number of similar images have high confidence. There is less noise.
  • the adjusted feature vector is calculated by averaging the feature vectors of human images that are most similar to the target feature vector and the number is the number of similar images and the target feature vector. Then the adjusted feature vector is found and is most similar to the adjusted feature vector. The operation on human body images as many as similar images improves the overall recall rate.
  • the feature vectors of each human body image corresponding to the initial similarity set and the target feature vector will be aggregated to obtain a set to be calculated; the average value of each feature vector in the set to be calculated will be calculated, The averaged data is used as the adjusted feature vector.
  • the number of the second similarities is equal to the number of similar images.
  • each of the human body images corresponding to the target similarity set is the human body image in the human body image library that is most similar to the characteristics corresponding to the target feature vector. Therefore, each of the human body images corresponding to the target similarity set is Human body images serve as the set of similar human body images.
  • the above-mentioned step of determining the pedestrian re-identification result based on each of the soft voting scores includes:
  • S61 Find the soft voting score with the largest value from each of the soft voting scores, and obtain the target score
  • S62 Determine whether the target score is greater than the preset score threshold
  • the human body image corresponding to the soft voting score greater than the preset score threshold is used as the hit image of the pedestrian re-identification result, thereby improving the accuracy of the determined pedestrian re-identification result.
  • the soft voting score with the largest value is found from each of the soft voting scores, and the found soft voting score is used as the target score.
  • the target score is greater than the preset score threshold, it means that the soft voting score with the largest value is greater than the preset score threshold, so the recognition result of the pedestrian re-identification result is determined to be is successful, and the human body image corresponding to the target score is used as the hit image of the pedestrian re-identification result.
  • the target score is less than or equal to the preset score threshold, which means that the soft voting score with the largest value is less than or equal to the preset score threshold, so the pedestrian re-identification is determined
  • the recognition result of the result is failure.
  • the step before the above step of inputting the target image into a preset feature extraction model and obtaining the feature vector to be analyzed output by each feature output module, the step further includes:
  • S71 Obtain an initial model and a training sample set, where the initial model includes: an initial feature pyramid, multiple feature output initial modules and multiple classification prediction initial modules, and the initial feature pyramid is associated with each of the feature output initial modules. Module connection, the feature output initial module and the classification prediction initial module are connected, and the feature output initial module and the classification prediction initial module correspond one to one;
  • the weight-based weak classifier integration method and the training sample set are used to train the initial model, and the initial model after training is used as the target model, where the initial feature pyramid of the target model As the feature pyramid, the feature output initial module of the target model serves as the feature output module, and the classification prediction initial module of the target model serves as the classification prediction module.
  • This embodiment uses a weight-based weak classifier integration method to train the initial model, thereby increasing the weight of the initial classification prediction module with a low error rate in each round of training, while reducing the initial classification prediction module with a high error rate.
  • the weight of the module makes the initial module of classification prediction have better effect on misclassified data.
  • training samples include: sample images and classification probability calibration values.
  • the sample images are images taken of the human body.
  • the classification probability calibration value is an accurate calibration result of whether the human body in the sample image and the human body in each human body image in the human body image library are the same person.
  • the weight-based weak classifier integration method and the training sample set are used to train the initial model, and the initial model after training is used as the target model, thereby improving the error rate with a small error rate in each round of training.
  • the weight of the initial module of classification prediction is reduced, and the weight of the initial module of classification prediction with high error rate is reduced.
  • the value calculated by Q y1 updates the weight of the y1th classification prediction initial module in the reverse-ordered classification prediction initial module set.
  • the value calculated by Q y2 updates the weight of the y2th classification prediction initial module in the reverse-ordered classification prediction initial module set.
  • a is set to 1.1.
  • this application also proposes a pedestrian re-identification device based on artificial intelligence.
  • the device includes:
  • Image acquisition module 100 used to acquire target images
  • the feature vector to be analyzed determination module 200 is used to input the target image into a preset feature extraction model to obtain the feature vector to be analyzed output by each feature output module, wherein the feature extraction model includes: a feature pyramid and a plurality of The feature output module, the feature pyramid is connected to each of the feature output modules respectively;
  • the classification probability prediction result determination module 300 is used to input each of the feature vectors to be analyzed into a preset classification prediction module to perform classification probability prediction and obtain the classification probability prediction result;
  • the similar human body image set determination module 400 is used to determine a similar human body image set from the preset human body image library according to the target feature vector and the preset number of similar images, wherein the target feature vector is any one of the to-be-analyzed Feature vector;
  • the soft voting score determination module 500 is configured to perform a weighted sum of each classification probability prediction result and the weight of each classification prediction module for each human body image in each of the similar human body image sets to obtain a soft voting score;
  • the pedestrian re-identification result determination module 600 is used to determine the pedestrian re-identification result according to each of the soft voting scores.
  • the feature pyramid of the feature extraction model is connected to each feature output module respectively, and feature vectors to be analyzed with different levels of semantic information are obtained, thereby achieving attention to low-level features such as clothing color and texture and high-level global semantic information, and improving The accuracy of pedestrian re-identification is improved; and the soft voting score is determined within the range of similar human image sets determined based on the target feature vector and the preset number of similar images, which further improves the accuracy of pedestrian re-identification.
  • an embodiment of the present application also provides a computer device.
  • the computer device may be a server, and its internal structure may be as shown in FIG. 3 .
  • the computer device includes a processor, memory, network interface, and database connected through a system bus. Among them, the processor designed by the computer is used to provide computing and control capabilities.
  • the memory of the computer device includes non-volatile storage media and internal memory.
  • the non-volatile storage medium stores operating systems, computer programs and databases. This memory provides an environment for the operation of operating systems and computer programs in non-volatile storage media.
  • the database of this computer device is used to store data such as pedestrian re-identification methods based on artificial intelligence.
  • the network interface of the computer device is used to communicate with external terminals through a network connection.
  • the computer program is executed by the processor to implement a pedestrian re-identification method based on artificial intelligence.
  • the artificial intelligence-based pedestrian re-identification method includes: obtaining a target image; inputting the target image into a preset feature extraction model to obtain a feature vector to be analyzed output by each feature output module, wherein the feature extraction model It includes: a feature pyramid and a plurality of the feature output modules, the feature pyramid is connected to each of the feature output modules respectively; inputting each of the feature vectors to be analyzed into a preset classification prediction module for classification probability prediction, we get Classification probability prediction results; according to the target feature vector and the preset number of similar images, determine a set of similar human body images from the preset human body image library, wherein the target feature vector is any one of the feature vectors to be analyzed; for each For each human body image in the similar human body image set, a weighted sum of each classification probability prediction result and the weight of each classification prediction module is obtained to obtain a soft voting score; according to each soft voting score, the
  • the feature pyramid of the feature extraction model is connected to each feature output module respectively, and feature vectors to be analyzed with different levels of semantic information are obtained, thereby achieving attention to low-level features such as clothing color and texture and high-level global semantic information, and improving The accuracy of pedestrian re-identification is improved; and the soft voting score is determined within the range of similar human image sets determined based on the target feature vector and the preset number of similar images, which further improves the accuracy of pedestrian re-identification.
  • An embodiment of the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium may be non-volatile or volatile.
  • a computer program is stored thereon. When the computer program is executed by a processor, Implementing a pedestrian re-identification method based on artificial intelligence, including the steps of: obtaining a target image; inputting the target image into a preset feature extraction model to obtain a feature vector to be analyzed output by each feature output module, wherein the feature
  • the extraction model includes: a feature pyramid and a plurality of the feature output modules, the feature pyramid is connected to each of the feature output modules respectively; each of the feature vectors to be analyzed is input into a preset classification prediction module for classification probability prediction.
  • the classification probability prediction result obtains the classification probability prediction result; determine a set of similar human body images from the preset human body image library according to the target feature vector and the preset number of similar images, wherein the target feature vector is any one of the feature vectors to be analyzed; For each human body image in each of the similar human body image sets, perform a weighted sum of each of the classification probability prediction results and the weight of each of the classification prediction modules to obtain a soft voting score; according to each of the soft voting scores, determine Pedestrian re-identification results.
  • the artificial intelligence-based pedestrian re-identification method implemented above is connected to each feature output module through the feature pyramid of the feature extraction model, and feature vectors to be analyzed with different levels of semantic information are obtained, thereby enabling attention to clothing color, texture, etc.
  • the low-level features and high-level global semantic information improve the accuracy of pedestrian re-identification; and by determining the soft voting score within the similar human image set determined based on the target feature vector and the preset number of similar images, the pedestrian re-identification accuracy is further improved. Re-identification accuracy.
  • Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual speed data rate SDRAM (SSRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
  • SRAM static RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • SDRAM dual speed data rate SDRAM
  • SSRSDRAM dual speed data rate SDRAM
  • ESDRAM enhanced SDRAM
  • SLDRAM synchronous Link (Synchlink) DRAM
  • SLDRAM synchronous Link (Synchlink) DRAM
  • Rambus direct RAM
  • DRAM direct memory bus dynamic RAM
  • RDRAM memory bus dynamic RAM

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The present application relates to the technical field of artificial intelligence. Disclosed are a person re-identification method and apparatus based on artificial intelligence, and a device and a storage medium. The method comprises: inputting a target image into a preset feature extraction model, so as to obtain a feature vector to be analyzed that is output by each feature output module; inputting, into a preset classification prediction module, each feature vector to be analyzed, so as to perform classification probability prediction, and obtaining a classification probability prediction result; according to a target feature vector and a preset number of similar images, determining sets of similar human body images from a preset human body image library, wherein the target feature vector is any feature vector to be analyzed; for each human body image in each set of similar human body images, performing weighted summation on each classification probability prediction result and the weight of each classification prediction module, so as to obtain a soft voting score; and determining a person re-identification result according to the soft voting scores. Attention is paid to both low-level features, such as the color and texture of clothes, and high-level global semantic information, thereby improving the accuracy of person re-identification.

Description

基于人工智能的行人重识别方法、装置、设备及存储介质Pedestrian re-identification method, device, equipment and storage medium based on artificial intelligence
本申请要求于2022年3月16日提交中国专利局、申请号为优先权号202210256790.4,发明名称为“基于人工智能的行人重识别方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application requests the priority of the Chinese patent application submitted to the China Patent Office on March 16, 2022, with the application number being priority number 202210256790.4, and the invention being titled "Pedestrian re-identification method, device, equipment and storage medium based on artificial intelligence" , the entire contents of which are incorporated herein by reference.
技术领域Technical field
本申请涉及到人工智能技术领域,特别是涉及到一种基于人工智能的行人重识别方法、装置、设备及存储介质。This application relates to the field of artificial intelligence technology, and in particular to a pedestrian re-identification method, device, equipment and storage medium based on artificial intelligence.
背景技术Background technique
行人重识别(Person re-identification)也称行人再识别,是利用计算机视觉技术判断图像或者视频序列中是否存在特定行人的技术,广泛被认为是一个图像检索的子问题,其任务是给定一个监控行人图像,检索人体图像库中是否存在与监控行人图像中的行人相同的人体图像。Person re-identification, also known as person re-identification, is a technology that uses computer vision technology to determine whether a specific pedestrian exists in an image or video sequence. It is widely considered to be a sub-problem of image retrieval, and its task is to give a Monitor pedestrian images and retrieve whether there are human body images that are the same as the pedestrians in the monitored pedestrian images in the human body image database.
一般来说,行人重识别任务的流程是首先训练一个特征提取网络,根据特征所计算的余弦度量距离得到损失值,优化器根据损失值不断更新网络的参数达到学习的效果。Generally speaking, the process of the pedestrian re-identification task is to first train a feature extraction network, obtain the loss value based on the cosine metric distance calculated by the feature, and the optimizer continuously updates the parameters of the network based on the loss value to achieve the learning effect.
在行人重识别任务的特征提取网络中,靠近输入层的特征往往包含了更多局部细节的语义信息,而靠近输出层的特征往往包含了更高层的全局语义信息,发明人发现,现有的行人重识别方法往往只使用特征提取网络的最后一层的特征向量与人体图像库中的人体图像做匹配而没有考虑底层特征,因小物体本身具有的像素信息较少,从而在特征提取网络进行下采样的过程中极易被丢失,导致行人重识别的准确性不高。In the feature extraction network for pedestrian re-identification tasks, features close to the input layer often contain more local detailed semantic information, while features close to the output layer often contain higher-level global semantic information. The inventor found that existing Pedestrian re-identification methods often only use the feature vectors of the last layer of the feature extraction network to match the human images in the human image database without considering the underlying features. Since small objects themselves have less pixel information, they are processed in the feature extraction network. It is easily lost during the down-sampling process, resulting in low accuracy of pedestrian re-identification.
技术问题technical problem
本申请的主要目的为提供一种基于人工智能的行人重识别方法、装置、设备及存储介质,旨在解决现有的行人重识别方法只使用特征提取网络的最后一层的特征向量与人体图像库中的人体图像做匹配而没有考虑底层特征,导致行人重识别的准确性不高的技术问题。The main purpose of this application is to provide a method, device, equipment and storage medium for pedestrian re-identification based on artificial intelligence, aiming to solve the problem that the existing pedestrian re-identification method only uses the feature vector of the last layer of the feature extraction network and the human body image. The human body images in the database are matched without considering the underlying features, resulting in technical problems such as low accuracy of pedestrian re-identification.
技术解决方案Technical solutions
本申请的主要目的为提供一种基于人工智能的行人重识别方法、装置、设备及存储介质,旨在解决现有的行人重识别方法只使用特征提取网络的最后一层的特征向量与人体图像库中的人体图像做匹配而没有考虑底层特征,导致行人重识别的准确性不高的技术问题。The main purpose of this application is to provide a method, device, equipment and storage medium for pedestrian re-identification based on artificial intelligence, aiming to solve the problem that the existing pedestrian re-identification method only uses the feature vector of the last layer of the feature extraction network and the human body image. The human body images in the database are matched without considering the underlying features, resulting in technical problems such as low accuracy of pedestrian re-identification.
为了实现上述发明目的,本申请提出一种基于人工智能的行人重识别方法,所述方法包括:In order to achieve the above-mentioned object of the invention, this application proposes a pedestrian re-identification method based on artificial intelligence, which method includes:
获取目标图像;Get the target image;
将所述目标图像输入预设的特征提取模型,得到每个特征输出模块输出的待分析特征向量,其中,所述特征提取模型包括:特征金字塔和多个所述特征输出模块,所述特征金字塔分别与每个所述特征输出模块连接;Input the target image into a preset feature extraction model to obtain the feature vector to be analyzed output by each feature output module, wherein the feature extraction model includes: a feature pyramid and a plurality of the feature output modules, the feature pyramid Connect to each of the feature output modules respectively;
将每个所述待分析特征向量输入预设的分类预测模块进行分类概率预测,得到分类概率预测结果;Input each of the feature vectors to be analyzed into a preset classification prediction module for classification probability prediction, and obtain a classification probability prediction result;
根据目标特征向量和预设的相似图像数量,从预设的人体图像库中确定相似人体图像集,其中,所述目标特征向量是任一个所述待分析特征向量;Determine a set of similar human body images from the preset human body image library according to the target feature vector and the preset number of similar images, wherein the target feature vector is any one of the feature vectors to be analyzed;
针对各个所述相似人体图像集中的每张人体图像,将各个所述分类概率预测结果和各个所述分类预测模块的权重进行加权求和,得到软投票评分;For each human body image in each of the similar human body image sets, perform a weighted sum of each of the classification probability prediction results and the weight of each of the classification prediction modules to obtain a soft voting score;
根据各个所述软投票评分,确定行人重识别结果。According to each of the soft voting scores, the pedestrian re-identification result is determined.
本申请还提出了一种基于人工智能的行人重识别装置,所述装置包括:This application also proposes a pedestrian re-identification device based on artificial intelligence, which includes:
图像获取模块,用于获取目标图像;Image acquisition module, used to acquire target images;
待分析特征向量确定模块,用于将所述目标图像输入预设的特征提取模型,得到每个特征输出模块输出的待分析特征向量,其中,所述特征提取模型包括:特征金字塔和多个所述特征输出模块,所述特征金字塔分别与每个所述特征输出模块连接;A feature vector determination module to be analyzed is used to input the target image into a preset feature extraction model to obtain a feature vector to be analyzed output by each feature output module, wherein the feature extraction model includes: a feature pyramid and a plurality of The feature output module, the feature pyramid is connected to each of the feature output modules respectively;
分类概率预测结果确定模块,用于将每个所述待分析特征向量输入预设的分类预测模块进行分类概率预测,得到分类概率预测结果;The classification probability prediction result determination module is used to input each of the feature vectors to be analyzed into a preset classification prediction module to perform classification probability prediction and obtain the classification probability prediction result;
相似人体图像集确定模块,用于根据目标特征向量和预设的相似图像数量,从预设的人体图像库中确定相似人体图像集,其中,所述目标特征向量是任一个所述待分析特征向量;A similar human body image set determination module, configured to determine a similar human body image set from a preset human body image library based on the target feature vector and a preset number of similar images, wherein the target feature vector is any one of the features to be analyzed vector;
软投票评分确定模块,用于针对各个所述相似人体图像集中的每张人体图像,将各个所述分类概率预测结果和各个所述分类预测模块的权重进行加权求和,得到软投票评分;A soft voting score determination module, configured to perform a weighted sum of each of the classification probability prediction results and the weights of each of the classification prediction modules for each human body image in each of the similar human body image sets to obtain a soft voting score;
行人重识别结果确定模块,用于根据各个所述软投票评分,确定行人重识别结果。The pedestrian re-identification result determination module is used to determine the pedestrian re-identification result according to each of the soft voting scores.
本申请还提出了一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现一种基于人工智能的行人重识别方法,其中,所述方法包括以下步骤:This application also proposes a computer device, including a memory and a processor. The memory stores a computer program. When the processor executes the computer program, it implements an artificial intelligence-based pedestrian re-identification method, wherein: The method includes the following steps:
获取目标图像;Get the target image;
将所述目标图像输入预设的特征提取模型,得到每个特征输出模块输出的待分析特征向量,其中,所述特征提取模型包括:特征金字塔和多个所述特征输出模块,所述特征金字塔分别与每个所述特征输出模块连接;Input the target image into a preset feature extraction model to obtain the feature vector to be analyzed output by each feature output module, wherein the feature extraction model includes: a feature pyramid and a plurality of the feature output modules, the feature pyramid Connect to each of the feature output modules respectively;
将每个所述待分析特征向量输入预设的分类预测模块进行分类概率预测,得到分类概率预测结果;Input each of the feature vectors to be analyzed into a preset classification prediction module for classification probability prediction, and obtain a classification probability prediction result;
根据目标特征向量和预设的相似图像数量,从预设的人体图像库中确定相似人体图像集,其中,所述目标特征向量是任一个所述待分析特征向量;Determine a set of similar human body images from the preset human body image library according to the target feature vector and the preset number of similar images, wherein the target feature vector is any one of the feature vectors to be analyzed;
针对各个所述相似人体图像集中的每张人体图像,将各个所述分类概率预 测结果和各个所述分类预测模块的权重进行加权求和,得到软投票评分;For each human body image in each of the similar human body image sets, perform a weighted sum of each of the classification probability prediction results and the weight of each of the classification prediction modules to obtain a soft voting score;
根据各个所述软投票评分,确定行人重识别结果。According to each of the soft voting scores, the pedestrian re-identification result is determined.
本申请还提出了一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现一种基于人工智能的行人重识别方法,其中,所述方法包括以下步骤:This application also proposes a computer-readable storage medium on which a computer program is stored. When the computer program is executed by a processor, it implements an artificial intelligence-based pedestrian re-identification method, wherein the method includes the following steps:
获取目标图像;Get the target image;
将所述目标图像输入预设的特征提取模型,得到每个特征输出模块输出的待分析特征向量,其中,所述特征提取模型包括:特征金字塔和多个所述特征输出模块,所述特征金字塔分别与每个所述特征输出模块连接;Input the target image into a preset feature extraction model to obtain the feature vector to be analyzed output by each feature output module, wherein the feature extraction model includes: a feature pyramid and a plurality of the feature output modules, the feature pyramid Connect to each of the feature output modules respectively;
将每个所述待分析特征向量输入预设的分类预测模块进行分类概率预测,得到分类概率预测结果;Input each of the feature vectors to be analyzed into a preset classification prediction module for classification probability prediction, and obtain a classification probability prediction result;
根据目标特征向量和预设的相似图像数量,从预设的人体图像库中确定相似人体图像集,其中,所述目标特征向量是任一个所述待分析特征向量;Determine a set of similar human body images from the preset human body image library according to the target feature vector and the preset number of similar images, wherein the target feature vector is any one of the feature vectors to be analyzed;
针对各个所述相似人体图像集中的每张人体图像,将各个所述分类概率预测结果和各个所述分类预测模块的权重进行加权求和,得到软投票评分;For each human body image in each of the similar human body image sets, perform a weighted sum of each of the classification probability prediction results and the weight of each of the classification prediction modules to obtain a soft voting score;
根据各个所述软投票评分,确定行人重识别结果。According to each of the soft voting scores, the pedestrian re-identification result is determined.
有益效果beneficial effects
本申请的基于人工智能的行人重识别方法、装置、设备及存储介质,提高了行人重识别的准确性。The artificial intelligence-based pedestrian re-identification method, device, equipment and storage medium of this application improves the accuracy of pedestrian re-identification.
附图说明Description of the drawings
图1为本申请一实施例的基于人工智能的行人重识别方法的流程示意图;Figure 1 is a schematic flow chart of a pedestrian re-identification method based on artificial intelligence according to an embodiment of the present application;
图2为本申请一实施例的基于人工智能的行人重识别装置的结构示意框图;Figure 2 is a schematic structural block diagram of an artificial intelligence-based pedestrian re-identification device according to an embodiment of the present application;
图3为本申请一实施例的计算机设备的结构示意框图。FIG. 3 is a schematic structural block diagram of a computer device according to an embodiment of the present application.
本发明的最佳实施方式Best Mode of Carrying Out the Invention
为了解决上述问题,本申请实施例中提供了一种基于人工智能的行人重识别方法,具体请参照图1,所述方法包括:In order to solve the above problem, an embodiment of the present application provides a pedestrian re-identification method based on artificial intelligence. Please refer to Figure 1 for details. The method includes:
S1:获取目标图像;S1: Get the target image;
S2:将所述目标图像输入预设的特征提取模型,得到每个特征输出模块输出的待分析特征向量,其中,所述特征提取模型包括:特征金字塔和多个所述特征输出模块,所述特征金字塔分别与每个所述特征输出模块连接;S2: Input the target image into a preset feature extraction model to obtain the feature vector to be analyzed output by each feature output module, wherein the feature extraction model includes: a feature pyramid and a plurality of the feature output modules, The feature pyramid is connected to each of the feature output modules respectively;
S3:将每个所述待分析特征向量输入预设的分类预测模块进行分类概率预测,得到分类概率预测结果;S3: Input each feature vector to be analyzed into the preset classification prediction module to perform classification probability prediction, and obtain the classification probability prediction result;
S4:根据目标特征向量和预设的相似图像数量,从预设的人体图像库中确定相似人体图像集,其中,所述目标特征向量是任一个所述待分析特征向量;S4: Determine a set of similar human body images from the preset human body image library according to the target feature vector and the preset number of similar images, where the target feature vector is any one of the feature vectors to be analyzed;
S5:针对各个所述相似人体图像集中的每张人体图像,将各个所述分类概 率预测结果和各个所述分类预测模块的权重进行加权求和,得到软投票评分;S5: For each human body image in each of the similar human body image sets, perform a weighted sum of each classification probability prediction result and the weight of each classification prediction module to obtain a soft voting score;
S6:根据各个所述软投票评分,确定行人重识别结果。S6: Determine the pedestrian re-identification result based on each of the soft voting scores.
本实施例通过特征提取模型的特征金字塔分别与每个特征输出模块连接,获取到了具有不同层级的语义信息的待分析特征向量,从而实现关注衣服颜色、纹理等底层特征和高层全局语义信息,提高了行人重识别的准确性;而且通过在根据目标特征向量和预设的相似图像数量确定的相似人体图像集范围内进行软投票评分的确定,进一步提高了行人重识别的准确性。In this embodiment, the feature pyramid of the feature extraction model is connected to each feature output module respectively, and feature vectors to be analyzed with different levels of semantic information are obtained, thereby achieving attention to low-level features such as clothing color and texture and high-level global semantic information, and improving The accuracy of pedestrian re-identification is improved; and the soft voting score is determined within the range of similar human image sets determined based on the target feature vector and the preset number of similar images, which further improves the accuracy of pedestrian re-identification.
对于S1,可以从数据库中获取目标图像,可以获取用户输入的目标图像,还可以从第三方应用中获取目标图像。For S1, the target image can be obtained from the database, the target image input by the user can be obtained, and the target image can be obtained from a third-party application.
目标图像,是需要在人体图像库中进行检索的图像。目标图像是针对人体拍摄的图像。The target image is an image that needs to be retrieved in the human body image database. The target image is an image taken of a human body.
对于S2,将所述目标图像输入预设的特征提取模型,特征提取模型的每个特征输出模块输出一个待分析特征向量。For S2, the target image is input into a preset feature extraction model, and each feature output module of the feature extraction model outputs a feature vector to be analyzed.
特征金字塔包括:自底向上的特征提取链路、自顶向下的特征融合链路和横向连接,其中,横向连接是自底向上的特征提取链路与自顶向下的特征融合链路之间的连接。The feature pyramid includes: bottom-up feature extraction link, top-down feature fusion link and horizontal connection. Among them, the horizontal connection is one of the bottom-up feature extraction link and the top-down feature fusion link. connections between.
特征金字塔的自顶向下的特征融合链路的每个特征融合层与一个所述特征输出模块链接,并且特征融合层与所述特征输出模块一一对应。Each feature fusion layer of the top-down feature fusion link of the feature pyramid is linked to one of the feature output modules, and the feature fusion layer corresponds to the feature output module one-to-one.
自底向上的特征提取链路,其中就是网络的前向过程,在前向过程中,特征图(feature map)的大小在经过某些网络层后会改变,而在经过其它一些网络层的时候不会改变,将具有相同特征图尺寸的网络层统一归为一个特征提取层。Bottom-up feature extraction link, which is the forward process of the network. In the forward process, the size of the feature map will change after passing through some network layers, and when passing through other network layers It will not change. Network layers with the same feature map size will be unified into one feature extraction layer.
自顶向下的特征融合链路,是将高层特征图采用上采样进行等比例放大。The top-down feature fusion link uses upsampling to amplify high-level feature maps in equal proportions.
横向连接,是将上采样得到的特征向量和自底向上的特征提取链路输出的具有相同大小的各个特征图经过1*1卷积核的通道扩展得到的特征向量进行融合,得到融合特征向量。融合特征向量将输出给所述特征输出模块。Horizontal connection is to fuse the feature vector obtained by upsampling and the feature vector of the same size output by the bottom-up feature extraction link through the channel expansion of the 1*1 convolution kernel to obtain the fused feature vector . The fused feature vector will be output to the feature output module.
对于S3,将每个所述待分析特征向量输入预设的分类预测模块进行分类概率预测,将预测得到的概率向量作为分类概率预测结果。For S3, each feature vector to be analyzed is input into a preset classification prediction module for classification probability prediction, and the predicted probability vector is used as the classification probability prediction result.
分类概率预测结果中的向量元素的数量与人体图像库中的人体图像的数量相同,也就是说,分类概率预测结果中的向量元素与人体图像库中的人体图像一一对应。The number of vector elements in the classification probability prediction result is the same as the number of human body images in the human body image database. That is to say, the vector elements in the classification probability prediction result correspond to the human body images in the human body image database.
分类概率预测结果中的向量元素的值,是目标图像中的人体与该向量元素对应的人体图像的人体相似的概率。The value of the vector element in the classification probability prediction result is the probability that the human body in the target image is similar to the human body in the human body image corresponding to the vector element.
可以理解的是,分类概率预测结果的数量与所述待分析特征向量的数量相同。It can be understood that the number of classification probability prediction results is the same as the number of feature vectors to be analyzed.
分类预测模块是采用softmax激活函数的全连接层。The classification prediction module is a fully connected layer using softmax activation function.
对于S4,将任一个所述待分析特征向量作为目标特征向量;根据目标特征向量,从预设的人体图像库中确定人体最相似的人体图像作为相似人体图像集,相似人体图像集中的图像数量等于所述相似图像数量。For S4, any one of the feature vectors to be analyzed is used as the target feature vector; according to the target feature vector, the most similar human body image is determined from the preset human body image library as a similar human body image set, and the number of images in the similar human body image set is equal to the number of similar images.
人体图像库包括:图像标识、人体图像和人体图像对应的特征向量。人体图像,是针对人体拍摄的图像。The human body image library includes: image identification, human body images, and feature vectors corresponding to the human body images. Human body images are images taken of the human body.
对于S5,将各个所述相似人体图像集进行合集处理和去重处理,得到目标人体图像集;将所述目标人体图像集中的任一张人体图像作为待投票图像;将各个所述分类概率预测结果中的与所述待投票图像对应的各个向量元素值和各个所述分类预测模块的权重进行加权求和,将加权求和得到的数据作为待投票图像对应的软投票评分。For S5, perform aggregation processing and deduplication processing on each similar human body image set to obtain a target human body image set; use any human body image in the target human body image set as the image to be voted; predict each classification probability Each vector element value in the result corresponding to the image to be voted and the weight of each classification prediction module are weighted and summed, and the data obtained by the weighted summation is used as the soft voting score corresponding to the image to be voted.
比如,所述分类概率预测结果的数量为4个,第1个所述分类概率预测结果(是第1个所述分类预测模块输出的)中的与待投票图像对应的向量元素值与第1个所述分类预测模块的权重进行相乘得到第1个评分,第2个所述分类概率预测结果(是第2个所述分类预测模块输出的)中的与待投票图像对应的向量元素值与第2个所述分类预测模块的权重进行相乘得到第2个评分,第3个所述分类概率预测结果(是第3个所述分类预测模块输出的)中的与待投票图像对应的向量元素值与第3个所述分类预测模块的权重进行相乘得到第3个评分,第4个所述分类概率预测结果(是第4个所述分类预测模块输出的)中的与待投票图像对应的向量元素值与第4个所述分类预测模块的权重进行相乘得到第4个评分,将第1个评分、第2个评分、第3个评分和第4个评分进行相加,将相加得到的数据作为待投票图像对应的软投票评分。For example, the number of the classification probability prediction results is 4, and the vector element value corresponding to the image to be voted in the first classification probability prediction result (which is output by the first classification prediction module) is the same as the first The weights of the two classification prediction modules are multiplied to obtain the first score, and the vector element value corresponding to the image to be voted in the second classification probability prediction result (which is output by the second classification prediction module) The second score is obtained by multiplying it with the weight of the second classification prediction module. The third classification probability prediction result (which is output by the third classification prediction module) corresponds to the image to be voted. The vector element value is multiplied by the weight of the third classification prediction module to obtain the third score. The fourth classification probability prediction result (which is output by the fourth classification prediction module) is equal to the value to be voted on. The vector element value corresponding to the image is multiplied by the weight of the fourth classification prediction module to obtain the fourth score, and the first score, the second score, the third score and the fourth score are added, The added data is used as the soft voting score corresponding to the image to be voted on.
对于S6,将各个所述软投票评分中的值为最大的所述软投票评分对应的人体图像作为行人重识别结果的命中图像。For S6, the human body image corresponding to the soft voting score with the largest value in each of the soft voting scores is used as the hit image of the pedestrian re-identification result.
在一个实施例中,上述将所述目标图像输入预设的特征提取模型,得到每个特征输出模块输出的待分析特征向量的步骤,包括:In one embodiment, the above-mentioned step of inputting the target image into a preset feature extraction model to obtain the feature vector to be analyzed output by each feature output module includes:
S21:将所述目标图像输入所述特征金字塔的第1个特征提取层,得到第1个特征初始向量;S21: Input the target image into the first feature extraction layer of the feature pyramid to obtain the first feature initial vector;
S22:将第i-1个所述特征初始向量输入所述特征金字塔的第i个所述特征提取层,得到第i个所述特征初始向量,其中,i大于1并且小于n+1,n是个大于2的整数;S22: Input the i-1th feature initial vector into the i-th feature extraction layer of the feature pyramid to obtain the i-th feature initial vector, where i is greater than 1 and less than n+1, n is an integer greater than 2;
S23:将第n个所述特征初始向量输入所述特征金字塔的第1个特征融合层进行特征处理,得到第1个融合特征向量;S23: Input the n-th feature initial vector into the first feature fusion layer of the feature pyramid for feature processing to obtain the first fusion feature vector;
S24:将第k-1个所述融合特征向量和第n-k+1个所述特征初始向量输入所述特征金字塔的第k个所述特征融合层进行特征融合,得到第k个所述融合特征向量,其中,k是大于1并且小于n;S24: Input the k-1th fusion feature vector and the n-k+1th feature initial vector into the k-th feature fusion layer of the feature pyramid to perform feature fusion, and obtain the k-th feature initial vector. Fusion feature vector, where k is greater than 1 and less than n;
S25:将第m个所述融合特征向量输入第m个所述特征输出模块进行特征输出,得到第m个所述待分析特征向量,其中,m大于0并且小于n。S25: Input the m-th fused feature vector into the m-th feature output module for feature output, and obtain the m-th feature vector to be analyzed, where m is greater than 0 and less than n.
本实施例实现了采用自底向上的特征提取链路、自顶向下的特征融合链路和横向连接进行特征提取,针对每个融合特征向量输出一个所述待分析特征向量,获取到了具有不同层级的语义信息的待分析特征向量,从而实现关注衣服颜色、纹理等底层特征和高层全局语义信息,提高了行人重识别的准确性。This embodiment implements feature extraction using bottom-up feature extraction links, top-down feature fusion links and horizontal connections, and outputs a feature vector to be analyzed for each fusion feature vector, and obtains features with different The feature vector to be analyzed is the hierarchical semantic information, so as to focus on low-level features such as clothing color and texture and high-level global semantic information, thereby improving the accuracy of pedestrian re-identification.
其中,所述特征金字塔的第1个至第n个特征提取层可以采用一个 Resnet50网络(残差网络)。Among them, the 1st to nth feature extraction layers of the feature pyramid can use a Resnet50 network (residual network).
对于S21,将所述目标图像输入所述特征金字塔的第1个特征提取层,将第1个特征提取层输出的特征向量作为第1个特征初始向量。For S21, the target image is input into the first feature extraction layer of the feature pyramid, and the feature vector output by the first feature extraction layer is used as the first feature initial vector.
对于S22,将第i-1个所述特征初始向量输入所述特征金字塔的第i个所述特征提取层,将第i个特征提取层输出的特征向量作为第i个所述特征初始向量,其中,i大于1并且小于n+1,n是个大于2的整数。For S22, input the i-1th feature initial vector into the i-th feature extraction layer of the feature pyramid, and use the feature vector output by the i-th feature extraction layer as the i-th feature initial vector, Among them, i is greater than 1 and less than n+1, and n is an integer greater than 2.
也就是说,特征初始向量的数量为n。That is, the number of feature initial vectors is n.
对于S23,将第n个所述特征初始向量输入所述特征金字塔的第1个特征融合层进行通道扩展,将通道扩展得到的特征向量作为第1个融合特征向量。For S23, the nth feature initial vector is input into the first feature fusion layer of the feature pyramid for channel expansion, and the feature vector obtained by channel expansion is used as the first fusion feature vector.
对于S24,将第k-1个所述融合特征向量和第n-k+1个所述特征初始向量输入所述特征金字塔的第k个所述特征融合层进行元素相加以实现特征融合,将特征融合得到的特征向量作为第k个所述融合特征向量,k是大于1并且小于n,也就是说,融合特征向量的数量为n-1。For S24, the k-1th fusion feature vector and the n-k+1th feature initial vector are input into the k-th feature fusion layer of the feature pyramid for element addition to achieve feature fusion. The feature vector obtained by feature fusion is used as the k-th fused feature vector, and k is greater than 1 and less than n, that is, the number of fused feature vectors is n-1.
对于S25,将第m个所述融合特征向量输入第m个所述特征输出模块进行池化,将池化得到的特征向量作为第m个所述待分析特征向量,m大于0并且小于n,也就是说,待分析特征向量的数量为n-1。For S25, input the m-th fused feature vector into the m-th feature output module for pooling, and use the pooled feature vector as the m-th feature vector to be analyzed, where m is greater than 0 and less than n, In other words, the number of feature vectors to be analyzed is n-1.
在一个实施例中,上述将第n个所述特征初始向量输入所述特征金字塔的第1个特征融合层进行特征处理,得到第1个融合特征向量的步骤,包括:In one embodiment, the above-mentioned step of inputting the n-th feature initial vector into the first feature fusion layer of the feature pyramid for feature processing to obtain the first fused feature vector includes:
S231:采用第1个所述特征融合层的通道扩展卷积核,对第n个所述特征初始向量进行通道扩展,得到第1个所述融合特征向量,其中,所述通道扩展卷积核为1*1卷积核;S231: Use the channel expansion convolution kernel of the first feature fusion layer to perform channel expansion on the nth feature initial vector to obtain the first fusion feature vector, wherein the channel expansion convolution kernel It is a 1*1 convolution kernel;
所述将第k-1个所述融合特征向量和第n-k+1个所述特征初始向量输入所述特征金字塔的第k个所述特征融合层进行特征融合,得到第k个所述融合特征向量的步骤,包括:The k-1th fusion feature vector and the n-k+1th feature initial vector are input into the k-th feature fusion layer of the feature pyramid for feature fusion to obtain the k-th feature initial vector. The steps to fuse feature vectors include:
S241:采用第k个所述特征融合层的所述通道扩展卷积核,对第n-k+1个所述特征初始向量进行通道扩展,得到第k个所述通道扩展特征;S241: Use the channel expansion convolution kernel of the kth feature fusion layer to perform channel expansion on the n-k+1th feature initial vector to obtain the kth channel expansion feature;
S242:将第k-1个所述融合特征向量输入第k个所述特征融合层的最近邻插值处理子层进行等比例放大,得到第k个等比例放大特征;S242: Input the k-1th fusion feature vector into the nearest neighbor interpolation processing sub-layer of the kth feature fusion layer for equal proportion amplification, and obtain the kth equal proportion amplification feature;
S243:将第k个所述通道扩展特征和第k个所述等比例放大特征进行融合处理,得到第k个所述融合特征向量。S243: Fusion process the k-th channel expansion feature and the k-th equal-scale amplification feature to obtain the k-th fusion feature vector.
本实施例通过通道扩展以使采用相同通道的特征进行融合处理,为获取到了具有不同层级的语义信息的待分析特征向量提供了基础。This embodiment uses channel expansion to perform fusion processing using features of the same channel, which provides a basis for obtaining feature vectors to be analyzed with different levels of semantic information.
对于S231,采用第1个所述特征融合层的通道扩展卷积核,对第n个所述特征初始向量进行卷积,以实现通道扩展,将卷积得到的特征向量作为第1个所述融合特征向量。For S231, the channel expansion convolution kernel of the first feature fusion layer is used to convolve the nth feature initial vector to achieve channel expansion, and the feature vector obtained by convolution is used as the first feature vector. Fusion of feature vectors.
对于S241,采用第k个所述特征融合层的所述通道扩展卷积核,对第n-k+1个所述特征初始向量进行卷积,以实现通道扩展,将卷积得到的特征向量作为第k个所述通道扩展特征。For S241, the channel expansion convolution kernel of the kth feature fusion layer is used to convolve the n-k+1th feature initial vector to achieve channel expansion, and the feature vector obtained by convolution is As the kth channel expansion feature.
对于S242,采用第k个所述特征融合层的最近邻插值处理子层,对第k-1 个所述融合特征向量进行最近邻插值处理以实现等比例放大,将等比例放大得到的特征向量作为第k个等比例放大特征。For S242, the nearest neighbor interpolation processing sub-layer of the k-th feature fusion layer is used to perform nearest neighbor interpolation processing on the k-1th fusion feature vector to achieve equal-proportion amplification, and the feature vector obtained by equal-proportion amplification is As the kth equal-scale enlargement feature.
可选的,最近邻插值处理子层的放大比例为2。Optionally, the amplification ratio of the nearest neighbor interpolation processing sub-layer is 2.
对于S243,将第k个所述通道扩展特征和第k个所述等比例放大特征进行元素相加,将元素相加得到的数据作为第k个所述融合特征向量。For S243, element-wise addition is performed on the k-th channel expansion feature and the k-th proportional amplification feature, and the data obtained by the element addition is used as the k-th fusion feature vector.
可选的,采用1*1卷积核,对第k个所述通道扩展特征和第k个所述等比例放大特征进行融合处理,将融合处理得到的数据作为第k个所述融合特征向量。Optionally, a 1*1 convolution kernel is used to fuse the k-th channel expansion feature and the k-th equal-scale amplification feature, and the data obtained by the fusion processing is used as the k-th fusion feature vector. .
在一个实施例中,上述将第m个所述融合特征向量输入第m个所述特征输出模块进行特征输出,得到第m个所述待分析特征向量的步骤,包括:In one embodiment, the above-mentioned step of inputting the m-th fused feature vector into the m-th feature output module for feature output to obtain the m-th feature vector to be analyzed includes:
S251:采用第m个所述特征输出模块的混叠效应消除层,对第m个所述融合特征向量进行混叠效应消除,得到第m个已消除混叠效应特征向量,其中,所述混叠效应消除层为3*3卷积核;S251: Use the aliasing effect elimination layer of the mth feature output module to perform aliasing effect elimination on the mth fusion feature vector to obtain the mth aliasing effect eliminated feature vector, wherein the aliasing effect is eliminated. The stacking effect elimination layer is a 3*3 convolution kernel;
S252:将第m个所述已消除混叠效应特征向量输入第m个所述特征输出模块的池化层进行池化处理,得到第m个所述待分析特征向量。S252: Input the m-th aliasing effect eliminated feature vector into the pooling layer of the m-th feature output module for pooling processing to obtain the m-th feature vector to be analyzed.
因特征金字塔的自顶向下的特征融合链路进行上采样时会存在混叠效应,为了解决该问题,本实施例采用3*3卷积核进行卷积,从而提高了待分析特征向量的准确性。Because the top-down feature fusion link of the feature pyramid will have an aliasing effect when upsampling, in order to solve this problem, this embodiment uses a 3*3 convolution kernel for convolution, thereby improving the accuracy of the feature vector to be analyzed. accuracy.
对于S251,采用第m个所述特征输出模块的混叠效应消除层,对第m个所述融合特征向量进行卷积,将卷积得到的数据作为第m个已消除混叠效应特征向量。For S251, the aliasing effect elimination layer of the mth feature output module is used to convolve the mth fusion feature vector, and the convolved data is used as the mth aliasing effect eliminated feature vector.
可以理解的是,还可以采用其他尺寸的卷积核进行卷积以进行混叠效应消除,在此不做限定。It can be understood that convolution kernels of other sizes can also be used for convolution to eliminate aliasing effects, which is not limited here.
对于S252,将第m个所述已消除混叠效应特征向量输入第m个所述特征输出模块的池化层进行最大池化处理,将最大池化处理得到的数据作为第m个所述待分析特征向量。For S252, input the m-th feature vector with eliminated aliasing effect into the pooling layer of the m-th feature output module for maximum pooling processing, and use the data obtained by the maximum pooling processing as the m-th feature vector to be processed. Analyze eigenvectors.
在一个实施例中,上述根据目标特征向量和预设的相似图像数量,从预设的人体图像库中确定相似人体图像集的步骤,包括:In one embodiment, the above-mentioned step of determining a set of similar human body images from a preset human body image library based on the target feature vector and the preset number of similar images includes:
S41:对所述目标特征向量和所述人体图像库中的每张所述人体图像对应的特征向量进行相似度计算,得到第一相似度;S41: Calculate the similarity between the target feature vector and the feature vector corresponding to each human body image in the human body image library to obtain the first similarity;
S42:从各个所述第一相似度中,找出值最大并且数量与所述相似图像数量相同的所述第一相似度,作为初始相似度集;S42: From each of the first similarities, find the first similarity with the largest value and the same number as the number of similar images as an initial similarity set;
S43:对所述初始相似度集对应的各张所述人体图像的特征向量和所述目标特征向量进行平均值计算,得到已调整特征向量;S43: Calculate the average value of the feature vectors of each human body image corresponding to the initial similarity set and the target feature vector to obtain an adjusted feature vector;
S44:对所述已调整特征向量和所述人体图像库中的每张所述人体图像对应的特征向量进行相似度计算,得到第二相似度;S44: Perform similarity calculation on the adjusted feature vector and the feature vector corresponding to each human body image in the human body image library to obtain a second similarity;
S45:从各个所述第二相似度中,找出值最大并且数量与所述相似图像数量相同的所述第二相似度,得到目标相似度集;S45: From each of the second similarities, find the second similarities with the largest value and the same number as the number of similar images, and obtain a target similarity set;
S46:将所述目标相似度集对应的各张所述人体图像作为所述相似人体图 像集。S46: Use each of the human body images corresponding to the target similarity set as the similar human body image set.
本实施例首先找出与目标特征向量最相似的并且数量为相似图像数量的人体图像,然后将找出的人体图像的特征向量和目标特征向量进行平均值计算,得到已调整特征向量,然后找出与已调整特征向量最相似的并且数量为相似图像数量的人体图像,将找出的各个人体图像作为相似人体图像集,因最相似的并且数量为相似图像数量的人体图像的置信度高、噪声少,通过与目标特征向量最相似的并且数量为相似图像数量的人体图像的特征向量和目标特征向量进行平均值计算得到已调整特征向量,将再找出与已调整特征向量最相似的并且数量为相似图像数量的人体图像的操作,提高了整体的召回率。This embodiment first finds the human body images that are most similar to the target feature vector and the number is the number of similar images, and then averages the found feature vectors of the human image and the target feature vector to obtain the adjusted feature vector, and then finds Find the human body images that are most similar to the adjusted feature vector and the number is the number of similar images, and use each found human body image as a set of similar human body images, because the most similar human body images and the number is the number of similar images have high confidence. There is less noise. The adjusted feature vector is calculated by averaging the feature vectors of human images that are most similar to the target feature vector and the number is the number of similar images and the target feature vector. Then the adjusted feature vector is found and is most similar to the adjusted feature vector. The operation on human body images as many as similar images improves the overall recall rate.
对于S41,对所述目标特征向量和所述人体图像库中的每张所述人体图像对应的特征向量进行余弦相似度计算,将计算得到的每个余弦相似度作为一个第一相似度。For S41, perform cosine similarity calculation on the target feature vector and the feature vector corresponding to each human body image in the human body image library, and use each calculated cosine similarity as a first similarity.
对于S42,从各个所述第一相似度中,找出值最大的多个所述第一相似度,将找出的各个所述第一相似度作为初始相似度集,其中,初始相似度集中的所述第一相似度的数量等于所述相似图像数量。For S42, find a plurality of first similarities with the largest values from each of the first similarities, and use each of the found first similarities as an initial similarity set, where the initial similarity set is The number of first similarities is equal to the number of similar images.
对于S43,将对所述初始相似度集对应的各张所述人体图像的特征向量和所述目标特征向量进行合集处理,得到待计算集合;对待计算集合中的各个特征向量进行平均值计算,将平均值计算得到的数据作为已调整特征向量。For S43, the feature vectors of each human body image corresponding to the initial similarity set and the target feature vector will be aggregated to obtain a set to be calculated; the average value of each feature vector in the set to be calculated will be calculated, The averaged data is used as the adjusted feature vector.
对于S44,对所述已调整特征向量和所述人体图像库中的每张所述人体图像对应的特征向量进行余弦相似度计算,将计算得到的每个余弦相似度作为一个第二相似度。For S44, perform cosine similarity calculation on the adjusted feature vector and the feature vector corresponding to each human body image in the human body image library, and use each calculated cosine similarity as a second similarity.
对于S45,从各个所述第二相似度中,找出值最大的多个所述第二相似度,将找出的各个所述第二相似度作为目标相似度集,其中,目标相似度集中的所述第二相似度的数量等于所述相似图像数量。For S45, find a plurality of second similarities with the largest value from each of the second similarities, and use each of the found second similarities as a target similarity set, where the target similarities are concentrated The number of the second similarities is equal to the number of similar images.
对于S46,所述目标相似度集对应的各张所述人体图像是人体图像库中与目标特征向量对应的特征最相似的人体图像,因此,将所述目标相似度集对应的各张所述人体图像作为所述相似人体图像集。For S46, each of the human body images corresponding to the target similarity set is the human body image in the human body image library that is most similar to the characteristics corresponding to the target feature vector. Therefore, each of the human body images corresponding to the target similarity set is Human body images serve as the set of similar human body images.
在一个实施例中,上述根据各个所述软投票评分,确定行人重识别结果的步骤,包括:In one embodiment, the above-mentioned step of determining the pedestrian re-identification result based on each of the soft voting scores includes:
S61:从各个所述软投票评分中找出值为最大的所述软投票评分,得到目标评分;S61: Find the soft voting score with the largest value from each of the soft voting scores, and obtain the target score;
S62:判断所述目标评分是否大于预设的评分阈值;S62: Determine whether the target score is greater than the preset score threshold;
S63:若是,则确定所述行人重识别结果的识别结果为成功,并且将所述目标评分对应的所述人体图像作为所述行人重识别结果的命中图像;S63: If yes, determine that the recognition result of the pedestrian re-recognition result is successful, and use the human body image corresponding to the target score as the hit image of the pedestrian re-recognition result;
S64:若否,则确定所述行人重识别结果的识别结果为失败。S64: If not, determine that the recognition result of the pedestrian re-recognition result is failed.
本实施例将大于预设的评分阈值的软投票评分对应的人体图像作为行人重识别结果的命中图像,从而提高了确定的行人重识别结果的准确性。In this embodiment, the human body image corresponding to the soft voting score greater than the preset score threshold is used as the hit image of the pedestrian re-identification result, thereby improving the accuracy of the determined pedestrian re-identification result.
对于S61,从各个所述软投票评分中找出值为最大的所述软投票评分,将找出的所述软投票评分作为目标评分。For S61, the soft voting score with the largest value is found from each of the soft voting scores, and the found soft voting score is used as the target score.
对于S63,若是,也就是说,所述目标评分大于预设的评分阈值,意味着值为最大的所述软投票评分大于预设的评分阈值,因此确定所述行人重识别结果的识别结果为成功,并且将所述目标评分对应的所述人体图像作为所述行人重识别结果的命中图像。For S63, if yes, that is to say, the target score is greater than the preset score threshold, it means that the soft voting score with the largest value is greater than the preset score threshold, so the recognition result of the pedestrian re-identification result is determined to be is successful, and the human body image corresponding to the target score is used as the hit image of the pedestrian re-identification result.
对于S64,若否,也就是说,所述目标评分小于或等于预设的评分阈值,意味着值为最大的所述软投票评分小于或等于预设的评分阈值,因此确定所述行人重识别结果的识别结果为失败。For S64, if not, that is to say, the target score is less than or equal to the preset score threshold, which means that the soft voting score with the largest value is less than or equal to the preset score threshold, so the pedestrian re-identification is determined The recognition result of the result is failure.
在一个实施例中,上述将所述目标图像输入预设的特征提取模型,得到每个特征输出模块输出的待分析特征向量的步骤之前,还包括:In one embodiment, before the above step of inputting the target image into a preset feature extraction model and obtaining the feature vector to be analyzed output by each feature output module, the step further includes:
S71:获取初始模型和训练样本集,其中,所述初始模型包括:初始特征金字塔、多个特征输出初始模块和多个分类预测初始模块,所述初始特征金字塔分别与每个所述特征输出初始模块连接,所述特征输出初始模块和所述分类预测初始模块连接,以及所述特征输出初始模块和所述分类预测初始模块一一对应;S71: Obtain an initial model and a training sample set, where the initial model includes: an initial feature pyramid, multiple feature output initial modules and multiple classification prediction initial modules, and the initial feature pyramid is associated with each of the feature output initial modules. Module connection, the feature output initial module and the classification prediction initial module are connected, and the feature output initial module and the classification prediction initial module correspond one to one;
S72:基于权值的弱分类器集成方法和所述训练样本集,对所述初始模型进行训练,将训练结束的所述初始模型作为目标模型,其中,所述目标模型的所述初始特征金字塔作为所述特征金字塔,所述目标模型的所述特征输出初始模块作为所述特征输出模块,所述目标模型的所述分类预测初始模块作为所述分类预测模块。S72: The weight-based weak classifier integration method and the training sample set are used to train the initial model, and the initial model after training is used as the target model, where the initial feature pyramid of the target model As the feature pyramid, the feature output initial module of the target model serves as the feature output module, and the classification prediction initial module of the target model serves as the classification prediction module.
本实施例采用了基于权值的弱分类器集成方法对所述初始模型进行训练,从而在每轮训练都提升错误率小的分类预测初始模块的权重,同时减小错误率高的分类预测初始模块的权重,使得分类预测初始模块对误分的数据有较好的效果。This embodiment uses a weight-based weak classifier integration method to train the initial model, thereby increasing the weight of the initial classification prediction module with a low error rate in each round of training, while reducing the initial classification prediction module with a high error rate. The weight of the module makes the initial module of classification prediction have better effect on misclassified data.
对于S71,训练样本包括:样本图像和分类概率标定值。样本图像是针对人体拍摄的图像。分类概率标定值,是对样本图像中的人体与人体图像库的每个人体图像的人体是否为同一个人的准确标定结果。For S71, training samples include: sample images and classification probability calibration values. The sample images are images taken of the human body. The classification probability calibration value is an accurate calibration result of whether the human body in the sample image and the human body in each human body image in the human body image library are the same person.
对于S72,基于权值的弱分类器集成方法和所述训练样本集,对所述初始模型进行训练,将训练结束的所述初始模型作为目标模型,从而在每轮训练都提升错误率小的分类预测初始模块的权重,同时减小错误率高的分类预测初始模块的权重。For S72, the weight-based weak classifier integration method and the training sample set are used to train the initial model, and the initial model after training is used as the target model, thereby improving the error rate with a small error rate in each round of training. The weight of the initial module of classification prediction is reduced, and the weight of the initial module of classification prediction with high error rate is reduced.
所述分类预测初始模块的数量为n-1,则采用对(n-1)/2进行向下取整得到x;针对所述训练样本集,计算每个分类预测初始模块的预测准确率;将各个分类预测初始模块,按预测准确率进行倒序排序,得到倒序排序后的分类预测初始模块集;采用第一权重更新公式,对倒序排序后的分类预测初始模块集的第1个至第x个分类预测初始模块的权重进行更新;采用第二权重更新公式,对倒序排序后的分类预测初始模块集的第x+1个至第n-1个分类预测初始模块的权重进行更新;其中,第一权重更新公式Q y1为:Q y1=q y1*a x-y1+1,第二权重更新公式Q y2为:Q y2=q y2/a x-y2+1,a是超级参数,Q y1是倒序排序后的分类预测初始模块集第y1个分类预测初始模块的权重更新公式,y1大于0并 且小于x+1,q y1是倒序排序后的分类预测初始模块集第y1个分类预测初始模块的当前权重,Q y2是倒序排序后的分类预测初始模块集的第y2个分类预测初始模块的权重更新公式,y2大于x并且小于n,q y2是倒序排序后的分类预测初始模块集第y2个分类预测初始模块的当前权重。 If the number of initial modules for classification prediction is n-1, then (n-1)/2 is rounded down to obtain x; for the training sample set, the prediction accuracy of each initial module for classification prediction is calculated; Sort each classification prediction initial module in reverse order according to the prediction accuracy to obtain the classification prediction initial module set after the reverse order; use the first weight update formula to sort the 1st to xth of the classification prediction initial module set after the reverse order The weights of the classification prediction initial modules are updated; the second weight update formula is used to update the weights of the x+1 to n-1 classification prediction initial modules of the reverse-ordered classification prediction initial module set; where, The first weight update formula Q y1 is: Q y1 =q y1 *a x-y1+1 , the second weight update formula Q y2 is: Q y2 =q y2 /a x-y2+1 , a is a super parameter, Q y1 is the weight update formula of the y1th classification prediction initial module of the classification prediction initial module set after sorting in reverse order, y1 is greater than 0 and less than x+1, q y1 is the y1th classification prediction initial module set of classification prediction after sorting in reverse order The current weight of the module, Q y2 is the weight update formula of the y2th classification prediction initial module of the classification prediction initial module set after sorting in reverse order, y2 is greater than x and less than n, q y2 is the classification prediction initial module set after sorting in reverse order The current weights of the initial module for y2 classification predictions.
Q y1计算出来的值,对倒序排序后的分类预测初始模块集第y1个分类预测初始模块的权重进行更新。 The value calculated by Q y1 updates the weight of the y1th classification prediction initial module in the reverse-ordered classification prediction initial module set.
Q y2计算出来的值,对倒序排序后的分类预测初始模块集第y2个分类预测初始模块的权重进行更新。 The value calculated by Q y2 updates the weight of the y2th classification prediction initial module in the reverse-ordered classification prediction initial module set.
可选的,a设置为1.1。Optionally, a is set to 1.1.
参照图2,本申请还提出了一种基于人工智能的行人重识别装置,所述装置包括:Referring to Figure 2, this application also proposes a pedestrian re-identification device based on artificial intelligence. The device includes:
图像获取模块100,用于获取目标图像; Image acquisition module 100, used to acquire target images;
待分析特征向量确定模块200,用于将所述目标图像输入预设的特征提取模型,得到每个特征输出模块输出的待分析特征向量,其中,所述特征提取模型包括:特征金字塔和多个所述特征输出模块,所述特征金字塔分别与每个所述特征输出模块连接;The feature vector to be analyzed determination module 200 is used to input the target image into a preset feature extraction model to obtain the feature vector to be analyzed output by each feature output module, wherein the feature extraction model includes: a feature pyramid and a plurality of The feature output module, the feature pyramid is connected to each of the feature output modules respectively;
分类概率预测结果确定模块300,用于将每个所述待分析特征向量输入预设的分类预测模块进行分类概率预测,得到分类概率预测结果;The classification probability prediction result determination module 300 is used to input each of the feature vectors to be analyzed into a preset classification prediction module to perform classification probability prediction and obtain the classification probability prediction result;
相似人体图像集确定模块400,用于根据目标特征向量和预设的相似图像数量,从预设的人体图像库中确定相似人体图像集,其中,所述目标特征向量是任一个所述待分析特征向量;The similar human body image set determination module 400 is used to determine a similar human body image set from the preset human body image library according to the target feature vector and the preset number of similar images, wherein the target feature vector is any one of the to-be-analyzed Feature vector;
软投票评分确定模块500,用于针对各个所述相似人体图像集中的每张人体图像,将各个所述分类概率预测结果和各个所述分类预测模块的权重进行加权求和,得到软投票评分;The soft voting score determination module 500 is configured to perform a weighted sum of each classification probability prediction result and the weight of each classification prediction module for each human body image in each of the similar human body image sets to obtain a soft voting score;
行人重识别结果确定模块600,用于根据各个所述软投票评分,确定行人重识别结果。The pedestrian re-identification result determination module 600 is used to determine the pedestrian re-identification result according to each of the soft voting scores.
本实施例通过特征提取模型的特征金字塔分别与每个特征输出模块连接,获取到了具有不同层级的语义信息的待分析特征向量,从而实现关注衣服颜色、纹理等底层特征和高层全局语义信息,提高了行人重识别的准确性;而且通过在根据目标特征向量和预设的相似图像数量确定的相似人体图像集范围内进行软投票评分的确定,进一步提高了行人重识别的准确性。In this embodiment, the feature pyramid of the feature extraction model is connected to each feature output module respectively, and feature vectors to be analyzed with different levels of semantic information are obtained, thereby achieving attention to low-level features such as clothing color and texture and high-level global semantic information, and improving The accuracy of pedestrian re-identification is improved; and the soft voting score is determined within the range of similar human image sets determined based on the target feature vector and the preset number of similar images, which further improves the accuracy of pedestrian re-identification.
参照图3,本申请实施例中还提供一种计算机设备,该计算机设备可以是服务器,其内部结构可以如图3所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设计的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机程序和数据库。该内存器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的数据库用于储存基于人工智能的行人重识别方法等数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现 一种基于人工智能的行人重识别方法。所述基于人工智能的行人重识别方法,包括:获取目标图像;将所述目标图像输入预设的特征提取模型,得到每个特征输出模块输出的待分析特征向量,其中,所述特征提取模型包括:特征金字塔和多个所述特征输出模块,所述特征金字塔分别与每个所述特征输出模块连接;将每个所述待分析特征向量输入预设的分类预测模块进行分类概率预测,得到分类概率预测结果;根据目标特征向量和预设的相似图像数量,从预设的人体图像库中确定相似人体图像集,其中,所述目标特征向量是任一个所述待分析特征向量;针对各个所述相似人体图像集中的每张人体图像,将各个所述分类概率预测结果和各个所述分类预测模块的权重进行加权求和,得到软投票评分;根据各个所述软投票评分,确定行人重识别结果。Referring to FIG. 3 , an embodiment of the present application also provides a computer device. The computer device may be a server, and its internal structure may be as shown in FIG. 3 . The computer device includes a processor, memory, network interface, and database connected through a system bus. Among them, the processor designed by the computer is used to provide computing and control capabilities. The memory of the computer device includes non-volatile storage media and internal memory. The non-volatile storage medium stores operating systems, computer programs and databases. This memory provides an environment for the operation of operating systems and computer programs in non-volatile storage media. The database of this computer device is used to store data such as pedestrian re-identification methods based on artificial intelligence. The network interface of the computer device is used to communicate with external terminals through a network connection. The computer program is executed by the processor to implement a pedestrian re-identification method based on artificial intelligence. The artificial intelligence-based pedestrian re-identification method includes: obtaining a target image; inputting the target image into a preset feature extraction model to obtain a feature vector to be analyzed output by each feature output module, wherein the feature extraction model It includes: a feature pyramid and a plurality of the feature output modules, the feature pyramid is connected to each of the feature output modules respectively; inputting each of the feature vectors to be analyzed into a preset classification prediction module for classification probability prediction, we get Classification probability prediction results; according to the target feature vector and the preset number of similar images, determine a set of similar human body images from the preset human body image library, wherein the target feature vector is any one of the feature vectors to be analyzed; for each For each human body image in the similar human body image set, a weighted sum of each classification probability prediction result and the weight of each classification prediction module is obtained to obtain a soft voting score; according to each soft voting score, the pedestrian weight is determined. Recognition results.
本实施例通过特征提取模型的特征金字塔分别与每个特征输出模块连接,获取到了具有不同层级的语义信息的待分析特征向量,从而实现关注衣服颜色、纹理等底层特征和高层全局语义信息,提高了行人重识别的准确性;而且通过在根据目标特征向量和预设的相似图像数量确定的相似人体图像集范围内进行软投票评分的确定,进一步提高了行人重识别的准确性。In this embodiment, the feature pyramid of the feature extraction model is connected to each feature output module respectively, and feature vectors to be analyzed with different levels of semantic information are obtained, thereby achieving attention to low-level features such as clothing color and texture and high-level global semantic information, and improving The accuracy of pedestrian re-identification is improved; and the soft voting score is determined within the range of similar human image sets determined based on the target feature vector and the preset number of similar images, which further improves the accuracy of pedestrian re-identification.
本申请一实施例还提供一种计算机可读存储介质,该计算机可读存储介质可以是非易失性的,也可以是易失性的,其上存储有计算机程序,计算机程序被处理器执行时实现一种基于人工智能的行人重识别方法,包括步骤:获取目标图像;将所述目标图像输入预设的特征提取模型,得到每个特征输出模块输出的待分析特征向量,其中,所述特征提取模型包括:特征金字塔和多个所述特征输出模块,所述特征金字塔分别与每个所述特征输出模块连接;将每个所述待分析特征向量输入预设的分类预测模块进行分类概率预测,得到分类概率预测结果;根据目标特征向量和预设的相似图像数量,从预设的人体图像库中确定相似人体图像集,其中,所述目标特征向量是任一个所述待分析特征向量;针对各个所述相似人体图像集中的每张人体图像,将各个所述分类概率预测结果和各个所述分类预测模块的权重进行加权求和,得到软投票评分;根据各个所述软投票评分,确定行人重识别结果。An embodiment of the present application also provides a computer-readable storage medium. The computer-readable storage medium may be non-volatile or volatile. A computer program is stored thereon. When the computer program is executed by a processor, Implementing a pedestrian re-identification method based on artificial intelligence, including the steps of: obtaining a target image; inputting the target image into a preset feature extraction model to obtain a feature vector to be analyzed output by each feature output module, wherein the feature The extraction model includes: a feature pyramid and a plurality of the feature output modules, the feature pyramid is connected to each of the feature output modules respectively; each of the feature vectors to be analyzed is input into a preset classification prediction module for classification probability prediction. , obtain the classification probability prediction result; determine a set of similar human body images from the preset human body image library according to the target feature vector and the preset number of similar images, wherein the target feature vector is any one of the feature vectors to be analyzed; For each human body image in each of the similar human body image sets, perform a weighted sum of each of the classification probability prediction results and the weight of each of the classification prediction modules to obtain a soft voting score; according to each of the soft voting scores, determine Pedestrian re-identification results.
上述执行的基于人工智能的行人重识别方法,通过特征提取模型的特征金字塔分别与每个特征输出模块连接,获取到了具有不同层级的语义信息的待分析特征向量,从而实现关注衣服颜色、纹理等底层特征和高层全局语义信息,提高了行人重识别的准确性;而且通过在根据目标特征向量和预设的相似图像数量确定的相似人体图像集范围内进行软投票评分的确定,进一步提高了行人重识别的准确性。The artificial intelligence-based pedestrian re-identification method implemented above is connected to each feature output module through the feature pyramid of the feature extraction model, and feature vectors to be analyzed with different levels of semantic information are obtained, thereby enabling attention to clothing color, texture, etc. The low-level features and high-level global semantic information improve the accuracy of pedestrian re-identification; and by determining the soft voting score within the similar human image set determined based on the target feature vector and the preset number of similar images, the pedestrian re-identification accuracy is further improved. Re-identification accuracy.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一计算机可读取存储介质中,该计算机程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的和实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可以包括只读存储器(ROM)、可编程ROM(PROM)、电可编 程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双速据率SDRAM(SSRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be completed by instructing relevant hardware through a computer program. The computer program can be stored in a computer-readable storage medium. When executed, the computer program may include the processes of the above method embodiments. Any reference to memory, storage, database or other media provided in this application and used in the embodiments may include non-volatile and/or volatile memory. Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. By way of illustration, and not limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual speed data rate SDRAM (SSRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、装置、物品或者方法不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、装置、物品或者方法所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、装置、物品或者方法中还存在另外的相同要素。It should be noted that, in this document, the terms "comprising", "comprising" or any other variations thereof are intended to cover a non-exclusive inclusion, such that a process, device, article or method that includes a series of elements not only includes those elements, It also includes other elements not expressly listed or inherent in the process, apparatus, article or method. Without further limitation, an element defined by the statement "comprises a..." does not exclude the presence of additional identical elements in a process, apparatus, article or method that includes that element.
以上所述仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。The above are only preferred embodiments of the present application, and do not limit the patent scope of the present application. Any equivalent structure or equivalent process transformation made using the contents of the description and drawings of the present application, or directly or indirectly used in other related The technical fields are all equally included in the scope of patent protection of this application.

Claims (20)

  1. 一种基于人工智能的行人重识别方法,其中,所述方法包括:A pedestrian re-identification method based on artificial intelligence, wherein the method includes:
    获取目标图像;Get the target image;
    将所述目标图像输入预设的特征提取模型,得到每个特征输出模块输出的待分析特征向量,其中,所述特征提取模型包括:特征金字塔和多个所述特征输出模块,所述特征金字塔分别与每个所述特征输出模块连接;Input the target image into a preset feature extraction model to obtain the feature vector to be analyzed output by each feature output module, wherein the feature extraction model includes: a feature pyramid and a plurality of the feature output modules, the feature pyramid Connect to each of the feature output modules respectively;
    将每个所述待分析特征向量输入预设的分类预测模块进行分类概率预测,得到分类概率预测结果;Input each of the feature vectors to be analyzed into a preset classification prediction module for classification probability prediction, and obtain a classification probability prediction result;
    根据目标特征向量和预设的相似图像数量,从预设的人体图像库中确定相似人体图像集,其中,所述目标特征向量是任一个所述待分析特征向量;Determine a set of similar human body images from the preset human body image library according to the target feature vector and the preset number of similar images, wherein the target feature vector is any one of the feature vectors to be analyzed;
    针对各个所述相似人体图像集中的每张人体图像,将各个所述分类概率预测结果和各个所述分类预测模块的权重进行加权求和,得到软投票评分;For each human body image in each of the similar human body image sets, perform a weighted sum of each of the classification probability prediction results and the weight of each of the classification prediction modules to obtain a soft voting score;
    根据各个所述软投票评分,确定行人重识别结果。According to each of the soft voting scores, the pedestrian re-identification result is determined.
  2. 根据权利要求1所述的基于人工智能的行人重识别方法,其中,所述将所述目标图像输入预设的特征提取模型,得到每个特征输出模块输出的待分析特征向量的步骤,包括:The method of pedestrian re-identification based on artificial intelligence according to claim 1, wherein the step of inputting the target image into a preset feature extraction model to obtain the feature vector to be analyzed output by each feature output module includes:
    将所述目标图像输入所述特征金字塔的第1个特征提取层,得到第1个特征初始向量;Input the target image into the first feature extraction layer of the feature pyramid to obtain the first feature initial vector;
    将第i-1个所述特征初始向量输入所述特征金字塔的第i个所述特征提取层,得到第i个所述特征初始向量,其中,i大于1并且小于n+1,n是个大于2的整数;Input the i-1th feature initial vector into the i-th feature extraction layer of the feature pyramid to obtain the i-th feature initial vector, where i is greater than 1 and less than n+1, and n is greater than an integer of 2;
    将第n个所述特征初始向量输入所述特征金字塔的第1个特征融合层进行特征处理,得到第1个融合特征向量;Input the n-th feature initial vector into the first feature fusion layer of the feature pyramid for feature processing to obtain the first fusion feature vector;
    将第k-1个所述融合特征向量和第n-k+1个所述特征初始向量输入所述特征金字塔的第k个所述特征融合层进行特征融合,得到第k个所述融合特征向量,其中,k是大于1并且小于n;The k-1th fusion feature vector and the n-k+1th feature initial vector are input into the k-th feature fusion layer of the feature pyramid to perform feature fusion, and the k-th fusion feature is obtained. Vector, where k is greater than 1 and less than n;
    将第m个所述融合特征向量输入第m个所述特征输出模块进行特征输出,得到第m个所述待分析特征向量,其中,m大于0并且小于n。The m-th fused feature vector is input into the m-th feature output module for feature output, and the m-th feature vector to be analyzed is obtained, where m is greater than 0 and less than n.
  3. 根据权利要求2所述的基于人工智能的行人重识别方法,其中,所述将第n个所述特征初始向量输入所述特征金字塔的第1个特征融合层进行特征处理,得到第1个融合特征向量的步骤,包括:The pedestrian re-identification method based on artificial intelligence according to claim 2, wherein the n-th feature initial vector is input into the first feature fusion layer of the feature pyramid for feature processing to obtain the first fusion The steps of eigenvector include:
    采用第1个所述特征融合层的通道扩展卷积核,对第n个所述特征初始向量进行通道扩展,得到第1个所述融合特征向量,其中,所述通道扩展卷积核为1*1卷积核;Using the channel expansion convolution kernel of the first feature fusion layer, channel expansion is performed on the nth feature initial vector to obtain the first fusion feature vector, wherein the channel expansion convolution kernel is 1 *1 convolution kernel;
    所述将第k-1个所述融合特征向量和第n-k+1个所述特征初始向量输入所述特征金字塔的第k个所述特征融合层进行特征融合,得到第k个所述融合特征向量的步骤,包括:The k-1th fusion feature vector and the n-k+1th feature initial vector are input into the k-th feature fusion layer of the feature pyramid for feature fusion to obtain the k-th feature initial vector. The steps to fuse feature vectors include:
    采用第k个所述特征融合层的所述通道扩展卷积核,对第n-k+1个所述特 征初始向量进行通道扩展,得到第k个所述通道扩展特征;Using the channel expansion convolution kernel of the kth feature fusion layer, channel expansion is performed on the n-k+1th feature initial vector to obtain the kth channel expansion feature;
    将第k-1个所述融合特征向量输入第k个所述特征融合层的最近邻插值处理子层进行等比例放大,得到第k个等比例放大特征;Input the k-1th fusion feature vector into the nearest neighbor interpolation processing sub-layer of the kth feature fusion layer for equal proportion amplification, and obtain the kth equal proportion amplification feature;
    将第k个所述通道扩展特征和第k个所述等比例放大特征进行融合处理,得到第k个所述融合特征向量。The kth channel expansion feature and the kth equal-scale amplification feature are fused to obtain the kth fusion feature vector.
  4. 根据权利要求2所述的基于人工智能的行人重识别方法,其中,所述将第m个所述融合特征向量输入第m个所述特征输出模块进行特征输出,得到第m个所述待分析特征向量的步骤,包括:The pedestrian re-identification method based on artificial intelligence according to claim 2, wherein the m-th fused feature vector is input into the m-th feature output module for feature output to obtain the m-th to-be-analyzed The steps of eigenvector include:
    采用第m个所述特征输出模块的混叠效应消除层,对第m个所述融合特征向量进行混叠效应消除,得到第m个已消除混叠效应特征向量,其中,所述混叠效应消除层为3*3卷积核;Using the aliasing effect elimination layer of the mth feature output module, the mth fusion feature vector is subjected to aliasing effect elimination to obtain the mth aliasing effect eliminated feature vector, wherein the aliasing effect The elimination layer is a 3*3 convolution kernel;
    将第m个所述已消除混叠效应特征向量输入第m个所述特征输出模块的池化层进行池化处理,得到第m个所述待分析特征向量。The mth feature vector with eliminated aliasing effect is input into the pooling layer of the mth feature output module for pooling processing to obtain the mth feature vector to be analyzed.
  5. 根据权利要求1所述的基于人工智能的行人重识别方法,其中,所述根据目标特征向量和预设的相似图像数量,从预设的人体图像库中确定相似人体图像集的步骤,包括:The method of pedestrian re-identification based on artificial intelligence according to claim 1, wherein the step of determining a set of similar human body images from a preset human body image library according to the target feature vector and a preset number of similar images includes:
    对所述目标特征向量和所述人体图像库中的每张所述人体图像对应的特征向量进行相似度计算,得到第一相似度;Perform similarity calculation on the target feature vector and the feature vector corresponding to each human body image in the human body image library to obtain a first similarity;
    从各个所述第一相似度中,找出值最大并且数量与所述相似图像数量相同的所述第一相似度,作为初始相似度集;From each of the first similarities, find the first similarity with the largest value and the same number as the number of similar images as an initial similarity set;
    对所述初始相似度集对应的各张所述人体图像的特征向量和所述目标特征向量进行平均值计算,得到已调整特征向量;Calculate the average value of the feature vectors of each human body image corresponding to the initial similarity set and the target feature vector to obtain an adjusted feature vector;
    对所述已调整特征向量和所述人体图像库中的每张所述人体图像对应的特征向量进行相似度计算,得到第二相似度;Perform similarity calculation on the adjusted feature vector and the feature vector corresponding to each human body image in the human body image library to obtain a second similarity;
    从各个所述第二相似度中,找出值最大并且数量与所述相似图像数量相同的所述第二相似度,得到目标相似度集;From each of the second similarities, find the second similarities with the largest value and the same number as the number of similar images to obtain a target similarity set;
    将所述目标相似度集对应的各张所述人体图像作为所述相似人体图像集。Each of the human body images corresponding to the target similarity set is used as the similar human body image set.
  6. 根据权利要求1所述的基于人工智能的行人重识别方法,其中,所述根据各个所述软投票评分,确定行人重识别结果的步骤,包括:The artificial intelligence-based pedestrian re-identification method according to claim 1, wherein the step of determining the pedestrian re-identification result based on each of the soft voting scores includes:
    从各个所述软投票评分中找出值为最大的所述软投票评分,得到目标评分;Find the soft voting score with the largest value from each of the soft voting scores to obtain a target score;
    判断所述目标评分是否大于预设的评分阈值;Determine whether the target score is greater than a preset score threshold;
    若是,则确定所述行人重识别结果的识别结果为成功,并且将所述目标评分对应的所述人体图像作为所述行人重识别结果的命中图像;If so, determine that the recognition result of the pedestrian re-identification result is successful, and use the human body image corresponding to the target score as the hit image of the pedestrian re-identification result;
    若否,则确定所述行人重识别结果的识别结果为失败。If not, it is determined that the recognition result of the pedestrian re-identification result is failed.
  7. 根据权利要求1所述的基于人工智能的行人重识别方法,其中,所述将所述目标图像输入预设的特征提取模型,得到每个特征输出模块输出的待分析特征向量的步骤之前,还包括:The method of pedestrian re-identification based on artificial intelligence according to claim 1, wherein before the step of inputting the target image into a preset feature extraction model to obtain the feature vector to be analyzed output by each feature output module, there is also include:
    获取初始模型和训练样本集,其中,所述初始模型包括:初始特征金字塔、多个特征输出初始模块和多个分类预测初始模块,所述初始特征金字塔分别与 每个所述特征输出初始模块连接,所述特征输出初始模块和所述分类预测初始模块连接,以及所述特征输出初始模块和所述分类预测初始模块一一对应;Obtain an initial model and a training sample set, wherein the initial model includes: an initial feature pyramid, multiple feature output initial modules and multiple classification prediction initial modules, and the initial feature pyramid is connected to each of the feature output initial modules. , the feature output initial module and the classification prediction initial module are connected, and the feature output initial module and the classification prediction initial module correspond one to one;
    基于权值的弱分类器集成方法和所述训练样本集,对所述初始模型进行训练,将训练结束的所述初始模型作为目标模型,其中,所述目标模型的所述初始特征金字塔作为所述特征金字塔,所述目标模型的所述特征输出初始模块作为所述特征输出模块,所述目标模型的所述分类预测初始模块作为所述分类预测模块。The initial model is trained using a weight-based weak classifier integration method and the training sample set, and the initial model after training is used as the target model, where the initial feature pyramid of the target model is used as the target model. In the feature pyramid, the feature output initial module of the target model serves as the feature output module, and the classification prediction initial module of the target model serves as the classification prediction module.
  8. 一种基于人工智能的行人重识别装置,其中,所述装置包括:A pedestrian re-identification device based on artificial intelligence, wherein the device includes:
    图像获取模块,用于获取目标图像;Image acquisition module, used to acquire target images;
    待分析特征向量确定模块,用于将所述目标图像输入预设的特征提取模型,得到每个特征输出模块输出的待分析特征向量,其中,所述特征提取模型包括:特征金字塔和多个所述特征输出模块,所述特征金字塔分别与每个所述特征输出模块连接;A feature vector determination module to be analyzed is used to input the target image into a preset feature extraction model to obtain a feature vector to be analyzed output by each feature output module, wherein the feature extraction model includes: a feature pyramid and a plurality of The feature output module, the feature pyramid is connected to each of the feature output modules respectively;
    分类概率预测结果确定模块,用于将每个所述待分析特征向量输入预设的分类预测模块进行分类概率预测,得到分类概率预测结果;The classification probability prediction result determination module is used to input each of the feature vectors to be analyzed into a preset classification prediction module to perform classification probability prediction and obtain the classification probability prediction result;
    相似人体图像集确定模块,用于根据目标特征向量和预设的相似图像数量,从预设的人体图像库中确定相似人体图像集,其中,所述目标特征向量是任一个所述待分析特征向量;A similar human body image set determination module, configured to determine a similar human body image set from a preset human body image library based on the target feature vector and a preset number of similar images, wherein the target feature vector is any one of the features to be analyzed vector;
    软投票评分确定模块,用于针对各个所述相似人体图像集中的每张人体图像,将各个所述分类概率预测结果和各个所述分类预测模块的权重进行加权求和,得到软投票评分;A soft voting score determination module, configured to perform a weighted sum of each of the classification probability prediction results and the weights of each of the classification prediction modules for each human body image in each of the similar human body image sets to obtain a soft voting score;
    行人重识别结果确定模块,用于根据各个所述软投票评分,确定行人重识别结果。The pedestrian re-identification result determination module is used to determine the pedestrian re-identification result according to each of the soft voting scores.
  9. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,其中,所述处理器执行所述计算机程序时实现一种基于人工智能的行人重识别方法,其中,所述基于人工智能的行人重识别方法包括以下步骤:A computer device includes a memory and a processor. The memory stores a computer program. When the processor executes the computer program, it implements an artificial intelligence-based pedestrian re-identification method. The artificial intelligence-based pedestrian re-identification method is provided. The pedestrian re-identification method includes the following steps:
    获取目标图像;Get the target image;
    将所述目标图像输入预设的特征提取模型,得到每个特征输出模块输出的待分析特征向量,其中,所述特征提取模型包括:特征金字塔和多个所述特征输出模块,所述特征金字塔分别与每个所述特征输出模块连接;Input the target image into a preset feature extraction model to obtain the feature vector to be analyzed output by each feature output module, wherein the feature extraction model includes: a feature pyramid and a plurality of the feature output modules, the feature pyramid Connect to each of the feature output modules respectively;
    将每个所述待分析特征向量输入预设的分类预测模块进行分类概率预测,得到分类概率预测结果;Input each of the feature vectors to be analyzed into a preset classification prediction module for classification probability prediction, and obtain a classification probability prediction result;
    根据目标特征向量和预设的相似图像数量,从预设的人体图像库中确定相似人体图像集,其中,所述目标特征向量是任一个所述待分析特征向量;Determine a set of similar human body images from the preset human body image library according to the target feature vector and the preset number of similar images, wherein the target feature vector is any one of the feature vectors to be analyzed;
    针对各个所述相似人体图像集中的每张人体图像,将各个所述分类概率预测结果和各个所述分类预测模块的权重进行加权求和,得到软投票评分;For each human body image in each of the similar human body image sets, perform a weighted sum of each of the classification probability prediction results and the weight of each of the classification prediction modules to obtain a soft voting score;
    根据各个所述软投票评分,确定行人重识别结果。According to each of the soft voting scores, the pedestrian re-identification result is determined.
  10. 根据权利要求9所述的计算机设备,其中,所述将所述目标图像输入预设的特征提取模型,得到每个特征输出模块输出的待分析特征向量的步骤, 包括:The computer device according to claim 9, wherein the step of inputting the target image into a preset feature extraction model to obtain the feature vector to be analyzed output by each feature output module includes:
    将所述目标图像输入所述特征金字塔的第1个特征提取层,得到第1个特征初始向量;Input the target image into the first feature extraction layer of the feature pyramid to obtain the first feature initial vector;
    将第i-1个所述特征初始向量输入所述特征金字塔的第i个所述特征提取层,得到第i个所述特征初始向量,其中,i大于1并且小于n+1,n是个大于2的整数;Input the i-1th feature initial vector into the i-th feature extraction layer of the feature pyramid to obtain the i-th feature initial vector, where i is greater than 1 and less than n+1, and n is greater than an integer of 2;
    将第n个所述特征初始向量输入所述特征金字塔的第1个特征融合层进行特征处理,得到第1个融合特征向量;Input the n-th feature initial vector into the first feature fusion layer of the feature pyramid for feature processing to obtain the first fusion feature vector;
    将第k-1个所述融合特征向量和第n-k+1个所述特征初始向量输入所述特征金字塔的第k个所述特征融合层进行特征融合,得到第k个所述融合特征向量,其中,k是大于1并且小于n;The k-1th fusion feature vector and the n-k+1th feature initial vector are input into the k-th feature fusion layer of the feature pyramid to perform feature fusion, and the k-th fusion feature is obtained. Vector, where k is greater than 1 and less than n;
    将第m个所述融合特征向量输入第m个所述特征输出模块进行特征输出,得到第m个所述待分析特征向量,其中,m大于0并且小于n。The m-th fused feature vector is input into the m-th feature output module for feature output, and the m-th feature vector to be analyzed is obtained, where m is greater than 0 and less than n.
  11. 根据权利要求10所述的计算机设备,其中,所述将第n个所述特征初始向量输入所述特征金字塔的第1个特征融合层进行特征处理,得到第1个融合特征向量的步骤,包括:The computer device according to claim 10, wherein the step of inputting the n-th feature initial vector into the first feature fusion layer of the feature pyramid for feature processing to obtain the first fused feature vector includes: :
    采用第1个所述特征融合层的通道扩展卷积核,对第n个所述特征初始向量进行通道扩展,得到第1个所述融合特征向量,其中,所述通道扩展卷积核为1*1卷积核;Using the channel expansion convolution kernel of the first feature fusion layer, channel expansion is performed on the nth feature initial vector to obtain the first fusion feature vector, wherein the channel expansion convolution kernel is 1 *1 convolution kernel;
    所述将第k-1个所述融合特征向量和第n-k+1个所述特征初始向量输入所述特征金字塔的第k个所述特征融合层进行特征融合,得到第k个所述融合特征向量的步骤,包括:The k-1th fusion feature vector and the n-k+1th feature initial vector are input into the k-th feature fusion layer of the feature pyramid for feature fusion to obtain the k-th feature initial vector. The steps to fuse feature vectors include:
    采用第k个所述特征融合层的所述通道扩展卷积核,对第n-k+1个所述特征初始向量进行通道扩展,得到第k个所述通道扩展特征;Using the channel expansion convolution kernel of the kth feature fusion layer, channel expansion is performed on the n-k+1th feature initial vector to obtain the kth channel expansion feature;
    将第k-1个所述融合特征向量输入第k个所述特征融合层的最近邻插值处理子层进行等比例放大,得到第k个等比例放大特征;Input the k-1th fusion feature vector into the nearest neighbor interpolation processing sub-layer of the kth feature fusion layer for equal proportion amplification, and obtain the kth equal proportion amplification feature;
    将第k个所述通道扩展特征和第k个所述等比例放大特征进行融合处理,得到第k个所述融合特征向量。The kth channel expansion feature and the kth equal-scale amplification feature are fused to obtain the kth fusion feature vector.
  12. 根据权利要求10所述的计算机设备,其中,所述将第m个所述融合特征向量输入第m个所述特征输出模块进行特征输出,得到第m个所述待分析特征向量的步骤,包括:The computer device according to claim 10, wherein the step of inputting the m-th fused feature vector into the m-th feature output module for feature output to obtain the m-th feature vector to be analyzed includes: :
    采用第m个所述特征输出模块的混叠效应消除层,对第m个所述融合特征向量进行混叠效应消除,得到第m个已消除混叠效应特征向量,其中,所述混叠效应消除层为3*3卷积核;Using the aliasing effect elimination layer of the mth feature output module, the mth fusion feature vector is subjected to aliasing effect elimination to obtain the mth aliasing effect eliminated feature vector, wherein the aliasing effect The elimination layer is a 3*3 convolution kernel;
    将第m个所述已消除混叠效应特征向量输入第m个所述特征输出模块的池化层进行池化处理,得到第m个所述待分析特征向量。The mth feature vector with eliminated aliasing effect is input into the pooling layer of the mth feature output module for pooling processing to obtain the mth feature vector to be analyzed.
  13. 根据权利要求9所述的计算机设备,其中,所述根据目标特征向量和预设的相似图像数量,从预设的人体图像库中确定相似人体图像集的步骤,包括:The computer device according to claim 9, wherein the step of determining a set of similar human body images from a preset human body image library according to the target feature vector and a preset number of similar images includes:
    对所述目标特征向量和所述人体图像库中的每张所述人体图像对应的特征向量进行相似度计算,得到第一相似度;Perform similarity calculation on the target feature vector and the feature vector corresponding to each human body image in the human body image library to obtain a first similarity;
    从各个所述第一相似度中,找出值最大并且数量与所述相似图像数量相同的所述第一相似度,作为初始相似度集;From each of the first similarities, find the first similarity with the largest value and the same number as the number of similar images as an initial similarity set;
    对所述初始相似度集对应的各张所述人体图像的特征向量和所述目标特征向量进行平均值计算,得到已调整特征向量;Calculate the average value of the feature vectors of each human body image corresponding to the initial similarity set and the target feature vector to obtain an adjusted feature vector;
    对所述已调整特征向量和所述人体图像库中的每张所述人体图像对应的特征向量进行相似度计算,得到第二相似度;Perform similarity calculation on the adjusted feature vector and the feature vector corresponding to each human body image in the human body image library to obtain a second similarity;
    从各个所述第二相似度中,找出值最大并且数量与所述相似图像数量相同的所述第二相似度,得到目标相似度集;From each of the second similarities, find the second similarities with the largest value and the same number as the number of similar images to obtain a target similarity set;
    将所述目标相似度集对应的各张所述人体图像作为所述相似人体图像集。Each of the human body images corresponding to the target similarity set is used as the similar human body image set.
  14. 根据权利要求9所述的计算机设备,其中,所述根据各个所述软投票评分,确定行人重识别结果的步骤,包括:The computer device according to claim 9, wherein the step of determining the pedestrian re-identification result according to each of the soft voting scores includes:
    从各个所述软投票评分中找出值为最大的所述软投票评分,得到目标评分;Find the soft voting score with the largest value from each of the soft voting scores to obtain a target score;
    判断所述目标评分是否大于预设的评分阈值;Determine whether the target score is greater than a preset score threshold;
    若是,则确定所述行人重识别结果的识别结果为成功,并且将所述目标评分对应的所述人体图像作为所述行人重识别结果的命中图像;If so, determine that the recognition result of the pedestrian re-identification result is successful, and use the human body image corresponding to the target score as the hit image of the pedestrian re-identification result;
    若否,则确定所述行人重识别结果的识别结果为失败。If not, it is determined that the recognition result of the pedestrian re-identification result is failed.
  15. 根据权利要求9所述的计算机设备,其中,所述将所述目标图像输入预设的特征提取模型,得到每个特征输出模块输出的待分析特征向量的步骤之前,还包括:The computer device according to claim 9, wherein before the step of inputting the target image into a preset feature extraction model to obtain the feature vector to be analyzed output by each feature output module, the step further includes:
    获取初始模型和训练样本集,其中,所述初始模型包括:初始特征金字塔、多个特征输出初始模块和多个分类预测初始模块,所述初始特征金字塔分别与每个所述特征输出初始模块连接,所述特征输出初始模块和所述分类预测初始模块连接,以及所述特征输出初始模块和所述分类预测初始模块一一对应;Obtain an initial model and a training sample set, wherein the initial model includes: an initial feature pyramid, multiple feature output initial modules and multiple classification prediction initial modules, and the initial feature pyramid is connected to each of the feature output initial modules. , the feature output initial module and the classification prediction initial module are connected, and the feature output initial module and the classification prediction initial module correspond one to one;
    基于权值的弱分类器集成方法和所述训练样本集,对所述初始模型进行训练,将训练结束的所述初始模型作为目标模型,其中,所述目标模型的所述初始特征金字塔作为所述特征金字塔,所述目标模型的所述特征输出初始模块作为所述特征输出模块,所述目标模型的所述分类预测初始模块作为所述分类预测模块。The initial model is trained using a weight-based weak classifier integration method and the training sample set, and the initial model after training is used as the target model, where the initial feature pyramid of the target model is used as the target model. In the feature pyramid, the feature output initial module of the target model serves as the feature output module, and the classification prediction initial module of the target model serves as the classification prediction module.
  16. 一种计算机可读存储介质,其上存储有计算机程序,其中,所述计算机程序被处理器执行时实现一种基于人工智能的行人重识别方法,其中,所述人工智能的行人重识别方法包括以下步骤:A computer-readable storage medium having a computer program stored thereon, wherein when the computer program is executed by a processor, an artificial intelligence-based pedestrian re-identification method is implemented, wherein the artificial intelligence-based pedestrian re-identification method includes Following steps:
    获取目标图像;Get the target image;
    将所述目标图像输入预设的特征提取模型,得到每个特征输出模块输出的待分析特征向量,其中,所述特征提取模型包括:特征金字塔和多个所述特征输出模块,所述特征金字塔分别与每个所述特征输出模块连接;Input the target image into a preset feature extraction model to obtain the feature vector to be analyzed output by each feature output module, wherein the feature extraction model includes: a feature pyramid and a plurality of the feature output modules, the feature pyramid Connect to each of the feature output modules respectively;
    将每个所述待分析特征向量输入预设的分类预测模块进行分类概率预测,得到分类概率预测结果;Input each of the feature vectors to be analyzed into a preset classification prediction module for classification probability prediction, and obtain a classification probability prediction result;
    根据目标特征向量和预设的相似图像数量,从预设的人体图像库中确定相似人体图像集,其中,所述目标特征向量是任一个所述待分析特征向量;Determine a set of similar human body images from the preset human body image library according to the target feature vector and the preset number of similar images, wherein the target feature vector is any one of the feature vectors to be analyzed;
    针对各个所述相似人体图像集中的每张人体图像,将各个所述分类概率预测结果和各个所述分类预测模块的权重进行加权求和,得到软投票评分;For each human body image in each of the similar human body image sets, perform a weighted sum of each of the classification probability prediction results and the weight of each of the classification prediction modules to obtain a soft voting score;
    根据各个所述软投票评分,确定行人重识别结果。According to each of the soft voting scores, the pedestrian re-identification result is determined.
  17. 根据权利要求16所述的计算机可读存储介质,其中,所述将所述目标图像输入预设的特征提取模型,得到每个特征输出模块输出的待分析特征向量的步骤,包括:The computer-readable storage medium according to claim 16, wherein the step of inputting the target image into a preset feature extraction model to obtain the feature vector to be analyzed output by each feature output module includes:
    将所述目标图像输入所述特征金字塔的第1个特征提取层,得到第1个特征初始向量;Input the target image into the first feature extraction layer of the feature pyramid to obtain the first feature initial vector;
    将第i-1个所述特征初始向量输入所述特征金字塔的第i个所述特征提取层,得到第i个所述特征初始向量,其中,i大于1并且小于n+1,n是个大于2的整数;Input the i-1th feature initial vector into the i-th feature extraction layer of the feature pyramid to obtain the i-th feature initial vector, where i is greater than 1 and less than n+1, and n is greater than an integer of 2;
    将第n个所述特征初始向量输入所述特征金字塔的第1个特征融合层进行特征处理,得到第1个融合特征向量;Input the n-th feature initial vector into the first feature fusion layer of the feature pyramid for feature processing to obtain the first fusion feature vector;
    将第k-1个所述融合特征向量和第n-k+1个所述特征初始向量输入所述特征金字塔的第k个所述特征融合层进行特征融合,得到第k个所述融合特征向量,其中,k是大于1并且小于n;The k-1th fusion feature vector and the n-k+1th feature initial vector are input into the k-th feature fusion layer of the feature pyramid to perform feature fusion, and the k-th fusion feature is obtained. Vector, where k is greater than 1 and less than n;
    将第m个所述融合特征向量输入第m个所述特征输出模块进行特征输出,得到第m个所述待分析特征向量,其中,m大于0并且小于n。The m-th fused feature vector is input into the m-th feature output module for feature output, and the m-th feature vector to be analyzed is obtained, where m is greater than 0 and less than n.
  18. 根据权利要求17所述的计算机可读存储介质,其中,所述将第n个所述特征初始向量输入所述特征金字塔的第1个特征融合层进行特征处理,得到第1个融合特征向量的步骤,包括:The computer-readable storage medium according to claim 17, wherein the n-th feature initial vector is input into the first feature fusion layer of the feature pyramid for feature processing to obtain the first fused feature vector. steps, including:
    采用第1个所述特征融合层的通道扩展卷积核,对第n个所述特征初始向量进行通道扩展,得到第1个所述融合特征向量,其中,所述通道扩展卷积核为1*1卷积核;Using the channel expansion convolution kernel of the first feature fusion layer, channel expansion is performed on the nth feature initial vector to obtain the first fusion feature vector, wherein the channel expansion convolution kernel is 1 *1 convolution kernel;
    所述将第k-1个所述融合特征向量和第n-k+1个所述特征初始向量输入所述特征金字塔的第k个所述特征融合层进行特征融合,得到第k个所述融合特征向量的步骤,包括:The k-1th fusion feature vector and the n-k+1th feature initial vector are input into the k-th feature fusion layer of the feature pyramid for feature fusion to obtain the k-th feature initial vector. The steps to fuse feature vectors include:
    采用第k个所述特征融合层的所述通道扩展卷积核,对第n-k+1个所述特征初始向量进行通道扩展,得到第k个所述通道扩展特征;Using the channel expansion convolution kernel of the kth feature fusion layer, channel expansion is performed on the n-k+1th feature initial vector to obtain the kth channel expansion feature;
    将第k-1个所述融合特征向量输入第k个所述特征融合层的最近邻插值处理子层进行等比例放大,得到第k个等比例放大特征;Input the k-1th fusion feature vector into the nearest neighbor interpolation processing sub-layer of the kth feature fusion layer for equal proportion amplification, and obtain the kth equal proportion amplification feature;
    将第k个所述通道扩展特征和第k个所述等比例放大特征进行融合处理,得到第k个所述融合特征向量。The kth channel expansion feature and the kth equal-scale amplification feature are fused to obtain the kth fusion feature vector.
  19. 根据权利要求16所述的计算机可读存储介质,其中,所述根据目标特征向量和预设的相似图像数量,从预设的人体图像库中确定相似人体图像集的步骤,包括:The computer-readable storage medium according to claim 16, wherein the step of determining a set of similar human body images from a preset human body image library according to the target feature vector and a preset number of similar images includes:
    对所述目标特征向量和所述人体图像库中的每张所述人体图像对应的特 征向量进行相似度计算,得到第一相似度;Perform similarity calculation on the target feature vector and the feature vector corresponding to each human body image in the human body image library to obtain a first similarity;
    从各个所述第一相似度中,找出值最大并且数量与所述相似图像数量相同的所述第一相似度,作为初始相似度集;From each of the first similarities, find the first similarity with the largest value and the same number as the number of similar images as an initial similarity set;
    对所述初始相似度集对应的各张所述人体图像的特征向量和所述目标特征向量进行平均值计算,得到已调整特征向量;Calculate the average value of the feature vectors of each human body image corresponding to the initial similarity set and the target feature vector to obtain an adjusted feature vector;
    对所述已调整特征向量和所述人体图像库中的每张所述人体图像对应的特征向量进行相似度计算,得到第二相似度;Perform similarity calculation on the adjusted feature vector and the feature vector corresponding to each human body image in the human body image library to obtain a second similarity;
    从各个所述第二相似度中,找出值最大并且数量与所述相似图像数量相同的所述第二相似度,得到目标相似度集;From each of the second similarities, find the second similarities with the largest value and the same number as the number of similar images to obtain a target similarity set;
    将所述目标相似度集对应的各张所述人体图像作为所述相似人体图像集。Each of the human body images corresponding to the target similarity set is used as the similar human body image set.
  20. 根据权利要求16所述的计算机可读存储介质,其中,所述根据各个所述软投票评分,确定行人重识别结果的步骤,包括:The computer-readable storage medium according to claim 16, wherein the step of determining the pedestrian re-identification result according to each of the soft voting scores includes:
    从各个所述软投票评分中找出值为最大的所述软投票评分,得到目标评分;Find the soft voting score with the largest value from each of the soft voting scores to obtain a target score;
    判断所述目标评分是否大于预设的评分阈值;Determine whether the target score is greater than a preset score threshold;
    若是,则确定所述行人重识别结果的识别结果为成功,并且将所述目标评分对应的所述人体图像作为所述行人重识别结果的命中图像;If so, determine that the recognition result of the pedestrian re-identification result is successful, and use the human body image corresponding to the target score as the hit image of the pedestrian re-identification result;
    若否,则确定所述行人重识别结果的识别结果为失败。If not, it is determined that the recognition result of the pedestrian re-identification result is failed.
PCT/CN2022/090156 2022-03-16 2022-04-29 Person re-identification method and apparatus based on artificial intelligence, and device and storage medium WO2023173544A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210256790.4A CN114639165B (en) 2022-03-16 2022-03-16 Pedestrian re-identification method, device, equipment and storage medium based on artificial intelligence
CN202210256790.4 2022-03-16

Publications (1)

Publication Number Publication Date
WO2023173544A1 true WO2023173544A1 (en) 2023-09-21

Family

ID=81948634

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/090156 WO2023173544A1 (en) 2022-03-16 2022-04-29 Person re-identification method and apparatus based on artificial intelligence, and device and storage medium

Country Status (2)

Country Link
CN (1) CN114639165B (en)
WO (1) WO2023173544A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116052220B (en) * 2023-02-07 2023-11-24 北京多维视通技术有限公司 Pedestrian re-identification method, device, equipment and medium
CN117522454B (en) * 2024-01-05 2024-04-16 北京文安智能技术股份有限公司 Staff identification method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110263697A (en) * 2019-06-17 2019-09-20 哈尔滨工业大学(深圳) Pedestrian based on unsupervised learning recognition methods, device and medium again
CN110532884A (en) * 2019-07-30 2019-12-03 平安科技(深圳)有限公司 Pedestrian recognition methods, device and computer readable storage medium again
CN111709331A (en) * 2020-06-03 2020-09-25 江南大学 Pedestrian re-identification method based on multi-granularity information interaction model
CN111783576A (en) * 2020-06-18 2020-10-16 西安电子科技大学 Pedestrian re-identification method based on improved YOLOv3 network and feature fusion
US20210150194A1 (en) * 2017-07-04 2021-05-20 Peking University Shenzhen Graduate School Image feature extraction method for person re-identification
CN112906614A (en) * 2021-03-08 2021-06-04 中南大学 Pedestrian re-identification method and device based on attention guidance and storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111259836A (en) * 2020-01-20 2020-06-09 浙江大学 Video pedestrian re-identification method based on dynamic graph convolution representation
CN112232203B (en) * 2020-10-15 2024-05-28 平安科技(深圳)有限公司 Pedestrian recognition method and device, electronic equipment and storage medium
CN112396036B (en) * 2020-12-09 2023-08-08 中山大学 Method for re-identifying blocked pedestrians by combining space transformation network and multi-scale feature extraction
CN112926531B (en) * 2021-04-01 2023-09-26 深圳市优必选科技股份有限公司 Feature information extraction method, model training method, device and electronic equipment
CN113269190B (en) * 2021-07-21 2021-10-12 中国平安人寿保险股份有限公司 Data classification method and device based on artificial intelligence, computer equipment and medium
CN113642505B (en) * 2021-08-25 2023-04-18 四川大学 Facial expression recognition method and device based on feature pyramid

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210150194A1 (en) * 2017-07-04 2021-05-20 Peking University Shenzhen Graduate School Image feature extraction method for person re-identification
CN110263697A (en) * 2019-06-17 2019-09-20 哈尔滨工业大学(深圳) Pedestrian based on unsupervised learning recognition methods, device and medium again
CN110532884A (en) * 2019-07-30 2019-12-03 平安科技(深圳)有限公司 Pedestrian recognition methods, device and computer readable storage medium again
CN111709331A (en) * 2020-06-03 2020-09-25 江南大学 Pedestrian re-identification method based on multi-granularity information interaction model
CN111783576A (en) * 2020-06-18 2020-10-16 西安电子科技大学 Pedestrian re-identification method based on improved YOLOv3 network and feature fusion
CN112906614A (en) * 2021-03-08 2021-06-04 中南大学 Pedestrian re-identification method and device based on attention guidance and storage medium

Also Published As

Publication number Publication date
CN114639165B (en) 2024-05-10
CN114639165A (en) 2022-06-17

Similar Documents

Publication Publication Date Title
Hou et al. Cross attention network for few-shot classification
CN107480261B (en) Fine-grained face image fast retrieval method based on deep learning
WO2023173544A1 (en) Person re-identification method and apparatus based on artificial intelligence, and device and storage medium
CN111523621B (en) Image recognition method and device, computer equipment and storage medium
Alani et al. Hand gesture recognition using an adapted convolutional neural network with data augmentation
WO2019228317A1 (en) Face recognition method and device, and computer readable medium
EP3029606A2 (en) Method and apparatus for image classification with joint feature adaptation and classifier learning
CN112364931B (en) Few-sample target detection method and network system based on meta-feature and weight adjustment
CN110879982B (en) Crowd counting system and method
US11163989B2 (en) Action localization in images and videos using relational features
CN110969627A (en) Image processing method and device
Shi et al. Deep regression for face alignment
CN112613515A (en) Semantic segmentation method and device, computer equipment and storage medium
Bohdal et al. Meta-calibration: Learning of model calibration using differentiable expected calibration error
CN111666974A (en) Image matching method and device, computer equipment and storage medium
WO2012164244A1 (en) Heterogeneous data fusion using gaussian processes
US11481649B2 (en) Adapting a base classifier to novel classes
CN111652350A (en) Neural network visual interpretation method and weak supervision object positioning method
CN116524296A (en) Training method and device of equipment defect detection model and equipment defect detection method
WO2022178775A1 (en) Deep ensemble model training method based on feature diversity learning
Dasgaonkar et al. Analysis of multi-layered perceptron, radial basis function and convolutional neural networks in recognizing handwritten digits
EP4244768A1 (en) Automatic early-exiting machine learning models
CN114973393A (en) Relationship detection method and device, computer equipment and storage medium
Trentin et al. Unsupervised nonparametric density estimation: A neural network approach
US11908193B2 (en) Obtaining custom artificial neural network architectures

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22931571

Country of ref document: EP

Kind code of ref document: A1