WO2023134071A1 - Person re-identification method and apparatus, electronic device and storage medium - Google Patents

Person re-identification method and apparatus, electronic device and storage medium Download PDF

Info

Publication number
WO2023134071A1
WO2023134071A1 PCT/CN2022/089991 CN2022089991W WO2023134071A1 WO 2023134071 A1 WO2023134071 A1 WO 2023134071A1 CN 2022089991 W CN2022089991 W CN 2022089991W WO 2023134071 A1 WO2023134071 A1 WO 2023134071A1
Authority
WO
WIPO (PCT)
Prior art keywords
pedestrian
sample
identified
body part
feature
Prior art date
Application number
PCT/CN2022/089991
Other languages
French (fr)
Chinese (zh)
Inventor
郑喜民
翟尤
舒畅
陈又新
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2023134071A1 publication Critical patent/WO2023134071A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/42Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/62Extraction of image or video features relating to a temporal dimension, e.g. time-based feature extraction; Pattern tracking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands

Definitions

  • the present application relates to the technical field of artificial intelligence, in particular to a pedestrian re-identification method, device, electronic equipment and storage medium.
  • Person re-identification also known as pedestrian re-identification
  • pedestrian re-identification is a technology that uses computer vision technology to determine whether a specific pedestrian exists in an image or video sequence. It is widely considered to be a sub-problem of image retrieval.
  • the existing technology uses video
  • the pose estimation in the method uses the pose estimation for each frame and then conducts pedestrian re-identification.
  • This application proposes a pedestrian re-identification method, device, electronic equipment, and storage medium. By dividing the pedestrian to be identified into multiple body parts and performing separate calculations for each body part, the accuracy of pedestrian re-identification is improved.
  • the first aspect of the present application provides a pedestrian re-identification method, the method comprising:
  • the re-identification model contains multiple channel attention modules and multiple position attention modules.
  • a second aspect of the present application provides an electronic device, the electronic device includes a memory and a processor, the memory is used to store at least one computer-readable instruction, and the processor is used to execute the at least one computer-readable instruction to Implement the following steps:
  • the re-identification model contains multiple channel attention modules and multiple position attention modules.
  • a third aspect of the present application provides a computer-readable storage medium, the computer-readable storage medium stores at least one computer-readable instruction, and when the at least one computer-readable instruction is executed by a processor, the following steps are implemented:
  • the re-identification model contains multiple channel attention modules and multiple position attention modules.
  • a fourth aspect of the present application provides a pedestrian re-identification device, the device comprising:
  • An acquisition module configured to acquire a first image sequence of a pedestrian to be identified, input the first image sequence into a preset gesture recognition network, and obtain a first local feature of each body part of the pedestrian to be identified;
  • a first input module configured to input the first image sequence into a preset multi-layer convolutional neural network to obtain the first global feature of the pedestrian to be identified;
  • a fusion module configured to fuse the first local features of each body part of the pedestrian to be identified with the first global feature for the first time to obtain the second feature of each body part of the pedestrian to be identified. local features;
  • the second input module is configured to input the second local features of each body part of the pedestrian to be identified into a pre-trained pedestrian re-identification model, and receive the pedestrian re-identification output from the pedestrian re-identification model
  • the pedestrian re-identification model includes multiple channel attention modules and multiple position attention modules.
  • the pedestrian re-identification method, device, electronic equipment and storage medium described in this application improve the accuracy of pedestrian re-identification.
  • FIG. 1 is a flow chart of a pedestrian re-identification method provided in Embodiment 1 of the present application.
  • FIG. 2 is a structural diagram of a pedestrian re-identification device provided in Embodiment 2 of the present application.
  • FIG. 3 is a schematic structural diagram of an electronic device provided in Embodiment 3 of the present application.
  • FIG. 1 is a flow chart of a pedestrian re-identification method provided in Embodiment 1 of the present application.
  • the pedestrian re-identification method can be applied to electronic devices.
  • the pedestrian re-identification function provided by the method of this application can be directly integrated on the electronic device, or It runs in the electronic device in the form of a software development kit (Software Development Kit, SDK).
  • AI artificial intelligence
  • the embodiments of the present application may acquire and process relevant data based on artificial intelligence technology.
  • artificial intelligence is the theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results. .
  • Artificial intelligence basic technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, and mechatronics.
  • Artificial intelligence software technology mainly includes several major directions such as computer vision technology, robot technology, biometric technology, speech processing technology, natural language processing technology, machine learning, and deep learning.
  • the pedestrian re-identification method specifically includes the following steps. According to different requirements, the order of the steps in the flow chart can be changed, and some of them can be omitted.
  • the first image sequence refers to multiple consecutive frame images extracted from the captured video of the pedestrian to be identified.
  • the above-mentioned first image sequence can also be stored in a block chain node.
  • the posture recognition network can be preset, wherein the posture recognition network can be an AlphaPose model, and the AlphaPose model adopts the RMPE framework, which is composed of a symmetric space transformer network SSTN, a parameter posture non-maximum suppression PNMS, and a posture guidance suggestion
  • the generator is composed of PGPG, and the AlphaPose model is a prior art, which will not be described in detail in this embodiment.
  • the inputting the first image sequence into a preset gesture recognition network to obtain the first local features of each body part of the pedestrian to be recognized includes:
  • the first image sequence is input into a preset gesture recognition network, and each image in the first image sequence is detected in the preset gesture recognition network to extract the body parts of the pedestrian to be recognized;
  • 18 body parts when performing the first local feature extraction of the body parts of pedestrians to be identified, 18 body parts can be preset, for example, the body parts can include: nose, right eye, left eye, right ear, left Ear, right shoulder, left shoulder, right elbow, left elbow, right wrist, left wrist, right hip, left hip, right knee, left knee, right ankle, left ankle, neck, each body part of the pedestrian to be identified Calculated separately to extract the first local features corresponding to each body part.
  • the pedestrian to be identified by dividing the body parts of the pedestrian to be recognized, and extracting the posture features of each body part, it is avoided that a certain body part of the pedestrian to be recognized is blocked, and the body part of the occluder is blocked.
  • the fact that features are used as the features of the pedestrian to be identified ensures the accuracy of the extracted first local features of each body part of the pedestrian to be identified.
  • the pedestrian to be identified is considered The first local features of each body part of the pedestrian, which in turn improves the accuracy of pedestrian re-identification.
  • a multi-layer convolutional neural network can be preset, and feature extraction is performed on each image in the first image sequence of the pedestrian to be identified through the preset multi-layer convolutional neural network, and then the obtained Describe the first global feature of the pedestrian to be recognized.
  • the inputting the first image sequence into a preset multi-layer convolutional neural network to obtain the first global feature of the pedestrian to be identified includes:
  • the first image sequence is input into a preset deep residual network ResNet50 for human detection, and the first global feature of the pedestrian to be recognized is obtained.
  • the residual learning is performed on each picture in the first image sequence through the convolutional layer in the deep residual network ResNet50, the residual network is easier to optimize, and the accuracy can be improved by depth,
  • the deep residual network can solve the degradation problem of the deep network caused by increasing the depth, and the performance of the network can be improved by increasing the depth of the network, thereby improving the accuracy of the acquired first global feature.
  • the first local feature of each body part is made more accurate.
  • the first local feature of each body part of the pedestrian to be identified is fused with the first global feature to obtain each body part of the pedestrian to be identified
  • the second local features of the body part include:
  • the pedestrian re-identification model can be pre-trained, and after obtaining the second local features of multiple body parts of the pedestrian to be identified, the second local features are input into the pre-trained pedestrian re-identification model , to obtain the pedestrian re-identification result, wherein the pedestrian re-identification result includes: the pedestrian to be recognized in the first image sequence is the same pedestrian, or the pedestrian to be recognized in the first image sequence Not the same pedestrian.
  • the pre-training process of the pedestrian re-identification model includes:
  • test pass rate is greater than or equal to a preset pass rate threshold, it is determined that the training of the pedestrian re-identification model is over; if the test pass rate is less than the preset pass rate threshold, the second sample data is updated , to obtain a new training set, and input the new training set into the preset neural network to retrain the pedestrian re-identification model.
  • the second image sequence of each pedestrian sample can be obtained in advance, wherein most of the second image sequences contain multiple consecutive frame images, and the pedestrian re-identification is trained according to the multiple second image sequences of multiple pedestrian samples Model.
  • the first sample data set refers to the first fusion of the third local features of multiple body parts of each pedestrian sample and the first global feature, which ensures that the input to the channel attention module and the effectiveness of sample images in the positional attention module.
  • the second sample data set refers to a second fusion of the target channel attention result, the target position attention result and the second global feature of each pedestrian sample.
  • the training set and the test set are divided from the second sample data set, and the division rules can be set in advance, for example, the training set accounts for 70% of the second sample data set, and the test set accounts for 70% of the second sample data set. 30%.
  • the pass rate threshold can be set in advance, for example, the pass rate threshold can be set to 98%, when the test pass rate is greater than or equal to 98%, it is determined that the pedestrian re-identification model training is passed, and the training is ended; when the test pass rate is less than When 98%, update the second sample data set to increase the number of the second image sequence in the training set to obtain a new training set, and input the new training set into the preset neural network for training, and repeat the execution The above steps, until the test passing rate is greater than or equal to 98%.
  • Results include:
  • each body part corresponds to a channel attention module and a position attention module
  • a plurality of fourth local features of multiple body parts of each pedestrian sample are respectively input into the corresponding channel attention module and the corresponding position attention module for weighting processing, and the channel of each body part of each pedestrian sample is obtained attention results and position attention results;
  • a first average value is calculated for multiple channel attention results of multiple body parts of each pedestrian sample, and the first average value is determined as the target channel attention result of the corresponding pedestrian, and multiple channel attention results for each pedestrian sample
  • a second average value is calculated for multiple position attention results of the body parts, and the second average value is determined as the target position attention result corresponding to the pedestrian.
  • the channel attention module and the position attention module are used to focus on meaningful features in the fourth local features of each body part of each pedestrian sample
  • the channel attention module can use the global Average pooling and maximum pooling are two ways to obtain meaningful fourth local features in each body part of each pedestrian sample
  • the position attention module can be processed by maximum pooling and average pooling, and The maximum pooling and average pooling results are concatenated and input to the convolutional layer. After the weight coefficient is obtained based on the Sigmoid activation function, the meaningful fourth local feature in each body part of each pedestrian sample is determined.
  • the first average value is obtained by averaging multiple channel attention results of multiple body parts of each pedestrian sample; It is obtained by averaging multiple location attention results for multiple body parts.
  • the second fusion of the target channel attention result, the target position attention result and the second global feature of each pedestrian sample is carried out to obtain the third global feature of each pedestrian sample including:
  • the second fusion refers to combining the target channel attention result of each pedestrian sample with the target position attention result and multiplying it with the second global feature of each pedestrian sample to obtain A new feature emphasized by dual attention on parts, i.e. the third global feature for each pedestrian sample.
  • the second sample data set used in the training process of the pedestrian re-identification model adopts the third global feature emphasized by the double attention of body parts, which ensures the accuracy of the features in the training set, so that the training The obtained pedestrian re-identification model is more optimized, thereby improving the accuracy of pedestrian re-identification.
  • the third local feature of each body part of each pedestrian sample is calculated for each pedestrian sample, and by separately calculating each body part of each pedestrian sample, it can be Reducing the difficult samples with small class spacing can distinguish different pedestrians with similar appearances, thereby improving the accuracy of pedestrian re-identification. Recognizing each body part of a pedestrian for calculation can prevent these occluders from being added to the overall calculation, reduce the influence of occluders, and improve the accuracy of subsequent pedestrian re-identification.
  • the first position coordinates and the first confidence level of each body part of the pedestrian to be recognized are acquired through a posture recognition network, and the first position coordinates and the first confidence level of each body part of the pedestrian to be recognized are Obtain the first local feature of each body part with a position coordinate and the first degree of confidence, and then obtain the first global feature of the pedestrian to be identified through a preset multi-layer convolutional neural network, and use the first global feature of each body part
  • the local feature is multiplied by the first global feature of the pedestrian to be identified to obtain the second local feature of each body part, and the first local feature of each body part is input to the channel attention module and the position attention module respectively
  • the target channel attention result and the target position attention result of the pedestrian to be identified are obtained, and the target channel attention result and the target position attention result of the pedestrian to be identified are combined and combined with the first pedestrian sample
  • a global feature is multiplied to obtain the global feature of the pedestrian to be recognized that is emphasized by the double attention of
  • the pedestrian re-identification method described in this embodiment obtains the first image sequence of the pedestrian to be identified, and inputs the first image sequence into the preset gesture recognition network to obtain the described
  • the first local feature of each body part of the pedestrian to be recognized avoids the situation that the feature of the occluder is used as the feature of the pedestrian to be recognized because a certain body part of the pedestrian to be recognized is blocked, ensuring the extraction
  • the accuracy rate of the first local feature of each body part of the pedestrian to be identified on the other hand, the first local feature and the first global feature of each body part of the pedestrian to be identified are performed for the first time fusion to obtain the second local features of each of the body parts of the pedestrian to be identified, so that the first local features of each body part are more accurate; finally, the first local features of the multiple body parts of the pedestrian to be identified are The two local features are input into the pre-trained pedestrian re-identification model.
  • each body part of each pedestrian sample is The fourth local features of the input to the channel attention module and the position attention module respectively, weighted according to the weight of each body part's posture weight, so that the corresponding channel attention results and position attention results of each body part are more accurate , thus improving the accuracy of pedestrian re-identification.
  • FIG. 2 is a structural diagram of a pedestrian re-identification device provided in Embodiment 2 of the present application.
  • the pedestrian re-identification device 20 may include a plurality of functional modules composed of program code segments.
  • the program codes of each program segment in the pedestrian re-identification device 20 can be stored in the memory of the electronic device, and executed by the at least one processor to perform the pedestrian re-identification function (see FIG. 1 for details).
  • the pedestrian re-identification device 20 can be divided into multiple functional modules according to the functions it performs.
  • the functional modules may include: an acquisition module 201 , a first input module 202 , a fusion module 203 and a second input module 204 .
  • the module referred to in this application refers to a series of computer-readable instruction segments that can be executed by at least one processor and can complete fixed functions, and are stored in a memory. In this embodiment, the functions of each module will be described in detail in subsequent embodiments.
  • An acquisition module 201 configured to acquire a first image sequence of a pedestrian to be identified, input the first image sequence into a preset gesture recognition network, and obtain a first local feature of each body part of the pedestrian to be identified, Wherein, the pedestrian to be identified includes multiple body parts.
  • the first input module 202 is configured to input the first image sequence into a preset multi-layer convolutional neural network to obtain the first global feature of the pedestrian to be identified.
  • the fusion module 203 is configured to fuse the first local features of each body part of the pedestrian to be identified with the first global feature for the first time to obtain the first local feature of each body part of the pedestrian to be identified. Two local features.
  • the second input module 204 is configured to input the second local features of each body part of the pedestrian to be identified into a pre-trained pedestrian re-identification model, and receive the pedestrian re-identification output from the pedestrian re-identification model. Recognition results, wherein the pedestrian re-identification model includes multiple channel attention modules and multiple position attention modules.
  • the pedestrian re-identification device described in this embodiment acquires the first image sequence of the pedestrian to be identified, and inputs the first image sequence into the preset gesture recognition network to obtain the described
  • the first local feature of each body part of the pedestrian to be recognized avoids the situation that the feature of the occluder is used as the feature of the pedestrian to be recognized because a certain body part of the pedestrian to be recognized is blocked, ensuring the extraction
  • the accuracy rate of the first local feature of each body part of the pedestrian to be identified on the other hand, the first local feature and the first global feature of each body part of the pedestrian to be identified are performed for the first time fusion to obtain the second local features of each of the body parts of the pedestrian to be identified, so that the first local features of each body part are more accurate; finally, the first local features of the multiple body parts of the pedestrian to be identified are The two local features are input into the pre-trained pedestrian re-identification model.
  • each body part of each pedestrian sample is The fourth local features of the input to the channel attention module and the position attention module respectively, weighted according to the weight of each body part's posture weight, so that the corresponding channel attention results and position attention results of each body part are more accurate , thus improving the accuracy of pedestrian re-identification.
  • the electronic device 3 includes a memory 31 , at least one processor 32 , at least one communication bus 33 and a transceiver 34 .
  • the structure of the electronic device shown in Figure 3 does not constitute a limitation of the embodiment of the present application, it can be a bus structure or a star structure, and the electronic device 3 can also include a ratio diagram more or less other hardware or software, or a different arrangement of components.
  • the electronic device 3 is an electronic device that can automatically perform numerical calculation and/or information processing according to preset or stored instructions, and its hardware includes but not limited to microprocessors, application-specific integrated circuits , programmable gate arrays, digital processors and embedded devices, etc.
  • the electronic device 3 may also include a client device, which includes but is not limited to any electronic product that can interact with the client through a keyboard, mouse, remote control, touch pad, or voice-activated device, for example, Personal computers, tablets, smartphones, digital cameras, etc.
  • the electronic device 3 is only an example, and other existing or future electronic products that can be adapted to this application should also be included in the scope of protection of this application, and are included here by reference .
  • the memory 31 is used to store program codes and various data, such as the pedestrian re-identification device 20 installed in the electronic device 3, and realize high-speed and automatic recognition during the operation of the electronic device 3 Complete program or data access.
  • Described memory 31 comprises nonvolatile memory and volatile memory, such as read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable Read-Only Memory, PROM), erasable programmable only memory Read-Only Memory (Erasable Programmable Read-Only Memory, EPROM), One-time Programmable Read-Only Memory (OTPROM), Electronically Erasable Programmable Read-Only Memory (Electrically-Erasable Programmable Read-Only Memory , EEPROM), CD-ROM (Compact Disc Read-Only Memory, CD-ROM) or other optical disk storage, disk storage, tape storage, or any other computer-readable medium that can be used to carry or store data.
  • Read-Only Memory Read-Only Memory
  • PROM programmable read-
  • the at least one processor 32 may be composed of an integrated circuit, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits with the same function or different functions packaged, including a Or a combination of multiple central processing units (Central Processing unit, CPU), microprocessors, digital processing chips, graphics processors, and various control chips.
  • the at least one processor 32 is the control core (Control Unit) of the electronic device 3, and uses various interfaces and lines to connect the various components of the entire electronic device 3, by running or executing programs stored in the memory 31 or module, and call the data stored in the memory 31 to execute various functions of the electronic device 3 and process data.
  • Control Unit Control Unit
  • the at least one communication bus 33 is configured to realize connection and communication between the memory 31 and the at least one processor 32 and so on.
  • the electronic device 3 may also include a power supply (such as a battery) for supplying power to various components.
  • the power supply may be logically connected to the at least one processor 32 through a power management device, thereby Realize the functions of managing charging, discharging, and power consumption management.
  • the power supply may also include one or more DC or AC power supplies, recharging devices, power failure detection circuits, power converters or inverters, power status indicators and other arbitrary components.
  • the electronic device 3 may also include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.
  • the above-mentioned integrated units implemented in the form of software function modules can be stored in a computer-readable storage medium.
  • the above-mentioned software function modules are stored in a storage medium, and include several instructions to make a computer device (which may be a personal computer, electronic device, or network device, etc.) or a processor (processor) execute the methods described in various embodiments of the present application part.
  • the at least one processor 32 can execute the operating device of the electronic device 3 and various installed applications (such as the pedestrian re-identification device 20 ), program codes, etc. , for example, the various modules mentioned above.
  • Program codes are stored in the memory 31 , and the at least one processor 32 can invoke the program codes stored in the memory 31 to execute related functions.
  • the various modules described in FIG. 2 are program codes stored in the memory 31 and executed by the at least one processor 32, so as to implement the functions of the various modules to achieve the purpose of pedestrian re-identification.
  • the memory 31 stores a plurality of computer-readable instructions, and the plurality of computer-readable instructions are executed by the at least one processor 32 to implement the pedestrian re-identification function.
  • the program code may be divided into one or more modules/units, and the one or more modules/units are stored in the memory 31 and executed by the processor 32 to complete this Apply.
  • the one or more modules/units may be a series of computer-readable instruction segments capable of accomplishing specific functions, and the instruction segments are used to describe the execution process of the computer program in the electronic device 3 .
  • the program code can be divided into an acquisition module 201 , a first input module 202 , a fusion module 203 and a second input module 204 .
  • the computer-readable storage medium may be non-volatile or volatile
  • the computer-readable storage medium may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function, etc.; The data created using the node, etc.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with each other using cryptographic methods. Each data block contains a batch of network transaction information, which is used to verify its Validity of information (anti-counterfeiting) and generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • the modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, and may be located in one place or distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional module in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware, or in the form of hardware plus software function modules.

Abstract

The present application relates to the technical field of artificial intelligence, and provides a person re-identification method and apparatus, an electronic device and a storage medium. The method comprises: inputting a first image sequence into a preset posture identification network so as to obtain a first local feature of each body part of a person to be identified; inputting the first image sequence into a preset multi-layer convolutional neural network so as to obtain a first global feature of said person; performing first fusion on the first local features of the plurality of body parts and the first global feature so as to obtain a second local feature of each body part; and inputting the second local features of the plurality of body parts of said person into a pre-trained person re-identification model, and outputting a person re-identification result. According to the present application, the accuracy of person re-identification is improved. The present application further relates to blockchain technology, and the first image sequence is stored in a blockchain node.

Description

行人重识别方法、装置、电子设备及存储介质Pedestrian re-identification method, device, electronic equipment and storage medium
本申请要求于2022年1月12日提交中国专利局,申请号为202210033877.5申请名称为“行人重识别方法、装置、电子设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number 202210033877.5 filed on January 12, 2022, entitled "Pedestrian re-identification method, device, electronic equipment and storage medium", the entire content of which is incorporated by reference in this application.
技术领域technical field
本申请涉及人工智能技术领域,具体涉及一种行人重识别方法、装置、电子设备及存储介质。The present application relates to the technical field of artificial intelligence, in particular to a pedestrian re-identification method, device, electronic equipment and storage medium.
背景技术Background technique
行人重识别(Person re-identification)也称行人再识别,是利用计算机视觉技术判断图像或者视频序列中是否存在特定行人的技术,广泛被认为是一个图像检索的子问题,现有技术通过对视频中的姿态估计采用的是对每一帧进行姿态估计后进行行人重识别。Person re-identification (Person re-identification), also known as pedestrian re-identification, is a technology that uses computer vision technology to determine whether a specific pedestrian exists in an image or video sequence. It is widely considered to be a sub-problem of image retrieval. The existing technology uses video The pose estimation in the method uses the pose estimation for each frame and then conducts pedestrian re-identification.
然而,由于人体动作的复杂、光照的变化、背景干扰等原因,现有技术对每一帧进行姿态处理时,发明人发现无法很好的处理遮挡问题,导致行人重识别准确率低。However, due to the complexity of human body movements, changes in illumination, background interference, etc., the inventors found that the existing technology cannot handle the occlusion problem well when performing gesture processing on each frame, resulting in low pedestrian re-identification accuracy.
因此,有必要提出一种可以快速准确的进行行人重识别的方法。Therefore, it is necessary to propose a method for fast and accurate person re-identification.
发明内容Contents of the invention
本申请提出一种行人重识别方法、装置、电子设备及存储介质,通过将待识别行人划分为多个身体部位,并对每个身体部位进行单独计算,提高了行人重识别的准确率。This application proposes a pedestrian re-identification method, device, electronic equipment, and storage medium. By dividing the pedestrian to be identified into multiple body parts and performing separate calculations for each body part, the accuracy of pedestrian re-identification is improved.
本申请的第一方面提供一种行人重识别方法,所述方法包括:The first aspect of the present application provides a pedestrian re-identification method, the method comprising:
获取待识别行人的第一图像序列,将所述第一图像序列输入至预设的姿态识别网络中,得到所述待识别行人的每个身体部位的第一局部特征;Obtaining a first image sequence of the pedestrian to be identified, inputting the first image sequence into a preset gesture recognition network, and obtaining the first local features of each body part of the pedestrian to be identified;
将所述第一图像序列输入至预设的多层卷积神经网络中,得到所述待识别行人的第一全局特征;Inputting the first image sequence into a preset multi-layer convolutional neural network to obtain the first global feature of the pedestrian to be identified;
对所述待识别行人的每个所述身体部位的第一局部特征与第一全局特征进行第一次融合,得到所述待识别行人的每个所述身体部位的第二局部特征;performing a first fusion of the first local features of each of the body parts of the pedestrian to be identified with the first global feature to obtain a second local feature of each of the body parts of the pedestrian to be identified;
将所述待识别行人的每个所述身体部位的第二局部特征输入至预先训练好的行人重识别模型中,并接收所述行人重识别模型输出的行人重识别结果,其中,所述行人重识别模型中包含有多个通道注意力模块和多个位置注意力模块。inputting the second local features of each body part of the pedestrian to be identified into a pre-trained pedestrian re-identification model, and receiving a pedestrian re-identification result output by the pedestrian re-identification model, wherein the pedestrian The re-identification model contains multiple channel attention modules and multiple position attention modules.
本申请的第二方面提供一种电子设备,所述电子设备包括存储器及处理器,所述存储器用于存储至少一个计算机可读指令,所述处理器用于执行所述至少一个计算机可读指令以实现以下步骤:A second aspect of the present application provides an electronic device, the electronic device includes a memory and a processor, the memory is used to store at least one computer-readable instruction, and the processor is used to execute the at least one computer-readable instruction to Implement the following steps:
获取待识别行人的第一图像序列,将所述第一图像序列输入至预设的姿态识别网络中,得到所述待识别行人的每个身体部位的第一局部特征;Obtaining a first image sequence of the pedestrian to be identified, inputting the first image sequence into a preset gesture recognition network, and obtaining the first local features of each body part of the pedestrian to be identified;
将所述第一图像序列输入至预设的多层卷积神经网络中,得到所述待识别行人的第一全局特征;Inputting the first image sequence into a preset multi-layer convolutional neural network to obtain the first global feature of the pedestrian to be identified;
对所述待识别行人的每个所述身体部位的第一局部特征与第一全局特征进行第一次融合,得到所述待识别行人的每个所述身体部位的第二局部特征;performing a first fusion of the first local features of each of the body parts of the pedestrian to be identified with the first global feature to obtain a second local feature of each of the body parts of the pedestrian to be identified;
将所述待识别行人的每个所述身体部位的第二局部特征输入至预先训练好的行人重识别模型中,并接收所述行人重识别模型输出的行人重识别结果,其中,所述行人重识别模型中包含有多个通道注意力模块和多个位置注意力模块。inputting the second local features of each body part of the pedestrian to be identified into a pre-trained pedestrian re-identification model, and receiving a pedestrian re-identification result output by the pedestrian re-identification model, wherein the pedestrian The re-identification model contains multiple channel attention modules and multiple position attention modules.
本申请的第三方面提供一种计算机可读存储介质,所述计算机可读存储介质存储有 至少一个计算机可读指令,所述至少一个计算机可读指令被处理器执行时实现以下步骤:A third aspect of the present application provides a computer-readable storage medium, the computer-readable storage medium stores at least one computer-readable instruction, and when the at least one computer-readable instruction is executed by a processor, the following steps are implemented:
获取待识别行人的第一图像序列,将所述第一图像序列输入至预设的姿态识别网络中,得到所述待识别行人的每个身体部位的第一局部特征;Obtaining a first image sequence of the pedestrian to be identified, inputting the first image sequence into a preset gesture recognition network, and obtaining the first local features of each body part of the pedestrian to be identified;
将所述第一图像序列输入至预设的多层卷积神经网络中,得到所述待识别行人的第一全局特征;Inputting the first image sequence into a preset multi-layer convolutional neural network to obtain the first global feature of the pedestrian to be identified;
对所述待识别行人的每个所述身体部位的第一局部特征与第一全局特征进行第一次融合,得到所述待识别行人的每个所述身体部位的第二局部特征;performing a first fusion of the first local features of each of the body parts of the pedestrian to be identified with the first global feature to obtain a second local feature of each of the body parts of the pedestrian to be identified;
将所述待识别行人的每个所述身体部位的第二局部特征输入至预先训练好的行人重识别模型中,并接收所述行人重识别模型输出的行人重识别结果,其中,所述行人重识别模型中包含有多个通道注意力模块和多个位置注意力模块。inputting the second local features of each body part of the pedestrian to be identified into a pre-trained pedestrian re-identification model, and receiving a pedestrian re-identification result output by the pedestrian re-identification model, wherein the pedestrian The re-identification model contains multiple channel attention modules and multiple position attention modules.
本申请的第四方面提供一种行人重识别装置,所述装置包括:A fourth aspect of the present application provides a pedestrian re-identification device, the device comprising:
获取模块,用于获取待识别行人的第一图像序列,将所述第一图像序列输入至预设的姿态识别网络中,得到所述待识别行人的每个身体部位的第一局部特征;An acquisition module, configured to acquire a first image sequence of a pedestrian to be identified, input the first image sequence into a preset gesture recognition network, and obtain a first local feature of each body part of the pedestrian to be identified;
第一输入模块,用于将所述第一图像序列输入至预设的多层卷积神经网络中,得到所述待识别行人的第一全局特征;A first input module, configured to input the first image sequence into a preset multi-layer convolutional neural network to obtain the first global feature of the pedestrian to be identified;
融合模块,用于对所述待识别行人的每个所述身体部位的第一局部特征与第一全局特征进行第一次融合,得到所述待识别行人的每个所述身体部位的第二局部特征;A fusion module, configured to fuse the first local features of each body part of the pedestrian to be identified with the first global feature for the first time to obtain the second feature of each body part of the pedestrian to be identified. local features;
第二输入模块,用于将所述待识别行人的每个所述身体部位的第二局部特征输入至预先训练好的行人重识别模型中,并接收所述行人重识别模型输出的行人重识别结果,其中,所述行人重识别模型中包含有多个通道注意力模块和多个位置注意力模块。The second input module is configured to input the second local features of each body part of the pedestrian to be identified into a pre-trained pedestrian re-identification model, and receive the pedestrian re-identification output from the pedestrian re-identification model As a result, the pedestrian re-identification model includes multiple channel attention modules and multiple position attention modules.
本申请所述的行人重识别方法、装置、电子设备及存储介质,提高了行人重识别的准确率。The pedestrian re-identification method, device, electronic equipment and storage medium described in this application improve the accuracy of pedestrian re-identification.
附图说明Description of drawings
图1是本申请实施例一提供的行人重识别方法的流程图。FIG. 1 is a flow chart of a pedestrian re-identification method provided in Embodiment 1 of the present application.
图2是本申请实施例二提供的行人重识别装置的结构图。FIG. 2 is a structural diagram of a pedestrian re-identification device provided in Embodiment 2 of the present application.
图3是本申请实施例三提供的电子设备的结构示意图。FIG. 3 is a schematic structural diagram of an electronic device provided in Embodiment 3 of the present application.
具体实施方式Detailed ways
为了能够更清楚地理解本申请的上述目的、特征和优点,下面结合附图和具体实施例对本申请进行详细描述。需要说明的是,在不冲突的情况下,本申请的实施例及实施例中的特征可以相互组合。In order to more clearly understand the above objects, features and advantages of the present application, the present application will be described in detail below in conjunction with the accompanying drawings and specific embodiments. It should be noted that, in the case of no conflict, the embodiments of the present application and the features in the embodiments can be combined with each other.
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中在本申请的说明书中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本申请。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the technical field to which this application belongs. The terms used herein in the specification of the application are only for the purpose of describing specific embodiments, and are not intended to limit the application.
实施例一Embodiment one
图1是本申请实施例一提供的行人重识别方法的流程图。FIG. 1 is a flow chart of a pedestrian re-identification method provided in Embodiment 1 of the present application.
在本实施例中,所述行人重识别方法可以应用于电子设备中,对于需要进行行人重识别的电子设备,可以直接在电子设备上集成本申请的方法所提供的行人重识别的功能,或者以软件开发工具包(Software Development Kit,SDK)的形式运行在电子设备中。In this embodiment, the pedestrian re-identification method can be applied to electronic devices. For electronic devices that require pedestrian re-identification, the pedestrian re-identification function provided by the method of this application can be directly integrated on the electronic device, or It runs in the electronic device in the form of a software development kit (Software Development Kit, SDK).
本申请实施例可以基于人工智能技术对相关的数据进行获取和处理。其中,人工智能(Artificial Intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。The embodiments of the present application may acquire and process relevant data based on artificial intelligence technology. Among them, artificial intelligence (AI) is the theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results. .
人工智能基础技术一般包括如传感器、专用人工智能芯片、云计算、分布式存储、大数据处理技术、操作/交互系统、机电一体化等技术。人工智能软件技术主要包括计算机视觉技 术、机器人技术、生物识别技术、语音处理技术、自然语言处理技术以及机器学习、深度学习等几大方向。Artificial intelligence basic technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, and mechatronics. Artificial intelligence software technology mainly includes several major directions such as computer vision technology, robot technology, biometric technology, speech processing technology, natural language processing technology, machine learning, and deep learning.
如图1所示,所述行人重识别方法具体包括以下步骤,根据不同的需求,该流程图中步骤的顺序可以改变,某些可以省略。As shown in FIG. 1 , the pedestrian re-identification method specifically includes the following steps. According to different requirements, the order of the steps in the flow chart can be changed, and some of them can be omitted.
S11,获取待识别行人的第一图像序列,将所述第一图像序列输入至预设的姿态识别网络中,得到所述待识别行人的每个身体部位的第一局部特征,其中,所述待识别行人包含有多个身体部位。S11. Obtain a first image sequence of a pedestrian to be identified, and input the first image sequence into a preset gesture recognition network to obtain a first local feature of each body part of the pedestrian to be identified, wherein the Pedestrians to be identified contain multiple body parts.
本实施例中,第一图像序列指的是从拍摄的待识别行人的视频中提取到的多张连续帧图像。In this embodiment, the first image sequence refers to multiple consecutive frame images extracted from the captured video of the pedestrian to be identified.
需要强调的是,为进一步保证上述第一图像序列的私密和安全性,上述第一图像序列还可以存储于一区块链的节点中。It should be emphasized that, in order to further ensure the privacy and security of the above-mentioned first image sequence, the above-mentioned first image sequence can also be stored in a block chain node.
本实施例中,可以预先设置姿态识别网络,其中,所述姿态识别网络可以为AlphaPose模型,所述AlphaPose模型采用RMPE框架,由对称空间变换器网络SSTN、参数姿势非最大抑制PNMS、姿势引导建议发生器PGPG组成,所述AlphaPose模型为现有技术,本实施例在此不做详述。In this embodiment, the posture recognition network can be preset, wherein the posture recognition network can be an AlphaPose model, and the AlphaPose model adopts the RMPE framework, which is composed of a symmetric space transformer network SSTN, a parameter posture non-maximum suppression PNMS, and a posture guidance suggestion The generator is composed of PGPG, and the AlphaPose model is a prior art, which will not be described in detail in this embodiment.
在一个可选的实施例中,所述将所述第一图像序列输入至预设的姿态识别网络中,得到所述待识别行人的每个身体部位的第一局部特征包括:In an optional embodiment, the inputting the first image sequence into a preset gesture recognition network to obtain the first local features of each body part of the pedestrian to be recognized includes:
将所述第一图像序列输入至预设的姿态识别网络中,在所述预设的姿态识别网络中检测所述第一图像序列中的每张图像进行所述待识别行人的身体部位提取;The first image sequence is input into a preset gesture recognition network, and each image in the first image sequence is detected in the preset gesture recognition network to extract the body parts of the pedestrian to be recognized;
获取所述待识别行人的每个身体部位的第一位置坐标和第一置信度;Acquiring the first position coordinates and the first confidence level of each body part of the pedestrian to be identified;
对所述待识别行人的每个身体部位的第一位置坐标和第一置信度进行向量转换,得到所述待识别行人的对应身体部位的第一局部特征。Performing vector transformation on the first position coordinates and the first confidence level of each body part of the pedestrian to be identified, to obtain the first local feature of the corresponding body part of the pedestrian to be identified.
本实施例中,在进行待识别行人的身体部位的第一局部特征提取时,可以预先设置18个身体部位,例如,所述身体部位可以包括:鼻子、右眼、左眼、右耳、左耳、右肩、左肩、右肘、左肘、右腕、左腕、右髋、左髋、右膝、左膝、右脚踝、左脚踝、颈部,对所述待识别行人的每个身体部位进行单独计算,提取出每个身体部位对应的第一局部特征。In this embodiment, when performing the first local feature extraction of the body parts of pedestrians to be identified, 18 body parts can be preset, for example, the body parts can include: nose, right eye, left eye, right ear, left Ear, right shoulder, left shoulder, right elbow, left elbow, right wrist, left wrist, right hip, left hip, right knee, left knee, right ankle, left ankle, neck, each body part of the pedestrian to be identified Calculated separately to extract the first local features corresponding to each body part.
本实施例中,通过对所述待识别行人进行身体部位的划分,并提取出每个身体部位的姿态特征,避免了由于所述待识别行人的某个身体部位被遮挡,而将遮挡物的特征作为所述待识别行人的特征的情况,确保了提取到的待识别行人的每个身体部位的第一局部特征的准确率,在后续进行行人重识别的过程中,考虑了所述待识别行人的每个身体部位的第一局部特征,进而提高了行人重识别的准确率。In this embodiment, by dividing the body parts of the pedestrian to be recognized, and extracting the posture features of each body part, it is avoided that a certain body part of the pedestrian to be recognized is blocked, and the body part of the occluder is blocked. The fact that features are used as the features of the pedestrian to be identified ensures the accuracy of the extracted first local features of each body part of the pedestrian to be identified. In the subsequent process of pedestrian re-identification, the pedestrian to be identified is considered The first local features of each body part of the pedestrian, which in turn improves the accuracy of pedestrian re-identification.
S12,将所述第一图像序列输入至预设的多层卷积神经网络中,得到所述待识别行人的第一全局特征。S12. Input the first image sequence into a preset multi-layer convolutional neural network to obtain a first global feature of the pedestrian to be identified.
本实施例中,可以预先设置多层卷积神经网络,通过所述预设的多层卷积神经网络对所述待识别行人的第一图像序列中的每张图像进行特征提取,进而得到所述待识别行人的第一全局特征。In this embodiment, a multi-layer convolutional neural network can be preset, and feature extraction is performed on each image in the first image sequence of the pedestrian to be identified through the preset multi-layer convolutional neural network, and then the obtained Describe the first global feature of the pedestrian to be recognized.
在一个可选的实施例中,所述将所述第一图像序列输入至预设的多层卷积神经网络中,得到所述待识别行人的第一全局特征包括:In an optional embodiment, the inputting the first image sequence into a preset multi-layer convolutional neural network to obtain the first global feature of the pedestrian to be identified includes:
将所述第一图像序列输入至预设的深度残差网络ResNet50中进行人体检测,得到所述待识别行人的第一全局特征。The first image sequence is input into a preset deep residual network ResNet50 for human detection, and the first global feature of the pedestrian to be recognized is obtained.
本实施例中,通过深度残差网络ResNet50中的卷积层对所述第一图像序列中的每张图片进行残差学习,残差网路更容易优化,同时可以通过深度来提高准确率,所述深度残差网络可以解决由于增加深度带来的深度网络的退化问题,解决了,可以通过增加网络的深度来提高网络的性能,进而提高了获取的第一全局特征的准确率。In this embodiment, the residual learning is performed on each picture in the first image sequence through the convolutional layer in the deep residual network ResNet50, the residual network is easier to optimize, and the accuracy can be improved by depth, The deep residual network can solve the degradation problem of the deep network caused by increasing the depth, and the performance of the network can be improved by increasing the depth of the network, thereby improving the accuracy of the acquired first global feature.
S13,对所述待识别行人的每个所述身体部位的第一局部特征与第一全局特征进行第一次 融合,得到所述待识别行人的每个所述身体部位的第二局部特征。S13. Perform the first fusion of the first local features of each body part of the pedestrian to be identified and the first global feature to obtain the second local features of each body part of the pedestrian to be identified.
本实施例中,通过将所述待识别行人的第一全局特征与每个身体部位的第一局部特征进行融合,使得每个身体部位的第一局部特征更加的精确。In this embodiment, by fusing the first global feature of the pedestrian to be identified with the first local feature of each body part, the first local feature of each body part is made more accurate.
在一个可选的实施例中,所述对所述待识别行人的每个所述身体部位的第一局部特征与第一全局特征进行第一次融合,得到所述待识别行人的每个所述身体部位的第二局部特征包括:In an optional embodiment, the first local feature of each body part of the pedestrian to be identified is fused with the first global feature to obtain each body part of the pedestrian to be identified The second local features of the body part include:
计算所述待识别行人的多个身体部位中的每个身体部位的第一局部特征与所述待识别行人的第一全局特征之间的乘积,得到所述待识别行人的对应身体部位的第二局部特征。calculating the product of the first local feature of each body part of the plurality of body parts of the pedestrian to be identified and the first global feature of the pedestrian to be identified, to obtain the first feature of the corresponding body part of the pedestrian to be identified Two local features.
S14,将所述待识别行人的每个所述身体部位的第二局部特征输入至预先训练好的行人重识别模型中,并接收所述行人重识别模型输出的行人重识别结果,其中,所述行人重识别模型中包含有多个通道注意力模块和多个位置注意力模块。S14. Input the second local features of each body part of the pedestrian to be identified into a pre-trained pedestrian re-identification model, and receive a pedestrian re-identification result output by the pedestrian re-identification model, wherein the The pedestrian re-identification model contains multiple channel attention modules and multiple location attention modules.
本实施例中,可以预先训练行人重识别模型,在得到所述待识别行人的多个身体部位的第二局部特征之后,将第二局部特征输入至所述预先训练好的行人重识别模型中,得到行人重识别结果,其中,所述行人重识别结果包括:所述第一图像序列中的所述待识别行人是同一个行人,或者,所述第一图像序列中的所述待识别行人不是同一个行人。In this embodiment, the pedestrian re-identification model can be pre-trained, and after obtaining the second local features of multiple body parts of the pedestrian to be identified, the second local features are input into the pre-trained pedestrian re-identification model , to obtain the pedestrian re-identification result, wherein the pedestrian re-identification result includes: the pedestrian to be recognized in the first image sequence is the same pedestrian, or the pedestrian to be recognized in the first image sequence Not the same pedestrian.
具体地,所述行人重识别模型的预先训练过程包括:Specifically, the pre-training process of the pedestrian re-identification model includes:
获取每个行人样本的第二图像序列,其中,每个行人样本包含有多个身体部位;Obtaining a second image sequence of each pedestrian sample, where each pedestrian sample contains multiple body parts;
将每个行人样本的第二图像序列输入至预设的姿态识别网络中,得到每个行人样本的多个身体部位的第二位置坐标和第二置信度;Inputting the second image sequence of each pedestrian sample into a preset posture recognition network to obtain second position coordinates and second confidence levels of multiple body parts of each pedestrian sample;
根据每个行人样本的每个身体部位的第二位置坐标和第二置信度,获取对应身体部位的第三局部特征;Obtaining a third local feature of the corresponding body part according to the second position coordinates and the second confidence level of each body part of each pedestrian sample;
将每个行人样本的第二图像序列输入至预设的多层卷积神经网络中,得到每个行人样本的第二全局特征;Inputting the second image sequence of each pedestrian sample into a preset multi-layer convolutional neural network to obtain the second global feature of each pedestrian sample;
对每个行人样本的每个所述身体部位的第三局部特征与第一全局特征进行第一次融合,得到每个行人样本的每个所述身体部位的第四局部特征;Performing a first fusion of the third local feature of each body part of each pedestrian sample with the first global feature to obtain a fourth local feature of each body part of each pedestrian sample;
将所述每个行人样本的多个身体部位的第四局部特征作为第一样本数据集;Taking the fourth local features of multiple body parts of each pedestrian sample as the first sample data set;
将所述第一样本数据集分别输入至所述通道注意力模块和所述位置注意力模块中进行处理,得到每个行人样本的目标通道注意力结果和目标位置注意力结果;Input the first sample data set into the channel attention module and the position attention module respectively for processing, and obtain the target channel attention result and the target position attention result of each pedestrian sample;
对每个行人样本的目标通道注意力结果、目标位置注意力结果和第二全局特征进行第二次融合,得到每个行人样本的第三全局特征;Perform a second fusion of the target channel attention result, target position attention result and the second global feature of each pedestrian sample to obtain the third global feature of each pedestrian sample;
将所述多个行人样本的多个第三全局特征作为第二样本数据集;using multiple third global features of the multiple pedestrian samples as a second sample data set;
从所述第二样本数据集划分出训练集和测试集;dividing a training set and a test set from the second sample data set;
将所述训练集输入预设神经网络中进行训练,得到行人重识别模型;Inputting the training set into a preset neural network for training to obtain a pedestrian re-identification model;
将所述测试集输入至所述行人重识别模型中进行测试,并计算测试通过率;Input the test set into the pedestrian re-identification model for testing, and calculate the pass rate of the test;
若所述测试通过率大于或者等于预设通过率阈值,则确定所述行人重识别模型的训练结束;若所述测试通过率小于所述预设通过率阈值,则更新所述第二样本数据,以获取新的训练集,并将所述新的训练集输入预设神经网络中重新进行所述行人重识别模型的训练。If the test pass rate is greater than or equal to a preset pass rate threshold, it is determined that the training of the pedestrian re-identification model is over; if the test pass rate is less than the preset pass rate threshold, the second sample data is updated , to obtain a new training set, and input the new training set into the preset neural network to retrain the pedestrian re-identification model.
本实施例中,可以预先获取每个行人样本的第二图像序列,其中,多数第二图像序列中包含有多张连续帧图像,根据多个行人样本的多个第二图像序列训练行人重识别模型。In this embodiment, the second image sequence of each pedestrian sample can be obtained in advance, wherein most of the second image sequences contain multiple consecutive frame images, and the pedestrian re-identification is trained according to the multiple second image sequences of multiple pedestrian samples Model.
本实施例中,所述第一样本数据集是指每个行人样本的多个身体部位的第三局部特征与第一全局特征进行第一次融合后得到的,确保了输入至通道注意力模块和位置注意力模块中的样本图像的有效性。In this embodiment, the first sample data set refers to the first fusion of the third local features of multiple body parts of each pedestrian sample and the first global feature, which ensures that the input to the channel attention module and the effectiveness of sample images in the positional attention module.
本实施例中,所述第二样本数据集是指将每个行人样本的目标通道注意力结果、目标位置注意力结果和第二全局特征进行第二次融合得到的。In this embodiment, the second sample data set refers to a second fusion of the target channel attention result, the target position attention result and the second global feature of each pedestrian sample.
本实施例中,从所述第二样本数据集划分出训练集和测试集,可以预先设置划分规则, 例如,训练集占第二样本数据集的70%,测试集占第二样本数据集的30%。In this embodiment, the training set and the test set are divided from the second sample data set, and the division rules can be set in advance, for example, the training set accounts for 70% of the second sample data set, and the test set accounts for 70% of the second sample data set. 30%.
本实施例中,可以预先设置通过率阈值,例如,通过率阈值可以设置为98%,当测试通过率大于或者等于98%时,确定行人重识别模型训练通过,结束训练;当测试通过率小于98%时,更新所述第二样本数据集,以增加训练集中第二图像序列的数量,得到新的训练集,并将所述新的训练集输入至预设神经网络中进行训练,重复执行上述步骤,直至测试通过利率大于或者等于98%。In this embodiment, the pass rate threshold can be set in advance, for example, the pass rate threshold can be set to 98%, when the test pass rate is greater than or equal to 98%, it is determined that the pedestrian re-identification model training is passed, and the training is ended; when the test pass rate is less than When 98%, update the second sample data set to increase the number of the second image sequence in the training set to obtain a new training set, and input the new training set into the preset neural network for training, and repeat the execution The above steps, until the test passing rate is greater than or equal to 98%.
进一步地,所述将所述第一样本数据集分别输入至所述通道注意力模块和所述位置注意力模块中进行处理,得到每个行人样本的目标通道注意力结果和目标位置注意力结果包括:Further, the first sample data set is respectively input into the channel attention module and the position attention module for processing, and the target channel attention result and target position attention result of each pedestrian sample are obtained. Results include:
从所述第一样本数据集中获取每个行人样本的每个身体部位的第四局部特征,其中,每个身体部位对应一个通道注意力模块和一个位置注意力模块;Obtaining the fourth local features of each body part of each pedestrian sample from the first sample data set, wherein each body part corresponds to a channel attention module and a position attention module;
将每个行人样本的多个身体部位的多个第四局部特征分别输入至对应的通道注意力模块和对应的位置注意力模块中进行加权处理,得到每个行人样本的每个身体部位的通道注意力结果和位置注意力结果;A plurality of fourth local features of multiple body parts of each pedestrian sample are respectively input into the corresponding channel attention module and the corresponding position attention module for weighting processing, and the channel of each body part of each pedestrian sample is obtained attention results and position attention results;
对每个行人样本的多个身体部位的多个通道注意力结果求第一平均值,将所述第一平均值确定为对应行人的目标通道注意力结果,及对每个行人样本的多个身体部位的多个位置注意力结果求第二平均值,将所述第二平均值确定为对应行人的目标位置注意力结果。A first average value is calculated for multiple channel attention results of multiple body parts of each pedestrian sample, and the first average value is determined as the target channel attention result of the corresponding pedestrian, and multiple channel attention results for each pedestrian sample A second average value is calculated for multiple position attention results of the body parts, and the second average value is determined as the target position attention result corresponding to the pedestrian.
本实施例中,所述通道注意力模块和位置注意力模块用于关注每个行人样本的每个身体部位的第四局部特征中有意义的特征,其中,所述通道注意力模块可以采用全局平均池化和最大池化两种方式得到每个行人样本的每个身体部位中的有意义的第四局部特征;所述位置注意力模块可以采用最大池化和平均池化进行处理,并将最大池化和平均池化结果进行拼接后输入卷积层,基于Sigmoid激活函数得到权重系数后,确定每个行人样本的每个身体部位中的有意义的第四局部特征。In this embodiment, the channel attention module and the position attention module are used to focus on meaningful features in the fourth local features of each body part of each pedestrian sample, wherein the channel attention module can use the global Average pooling and maximum pooling are two ways to obtain meaningful fourth local features in each body part of each pedestrian sample; the position attention module can be processed by maximum pooling and average pooling, and The maximum pooling and average pooling results are concatenated and input to the convolutional layer. After the weight coefficient is obtained based on the Sigmoid activation function, the meaningful fourth local feature in each body part of each pedestrian sample is determined.
本实施例中,在行人重识别模型的训练过程中,在得到每个行人样本的每个身体部位的第四局部特征时,针对每个行人样本的每个身体部位的第四局部特征,分别输入至通道注意力模块和位置注意力模块中,根据每个身体部位的姿态权重进行加权处理,使得每个身体部位对应的通道注意力结果和位置注意力结果更加的精确,进而确保了每个行人样本的目标通道注意力结果和目标位置注意力结果的准确性。In this embodiment, during the training process of the pedestrian re-identification model, when the fourth local feature of each body part of each pedestrian sample is obtained, for the fourth local feature of each body part of each pedestrian sample, respectively It is input to the channel attention module and the position attention module, and weighted according to the posture weight of each body part, so that the channel attention results and position attention results corresponding to each body part are more accurate, thereby ensuring that each Accuracy of Object Channel Attention Results and Object Location Attention Results for Pedestrian Samples.
本实施例中,所述第一平均值是通过对每个行人样本的多个身体部位的多个通道注意力结果求平均值得到的;所述第二平均值是通过对每个行人样本的多个身体部位的多个位置注意力结果求平均值得到的。In this embodiment, the first average value is obtained by averaging multiple channel attention results of multiple body parts of each pedestrian sample; It is obtained by averaging multiple location attention results for multiple body parts.
进一步地,所述对每个行人样本的目标通道注意力结果、目标位置注意力结果和第二全局特征进行第二次融合,得到每个行人样本的第三全局特征包括:Further, the second fusion of the target channel attention result, the target position attention result and the second global feature of each pedestrian sample is carried out to obtain the third global feature of each pedestrian sample including:
计算每个行人样本的目标通道注意力结果、目标位置注意力结果和第二全局特征之间的乘积,得到每个行人样本的第三全局特征。Compute the product between the target channel attention result, the target position attention result and the second global feature for each pedestrian sample to obtain the third global feature for each pedestrian sample.
本实施例中,所述第二次融合是指将每个行人样本的目标通道注意力结果与目标位置注意力结果进行联合后与每个行人样本的第二全局特征进行相乘,获得经过身体部位双重注意力强调的新的特征,即每个行人样本的第三全局特征。In this embodiment, the second fusion refers to combining the target channel attention result of each pedestrian sample with the target position attention result and multiplying it with the second global feature of each pedestrian sample to obtain A new feature emphasized by dual attention on parts, i.e. the third global feature for each pedestrian sample.
本实施例中,所述行人重识别模型的训练过程所采用的第二样本数据集中,采用了经过身体部位双重注意力强调的第三全局特征,确保了训练集中的特征的精确性,使得训练得到的行人重识别模型更加优化,进而提高了行人重识别的准确率。In this embodiment, the second sample data set used in the training process of the pedestrian re-identification model adopts the third global feature emphasized by the double attention of body parts, which ensures the accuracy of the features in the training set, so that the training The obtained pedestrian re-identification model is more optimized, thereby improving the accuracy of pedestrian re-identification.
本实施例中,在行人重识别模型创建过程中,将每个行人样本计算了每个行人样本的每个身体部位的第三局部特征,通过单独计算每个行人样本的每个身体部位,可以减少类间距较小的困难样本,即可以很好的区分外表较相似的不同行人,进而提高了行人重识别的准确率,同时,对于某些部位被遮挡的待识别行人,通过对所述待识别行人的每个身体部位进行计算可以避免这些遮挡物加入整体的计算中,减少了遮挡物的影响,进而提高了后续行人重 识别准确率。In this embodiment, in the process of creating the pedestrian re-identification model, the third local feature of each body part of each pedestrian sample is calculated for each pedestrian sample, and by separately calculating each body part of each pedestrian sample, it can be Reducing the difficult samples with small class spacing can distinguish different pedestrians with similar appearances, thereby improving the accuracy of pedestrian re-identification. Recognizing each body part of a pedestrian for calculation can prevent these occluders from being added to the overall calculation, reduce the influence of occluders, and improve the accuracy of subsequent pedestrian re-identification.
本实施例中,对于所述待识别行人的第一图像序列,经过姿态识别网络获取所述待识别行人的每个身体部位的第一位置坐标和第一置信度,根据每个身体部位的第一位置坐标和第一置信度获取每个身体部位的第一局部特征,再经过预设的多层卷积神经网络获取所述待识别行人的第一全局特征,将每个身体部位的第一局部特征与所述待识别行人的第一全局特征进行相乘,获取每个身体部位的第二局部特征,对于每个身体部位的第一局部特征分别输入至通道注意力模块和位置注意力模块中,得到所述待识别行人的目标通道注意力结果和目标位置注意力结果,并将所述待识别行人的目标通道注意力结果和目标位置注意力结果进行联合后与每个行人样本的第一全局特征进行相乘,获得了经过身体部位双重注意力强调的待识别行人的全局特征。In this embodiment, for the first image sequence of the pedestrian to be recognized, the first position coordinates and the first confidence level of each body part of the pedestrian to be recognized are acquired through a posture recognition network, and the first position coordinates and the first confidence level of each body part of the pedestrian to be recognized are Obtain the first local feature of each body part with a position coordinate and the first degree of confidence, and then obtain the first global feature of the pedestrian to be identified through a preset multi-layer convolutional neural network, and use the first global feature of each body part The local feature is multiplied by the first global feature of the pedestrian to be identified to obtain the second local feature of each body part, and the first local feature of each body part is input to the channel attention module and the position attention module respectively In the process, the target channel attention result and the target position attention result of the pedestrian to be identified are obtained, and the target channel attention result and the target position attention result of the pedestrian to be identified are combined and combined with the first pedestrian sample A global feature is multiplied to obtain the global feature of the pedestrian to be recognized that is emphasized by the double attention of body parts.
综上所述,本实施例所述的行人重识别方法,一方面,通过获取待识别行人的第一图像序列,将所述第一图像序列输入至预设的姿态识别网络中,得到所述待识别行人的每个身体部位的第一局部特征,避免了由于所述待识别行人的某个身体部位被遮挡,而将遮挡物的特征作为所述待识别行人的特征的情况,确保了提取到的待识别行人的每个身体部位的第一局部特征的准确率;另一方面,对所述待识别行人的每个所述身体部位的第一局部特征与第一全局特征进行第一次融合,得到所述待识别行人的每个所述身体部位的第二局部特征,使得每个身体部位的第一局部特征更加的精确;最后,将所述待识别行人的多个身体部位的第二局部特征输入至预先训练好的行人重识别模型中,由于所述行人重识别模型中包含有多个通道注意力模块和多个位置注意力模块,通过将每个行人样本的每个身体部位的第四局部特征分别输入至通道注意力模块和位置注意力模块中,根据每个身体部位的姿态权重进行加权处理,使得每个身体部位对应的通道注意力结果和位置注意力结果更加的精确,进而提高了行人重识别的准确率。To sum up, the pedestrian re-identification method described in this embodiment, on the one hand, obtains the first image sequence of the pedestrian to be identified, and inputs the first image sequence into the preset gesture recognition network to obtain the described The first local feature of each body part of the pedestrian to be recognized avoids the situation that the feature of the occluder is used as the feature of the pedestrian to be recognized because a certain body part of the pedestrian to be recognized is blocked, ensuring the extraction The accuracy rate of the first local feature of each body part of the pedestrian to be identified; on the other hand, the first local feature and the first global feature of each body part of the pedestrian to be identified are performed for the first time fusion to obtain the second local features of each of the body parts of the pedestrian to be identified, so that the first local features of each body part are more accurate; finally, the first local features of the multiple body parts of the pedestrian to be identified are The two local features are input into the pre-trained pedestrian re-identification model. Since the pedestrian re-identification model contains multiple channel attention modules and multiple position attention modules, each body part of each pedestrian sample is The fourth local features of the input to the channel attention module and the position attention module respectively, weighted according to the weight of each body part's posture weight, so that the corresponding channel attention results and position attention results of each body part are more accurate , thus improving the accuracy of pedestrian re-identification.
实施例二Embodiment two
图2是本申请实施例二提供的行人重识别装置的结构图。FIG. 2 is a structural diagram of a pedestrian re-identification device provided in Embodiment 2 of the present application.
在一些实施例中,所述行人重识别装置20可以包括多个由程序代码段所组成的功能模块。所述行人重识别装置20中的各个程序段的程序代码可以存储于电子设备的存储器中,并由所述至少一个处理器所执行,以执行(详见图1描述)行人重识别的功能。In some embodiments, the pedestrian re-identification device 20 may include a plurality of functional modules composed of program code segments. The program codes of each program segment in the pedestrian re-identification device 20 can be stored in the memory of the electronic device, and executed by the at least one processor to perform the pedestrian re-identification function (see FIG. 1 for details).
本实施例中,所述行人重识别装置20根据其所执行的功能,可以被划分为多个功能模块。所述功能模块可以包括:获取模块201、第一输入模块202、融合模块203及第二输入模块204。本申请所称的模块是指一种能够被至少一个处理器所执行并且能够完成固定功能的一系列计算机可读指令段,其存储在存储器中。在本实施例中,关于各模块的功能将在后续的实施例中详述。In this embodiment, the pedestrian re-identification device 20 can be divided into multiple functional modules according to the functions it performs. The functional modules may include: an acquisition module 201 , a first input module 202 , a fusion module 203 and a second input module 204 . The module referred to in this application refers to a series of computer-readable instruction segments that can be executed by at least one processor and can complete fixed functions, and are stored in a memory. In this embodiment, the functions of each module will be described in detail in subsequent embodiments.
获取模块201,用于获取待识别行人的第一图像序列,将所述第一图像序列输入至预设的姿态识别网络中,得到所述待识别行人的每个身体部位的第一局部特征,其中,所述待识别行人包含有多个身体部位。An acquisition module 201, configured to acquire a first image sequence of a pedestrian to be identified, input the first image sequence into a preset gesture recognition network, and obtain a first local feature of each body part of the pedestrian to be identified, Wherein, the pedestrian to be identified includes multiple body parts.
第一输入模块202,用于将所述第一图像序列输入至预设的多层卷积神经网络中,得到所述待识别行人的第一全局特征。The first input module 202 is configured to input the first image sequence into a preset multi-layer convolutional neural network to obtain the first global feature of the pedestrian to be identified.
融合模块203,用于对所述待识别行人的每个所述身体部位的第一局部特征与第一全局特征进行第一次融合,得到所述待识别行人的每个所述身体部位的第二局部特征。The fusion module 203 is configured to fuse the first local features of each body part of the pedestrian to be identified with the first global feature for the first time to obtain the first local feature of each body part of the pedestrian to be identified. Two local features.
第二输入模块204,用于将所述待识别行人的每个所述身体部位的第二局部特征输入至预先训练好的行人重识别模型中,并接收所述行人重识别模型输出的行人重识别结果,其中,所述行人重识别模型中包含有多个通道注意力模块和多个位置注意力模块。The second input module 204 is configured to input the second local features of each body part of the pedestrian to be identified into a pre-trained pedestrian re-identification model, and receive the pedestrian re-identification output from the pedestrian re-identification model. Recognition results, wherein the pedestrian re-identification model includes multiple channel attention modules and multiple position attention modules.
综上所述,本实施例所述的行人重识别装置,一方面,通过获取待识别行人的第一图像序列,将所述第一图像序列输入至预设的姿态识别网络中,得到所述待识别行人的每个身体部位的第一局部特征,避免了由于所述待识别行人的某个身体部位被遮挡,而将遮挡物的特 征作为所述待识别行人的特征的情况,确保了提取到的待识别行人的每个身体部位的第一局部特征的准确率;另一方面,对所述待识别行人的每个所述身体部位的第一局部特征与第一全局特征进行第一次融合,得到所述待识别行人的每个所述身体部位的第二局部特征,使得每个身体部位的第一局部特征更加的精确;最后,将所述待识别行人的多个身体部位的第二局部特征输入至预先训练好的行人重识别模型中,由于所述行人重识别模型中包含有多个通道注意力模块和多个位置注意力模块,通过将每个行人样本的每个身体部位的第四局部特征分别输入至通道注意力模块和位置注意力模块中,根据每个身体部位的姿态权重进行加权处理,使得每个身体部位对应的通道注意力结果和位置注意力结果更加的精确,进而提高了行人重识别的准确率。To sum up, the pedestrian re-identification device described in this embodiment, on the one hand, acquires the first image sequence of the pedestrian to be identified, and inputs the first image sequence into the preset gesture recognition network to obtain the described The first local feature of each body part of the pedestrian to be recognized avoids the situation that the feature of the occluder is used as the feature of the pedestrian to be recognized because a certain body part of the pedestrian to be recognized is blocked, ensuring the extraction The accuracy rate of the first local feature of each body part of the pedestrian to be identified; on the other hand, the first local feature and the first global feature of each body part of the pedestrian to be identified are performed for the first time fusion to obtain the second local features of each of the body parts of the pedestrian to be identified, so that the first local features of each body part are more accurate; finally, the first local features of the multiple body parts of the pedestrian to be identified are The two local features are input into the pre-trained pedestrian re-identification model. Since the pedestrian re-identification model contains multiple channel attention modules and multiple position attention modules, each body part of each pedestrian sample is The fourth local features of the input to the channel attention module and the position attention module respectively, weighted according to the weight of each body part's posture weight, so that the corresponding channel attention results and position attention results of each body part are more accurate , thus improving the accuracy of pedestrian re-identification.
实施例三Embodiment three
参阅图3所示,为本申请实施例三提供的电子设备的结构示意图。在本申请较佳实施例中,所述电子设备3包括存储器31、至少一个处理器32、至少一条通信总线33及收发器34。Referring to FIG. 3 , it is a schematic structural diagram of an electronic device provided by Embodiment 3 of the present application. In a preferred embodiment of the present application, the electronic device 3 includes a memory 31 , at least one processor 32 , at least one communication bus 33 and a transceiver 34 .
本领域技术人员应该了解,图3示出的电子设备的结构并不构成本申请实施例的限定,既可以是总线型结构,也可以是星形结构,所述电子设备3还可以包括比图示更多或更少的其他硬件或者软件,或者不同的部件布置。Those skilled in the art should understand that the structure of the electronic device shown in Figure 3 does not constitute a limitation of the embodiment of the present application, it can be a bus structure or a star structure, and the electronic device 3 can also include a ratio diagram more or less other hardware or software, or a different arrangement of components.
在一些实施例中,所述电子设备3是一种能够按照事先设定或存储的指令,自动进行数值计算和/或信息处理的电子设备,其硬件包括但不限于微处理器、专用集成电路、可编程门阵列、数字处理器及嵌入式设备等。所述电子设备3还可包括客户设备,所述客户设备包括但不限于任何一种可与客户通过键盘、鼠标、遥控器、触摸板或声控设备等方式进行人机交互的电子产品,例如,个人计算机、平板电脑、智能手机、数码相机等。In some embodiments, the electronic device 3 is an electronic device that can automatically perform numerical calculation and/or information processing according to preset or stored instructions, and its hardware includes but not limited to microprocessors, application-specific integrated circuits , programmable gate arrays, digital processors and embedded devices, etc. The electronic device 3 may also include a client device, which includes but is not limited to any electronic product that can interact with the client through a keyboard, mouse, remote control, touch pad, or voice-activated device, for example, Personal computers, tablets, smartphones, digital cameras, etc.
需要说明的是,所述电子设备3仅为举例,其他现有的或今后可能出现的电子产品如可适应于本申请,也应包含在本申请的保护范围以内,并以引用方式包含于此。It should be noted that the electronic device 3 is only an example, and other existing or future electronic products that can be adapted to this application should also be included in the scope of protection of this application, and are included here by reference .
在一些实施例中,所述存储器31用于存储程序代码和各种数据,例如安装在所述电子设备3中的行人重识别装置20,并在电子设备3的运行过程中实现高速、自动地完成程序或数据的存取。所述存储器31包括非易失性存储器和易失性存储器,比如只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable Read-Only Memory,PROM)、可擦除可编程只读存储器(Erasable Programmable Read-Only Memory,EPROM)、一次可编程只读存储器(One-time Programmable Read-Only Memory,OTPROM)、电子擦除式可复写只读存储器(Electrically-Erasable Programmable Read-Only Memory,EEPROM)、只读光盘(Compact Disc Read-Only Memory,CD-ROM)或其他光盘存储器、磁盘存储器、磁带存储器、或者能够用于携带或存储数据的计算机可读的任何其他介质。In some embodiments, the memory 31 is used to store program codes and various data, such as the pedestrian re-identification device 20 installed in the electronic device 3, and realize high-speed and automatic recognition during the operation of the electronic device 3 Complete program or data access. Described memory 31 comprises nonvolatile memory and volatile memory, such as read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable Read-Only Memory, PROM), erasable programmable only memory Read-Only Memory (Erasable Programmable Read-Only Memory, EPROM), One-time Programmable Read-Only Memory (OTPROM), Electronically Erasable Programmable Read-Only Memory (Electrically-Erasable Programmable Read-Only Memory , EEPROM), CD-ROM (Compact Disc Read-Only Memory, CD-ROM) or other optical disk storage, disk storage, tape storage, or any other computer-readable medium that can be used to carry or store data.
在一些实施例中,所述至少一个处理器32可以由集成电路组成,例如可以由单个封装的集成电路所组成,也可以是由多个相同功能或不同功能封装的集成电路所组成,包括一个或者多个中央处理器(Central Processing unit,CPU)、微处理器、数字处理芯片、图形处理器及各种控制芯片的组合等。所述至少一个处理器32是所述电子设备3的控制核心(Control Unit),利用各种接口和线路连接整个电子设备3的各个部件,通过运行或执行存储在所述存储器31内的程序或者模块,以及调用存储在所述存储器31内的数据,以执行电子设备3的各种功能和处理数据。In some embodiments, the at least one processor 32 may be composed of an integrated circuit, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits with the same function or different functions packaged, including a Or a combination of multiple central processing units (Central Processing unit, CPU), microprocessors, digital processing chips, graphics processors, and various control chips. The at least one processor 32 is the control core (Control Unit) of the electronic device 3, and uses various interfaces and lines to connect the various components of the entire electronic device 3, by running or executing programs stored in the memory 31 or module, and call the data stored in the memory 31 to execute various functions of the electronic device 3 and process data.
在一些实施例中,所述至少一条通信总线33被设置为实现所述存储器31以及所述至少一个处理器32等之间的连接通信。In some embodiments, the at least one communication bus 33 is configured to realize connection and communication between the memory 31 and the at least one processor 32 and so on.
尽管未示出,所述电子设备3还可以包括给各个部件供电的电源(比如电池),可选的,电源可以通过电源管理装置与所述至少一个处理器32逻辑相连,从而通过电源管理装置实现管理充电、放电、以及功耗管理等功能。电源还可以包括一个或一个以上的直流或交流电源、再充电装置、电源故障检测电路、电源转换器或者逆变器、电源状态指示器等任意组件。所述电子设备3还可以包括多种传感器、蓝牙模块、Wi-Fi模块等,在此不再赘述。Although not shown, the electronic device 3 may also include a power supply (such as a battery) for supplying power to various components. Optionally, the power supply may be logically connected to the at least one processor 32 through a power management device, thereby Realize the functions of managing charging, discharging, and power consumption management. The power supply may also include one or more DC or AC power supplies, recharging devices, power failure detection circuits, power converters or inverters, power status indicators and other arbitrary components. The electronic device 3 may also include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.
应该了解,所述实施例仅为说明之用,在专利申请范围上并不受此结构的限制。It should be understood that the embodiments are only for illustration, and are not limited by the structure in the scope of the patent application.
上述以软件功能模块的形式实现的集成的单元,可以存储在一个计算机可读取存储介质中。上述软件功能模块存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,电子设备,或者网络设备等)或处理器(processor)执行本申请各个实施例所述方法的部分。The above-mentioned integrated units implemented in the form of software function modules can be stored in a computer-readable storage medium. The above-mentioned software function modules are stored in a storage medium, and include several instructions to make a computer device (which may be a personal computer, electronic device, or network device, etc.) or a processor (processor) execute the methods described in various embodiments of the present application part.
在进一步的实施例中,结合图2,所述至少一个处理器32可执行所述电子设备3的操作装置以及安装的各类应用程序(如所述的行人重识别装置20)、程序代码等,例如,上述的各个模块。In a further embodiment, referring to FIG. 2 , the at least one processor 32 can execute the operating device of the electronic device 3 and various installed applications (such as the pedestrian re-identification device 20 ), program codes, etc. , for example, the various modules mentioned above.
所述存储器31中存储有程序代码,且所述至少一个处理器32可调用所述存储器31中存储的程序代码以执行相关的功能。例如,图2中所述的各个模块是存储在所述存储器31中的程序代码,并由所述至少一个处理器32所执行,从而实现所述各个模块的功能以达到行人重识别的目的。Program codes are stored in the memory 31 , and the at least one processor 32 can invoke the program codes stored in the memory 31 to execute related functions. For example, the various modules described in FIG. 2 are program codes stored in the memory 31 and executed by the at least one processor 32, so as to implement the functions of the various modules to achieve the purpose of pedestrian re-identification.
在本申请的一个实施例中,所述存储器31存储多个计算机可读指令,所述多个计算机可读指令被所述至少一个处理器32所执行以实现行人重识别的功能。In one embodiment of the present application, the memory 31 stores a plurality of computer-readable instructions, and the plurality of computer-readable instructions are executed by the at least one processor 32 to implement the pedestrian re-identification function.
示例性的,所述程序代码可以被分割成一个或多个模块/单元,所述一个或者多个模块/单元被存储在所述存储器31中,并由所述处理器32执行,以完成本申请。所述一个或多个模块/单元可以是能够完成特定功能的一系列计算机可读指令段,该指令段用于描述所述计算机程序在所述电子设备3中的执行过程。例如,所述程序代码可以被分割成获取模块201、第一输入模块202、融合模块203及第二输入模块204。Exemplarily, the program code may be divided into one or more modules/units, and the one or more modules/units are stored in the memory 31 and executed by the processor 32 to complete this Apply. The one or more modules/units may be a series of computer-readable instruction segments capable of accomplishing specific functions, and the instruction segments are used to describe the execution process of the computer program in the electronic device 3 . For example, the program code can be divided into an acquisition module 201 , a first input module 202 , a fusion module 203 and a second input module 204 .
具体地,所述至少一个处理器32对上述指令的具体实现方法可参考图1对应实施例中相关步骤的描述,在此不赘述。Specifically, for the specific implementation method of the above instructions by the at least one processor 32, reference may be made to the description of relevant steps in the embodiment corresponding to FIG. 1 , and details are not repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In the several embodiments provided in this application, it should be understood that the disclosed devices and methods may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the modules is only a logical function division, and there may be other division methods in actual implementation.
进一步地,所述计算机可读存储介质可以是非易失性,也可以是易失性Further, the computer-readable storage medium may be non-volatile or volatile
进一步地,所述计算机可读存储介质可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序等;存储数据区可存储根据区块链节点的使用所创建的数据等。Further, the computer-readable storage medium may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function, etc.; The data created using the node, etc.
本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。The blockchain referred to in this application is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain (Blockchain), essentially a decentralized database, is a series of data blocks associated with each other using cryptographic methods. Each data block contains a batch of network transaction information, which is used to verify its Validity of information (anti-counterfeiting) and generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理单元,既可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, and may be located in one place or distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本申请各个实施例中的各功能模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能模块的形式实现。In addition, each functional module in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware, or in the form of hardware plus software function modules.
对于本领域技术人员而言,显然本申请不限于上述示范性实施例的细节,而且在不背离本申请的精神或基本特征的情况下,能够以其他的具体形式实现本申请。因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本申请的范围由所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本申请内。不应将权利要求中的任何附图标记视为限制所涉及的权利要求。此外,显然“包括”一词不排除其他单元或,单数不排除复数。本申请中陈述的多个单元或装置也可以由一个 单元或装置通过软件或者硬件来实现。第一,第二等词语用来表示名称,而并不表示任何特定的顺序。It will be apparent to those skilled in the art that the present application is not limited to the details of the exemplary embodiments described above, but that the present application can be implemented in other specific forms without departing from the spirit or essential characteristics of the present application. Therefore, the embodiments should be regarded as exemplary and not restrictive in all points of view, and the scope of the application is defined by the appended claims rather than the foregoing description, and it is intended that the scope of the present application be defined by the appended claims rather than by the foregoing description. All changes within the meaning and range of equivalents of the elements are embraced in this application. Any reference sign in a claim should not be construed as limiting the claim concerned. Furthermore, it is clear that the word "comprising" does not exclude other elements or the singular does not exclude the plural. A plurality of units or means stated in this application can also be realized by software or hardware by one unit or means. The words first, second, etc. are used to denote names and do not imply any particular order.
最后应说明的是,以上实施例仅用以说明本申请的技术方案而非限制,尽管参照较佳实施例对本申请进行了详细说明,本领域的普通技术人员应当理解,可以对本申请的技术方案进行修改或等同替换,而不脱离本申请技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present application without limitation. Although the present application has been described in detail with reference to the preferred embodiments, those skilled in the art should understand that the technical solutions of the present application can be Make modifications or equivalent replacements without departing from the spirit and scope of the technical solutions of the present application.

Claims (20)

  1. 一种行人重识别方法,其中,所述方法包括:A pedestrian re-identification method, wherein the method includes:
    获取待识别行人的第一图像序列,将所述第一图像序列输入至预设的姿态识别网络中,得到所述待识别行人的每个身体部位的第一局部特征,其中,所述待识别行人包含有多个身体部位;Acquiring a first image sequence of the pedestrian to be identified, inputting the first image sequence into a preset gesture recognition network, and obtaining the first local features of each body part of the pedestrian to be identified, wherein the to-be-identified Pedestrians contain multiple body parts;
    将所述第一图像序列输入至预设的多层卷积神经网络中,得到所述待识别行人的第一全局特征;Inputting the first image sequence into a preset multi-layer convolutional neural network to obtain the first global feature of the pedestrian to be identified;
    对所述待识别行人的每个所述身体部位的第一局部特征与第一全局特征进行第一次融合,得到所述待识别行人的每个所述身体部位的第二局部特征;performing a first fusion of the first local features of each of the body parts of the pedestrian to be identified with the first global feature to obtain a second local feature of each of the body parts of the pedestrian to be identified;
    将所述待识别行人的每个所述身体部位的第二局部特征输入至预先训练好的行人重识别模型中,并接收所述行人重识别模型输出的行人重识别结果,其中,所述行人重识别模型中包含有多个通道注意力模块和多个位置注意力模块。inputting the second local features of each body part of the pedestrian to be identified into a pre-trained pedestrian re-identification model, and receiving a pedestrian re-identification result output by the pedestrian re-identification model, wherein the pedestrian The re-identification model contains multiple channel attention modules and multiple position attention modules.
  2. 如权利要求1所述的行人重识别方法,其中,所述将所述第一图像序列输入至预设的姿态识别网络中,得到所述待识别行人的每个身体部位的第一局部特征包括:The pedestrian re-identification method according to claim 1, wherein said inputting said first image sequence into a preset gesture recognition network to obtain the first local features of each body part of said pedestrian to be recognized comprises :
    将所述第一图像序列输入至预设的姿态识别网络中,在所述预设的姿态识别网络中检测所述第一图像序列中的每张图像进行所述待识别行人的身体部位提取;The first image sequence is input into a preset gesture recognition network, and each image in the first image sequence is detected in the preset gesture recognition network to extract the body parts of the pedestrian to be recognized;
    获取所述待识别行人的每个身体部位的第一位置坐标和第一置信度;Acquiring the first position coordinates and the first confidence level of each body part of the pedestrian to be identified;
    对所述待识别行人的每个身体部位的第一位置坐标和第一置信度进行向量转换,得到所述待识别行人的对应身体部位的第一局部特征。Performing vector transformation on the first position coordinates and the first confidence level of each body part of the pedestrian to be identified, to obtain the first local feature of the corresponding body part of the pedestrian to be identified.
  3. 如权利要求1所述的行人重识别方法,其中,所述对所述待识别行人的每个所述身体部位的第一局部特征与第一全局特征进行第一次融合,得到所述待识别行人的每个所述身体部位的第二局部特征包括:The pedestrian re-identification method according to claim 1, wherein the first local feature and the first global feature of each of the body parts of the pedestrian to be identified are fused for the first time to obtain the The second local features of each said body part of the pedestrian include:
    计算所述待识别行人的多个身体部位中的每个身体部位的第一局部特征与所述待识别行人的第一全局特征之间的乘积,得到所述待识别行人的对应身体部位的第二局部特征。calculating the product of the first local feature of each body part of the plurality of body parts of the pedestrian to be identified and the first global feature of the pedestrian to be identified, to obtain the first feature of the corresponding body part of the pedestrian to be identified Two local features.
  4. 如权利要求1所述的行人重识别方法,其中,在所述将所述待识别行人的每个所述身体部位的第二局部特征输入至预先训练好的行人重识别模型中之前,所述方法还包括:The pedestrian re-identification method according to claim 1, wherein, before inputting the second local features of each of the body parts of the pedestrian to be recognized into a pre-trained pedestrian re-identification model, the Methods also include:
    获取每个行人样本的第二图像序列,其中,每个行人样本包含有多个身体部位;Obtaining a second image sequence of each pedestrian sample, where each pedestrian sample contains multiple body parts;
    将每个行人样本的第二图像序列输入至预设的姿态识别网络中,得到每个行人样本的多个身体部位的第二位置坐标和第二置信度;Inputting the second image sequence of each pedestrian sample into a preset posture recognition network to obtain second position coordinates and second confidence levels of multiple body parts of each pedestrian sample;
    根据每个行人样本的每个身体部位的第二位置坐标和第二置信度,获取对应身体部位的第三局部特征;Obtaining a third local feature of the corresponding body part according to the second position coordinates and the second confidence level of each body part of each pedestrian sample;
    将每个行人样本的第二图像序列输入至预设的多层卷积神经网络中,得到每个行人样本的第二全局特征;Inputting the second image sequence of each pedestrian sample into a preset multi-layer convolutional neural network to obtain the second global feature of each pedestrian sample;
    对每个行人样本的每个所述身体部位的第三局部特征与第一全局特征进行第一次融合,得到每个行人样本的每个所述身体部位的第四局部特征;Performing a first fusion of the third local feature of each body part of each pedestrian sample with the first global feature to obtain a fourth local feature of each body part of each pedestrian sample;
    将所述每个行人样本的多个身体部位的第四局部特征作为第一样本数据集;Taking the fourth local features of multiple body parts of each pedestrian sample as the first sample data set;
    将所述第一样本数据集分别输入至所述通道注意力模块和所述位置注意力模块中进行处理,得到每个行人样本的目标通道注意力结果和目标位置注意力结果;Input the first sample data set into the channel attention module and the position attention module respectively for processing, and obtain the target channel attention result and the target position attention result of each pedestrian sample;
    对每个行人样本的目标通道注意力结果、目标位置注意力结果和第二全局特征进行第二次融合,得到每个行人样本的第三全局特征;Perform a second fusion of the target channel attention result, target position attention result and the second global feature of each pedestrian sample to obtain the third global feature of each pedestrian sample;
    将所述多个行人样本的多个第三全局特征作为第二样本数据集;using multiple third global features of the multiple pedestrian samples as a second sample data set;
    从所述第二样本数据集划分出训练集和测试集;dividing a training set and a test set from the second sample data set;
    将所述训练集输入预设神经网络中进行训练,得到行人重识别模型;Inputting the training set into a preset neural network for training to obtain a pedestrian re-identification model;
    将所述测试集输入至所述行人重识别模型中进行测试,并计算测试通过率;Input the test set into the pedestrian re-identification model for testing, and calculate the pass rate of the test;
    若所述测试通过率大于或者等于预设通过率阈值,则确定所述行人重识别模型的训练结束;若所述测试通过率小于所述预设通过率阈值,则更新所述第二样本数据,以获取新的训练集,并将所述新的训练集输入预设神经网络中重新进行所述行人重识别模型的训练。If the test pass rate is greater than or equal to a preset pass rate threshold, it is determined that the training of the pedestrian re-identification model is over; if the test pass rate is less than the preset pass rate threshold, the second sample data is updated , to obtain a new training set, and input the new training set into the preset neural network to retrain the pedestrian re-identification model.
  5. 如权利要求4所述的行人重识别方法,其中,所述将所述第一样本数据集分别输入至所述通道注意力模块和所述位置注意力模块中进行处理,得到每个行人样本的目标通道注意力结果和目标位置注意力结果包括:The pedestrian re-identification method according to claim 4, wherein said first sample data set is respectively input into said channel attention module and said position attention module for processing to obtain each pedestrian sample The target channel attention results and target position attention results of , include:
    从所述第一样本数据集中获取每个行人样本的每个身体部位的第四局部特征,其中,每个身体部位对应一个通道注意力模块和一个位置注意力模块;Obtaining the fourth local features of each body part of each pedestrian sample from the first sample data set, wherein each body part corresponds to a channel attention module and a position attention module;
    将每个行人样本的多个身体部位的多个第四局部特征分别输入至对应的通道注意力模块和对应的位置注意力模块中进行加权处理,得到每个行人样本的每个身体部位的通道注意力结果和位置注意力结果;A plurality of fourth local features of multiple body parts of each pedestrian sample are respectively input into the corresponding channel attention module and the corresponding position attention module for weighting processing, and the channel of each body part of each pedestrian sample is obtained attention results and position attention results;
    对每个行人样本的多个身体部位的多个通道注意力结果求第一平均值,将所述第一平均值确定为对应行人的目标通道注意力结果,及对每个行人样本的多个身体部位的多个位置注意力结果求第二平均值,将所述第二平均值确定为对应行人的目标位置注意力结果。A first average value is calculated for multiple channel attention results of multiple body parts of each pedestrian sample, and the first average value is determined as the target channel attention result of the corresponding pedestrian, and multiple channel attention results for each pedestrian sample A second average value is calculated for multiple position attention results of the body parts, and the second average value is determined as the target position attention result corresponding to the pedestrian.
  6. 如权利要求4所述的行人重识别方法,其中,所述对每个行人样本的目标通道注意力结果、目标位置注意力结果和第二全局特征进行第二次融合,得到每个行人样本的第三全局特征包括:The pedestrian re-identification method as claimed in claim 4, wherein, the target channel attention result, the target position attention result and the second global feature of each pedestrian sample are fused for the second time to obtain the pedestrian sample's The third global feature includes:
    计算每个行人样本的目标通道注意力结果、目标位置注意力结果和第二全局特征之间的乘积,得到每个行人样本的第三全局特征。Compute the product between the target channel attention result, the target position attention result and the second global feature for each pedestrian sample to obtain the third global feature for each pedestrian sample.
  7. 如权利要求1所述的行人重识别方法,其中,所述将所述第一图像序列输入至预设的多层卷积神经网络中,得到所述待识别行人的第一全局特征包括:The pedestrian re-identification method according to claim 1, wherein said inputting said first image sequence into a preset multi-layer convolutional neural network to obtain the first global feature of said pedestrian to be identified comprises:
    将所述第一图像序列输入至预设的深度残差网络ResNet50中进行人体检测,得到所述待识别行人的第一全局特征。The first image sequence is input into a preset deep residual network ResNet50 for human detection, and the first global feature of the pedestrian to be recognized is obtained.
  8. 一种电子设备,其中,所述电子设备包括存储器及处理器,所述存储器用于存储至少一个计算机可读指令,所述处理器用于执行所述至少一个计算机可读指令以实现以下步骤:An electronic device, wherein the electronic device includes a memory and a processor, the memory is used to store at least one computer-readable instruction, and the processor is used to execute the at least one computer-readable instruction to implement the following steps:
    获取待识别行人的第一图像序列,将所述第一图像序列输入至预设的姿态识别网络中,得到所述待识别行人的每个身体部位的第一局部特征,其中,所述待识别行人包含有多个身体部位;Acquiring a first image sequence of the pedestrian to be identified, inputting the first image sequence into a preset gesture recognition network, and obtaining the first local features of each body part of the pedestrian to be identified, wherein the to-be-identified Pedestrians contain multiple body parts;
    将所述第一图像序列输入至预设的多层卷积神经网络中,得到所述待识别行人的第一全局特征;Inputting the first image sequence into a preset multi-layer convolutional neural network to obtain the first global feature of the pedestrian to be identified;
    对所述待识别行人的每个所述身体部位的第一局部特征与第一全局特征进行第一次融合,得到所述待识别行人的每个所述身体部位的第二局部特征;performing a first fusion of the first local features of each of the body parts of the pedestrian to be identified with the first global feature to obtain a second local feature of each of the body parts of the pedestrian to be identified;
    将所述待识别行人的每个所述身体部位的第二局部特征输入至预先训练好的行人重识别模型中,并接收所述行人重识别模型输出的行人重识别结果,其中,所述行人重识别模型中包含有多个通道注意力模块和多个位置注意力模块。inputting the second local features of each body part of the pedestrian to be identified into a pre-trained pedestrian re-identification model, and receiving a pedestrian re-identification result output by the pedestrian re-identification model, wherein the pedestrian The re-identification model contains multiple channel attention modules and multiple position attention modules.
  9. 如权利要求8所述的电子设备,其中,所述处理器执行所述至少一个计算机可读指令以实现所述将所述第一图像序列输入至预设的姿态识别网络中,得到所述待识别行人的每个身体部位的第一局部特征时,具体包括:The electronic device according to claim 8, wherein the processor executes the at least one computer-readable instruction to implement the input of the first image sequence into a preset gesture recognition network, and obtain the to-be When identifying the first local features of each body part of a pedestrian, it specifically includes:
    将所述第一图像序列输入至预设的姿态识别网络中,在所述预设的姿态识别网络中检测所述第一图像序列中的每张图像进行所述待识别行人的身体部位提取;The first image sequence is input into a preset gesture recognition network, and each image in the first image sequence is detected in the preset gesture recognition network to extract the body parts of the pedestrian to be recognized;
    获取所述待识别行人的每个身体部位的第一位置坐标和第一置信度;Acquiring the first position coordinates and the first confidence level of each body part of the pedestrian to be identified;
    对所述待识别行人的每个身体部位的第一位置坐标和第一置信度进行向量转换,得到所述待识别行人的对应身体部位的第一局部特征。Performing vector transformation on the first position coordinates and the first confidence level of each body part of the pedestrian to be identified, to obtain the first local feature of the corresponding body part of the pedestrian to be identified.
  10. 如权利要求8所述的电子设备,其中,所述处理器执行所述至少一个计算机可读 指令以实现所述对所述待识别行人的每个所述身体部位的第一局部特征与第一全局特征进行第一次融合,得到所述待识别行人的每个所述身体部位的第二局部特征时,具体包括:The electronic device according to claim 8, wherein said processor executes said at least one computer-readable instruction to implement said first partial feature and first When the global feature is fused for the first time to obtain the second local feature of each body part of the pedestrian to be identified, it specifically includes:
    计算所述待识别行人的多个身体部位中的每个身体部位的第一局部特征与所述待识别行人的第一全局特征之间的乘积,得到所述待识别行人的对应身体部位的第二局部特征。calculating the product of the first local feature of each body part of the plurality of body parts of the pedestrian to be identified and the first global feature of the pedestrian to be identified, to obtain the first feature of the corresponding body part of the pedestrian to be identified Two local features.
  11. 如权利要求8所述的电子设备,其中,在所述将所述待识别行人的每个所述身体部位的第二局部特征输入至预先训练好的行人重识别模型中之前,所述处理器执行所述至少一个计算机可读指令还用以实现以下步骤:The electronic device according to claim 8, wherein, before inputting the second local features of each of the body parts of the pedestrian to be identified into a pre-trained pedestrian re-identification model, the processor Executing the at least one computer readable instruction is further to:
    获取每个行人样本的第二图像序列,其中,每个行人样本包含有多个身体部位;Obtaining a second image sequence of each pedestrian sample, where each pedestrian sample contains multiple body parts;
    将每个行人样本的第二图像序列输入至预设的姿态识别网络中,得到每个行人样本的多个身体部位的第二位置坐标和第二置信度;Inputting the second image sequence of each pedestrian sample into a preset posture recognition network to obtain second position coordinates and second confidence levels of multiple body parts of each pedestrian sample;
    根据每个行人样本的每个身体部位的第二位置坐标和第二置信度,获取对应身体部位的第三局部特征;Obtaining a third local feature of the corresponding body part according to the second position coordinates and the second confidence level of each body part of each pedestrian sample;
    将每个行人样本的第二图像序列输入至预设的多层卷积神经网络中,得到每个行人样本的第二全局特征;Inputting the second image sequence of each pedestrian sample into a preset multi-layer convolutional neural network to obtain the second global feature of each pedestrian sample;
    对每个行人样本的每个所述身体部位的第三局部特征与第一全局特征进行第一次融合,得到每个行人样本的每个所述身体部位的第四局部特征;Performing a first fusion of the third local feature of each body part of each pedestrian sample with the first global feature to obtain a fourth local feature of each body part of each pedestrian sample;
    将所述每个行人样本的多个身体部位的第四局部特征作为第一样本数据集;Taking the fourth local features of multiple body parts of each pedestrian sample as the first sample data set;
    将所述第一样本数据集分别输入至所述通道注意力模块和所述位置注意力模块中进行处理,得到每个行人样本的目标通道注意力结果和目标位置注意力结果;Input the first sample data set into the channel attention module and the position attention module respectively for processing, and obtain the target channel attention result and the target position attention result of each pedestrian sample;
    对每个行人样本的目标通道注意力结果、目标位置注意力结果和第二全局特征进行第二次融合,得到每个行人样本的第三全局特征;Perform a second fusion of the target channel attention result, target position attention result and the second global feature of each pedestrian sample to obtain the third global feature of each pedestrian sample;
    将所述多个行人样本的多个第三全局特征作为第二样本数据集;using multiple third global features of the multiple pedestrian samples as a second sample data set;
    从所述第二样本数据集划分出训练集和测试集;dividing a training set and a test set from the second sample data set;
    将所述训练集输入预设神经网络中进行训练,得到行人重识别模型;Inputting the training set into a preset neural network for training to obtain a pedestrian re-identification model;
    将所述测试集输入至所述行人重识别模型中进行测试,并计算测试通过率;Input the test set into the pedestrian re-identification model for testing, and calculate the pass rate of the test;
    若所述测试通过率大于或者等于预设通过率阈值,则确定所述行人重识别模型的训练结束;若所述测试通过率小于所述预设通过率阈值,则更新所述第二样本数据,以获取新的训练集,并将所述新的训练集输入预设神经网络中重新进行所述行人重识别模型的训练。If the test pass rate is greater than or equal to a preset pass rate threshold, it is determined that the training of the pedestrian re-identification model is over; if the test pass rate is less than the preset pass rate threshold, the second sample data is updated , to obtain a new training set, and input the new training set into the preset neural network to retrain the pedestrian re-identification model.
  12. 如权利要求11所述的电子设备,其中,所述处理器执行所述至少一个计算机可读指令以实现所述将所述第一样本数据集分别输入至所述通道注意力模块和所述位置注意力模块中进行处理,得到每个行人样本的目标通道注意力结果和目标位置注意力结果时,具体包括:The electronic device of claim 11 , wherein said processor executes said at least one computer readable instruction to implement said inputting said first sample data set into said channel attention module and said When processing in the position attention module to obtain the target channel attention result and target position attention result of each pedestrian sample, it specifically includes:
    从所述第一样本数据集中获取每个行人样本的每个身体部位的第四局部特征,其中,每个身体部位对应一个通道注意力模块和一个位置注意力模块;Obtaining the fourth local features of each body part of each pedestrian sample from the first sample data set, wherein each body part corresponds to a channel attention module and a position attention module;
    将每个行人样本的多个身体部位的多个第四局部特征分别输入至对应的通道注意力模块和对应的位置注意力模块中进行加权处理,得到每个行人样本的每个身体部位的通道注意力结果和位置注意力结果;A plurality of fourth local features of multiple body parts of each pedestrian sample are respectively input into the corresponding channel attention module and the corresponding position attention module for weighting processing, and the channel of each body part of each pedestrian sample is obtained attention results and position attention results;
    对每个行人样本的多个身体部位的多个通道注意力结果求第一平均值,将所述第一平均值确定为对应行人的目标通道注意力结果,及对每个行人样本的多个身体部位的多个位置注意力结果求第二平均值,将所述第二平均值确定为对应行人的目标位置注意力结果。A first average value is calculated for multiple channel attention results of multiple body parts of each pedestrian sample, and the first average value is determined as the target channel attention result of the corresponding pedestrian, and multiple channel attention results for each pedestrian sample A second average value is calculated for multiple position attention results of the body parts, and the second average value is determined as the target position attention result corresponding to the pedestrian.
  13. 如权利要求11所述的电子设备,其中,所述处理器执行所述至少一个计算机可 读指令以实现所述对每个行人样本的目标通道注意力结果、目标位置注意力结果和第二全局特征进行第二次融合,得到每个行人样本的第三全局特征时,具体包括:The electronic device of claim 11 , wherein the processor executes the at least one computer readable instruction to implement the target channel attention result, the target location attention result and the second global When the features are fused for the second time to obtain the third global feature of each pedestrian sample, it specifically includes:
    计算每个行人样本的目标通道注意力结果、目标位置注意力结果和第二全局特征之间的乘积,得到每个行人样本的第三全局特征。Compute the product between the target channel attention result, the target position attention result and the second global feature for each pedestrian sample to obtain the third global feature for each pedestrian sample.
  14. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有至少一个计算机可读指令,所述至少一个计算机可读指令被处理器执行时实现以下步骤:A computer-readable storage medium, wherein the computer-readable storage medium stores at least one computer-readable instruction, and when the at least one computer-readable instruction is executed by a processor, the following steps are implemented:
    获取待识别行人的第一图像序列,将所述第一图像序列输入至预设的姿态识别网络中,得到所述待识别行人的每个身体部位的第一局部特征,其中,所述待识别行人包含有多个身体部位;Acquiring a first image sequence of the pedestrian to be identified, inputting the first image sequence into a preset gesture recognition network, and obtaining the first local features of each body part of the pedestrian to be identified, wherein the to-be-identified Pedestrians contain multiple body parts;
    将所述第一图像序列输入至预设的多层卷积神经网络中,得到所述待识别行人的第一全局特征;Inputting the first image sequence into a preset multi-layer convolutional neural network to obtain the first global feature of the pedestrian to be identified;
    对所述待识别行人的每个所述身体部位的第一局部特征与第一全局特征进行第一次融合,得到所述待识别行人的每个所述身体部位的第二局部特征;performing a first fusion of the first local features of each of the body parts of the pedestrian to be identified with the first global feature to obtain a second local feature of each of the body parts of the pedestrian to be identified;
    将所述待识别行人的每个所述身体部位的第二局部特征输入至预先训练好的行人重识别模型中,并接收所述行人重识别模型输出的行人重识别结果,其中,所述行人重识别模型中包含有多个通道注意力模块和多个位置注意力模块。inputting the second local features of each body part of the pedestrian to be identified into a pre-trained pedestrian re-identification model, and receiving a pedestrian re-identification result output by the pedestrian re-identification model, wherein the pedestrian The re-identification model contains multiple channel attention modules and multiple position attention modules.
  15. 如权利要求14所述的存储介质,其中,所述至少一个计算机可读指令被所述处理器执行以实现所述将所述第一图像序列输入至预设的姿态识别网络中,得到所述待识别行人的每个身体部位的第一局部特征时,具体包括:The storage medium according to claim 14, wherein said at least one computer-readable instruction is executed by said processor to implement said inputting said first image sequence into a preset gesture recognition network to obtain said When the first local feature of each body part of the pedestrian is to be identified, it specifically includes:
    将所述第一图像序列输入至预设的姿态识别网络中,在所述预设的姿态识别网络中检测所述第一图像序列中的每张图像进行所述待识别行人的身体部位提取;The first image sequence is input into a preset gesture recognition network, and each image in the first image sequence is detected in the preset gesture recognition network to extract the body parts of the pedestrian to be recognized;
    获取所述待识别行人的每个身体部位的第一位置坐标和第一置信度;Acquiring the first position coordinates and the first confidence level of each body part of the pedestrian to be identified;
    对所述待识别行人的每个身体部位的第一位置坐标和第一置信度进行向量转换,得到所述待识别行人的对应身体部位的第一局部特征。Performing vector transformation on the first position coordinates and the first confidence level of each body part of the pedestrian to be identified, to obtain the first local feature of the corresponding body part of the pedestrian to be identified.
  16. 如权利要求14所述的存储介质,其中,所述至少一个计算机可读指令被所述处理器执行以实现所述对所述待识别行人的每个所述身体部位的第一局部特征与第一全局特征进行第一次融合,得到所述待识别行人的每个所述身体部位的第二局部特征时,具体包括:The storage medium according to claim 14, wherein said at least one computer readable instruction is executed by said processor to implement said first partial feature and second partial feature of each said body part of said pedestrian to be identified When a global feature is fused for the first time to obtain the second local feature of each body part of the pedestrian to be identified, it specifically includes:
    计算所述待识别行人的多个身体部位中的每个身体部位的第一局部特征与所述待识别行人的第一全局特征之间的乘积,得到所述待识别行人的对应身体部位的第二局部特征。calculating the product of the first local feature of each body part of the plurality of body parts of the pedestrian to be identified and the first global feature of the pedestrian to be identified, to obtain the first feature of the corresponding body part of the pedestrian to be identified Two local features.
  17. 如权利要求14所述的存储介质,其中,在所述将所述待识别行人的每个所述身体部位的第二局部特征输入至预先训练好的行人重识别模型中之前,所述至少一个计算机可读指令被处理器执行时还用以实现以下步骤:The storage medium according to claim 14, wherein, before inputting the second local features of each of the body parts of the pedestrian to be recognized into a pre-trained pedestrian re-identification model, the at least one The computer readable instructions are also used to implement the following steps when executed by the processor:
    获取每个行人样本的第二图像序列,其中,每个行人样本包含有多个身体部位;Obtaining a second image sequence of each pedestrian sample, where each pedestrian sample contains multiple body parts;
    将每个行人样本的第二图像序列输入至预设的姿态识别网络中,得到每个行人样本的多个身体部位的第二位置坐标和第二置信度;Inputting the second image sequence of each pedestrian sample into a preset posture recognition network to obtain second position coordinates and second confidence levels of multiple body parts of each pedestrian sample;
    根据每个行人样本的每个身体部位的第二位置坐标和第二置信度,获取对应身体部位的第三局部特征;Obtaining a third local feature of the corresponding body part according to the second position coordinates and the second confidence level of each body part of each pedestrian sample;
    将每个行人样本的第二图像序列输入至预设的多层卷积神经网络中,得到每个行人样本的第二全局特征;Inputting the second image sequence of each pedestrian sample into a preset multi-layer convolutional neural network to obtain the second global feature of each pedestrian sample;
    对每个行人样本的每个所述身体部位的第三局部特征与第一全局特征进行第一次融合,得到每个行人样本的每个所述身体部位的第四局部特征;Performing a first fusion of the third local feature of each body part of each pedestrian sample with the first global feature to obtain a fourth local feature of each body part of each pedestrian sample;
    将所述每个行人样本的多个身体部位的第四局部特征作为第一样本数据集;Taking the fourth local features of multiple body parts of each pedestrian sample as the first sample data set;
    将所述第一样本数据集分别输入至所述通道注意力模块和所述位置注意力模块中进 行处理,得到每个行人样本的目标通道注意力结果和目标位置注意力结果;The first sample data set is input into the channel attention module and the position attention module respectively for processing, and the target channel attention result and the target position attention result of each pedestrian sample are obtained;
    对每个行人样本的目标通道注意力结果、目标位置注意力结果和第二全局特征进行第二次融合,得到每个行人样本的第三全局特征;Perform a second fusion of the target channel attention result, target position attention result and the second global feature of each pedestrian sample to obtain the third global feature of each pedestrian sample;
    将所述多个行人样本的多个第三全局特征作为第二样本数据集;using multiple third global features of the multiple pedestrian samples as a second sample data set;
    从所述第二样本数据集划分出训练集和测试集;dividing a training set and a test set from the second sample data set;
    将所述训练集输入预设神经网络中进行训练,得到行人重识别模型;Inputting the training set into a preset neural network for training to obtain a pedestrian re-identification model;
    将所述测试集输入至所述行人重识别模型中进行测试,并计算测试通过率;Input the test set into the pedestrian re-identification model for testing, and calculate the pass rate of the test;
    若所述测试通过率大于或者等于预设通过率阈值,则确定所述行人重识别模型的训练结束;若所述测试通过率小于所述预设通过率阈值,则更新所述第二样本数据,以获取新的训练集,并将所述新的训练集输入预设神经网络中重新进行所述行人重识别模型的训练。If the test pass rate is greater than or equal to a preset pass rate threshold, it is determined that the training of the pedestrian re-identification model is over; if the test pass rate is less than the preset pass rate threshold, the second sample data is updated , to obtain a new training set, and input the new training set into the preset neural network to retrain the pedestrian re-identification model.
  18. 如权利要求17所述的存储介质,其中,所述至少一个计算机可读指令被所述处理器执行以实现所述将所述第一样本数据集分别输入至所述通道注意力模块和所述位置注意力模块中进行处理,得到每个行人样本的目标通道注意力结果和目标位置注意力结果时,具体包括:The storage medium of claim 17, wherein the at least one computer readable instruction is executed by the processor to implement the inputting the first sample data set to the channel attention module and the channel attention module, respectively. When processing in the position attention module mentioned above to obtain the target channel attention result and target position attention result of each pedestrian sample, it specifically includes:
    从所述第一样本数据集中获取每个行人样本的每个身体部位的第四局部特征,其中,每个身体部位对应一个通道注意力模块和一个位置注意力模块;Obtaining the fourth local features of each body part of each pedestrian sample from the first sample data set, wherein each body part corresponds to a channel attention module and a position attention module;
    将每个行人样本的多个身体部位的多个第四局部特征分别输入至对应的通道注意力模块和对应的位置注意力模块中进行加权处理,得到每个行人样本的每个身体部位的通道注意力结果和位置注意力结果;A plurality of fourth local features of multiple body parts of each pedestrian sample are respectively input into the corresponding channel attention module and the corresponding position attention module for weighting processing, and the channel of each body part of each pedestrian sample is obtained attention results and position attention results;
    对每个行人样本的多个身体部位的多个通道注意力结果求第一平均值,将所述第一平均值确定为对应行人的目标通道注意力结果,及对每个行人样本的多个身体部位的多个位置注意力结果求第二平均值,将所述第二平均值确定为对应行人的目标位置注意力结果。A first average value is calculated for multiple channel attention results of multiple body parts of each pedestrian sample, and the first average value is determined as the target channel attention result of the corresponding pedestrian, and multiple channel attention results for each pedestrian sample A second average value is calculated for multiple position attention results of the body parts, and the second average value is determined as the target position attention result corresponding to the pedestrian.
  19. 如权利要求17所述的存储介质,其中,所述至少一个计算机可读指令被所述处理器执行以实现所述对每个行人样本的目标通道注意力结果、目标位置注意力结果和第二全局特征进行第二次融合,得到每个行人样本的第三全局特征时,具体包括:The storage medium of claim 17, wherein said at least one computer readable instruction is executed by said processor to implement said target channel attention result, target location attention result and second When the global feature is fused for the second time to obtain the third global feature of each pedestrian sample, it specifically includes:
    计算每个行人样本的目标通道注意力结果、目标位置注意力结果和第二全局特征之间的乘积,得到每个行人样本的第三全局特征。Compute the product between the target channel attention result, the target position attention result and the second global feature for each pedestrian sample to obtain the third global feature for each pedestrian sample.
  20. 一种行人重识别装置,其中,所述装置包括:A pedestrian re-identification device, wherein the device includes:
    获取模块,用于获取待识别行人的第一图像序列,将所述第一图像序列输入至预设的姿态识别网络中,得到所述待识别行人的每个身体部位的第一局部特征,其中,所述待识别行人包含有多个身体部位;An acquisition module, configured to acquire a first image sequence of a pedestrian to be identified, input the first image sequence into a preset gesture recognition network, and obtain a first local feature of each body part of the pedestrian to be identified, wherein , the pedestrian to be identified contains multiple body parts;
    第一输入模块,用于将所述第一图像序列输入至预设的多层卷积神经网络中,得到所述待识别行人的第一全局特征;A first input module, configured to input the first image sequence into a preset multi-layer convolutional neural network to obtain the first global feature of the pedestrian to be identified;
    融合模块,用于对所述待识别行人的每个所述身体部位的第一局部特征与第一全局特征进行第一次融合,得到所述待识别行人的每个所述身体部位的第二局部特征;A fusion module, configured to fuse the first local features of each body part of the pedestrian to be identified with the first global feature for the first time to obtain the second feature of each body part of the pedestrian to be identified. local features;
    第二输入模块,用于将所述待识别行人的每个所述身体部位的第二局部特征输入至预先训练好的行人重识别模型中,并接收所述行人重识别模型输出的行人重识别结果,其中,所述行人重识别模型中包含有多个通道注意力模块和多个位置注意力模块。The second input module is configured to input the second local features of each body part of the pedestrian to be identified into a pre-trained pedestrian re-identification model, and receive the pedestrian re-identification output from the pedestrian re-identification model As a result, the pedestrian re-identification model includes multiple channel attention modules and multiple position attention modules.
PCT/CN2022/089991 2022-01-12 2022-04-28 Person re-identification method and apparatus, electronic device and storage medium WO2023134071A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210033877.5 2022-01-12
CN202210033877.5A CN114359970A (en) 2022-01-12 2022-01-12 Pedestrian re-identification method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2023134071A1 true WO2023134071A1 (en) 2023-07-20

Family

ID=81109084

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/089991 WO2023134071A1 (en) 2022-01-12 2022-04-28 Person re-identification method and apparatus, electronic device and storage medium

Country Status (2)

Country Link
CN (1) CN114359970A (en)
WO (1) WO2023134071A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116665309A (en) * 2023-07-26 2023-08-29 山东睿芯半导体科技有限公司 Method, device, chip and terminal for identifying walking gesture features

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114359970A (en) * 2022-01-12 2022-04-15 平安科技(深圳)有限公司 Pedestrian re-identification method and device, electronic equipment and storage medium
CN114783003B (en) 2022-06-23 2022-09-20 之江实验室 Pedestrian re-identification method and device based on local feature attention
CN115527168A (en) * 2022-10-08 2022-12-27 通号通信信息集团有限公司 Pedestrian re-identification method, storage medium, database editing method, and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170256057A1 (en) * 2016-03-04 2017-09-07 Disney Enterprises, Inc. Systems and Methods for Re-Identifying Objects in Images
CN110008913A (en) * 2019-04-08 2019-07-12 南京工业大学 The pedestrian's recognition methods again merged based on Attitude estimation with viewpoint mechanism
CN110543841A (en) * 2019-08-21 2019-12-06 中科视语(北京)科技有限公司 Pedestrian re-identification method, system, electronic device and medium
CN111582154A (en) * 2020-05-07 2020-08-25 浙江工商大学 Pedestrian re-identification method based on multitask skeleton posture division component
CN114359970A (en) * 2022-01-12 2022-04-15 平安科技(深圳)有限公司 Pedestrian re-identification method and device, electronic equipment and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108764065B (en) * 2018-05-04 2020-12-08 华中科技大学 Pedestrian re-recognition feature fusion aided learning method
CN110175527B (en) * 2019-04-29 2022-03-25 北京百度网讯科技有限公司 Pedestrian re-identification method and device, computer equipment and readable medium
CN111401265B (en) * 2020-03-19 2020-12-25 重庆紫光华山智安科技有限公司 Pedestrian re-identification method and device, electronic equipment and computer-readable storage medium
CN113449671A (en) * 2021-07-08 2021-09-28 北京科技大学 Multi-scale and multi-feature fusion pedestrian re-identification method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170256057A1 (en) * 2016-03-04 2017-09-07 Disney Enterprises, Inc. Systems and Methods for Re-Identifying Objects in Images
CN110008913A (en) * 2019-04-08 2019-07-12 南京工业大学 The pedestrian's recognition methods again merged based on Attitude estimation with viewpoint mechanism
CN110543841A (en) * 2019-08-21 2019-12-06 中科视语(北京)科技有限公司 Pedestrian re-identification method, system, electronic device and medium
CN111582154A (en) * 2020-05-07 2020-08-25 浙江工商大学 Pedestrian re-identification method based on multitask skeleton posture division component
CN114359970A (en) * 2022-01-12 2022-04-15 平安科技(深圳)有限公司 Pedestrian re-identification method and device, electronic equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HUO, LIJUAN: "Research on Re-identification of Occluded Pedestrians Based on Self-attention and Human Body Pose", WANFANG, 16 December 2021 (2021-12-16) *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116665309A (en) * 2023-07-26 2023-08-29 山东睿芯半导体科技有限公司 Method, device, chip and terminal for identifying walking gesture features
CN116665309B (en) * 2023-07-26 2023-11-14 山东睿芯半导体科技有限公司 Method, device, chip and terminal for identifying walking gesture features

Also Published As

Publication number Publication date
CN114359970A (en) 2022-04-15

Similar Documents

Publication Publication Date Title
WO2023134071A1 (en) Person re-identification method and apparatus, electronic device and storage medium
KR102014385B1 (en) Method and apparatus for learning surgical image and recognizing surgical action based on learning
KR101864380B1 (en) Surgical image data learning system
KR102298412B1 (en) Surgical image data learning system
CN108205655B (en) Key point prediction method and device, electronic equipment and storage medium
US20190392587A1 (en) System for predicting articulated object feature location
WO2019127108A1 (en) Key-point guided human attribute recognition using statistic correlation models
CN108491823B (en) Method and device for generating human eye recognition model
CN113052149B (en) Video abstract generation method and device, computer equipment and medium
US20220139061A1 (en) Model training method and apparatus, keypoint positioning method and apparatus, device and medium
CN109544516B (en) Image detection method and device
CN111222379A (en) Hand detection method and device
WO2021217937A1 (en) Posture recognition model training method and device, and posture recognition method and device
CN113192175A (en) Model training method and device, computer equipment and readable storage medium
WO2023273297A1 (en) Multi-modality-based living body detection method and apparatus, electronic device, and storage medium
CN114625923A (en) Training method of video retrieval model, video retrieval method, device and equipment
CN112115790A (en) Face recognition method and device, readable storage medium and electronic equipment
CN113569671A (en) Abnormal behavior alarm method and device
CN116453226A (en) Human body posture recognition method and device based on artificial intelligence and related equipment
CN114783597B (en) Method and device for diagnosing multi-class diseases, electronic equipment and storage medium
CN113325950B (en) Function control method, device, equipment and storage medium
CN115471863A (en) Three-dimensional posture acquisition method, model training method and related equipment
WO2022159200A1 (en) Action recognition using pose data and machine learning
CN113963413A (en) Epidemic situation investigation method and device based on artificial intelligence, electronic equipment and medium
Lee Implementation of an interactive interview system using hand gesture recognition

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22919712

Country of ref document: EP

Kind code of ref document: A1