CN117541979A - Pedestrian re-recognition method and model training method based on global feature learning - Google Patents

Pedestrian re-recognition method and model training method based on global feature learning Download PDF

Info

Publication number
CN117541979A
CN117541979A CN202311424960.6A CN202311424960A CN117541979A CN 117541979 A CN117541979 A CN 117541979A CN 202311424960 A CN202311424960 A CN 202311424960A CN 117541979 A CN117541979 A CN 117541979A
Authority
CN
China
Prior art keywords
global
pedestrian
feature
recognition
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311424960.6A
Other languages
Chinese (zh)
Inventor
蒋召
周靖宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Xumi Yuntu Space Technology Co Ltd
Original Assignee
Shenzhen Xumi Yuntu Space Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Xumi Yuntu Space Technology Co Ltd filed Critical Shenzhen Xumi Yuntu Space Technology Co Ltd
Priority to CN202311424960.6A priority Critical patent/CN117541979A/en
Publication of CN117541979A publication Critical patent/CN117541979A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • G06V40/173Classification, e.g. identification face re-identification, e.g. recognising unknown faces across different face tracks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The application relates to the technical field of image processing, and provides a pedestrian re-recognition method and a model training method based on global feature learning. The pedestrian re-identification method comprises the steps of obtaining a target picture sample; extracting a global feature map of a target picture sample, and segmenting the global feature map into a plurality of local feature maps; inputting a plurality of local feature graphs into a global feature learning network to obtain global relation features; wherein the global feature learning network comprises at least 1 global average pooling layer, at least 3 convolution layers, and a plurality of global maximum pooling layers; the number of global maximum pooling layers is greater than the number of local feature maps; and obtaining a pedestrian re-identification classification result according to the global relation characteristic. According to the method and the device, the global and local relations are captured based on global feature learning, more effective features are extracted, and the accuracy of the pedestrian re-recognition result in a complex scene is improved.

Description

Pedestrian re-recognition method and model training method based on global feature learning
Technical Field
The application relates to the technical field of image processing, in particular to a pedestrian re-recognition method and a model training method based on global feature learning.
Background
Because clear face information of pedestrians is difficult to obtain by the monitoring video, mature face recognition technology is difficult to be effectively applied, so that a pedestrian Re-recognition technology (Re-Id) is rapidly developed and becomes a research hotspot at home and abroad. Pedestrian re-recognition is also called pedestrian re-recognition, and is a technology capable of performing cross-scene and cross-camera retrieval on target pedestrians. The pedestrian re-recognition technology can directly search the specific pedestrians in the monitoring image or the monitoring video according to the wearing, gesture and other information of the pedestrians, and clear face information is not needed. The pedestrian re-recognition technology can improve the pedestrian retrieval efficiency, and greatly saves the time cost and the labor cost. In practical application, when the detected pedestrian body is not complete, the problem of false recognition easily occurs, namely the model is not robust enough, and a common solution is to add multi-scene features as feature enhancement from the data perspective. And how to capture the global feature relation in the feature map, extract more robust features, and significantly improve the accuracy of the pedestrian re-recognition algorithm in the scene of detecting the insufficient pedestrians, which is a technical problem to be solved.
Disclosure of Invention
In view of this, the embodiment of the application provides a pedestrian re-recognition method and model training method based on global feature learning, and corresponding devices, electronic equipment and storage media, so as to solve the problem that in the prior art, the accuracy of a pedestrian re-recognition algorithm is poor under the scene of detecting insufficient pedestrians.
In a first aspect of an embodiment of the present application, a pedestrian re-recognition method based on global feature learning is provided, including:
obtaining a target picture sample;
extracting a global feature map of the target picture sample, and segmenting the global feature map into a plurality of local feature maps;
inputting a plurality of local feature graphs into a global feature learning network to obtain global relation features;
and obtaining a pedestrian re-identification classification result according to the global relation characteristic.
In a second aspect of the embodiments of the present application, a training method for a pedestrian re-recognition model based on global feature learning is provided, including:
constructing a global feature learning network, and constructing a pedestrian re-identification model by utilizing a feature extraction network and the global feature learning network;
acquiring a training picture sample, inputting the training picture sample into the feature extraction network, acquiring a global feature map of the training picture sample, and segmenting the global feature map into a plurality of local feature maps;
inputting a plurality of local feature graphs into the global feature learning network to obtain global relation features;
inputting the global relation features into a classification layer to obtain a pedestrian re-identification classification result;
and determining a target loss function to iteratively update the model parameters of the pedestrian re-recognition model according to the pedestrian re-recognition classification result until a preset iteration termination condition is reached so as to obtain the trained pedestrian re-recognition model.
In a third aspect of the embodiments of the present application, there is provided a pedestrian re-recognition device based on global feature learning, including:
a target picture sample acquisition module configured to acquire a target picture sample;
the global feature map acquisition module is configured to extract a global feature map of the target picture sample and segment the global feature map into a plurality of local feature maps;
the global relation feature acquisition module is configured to input a plurality of local feature graphs into a global feature learning network to acquire global relation features;
and the pedestrian re-recognition result acquisition module is configured to acquire pedestrian re-recognition classification results according to the global relation features.
In a fourth aspect of embodiments of the present application, there is provided an electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method of the first or second aspect when the computer program is executed.
In a fourth aspect of embodiments of the present application, there is provided a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the method of the first or second aspect.
Compared with the prior art, the beneficial effects of the embodiment of the application at least comprise: constructing a pedestrian re-recognition model by constructing a global feature learning network and utilizing a feature extraction network and the global feature learning network; acquiring a training picture sample, inputting the training picture sample into a feature extraction network, acquiring a global feature map of the training picture sample, and segmenting the global feature map into a plurality of local feature maps; inputting a plurality of local feature graphs into a global feature learning network to obtain global relation features; inputting the global relation features into a classification layer to obtain a pedestrian re-identification classification result; and determining a target loss function to iteratively update the model parameters of the pedestrian re-recognition model according to the pedestrian re-recognition classification result until a preset iteration termination condition is reached. According to the pedestrian re-recognition method based on global feature learning, the global and local relations of the sample are captured, more effective features are extracted, and the precision of pedestrian re-recognition results in complex scenes is remarkably improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the following description will briefly introduce the drawings that are needed in the embodiments or the description of the prior art, it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is one of flow diagrams of a pedestrian re-recognition method based on global feature learning provided in an embodiment of the present application;
FIG. 2 is a schematic diagram of a global feature learning network structure provided in an embodiment of the present application;
FIG. 3 is a second flow chart of a pedestrian re-recognition method based on global feature learning according to the embodiment of the present application;
FIG. 4 is a flowchart of a training method of a pedestrian re-recognition model based on global feature learning according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a pedestrian re-recognition device based on global feature learning according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a training device for a pedestrian re-recognition model based on global feature learning according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system configurations, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
A pedestrian re-recognition method and a model training method based on global feature learning according to embodiments of the present application, and corresponding apparatuses, electronic devices, and storage media will be described in detail below with reference to the accompanying drawings.
As described in the background art, the automatic searching of specific pedestrians in a massive monitoring video by using a computer vision technology becomes a necessary development trend. Because clear face information of pedestrians is difficult to obtain by the monitoring video, mature face recognition technology is difficult to effectively apply. Pedestrian Re-Identification (Re-Id) technology has been rapidly developed and has rapidly become a research hotspot at home and abroad. Pedestrian re-recognition is also called pedestrian re-recognition, and is a technology capable of performing cross-scene and cross-camera retrieval on target pedestrians. The pedestrian re-recognition technology can directly search the specific pedestrians in the monitoring image or the monitoring video according to the wearing, gesture and other information of the pedestrians, and clear face information is not needed. The pedestrian re-recognition technology can improve the pedestrian retrieval efficiency, and greatly saves the time cost and the labor cost.
In early studies of pedestrian re-recognition, acquisition of low-level visual features such as color, texture, etc. is the mainstream method for pedestrian feature extraction. However, the steps of such pedestrian re-recognition design are very complex, the acquired features are mostly bottom features, complex problems such as scene (background, gesture, etc.) transformation and shielding are difficult to cope with, and the pedestrian re-recognition accuracy is generally low. With the rise of deep learning, the deep learning-based method gradually replaces the traditional feature extraction method, and becomes a mainstream research direction of pedestrian re-recognition. At present, a deep learning convolutional neural network is utilized to process pedestrian re-recognition tasks and obtain good recognition effects. Such deep learning is performed based on feature extraction, and is generally classified into global feature learning and local feature learning.
The global feature learning network generally performs feature extraction for the entire image, and extracts a global feature representation for a pedestrian image. The method for improving the background noise interference resistance of the network and extracting the fine-granularity global features is a common thinking of current global feature learning. In order to improve the background noise interference resistance of the network and extract the fine-grained global characteristic information, the method can be realized by introducing an attention mechanism or constructing a multi-scale network structure. The attention mechanism can effectively inhibit noise information and improve the discrimination of global feature learning. However, the attention mechanisms used by current research methods still have some problems: (1) The dimension of the attention mechanism is single, only a single dimension modeling can be carried out on a space domain or a channel domain of the feature map, and a plurality of dimensions cannot be modeled at the same time; (2) The dependency of attention is not long, granularity is large, and long-distance dependency of fine granularity in the feature map is difficult to capture. Although the multi-scale network can extract comprehensive and fine global features, the network often has a plurality of branches, has a complex structure and is large in parameter quantity and calculation amount.
The local feature learning aims at focusing on a local area of a pedestrian image by a network and uses image information of the local area to perform feature learning. The model based on local feature learning can well acquire local differences among different pedestrians and extract the pedestrian local features with fine granularity, so that the problem of difficulty in heavy recognition of pedestrians such as shielding, posture change and the like is solved to a certain extent. Typical local feature learning methods at present are a picture cutting method, a human body posture key point and the like.
Local feature learning can acquire pedestrian local fine granularity information, solves the problems of inconsistent pedestrian posture, shielding and the like to a certain extent, but still has some problems: (1) The lack of a description of the entirety of the pedestrian by the local features generally requires assistance that relies on global features; (2) The picture dicing method needs to standardize the aligned pedestrian pictures, otherwise the effect is greatly affected; (3) The local feature learning method utilizing the gesture information needs to rely on an additional gesture estimation model, and the performance of the gesture estimation model has direct influence on the effect of pedestrian re-recognition; (4) The model is relatively complex, often requiring multiple branches to be constructed to extract local features.
The inventor searches and finds that when the detected pedestrian body is not complete in the prior art, the problem of false recognition easily occurs, namely the model is not robust enough, and a common solution is to add multiple scene features as feature enhancement from the data perspective. However, how to capture the global relation in the feature map, extract more robust features, and significantly improve the accuracy of the pedestrian re-recognition algorithm in the scene of detecting the insufficient pedestrians, which is a technical problem to be solved.
Fig. 1 is a flowchart of a pedestrian re-recognition method based on global feature learning. The method comprises the following steps:
s101: and obtaining a target picture sample.
S102: and extracting a global feature map of the target picture sample, and segmenting the global feature map into a plurality of local feature maps.
S103: and inputting a plurality of the local feature graphs into a global feature learning network to obtain global relation features.
S104: and obtaining a pedestrian re-identification classification result according to the global relation characteristics.
In some embodiments, the global feature map is characterized by a spatial dimension and/or a channel dimension, the spatial dimension including a length and a width.
In some embodiments, splitting the global feature map into a plurality of local feature maps includes splitting the global feature map into a plurality of local feature maps according to the length of the spatial dimension.
In some embodiments, as shown in fig. 2, the global feature learning network includes at least 1 global average pooling layer, at least 3 convolution layers, and a plurality of global maximum pooling layers; the number of global maximization layers is greater than the number of local feature maps.
In some embodiments, inputting a plurality of the local feature graphs into a global feature learning network to obtain global relationship features, as shown in fig. 3, including:
s311: and correspondingly inputting the local feature maps to the global maximum pooling layers to obtain a plurality of first global maximum pooling features.
S312: and after the first global maximum pooling features are spliced, respectively inputting the global average pooling layer and the global maximum pooling layer to obtain a first global average pooling feature and a second global maximum pooling feature.
S313: and the result input to the first convolution layer after the subtraction of the first global average pooling feature and the second global maximum pooling feature is spliced with the result input to the second convolution layer by the second global maximum pooling feature, so as to obtain a first global splicing feature.
S314: and inputting the first global splicing characteristic into a third convolution layer to obtain the global relation characteristic.
In some embodiments, obtaining the pedestrian re-recognition classification result according to the global relationship feature includes: and inputting the global relation features into a classification layer to obtain a pedestrian re-identification classification result.
The working process of the global feature learning network in the pedestrian re-recognition method according to the embodiment of the application is described in detail below. Firstly, global maximum pooling is carried out on the segmented local features, and the most effective features are extracted. Then, splicing the features subjected to global maximum pooling, and then calculating global average pooling; and splicing the features subjected to global maximum pooling, and calculating global maximum pooling. And thirdly, subtracting the global average pooled result from the global maximum pooled result, wherein the calculated features are not only effective, but also the interference of noise is removed, and the subtracted features are subjected to convolution layer feature transformation. Then, the original global maximum pooled features are subjected to convolution layer calculation. And finally, splicing the features of the first two steps, and sending the spliced features into a convolution layer for calculation to calculate the final global features.
Fig. 4 is a schematic flow chart of a training method of a pedestrian re-recognition model based on global feature learning. The method comprises the following steps:
s401: and constructing a global feature learning network, and constructing a pedestrian re-identification model by utilizing the feature extraction network and the global feature learning network.
S402: and acquiring a training picture sample, inputting the training picture sample into the feature extraction network, acquiring a global feature map of the training picture sample, and segmenting the global feature map into a plurality of local feature maps.
S403: and inputting a plurality of the local feature maps into the global feature learning network to obtain global relation features.
S404: and inputting the global relation features into a classification layer to obtain a pedestrian re-identification classification result.
S405: and determining a target loss function to iteratively update the model parameters of the pedestrian re-recognition model according to the pedestrian re-recognition classification result until a preset iteration termination condition is reached so as to obtain the trained pedestrian re-recognition model.
In some embodiments, the global feature learning network includes at least 1 global average pooling layer, at least 3 convolution layers, and a plurality of global maximum pooling layers; the number of global maximization layers is greater than the number of local feature maps.
The implementation process of the model training method in the embodiment of the application is described in detail below. Firstly, the features in the input picture are extracted through the feature extraction network, and the better the feature extraction network is, the more effective the features extracted by the feature extraction network are, and the influence on the final result is larger. Then, after feature extraction, a global feature map is calculated, wherein the global feature map is expressed as (H, W, C), H is the height of the global feature map, W is the width of the global feature map, and C is the channel number of the global feature map; that is, (H, W) is the spatial dimension of the global feature map. In the embodiment of the application, the global feature map is segmented in a space dimension, namely, in a height dimension or a width dimension. In one embodiment, the splitting is performed in the height H dimension in a manner that may be uniform, e.g., 3 portions, and each portion may be considered a local feature. And thirdly, sending the segmented features into a global feature learning network, integrating the obtained local features by the global feature learning network, outputting global relation features, sending the global relation features into a classification layer, and calculating a pedestrian re-identification classification result. And finally, calculating the loss according to the output pedestrian re-identification classification result, and reversely updating the model parameters according to the loss. The loss function determined according to the pedestrian re-recognition classification result includes a cross entropy loss function.
According to the embodiment of the application, the pedestrian re-recognition model is built by building a global feature learning network and utilizing a feature extraction network and the global feature learning network; acquiring a training picture sample, inputting the training picture sample into a feature extraction network, acquiring a global feature map of the training picture sample, and segmenting the global feature map into a plurality of local feature maps; inputting a plurality of local feature graphs into a global feature learning network to obtain global relation features; inputting the global relation features into a classification layer to obtain a pedestrian re-identification classification result; and determining a target loss function to iteratively update the model parameters of the pedestrian re-recognition model according to the pedestrian re-recognition classification result until a preset iteration termination condition is reached. According to the pedestrian re-recognition method based on global feature learning, the global and local relations of the sample are captured, more effective features are extracted, and the precision of pedestrian re-recognition results in complex scenes is remarkably improved.
Any combination of the above optional solutions may be adopted to form an optional embodiment of the present application, which is not described herein in detail.
The following are device embodiments of the present application, which may be used to perform method embodiments of the present application. For details not disclosed in the system embodiments of the present application, please refer to the method embodiments of the present application.
Fig. 5 is a schematic diagram of a pedestrian re-recognition device based on global feature learning according to an embodiment of the present application. As shown in fig. 5, the pedestrian re-recognition device based on global feature learning includes:
a target picture sample acquisition module 501 configured to acquire a target picture sample;
the global feature map obtaining module 502 is configured to extract a global feature map of the target picture sample, and segment the global feature map into a plurality of local feature maps;
a global relationship feature obtaining module 503 configured to input a plurality of the local feature maps to a global feature learning network to obtain global relationship features;
the pedestrian re-recognition result obtaining module 504 is configured to obtain a pedestrian re-recognition classification result according to the global relationship feature.
It should be understood that, the pedestrian re-recognition device based on global feature learning in the embodiments of the present disclosure may also execute the method executed by the pedestrian re-recognition device based on global feature learning in fig. 1 to 3, and implement the functions of the example shown in fig. 1 to 3 of the pedestrian re-recognition device based on global feature learning, which are not described herein again. Meanwhile, the sequence number of each step in the above embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present application.
Fig. 6 is a schematic diagram of a training device for a pedestrian re-recognition model based on global feature learning according to an embodiment of the present application. As shown in fig. 6, the pedestrian re-recognition model training device based on global feature learning includes:
the pedestrian re-recognition model construction module 601 is configured to construct a global feature learning network, and construct a pedestrian re-recognition model by using the feature extraction network and the global feature learning network;
the training picture sample obtaining module 602 is configured to obtain a training picture sample, input the training picture sample to the feature extraction network, obtain a global feature map of the training picture sample, and segment the global feature map into a plurality of local feature maps;
a global relationship feature obtaining module 603 configured to input a plurality of the local feature maps to the global feature learning network to obtain global relationship features;
the pedestrian re-recognition result obtaining module 604 is configured to input the global relationship features to the classification layer to obtain a pedestrian re-recognition classification result;
the pedestrian re-recognition model training module 605 is configured to determine, according to the pedestrian re-recognition classification result, a target loss function to iteratively update the model parameters of the pedestrian re-recognition model until a preset iteration termination condition is reached, so as to obtain the trained pedestrian re-recognition model.
It should be understood that, the pedestrian re-recognition model training device based on global feature learning in the embodiment of the present disclosure may further execute the method executed by the pedestrian re-recognition model training device based on global feature learning in fig. 4, and implement the function of the example shown in fig. 4 of the pedestrian re-recognition model training device based on global feature learning, which is not described herein. Meanwhile, the sequence number of each step in the above embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present application.
Fig. 7 is a schematic diagram of an electronic device 7 provided in an embodiment of the present application. As shown in fig. 7, the electronic device 7 of this embodiment includes: a processor 701, a memory 702 and a computer program 703 stored in the memory 702 and executable on the processor 701. The steps of the various method embodiments described above are implemented by the processor 701 when executing the computer program 703. Alternatively, the processor 701, when executing the computer program 703, performs the functions of the modules/units of the apparatus embodiments described above.
The electronic device 7 may be a desktop computer, a notebook computer, a palm computer, a cloud server, or the like. The electronic device 7 may include, but is not limited to, a processor 701 and a memory 702. It will be appreciated by those skilled in the art that fig. 7 is merely an example of the electronic device 7 and is not limiting of the electronic device 7 and may include more or fewer components than shown, or different components.
The memory 702 may be an internal storage unit of the electronic device 7, for example, a hard disk or a memory of the electronic device 7. The memory 702 may also be an external storage device of the electronic device 7, for example, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card) or the like provided on the electronic device 7. The memory 702 may also include both internal storage units and external storage devices of the electronic device 7. The memory 702 is used to store computer programs and other programs and data required by the electronic device.
The processor 701 may be a central processing unit (Central Processing Unit, CPU) or other general purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. The processor 701 reads a corresponding computer program from the nonvolatile memory into the memory and then runs, and forms a shared resource access control device on a logic level. The processor is used for executing the programs stored in the memory and is specifically used for executing the following operations:
obtaining a target picture sample;
extracting a global feature map of the target picture sample, and segmenting the global feature map into a plurality of local feature maps;
inputting a plurality of local feature graphs into a global feature learning network to obtain global relation features;
and obtaining a pedestrian re-identification classification result according to the global relation characteristics.
Or,
constructing a global feature learning network, and constructing a pedestrian re-identification model by utilizing a feature extraction network and the global feature learning network;
acquiring a training picture sample, inputting the training picture sample into the feature extraction network, acquiring a global feature map of the training picture sample, and segmenting the global feature map into a plurality of local feature maps;
inputting a plurality of the local feature graphs into the global feature learning network to obtain global relation features;
inputting the global relation features into a classification layer to obtain a pedestrian re-identification classification result;
and determining a target loss function to iteratively update the model parameters of the pedestrian re-recognition model according to the pedestrian re-recognition classification result until a preset iteration termination condition is reached so as to obtain the trained pedestrian re-recognition model.
The method disclosed in the embodiments shown in fig. 1 to 3 of the present specification or the method disclosed in the embodiment shown in fig. 4 may be applied to the processor 701 or implemented by the processor 701. The processor 701 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or by instructions in the form of software. The above-described processor may implement or perform the methods, steps, and logic blocks disclosed in the embodiments of the present specification. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present specification may be embodied directly in hardware, in a decoded processor, or in a combination of hardware and software modules in a decoded processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory, and the processor reads the information in the memory and, in combination with its hardware, performs the steps of the above method.
Of course, in addition to the software implementation, the electronic device of the embodiments of the present disclosure does not exclude other implementations, such as a logic device or a combination of software and hardware, that is, the execution subject of the following processing flow is not limited to each logic unit, but may also be hardware or a logic device.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit.
The integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present application implements all or part of the flow in the methods of the above embodiments, or may be implemented by a computer program to instruct related hardware, and the computer program may be stored in a computer readable storage medium, where the computer program may implement the steps of the respective method embodiments described above when executed by a processor. The computer program may comprise computer program code, which may be in source code form, object code form, executable file or in some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the content of the computer readable medium can be appropriately increased or decreased according to the requirements of the jurisdiction's jurisdiction and the patent practice, for example, in some jurisdictions, the computer readable medium does not include electrical carrier signals and telecommunication signals according to the jurisdiction and the patent practice.
The present description also proposes a computer-readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a portable electronic device comprising a plurality of application programs, enable the portable electronic device to perform the method of the embodiments shown in fig. 1-3 or the method of the embodiment disclosure shown in fig. 4, and in particular to perform the following method:
obtaining a target picture sample;
extracting a global feature map of the target picture sample, and segmenting the global feature map into a plurality of local feature maps;
inputting a plurality of local feature graphs into a global feature learning network to obtain global relation features;
and obtaining a pedestrian re-identification classification result according to the global relation characteristics.
Or,
constructing a global feature learning network, and constructing a pedestrian re-identification model by utilizing a feature extraction network and the global feature learning network;
acquiring a training picture sample, inputting the training picture sample into the feature extraction network, acquiring a global feature map of the training picture sample, and segmenting the global feature map into a plurality of local feature maps;
inputting a plurality of the local feature graphs into the global feature learning network to obtain global relation features;
inputting the global relation features into a classification layer to obtain a pedestrian re-identification classification result;
and determining a target loss function to iteratively update the model parameters of the pedestrian re-recognition model according to the pedestrian re-recognition classification result until a preset iteration termination condition is reached so as to obtain the trained pedestrian re-recognition model.
In summary, the above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present specification should be included in the protection scope of the present specification.
The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.
The above embodiments are only for illustrating the technical solution of the present application, and are not limiting thereof; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application.

Claims (10)

1. The pedestrian re-identification method based on global feature learning is characterized by comprising the following steps of:
obtaining a target picture sample;
extracting a global feature map of the target picture sample, and segmenting the global feature map into a plurality of local feature maps;
inputting a plurality of local feature graphs into a global feature learning network to obtain global relation features;
and obtaining a pedestrian re-identification classification result according to the global relation characteristic.
2. The method according to claim 1, wherein the global feature map is characterized by a spatial dimension and/or a channel dimension, the spatial dimension comprising a length and a width; and/or splitting the global feature map into a plurality of local feature maps comprises splitting the global feature map into a plurality of local feature maps according to the length of the spatial dimension.
3. The method of claim 1, wherein the global feature learning network comprises at least 1 global average pooling layer, at least 3 convolution layers, and a plurality of global maximum pooling layers; the number of global maximization layers is greater than the number of local feature maps.
4. A method according to claim 3, wherein inputting a plurality of the local feature maps into a global feature learning network to obtain global relationship features comprises:
correspondingly inputting the local feature maps to the global maximum pooling layers to obtain a plurality of first global maximum pooling features;
after a plurality of first global maximum pooling features are spliced, respectively inputting the global average pooling layer and the global maximum pooling layer to obtain a first global average pooling feature and a second global maximum pooling feature;
the first global average pooling feature and the second global maximum pooling feature are subtracted and then are input into a first convolution layer, and the result of the second global maximum pooling feature input into a second convolution layer is spliced to obtain a first global splicing feature;
and inputting the first global splicing characteristic into a third convolution layer to obtain the global relation characteristic.
5. The method according to any one of claims 1 to 4, wherein obtaining a pedestrian re-recognition classification result from the global relationship feature comprises: and inputting the global relation features into a classification layer to obtain a pedestrian re-identification classification result.
6. The pedestrian re-recognition model training method based on global feature learning is characterized by comprising the following steps of:
constructing a global feature learning network, and constructing a pedestrian re-identification model by utilizing a feature extraction network and the global feature learning network;
acquiring a training picture sample, inputting the training picture sample into the feature extraction network, acquiring a global feature map of the training picture sample, and segmenting the global feature map into a plurality of local feature maps;
inputting a plurality of local feature graphs into the global feature learning network to obtain global relation features;
inputting the global relation features into a classification layer to obtain a pedestrian re-identification classification result;
and determining a target loss function to iteratively update the model parameters of the pedestrian re-recognition model according to the pedestrian re-recognition classification result until a preset iteration termination condition is reached so as to obtain the trained pedestrian re-recognition model.
7. The method of claim 6, wherein the global feature learning network comprises at least 1 global average pooling layer, at least 3 convolution layers, and a plurality of global maximum pooling layers; the number of global maximization layers is greater than the number of local feature maps.
8. A pedestrian re-recognition device based on global feature learning, comprising:
a target picture sample acquisition module configured to acquire a target picture sample;
the global feature map acquisition module is configured to extract a global feature map of the target picture sample and segment the global feature map into a plurality of local feature maps;
the global relation feature acquisition module is configured to input a plurality of local feature graphs into a global feature learning network to acquire global relation features;
and the pedestrian re-recognition result acquisition module is configured to acquire pedestrian re-recognition classification results according to the global relation features.
9. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor, when executing the computer program, realizes the method according to any one of claims 1 to 5 or the steps of the method according to any one of claims 6 to 7.
10. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the method of any one of claims 1 to 5 or the steps of the method of any one of claims 6 to 7.
CN202311424960.6A 2023-10-27 2023-10-27 Pedestrian re-recognition method and model training method based on global feature learning Pending CN117541979A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311424960.6A CN117541979A (en) 2023-10-27 2023-10-27 Pedestrian re-recognition method and model training method based on global feature learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311424960.6A CN117541979A (en) 2023-10-27 2023-10-27 Pedestrian re-recognition method and model training method based on global feature learning

Publications (1)

Publication Number Publication Date
CN117541979A true CN117541979A (en) 2024-02-09

Family

ID=89781552

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311424960.6A Pending CN117541979A (en) 2023-10-27 2023-10-27 Pedestrian re-recognition method and model training method based on global feature learning

Country Status (1)

Country Link
CN (1) CN117541979A (en)

Similar Documents

Publication Publication Date Title
CN109255352B (en) Target detection method, device and system
CN109035304B (en) Target tracking method, medium, computing device and apparatus
CN108830780B (en) Image processing method and device, electronic device and storage medium
CN110910422A (en) Target tracking method and device, electronic equipment and readable storage medium
JP2015079505A (en) Noise identification method and noise identification device of parallax depth image
CN111444807B (en) Target detection method, device, electronic equipment and computer readable medium
CN109063776B (en) Image re-recognition network training method and device and image re-recognition method and device
CN112801047B (en) Defect detection method and device, electronic equipment and readable storage medium
CN108875519B (en) Object detection method, device and system and storage medium
CN112749726B (en) Training method and device for target detection model, computer equipment and storage medium
CN115493612A (en) Vehicle positioning method and device based on visual SLAM
CN110135428B (en) Image segmentation processing method and device
CN116912923B (en) Image recognition model training method and device
CN117541979A (en) Pedestrian re-recognition method and model training method based on global feature learning
CN112052863B (en) Image detection method and device, computer storage medium and electronic equipment
CN115393756A (en) Visual image-based watermark identification method, device, equipment and medium
CN110634155A (en) Target detection method and device based on deep learning
CN114387465A (en) Image recognition method and device, electronic equipment and computer readable medium
CN113821689A (en) Pedestrian retrieval method and device based on video sequence and electronic equipment
CN115147434A (en) Image processing method, device, terminal equipment and computer readable storage medium
CN117496555A (en) Pedestrian re-recognition model training method and device based on scale transformation scene learning
CN113569727B (en) Method, system, terminal and medium for identifying construction site in remote sensing image
CN117541977A (en) Pedestrian re-recognition method and device based on pedestrian segmentation
CN113221920B (en) Image recognition method, apparatus, device, storage medium, and computer program product
Ji et al. Saliency detection using Multi-layer graph ranking and combined neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination