CN111832348A - Pedestrian re-identification method based on pixel and channel attention mechanism - Google Patents
Pedestrian re-identification method based on pixel and channel attention mechanism Download PDFInfo
- Publication number
- CN111832348A CN111832348A CN201910310802.5A CN201910310802A CN111832348A CN 111832348 A CN111832348 A CN 111832348A CN 201910310802 A CN201910310802 A CN 201910310802A CN 111832348 A CN111832348 A CN 111832348A
- Authority
- CN
- China
- Prior art keywords
- channel
- pedestrian
- pixel
- information
- features
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/103—Static body considered as a whole, e.g. static pedestrian or occupant recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a pedestrian re-identification method based on a pixel and channel attention mechanism, which comprises the following steps of: extracting the global characteristics of the pedestrian according to the bounding box (search box) of the person; averagely dividing a pedestrian picture into two parts and three parts, and respectively extracting local features of pedestrians; and matching the extracted character features with the character information in the Gallery to find out the required character information. The method utilizes the channel and the pixel attention module to extract the features, thereby effectively reducing the influence of background information on a retrieval result; meanwhile, middle-layer supervision is further designed for the neural network, and in the feature extraction process, a multi-loss function is used for supervising middle-layer feature information to accelerate network convergence; the pedestrian re-identification network based on the channel attention mechanism, the pixel attention mechanism and the intermediate layer supervision can effectively delete redundant information in the character bounding box, so that character information is effectively aggregated, and the retrieval precision is obviously improved.
Description
Technical Field
The invention relates to a pedestrian re-identification method, in particular to a pedestrian re-identification method based on a pixel and channel attention mechanism, and belongs to the technical field of image processing.
Background
At present, various criminal behaviors at home and abroad pose a very great threat to the sustainable and stable development of society. In places with large people flow, such as shopping malls, stations, airports, pedestrian streets and the like, monitoring equipment with large and small sizes are distributed, but how to accurately find out people or information which are needed by people from the monitoring information still presents a great challenge. Especially in criminal investigation work, the policeman needs to find criminal suspect information from a large amount of long-time monitoring information, know the conditions in time and control the criminal suspect information. However, the monitoring information is huge in quantity, complicated in content, and small in angle of view of monitoring, and it becomes very difficult to find out the target person quickly and accurately. Although the face recognition technology is mature at present, the face recognition technology is widely applied to various fields. However, in the surveillance video, due to the problems of the resolution and the shooting angle of the camera, people cannot capture clear and effective face pictures, and people information cannot be retrieved by using a face recognition technology.
In order to solve the problem of person retrieval under complex conditions, a pedestrian re-identification technology is also called as pedestrian re-identification technology. The technology uses a computer to retrieve the character information, and can save a large amount of manpower and material resources. With the development of deep learning, a re-recognition method based on deep learning also becomes the mainstream of pedestrian re-recognition technology. The existing re-identification method based on deep learning is mainly divided into the following five categories: the re-identification method is based on characterization learning, metric learning, local feature, video sequence and GAN mapping.
These methods are widely used in human re-identification studies, but they also have many problems. Based on the method of characterization learning, global features are used as feature vectors, so that many detail features are lost in feature extraction, and errors occur in retrieval results. The method based on metric learning is to compare the similarity distance between two pictures through a neural network, and how to accurately calculate the similarity between the pictures is still a subject to be researched. The method divides a figure picture into a plurality of parts in the vertical direction, and then extracts the local features of the picture respectively. However, when dividing pictures, the dividing is often inaccurate due to the posture of the person, and the like, and the system accuracy is seriously affected. Video sequence-based re-identification techniques also require further exploration in the problem of how to remove redundant frames. At present, pictures generated by the GAN-based method can only be used as negative samples, and the distortion is relatively serious.
In addition to the drawbacks of the above methods, the low resolution, occlusion, view angle, posture and illumination variation of the camera can cause many adverse effects on the re-recognition system. At present, the pedestrian re-identification method based on deep learning uses pooling operation to perform data dimension reduction on feature extraction, but all channels and pixel information in a to-be-treated picture are all treated in the same way no matter maximum pooling or average pooling is adopted. Particularly, a bounding box (search box) contains person information and background information, which cannot be distinguished by a neural network, so that the background information is also used as a part of the person characteristics during feature extraction, which may have a great negative effect on the accuracy of the entire re-recognition system. How to effectively reduce the influence of background information on the re-identification technology is a great challenge.
In order to effectively reduce the influence of background information on a retrieval result, the invention provides a method for extracting features by utilizing a channel and a pixel attention module. Before the maximum pooling and the average pooling, a channel and a pixel attention module are applied to delete redundant information and improve the effectiveness of the picture feature vector; meanwhile, the invention extracts the global and local characteristics of the pedestrian based on the neural network, further designs middle layer supervision for the neural network, and uses a multi-loss function to supervise the middle layer characteristic information in the characteristic extraction process, thereby quickening the network convergence and improving the retrieval precision.
Disclosure of Invention
The invention mainly aims to provide a pedestrian re-identification method based on a pixel and channel attention mechanism so as to overcome the defects in the prior art.
In order to achieve the purpose, the technical scheme adopted by the invention comprises the following steps:
extracting the global characteristics of the pedestrian according to the bounding box (search box) of the person;
averagely dividing a pedestrian picture into two parts and three parts, and respectively extracting local features of pedestrians;
and matching the extracted character features with the character information in the Gallery to find out the required character information.
Preferably, global and local features of the pedestrian are extracted based on the neural network, the extracted global features of the pedestrian include color and edge features, and the extracted local features of the pedestrian include color and edge features of different regions of the pedestrian in the vertical direction.
Preferably, in the process of extracting the local features of the pedestrian, the channel attention module and the pixel attention module are used for aggregating the character feature information, the extracted character features are character feature information obtained by feature aggregation through a neural network, and the character information in the Gallery is character feature information output after pictures in the Gallery are input into a trained model.
Preferably, the extracting global and local features of the pedestrian based on the neural network specifically includes:
using a ResNet-50 network as a basic network to extract picture characteristics, and using the first three layers of the ResNet-50 network; then dividing the whole network into three branches, extracting global features of the image in the first branch, dividing the feature tensor into two parts in the vertical direction by the second branch, and dividing the feature tensor into three parts in the vertical direction by the third branch; then, the channel attention module is used for aggregating the characteristic information and deleting redundant channel information; then using maximal pooling to reduce dimensions; finally, using 1 × 1 convolutional layer, reducing the dimension of the feature vector from 2048 to 256;
the first three layers of the ResNet network, i.e., layer2, layer3, are followed by middle layer supervision, in which pixel attention modules are used to reduce the value of background pixels and increase the value of human pixels.
Preferably, the channel attention module is implemented as follows:
let the size of the input tensor be H × W × C, and be denoted as X ═ X1,x2,…,xc]Wherein H represents the height of the image, W represents the width of the image, and C represents the channel;
the first step is as follows: reducing the dimension of the characteristic information of each channel, and taking the characteristic of each channel after dimension reduction as FcTo carry out the presentation of the contents,
wherein x isc(i, j) is the value at position (i, j) on channel c, and the formula averages tensor in each channel, so that the characteristic aggregation effect can be achieved;
the second step is that: filtering each channel by using a filter, and deleting redundant information;
wherein, ω iscRepresents the weight given to each channel, FcRepresents the tensor value of the c channel, f1Represents a filtering operation;
the third step: performing dimension increasing operation;
wherein the content of the first and second substances,weight for each channel, ZcFor the final weight of each channel, f2Is an operation of ascending dimensionA function, representing a convolution operation;
the fourth step: the source tensor is weighted;
preferably, the pixel attention module is implemented as follows:
let the size of the input tensor be H × W × C, and be denoted as Y ═ Y1,y2,…,yc]Wherein H represents the height of the image, W represents the width of the image, and C represents the channel;
the first step is as follows: compressing the number of channels to 1 based on the following formula (5) for subsequent processing;
the second step is that: rearranging the tensor values;
Eα=g0(D),α=3·j+i (6)
the third step: screening is carried out;
{I1,I2,…,Ic}=g1({η1,η2,…,ηα}·{E1,E2,…,Eα}) (7)
{J1,J2,…,Jα}=g2({γ1,γ2,…,γN}.{I1,I2,…,IN}) (8)
the fourth step: restoring the obtained vector to the original mapsize;
K=g4(J) (9)
the fifth step: assigning a weight to each pixel;
Yresult-c(i,j)=K(i,j)·Y(i,j) (10)。
compared with the prior art, the invention has the advantages that:
(1) by adopting the channel and pixel attention module to extract the features, the channel and pixel attention module is applied to delete redundant information and improve the effectiveness of the picture feature vector before the maximum pooling and average pooling operations; meanwhile, the invention extracts the global and local characteristics of the pedestrian based on the neural network, further designs middle layer supervision for the neural network, and uses a multi-loss function to supervise the middle layer characteristic information in the characteristic extraction process, thereby quickening the network convergence and improving the retrieval precision;
(2) the invention provides an innovative pedestrian re-identification network based on a channel attention mechanism, a pixel attention mechanism and intermediate layer supervision. The network can effectively delete redundant information in the character bounding box, so that the character information is effectively aggregated, and the retrieval precision is obviously improved;
(3) the invention uses three data sets of Market1501, DukeMTMC-reiD and CUHK03-NP to verify the experimental effect, and the result shows that compared with other methods, the re-identification network provided by the invention has the advantages that the two indexes of CMC and Map are remarkably improved, especially on the CUHK03-NP data set.
Drawings
FIG. 1 is a schematic diagram of a main workflow of pedestrian re-identification in an exemplary embodiment of the present invention;
FIG. 2 is a schematic diagram of a re-identification network structure including a channel and pixel attention mechanism in an exemplary embodiment of the invention;
FIG. 3 is a block diagram of a channel attention module in accordance with an exemplary embodiment of the present invention;
FIG. 4 is an attentionmap of a channel attention module in an exemplary embodiment of the invention;
FIG. 5 is a block diagram of a pixel attention module in accordance with an exemplary embodiment of the present invention;
FIG. 6 is an attention map of a pixel attention module in an exemplary embodiment of the invention;
FIG. 7 is a diagram illustrating the results of the search on the data sets Market1501, DukeMTMC-reiD and CUHK03-NP in accordance with an exemplary embodiment of the present invention.
Detailed Description
In view of the deficiencies in the prior art, the inventors of the present invention have made extensive studies and extensive practices to provide technical solutions of the present invention. The technical solution, its implementation and principles, etc. will be further explained as follows.
Referring to fig. 1, in fig. 1, CA represents a channel attention module, PA represents a pixel attention module, and a pedestrian re-identification method based on a pixel and channel attention mechanism includes:
firstly, extracting the global characteristics of the pedestrian according to a bounding box of the person;
then, the pedestrian picture is divided into two parts and three parts on average, local features of pedestrians are extracted respectively, and in the process, the pedestrian feature information is aggregated by using a channel and a pixel attention module;
and then matching the extracted character features with the character information in the Gallery to find out the required character information.
The extracted global features of the pedestrian mainly include features such as color and edge, and the local features of the pedestrian refer to features such as color and edge of different areas of the pedestrian in the vertical direction.
The character information in the Gallery specifically refers to character feature information output after pictures in the Gallery are input into a trained model. The extracted human features refer to human feature information obtained by feature aggregation through a neural network.
The invention extracts global and local features based on a neural network, and fig. 2 is a structure diagram of a re-recognition network including channels and a pixel attention mechanism, wherein it can be seen in the structure diagram of the whole network that the upper layer and three branch networks of the main network extract global features of people, and the middle layer and the lower layer of the main network extract local features of people.
The specific details of the overall neural network are described below:
(1) the overall network structure, as shown in fig. 2, in the figure, PA is a pixel attention model, CA is a channel attention model, triple _ Loss is a ternary Loss function, CrossEntropy Loss is a cross entropy Loss function, and Sum _ Loss is a total Loss function; the network uses the ResNet-50 network as the base network to extract picture features. The difference from the base network is that we only use the first three layers of the ResNet-50 network, after which we divide the entire network into three branches. In the first branch we extract the global features of the image, the second branch divides the feature tensor into two parts in the vertical direction, and the third branch divides the feature tensor into three parts in the vertical direction. We then used the channel attention module to aggregate the feature information, remove redundant channel information, then use max pooling to reduce the dimensions, and finally use 1 x 1 convolutional layer to reduce the dimensions of the feature vector from 2048 to 256. Also as shown in fig. 2, we add middle layer supervision after layer1, layer2, and layer3, where we use the pixel attention module to reduce the value of background pixels and increase the value of person pixels. The dimensions of the network profile are as shown in table 1.
Numbering | Module | Feature size | Dimension (d) of |
1 | Layed | 96×32 | 256 |
2 | Layer2 | 48×16 | 512 |
3 | Layer3 | 24×8 | 1024 |
4 | Branch_Global | 12×4 | 2048 |
5 | Branch_Partl | 24×8 | 2048 |
6 | Branch_Part2 | 24×8 | 2048 |
7 | Channel Attention-l | 12×4 | 2048 |
8 | Channel Attention-2 | 24×8 | 2048 |
9 | Channel Attention-3 | 24×8 | 2048 |
10 | Pixel Attention-1 | 96×32 | 256 |
11 | Pixel Attention-2 | 48×16 | 512 |
12 | Pixel Attention-3 | 24×8 | 1024 |
Table 1. for the network profile information, the resolution of the input picture is set to 384 × 128.
(2) The channel attention module is structured as shown in fig. 3.
Before that, the cnn-based convolutional neural network gives the same weight to each channel of each tensor, but the same weight is different from the actual situation, redundant channel information cannot be deleted due to the same weight, and finally noise enters the final feature vector, so that the retrieval result is influenced. The key of the channel attention mechanism is how to endow each channel with different weight values; FIG. 3 is a diagram of the channel attention model structure that we have designed.
As shown in fig. 3,. AvgPool2d is an adaptive pooling layer, and Conv2d is a convolutional layer; let us denote the size of the input tensor as H × W × C, and denote X ═ X1,x2,…,xc]In the first step, we need to perform dimension reduction on the feature information of each channel. The feature of each channel after dimensionality reduction is represented by FcTo carry out the presentation of the contents,
wherein xc(i, j) is the value at location (i, j) on channel c. The formula averages tensors in each channel, and the effect of feature aggregation can be achieved.
Then, the filter is used for filtering each channel, and redundant information is deleted.
In the formula (2), ωcRepresents the weight given to each channel, FcRepresents the tensor value of the c channel, f1Representing a filtering operation.
And then performing dimension increasing operation.
In the formula (3)Weight for each channel, ZcFor the final weight of each channel, f2Is a rising dimension operation function, which represents the convolution operation in the structure chart; and finally, giving weight to the source tensor.
Fig. 4 is an attention map of a channel attention module, in which "Input image" is an Input image of a model, and it can be known from the overall network structure diagram that we use the channel attention module in the upper, middle and lower branches of the main network; "No-CA 1" is the attention feature map of the non-channel attention model, and "CA 1" is the attention feature map after adding the channel attention model;
the right 6 images in fig. 4 show the feature aggregation effect of the model after the attention module is used, and the highlight part in the figure represents that the features of the part have important influence on the retrieval result. We can see that after using CA (channel attention module), the neural network can effectively delete the background information, the character features are strengthened, and the search result is positively influenced.
(3) The pixel attention module is shown in fig. 5.
In the present invention, we apply the pixel attention module to the middle monitor branch, and as with the channel attention, we set the size of the input tensor to H × W × C, and denote Y ═ C1,y2,…,yc]The specific operation of the first step is shown in the following formula,
this operation compresses the channel number to 1 for subsequent processing. The tensor values are then rearranged as shown in fig. 5.
Eα=g0(D),α=3·j+i (6)
We then screened it, similar to the channel attention.
{I1,I2,…,Ic}=g1({η1,η2,…,ηα}·{E1,E2,…,Eα}) (7)
{J1,J2,…,Jα}=g2({γ1,γ2,…,γN}·{I1,I2,…,IN}) (8)
The resulting vector is then restored to the original mapsize.
K=g4(J) (9)
Finally we assign a weight to each pixel.
Yresult-c(i,j)=K(i,j)·Y(i,j) (10)。
FIG. 6 is an attention map of a pixel attention module, similar to the channel attention map, we use the pixel attention module in the three branches of layer, layer2 and layer 3; wherein, No-PA1 is the attention feature map of the non-pixel attention model, and PA1 is the attention feature map after the pixel attention model is added; as is apparent from fig. 6, after the pixel attention module is used, the environmental information is effectively subtracted, and the feature information of the person is further enhanced, so that the retrieval result is enhanced.
The invention provides an innovative pedestrian re-identification network based on a channel attention mechanism, a pixel attention mechanism and intermediate layer supervision. The network can effectively delete redundant information in the character bounding box, so that the character information is effectively aggregated, and the retrieval precision is obviously improved.
(4) Technical effects of the invention
The invention mainly uses three data sets of Markelt 501, DukeMTMC-reiD and CUHK03-NP to verify the experimental effect. Tables 2-4 show the results of the comparisons on the data sets Market1501, DukeMTMC-reiD, CUHK03-NP, respectively. Wherein RK stands for re-ranking algorithm.
TABLE 2 comparison of the results on the data set Market1501, RK stands for the re-ranking algorithm
TABLE 3 comparison of data sets DukeMTMC-relD, RK stands for re-ranking algorithm
TABLE 4 comparison of the results on the data set CUHK03-NP, RK stands for re-ranking algorithm
From tables 2-4, it can be seen that the re-identification network in the present invention has significantly improved both CMC and Map indexes compared with other methods, especially on the CUHK03-NP data set, the accuracies on CUHK03-labeled and CUHK 03-protected respectively reach rank1/mAP 80.9/78.7 and rank1/mAP 78.9/76.4, and the effect is far superior to other re-ID methods.
Table 5 shows the results of ablation experiments, which respectively test the effects of three network structures including backbone, backbone + CA, and backbone + CA + PA on the data sets of DukeMTMC-reID and CUHK03, and it can be seen that the CA and PA modules provided by the present invention have significant effects on improving the search effect of the original neural network.
TABLE 5 ablation test results
FIG. 7 is a graph of the search results on the datasets Markelt 501, DukeMTMC-reiD and CUHK03-NP using the present invention.
It should be understood that the above-mentioned embodiments are merely illustrative of the technical concepts and features of the present invention, which are intended to enable those skilled in the art to understand the contents of the present invention and implement the present invention, and therefore, the protection scope of the present invention is not limited thereby. All equivalent changes and modifications made according to the spirit of the present invention should be covered within the protection scope of the present invention.
Claims (6)
1. A pedestrian re-identification method based on a pixel and channel attention mechanism is characterized by comprising the following steps:
extracting the global features of the pedestrians according to the retrieval frame of the person;
averagely dividing a pedestrian picture into two parts and three parts, and respectively extracting local features of pedestrians;
and matching the extracted character features with the character information in the Gallery to find out the required character information.
2. The method of claim 1 for pedestrian re-identification based on a pixel and channel attention mechanism, comprising: the method comprises the steps of extracting global and local features of the pedestrian based on a neural network, wherein the extracted global features of the pedestrian comprise color and edge features, and the extracted local features of the pedestrian comprise color and edge features of different areas of the pedestrian in the vertical direction.
3. The method of claim 2, comprising: the method comprises the steps that a channel attention module and a pixel attention module are used for aggregating character feature information in the process of extracting the local features of pedestrians, the extracted character features are character feature information obtained by feature aggregation through a neural network, and the character information in the Gallery is the character feature information output after pictures in the Gallery are input into a trained model.
4. The pedestrian re-identification method based on the pixel and channel attention mechanism according to claim 3, wherein the extracting global and local features of the pedestrian based on the neural network specifically comprises:
using a ResNet-50 network as a basic network to extract picture characteristics, and using the first three layers of the ResNet-50 network; then dividing the whole network into three branches, extracting global features of the image in the first branch, dividing the feature tensor into two parts in the vertical direction by the second branch, and dividing the feature tensor into three parts in the vertical direction by the third branch; then, the channel attention module is used for aggregating the characteristic information and deleting redundant channel information; then using maximal pooling to reduce dimensions; finally, using 1 × 1 convolutional layer, reducing the dimension of the feature vector from 2048 to 256;
the first three layers of the ResNet network, i.e., layer1, layer2, and layer3, are followed by middle layer supervision, in which pixel attention modules are used to reduce the value of background pixels and increase the value of human pixels.
5. The pedestrian re-identification method based on the pixel and channel attention mechanism as claimed in claim 3, wherein the channel attention module is implemented as follows:
let the size of the input tensor be H × W × C, and be denoted as X ═ X1,x2,…,xc]Wherein H represents the height of the image, W represents the width of the image, and C represents the channel;
the first step is as follows: reducing the dimension of the characteristic information of each channel, and taking the characteristic of each channel after dimension reduction as FcTo carry out the presentation of the contents,
wherein x isc(i, j) is the value at position (i, j) on channel c, and the formula averages tensor in each channel, so that the characteristic aggregation effect can be achieved;
the second step is that: filtering each channel by using a filter, and deleting redundant information;
wherein, ω iscRepresents the weight given to each channel, FcRepresents the tensor value of the c channel, f1Represents a filtering operation;
the third step: performing dimension increasing operation;
wherein the content of the first and second substances,weight for each channel, ZcFor the final weight of each channel, f2Is a rising dimensional operation function, representing a convolution operation;
the fourth step: the source tensor is weighted;
6. the method of claim 3, wherein the pixel attention module is implemented as follows:
let the size of the input tensor be H × W × C, and be denoted as Y ═ Y1,y2,…,yc]Wherein H represents the height of the image, W represents the width of the image, and C represents the channel;
the first step is as follows: compressing the number of channels to 1 based on the following formula (5) for subsequent processing;
the second step is that: rearranging the tensor values;
Eα=g0(D),α=3·j+i (6)
the third step: screening is carried out;
{I1,I2,…,Ic}=g1({η1,η2,…,ηα}·{E1,E2,…,Eα}) (7)
{J1,J2,…,Jα}=g2({γ1,γ2,…,γN}·{I1,I2,…,IN}) (8)
the fourth step: restoring the obtained vector into an original mapsize, wherein the mapsize is the size of the feature map;
K=g4(J) (9)
the fifth step: assigning a weight to each pixel;
Yresult-c(i,j)=K(i,j)·Y(i,j) (10)。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910310802.5A CN111832348B (en) | 2019-04-17 | 2019-04-17 | Pedestrian re-identification method based on pixel and channel attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910310802.5A CN111832348B (en) | 2019-04-17 | 2019-04-17 | Pedestrian re-identification method based on pixel and channel attention mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111832348A true CN111832348A (en) | 2020-10-27 |
CN111832348B CN111832348B (en) | 2022-05-06 |
Family
ID=72914987
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910310802.5A Active CN111832348B (en) | 2019-04-17 | 2019-04-17 | Pedestrian re-identification method based on pixel and channel attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111832348B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112836646A (en) * | 2021-02-05 | 2021-05-25 | 华南理工大学 | Video pedestrian re-identification method based on channel attention mechanism and application |
CN112884680A (en) * | 2021-03-26 | 2021-06-01 | 南通大学 | Single image defogging method using end-to-end neural network |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107832672A (en) * | 2017-10-12 | 2018-03-23 | 北京航空航天大学 | A kind of pedestrian's recognition methods again that more loss functions are designed using attitude information |
CN108510012A (en) * | 2018-05-04 | 2018-09-07 | 四川大学 | A kind of target rapid detection method based on Analysis On Multi-scale Features figure |
-
2019
- 2019-04-17 CN CN201910310802.5A patent/CN111832348B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107832672A (en) * | 2017-10-12 | 2018-03-23 | 北京航空航天大学 | A kind of pedestrian's recognition methods again that more loss functions are designed using attitude information |
CN108510012A (en) * | 2018-05-04 | 2018-09-07 | 四川大学 | A kind of target rapid detection method based on Analysis On Multi-scale Features figure |
Non-Patent Citations (1)
Title |
---|
TIANSHENG GUO等: "Deep Network with Spatial and Channel Attention for Person Re-identification", 《2018IEEE VISUAL COMMUNICATION AND IMAGE PROCESSING》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112836646A (en) * | 2021-02-05 | 2021-05-25 | 华南理工大学 | Video pedestrian re-identification method based on channel attention mechanism and application |
CN112836646B (en) * | 2021-02-05 | 2023-04-28 | 华南理工大学 | Video pedestrian re-identification method based on channel attention mechanism and application |
CN112884680A (en) * | 2021-03-26 | 2021-06-01 | 南通大学 | Single image defogging method using end-to-end neural network |
Also Published As
Publication number | Publication date |
---|---|
CN111832348B (en) | 2022-05-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109829443B (en) | Video behavior identification method based on image enhancement and 3D convolution neural network | |
CN110348376B (en) | Pedestrian real-time detection method based on neural network | |
CN106682108B (en) | Video retrieval method based on multi-mode convolutional neural network | |
CN113065558A (en) | Lightweight small target detection method combined with attention mechanism | |
CN108304808A (en) | A kind of monitor video method for checking object based on space time information Yu depth network | |
CN110378849B (en) | Image defogging and rain removing method based on depth residual error network | |
CN107330390B (en) | People counting method based on image analysis and deep learning | |
CN109685045B (en) | Moving target video tracking method and system | |
CN109886159B (en) | Face detection method under non-limited condition | |
CN111709331B (en) | Pedestrian re-recognition method based on multi-granularity information interaction model | |
CN111832348B (en) | Pedestrian re-identification method based on pixel and channel attention mechanism | |
Ma et al. | Image-based air pollution estimation using hybrid convolutional neural network | |
TW201308254A (en) | Motion detection method for comples scenes | |
CN113792606A (en) | Low-cost self-supervision pedestrian re-identification model construction method based on multi-target tracking | |
CN114627269A (en) | Virtual reality security protection monitoring platform based on degree of depth learning target detection | |
CN110866453B (en) | Real-time crowd steady state identification method and device based on convolutional neural network | |
CN112164010A (en) | Multi-scale fusion convolution neural network image defogging method | |
CN111507416A (en) | Smoking behavior real-time detection method based on deep learning | |
CN115171183A (en) | Mask face detection method based on improved yolov5 | |
CN105701515A (en) | Face super-resolution processing method and system based on double-layer manifold constraint | |
CN110751667A (en) | Method for detecting infrared dim small target under complex background based on human visual system | |
CN105930789A (en) | Human body behavior recognition based on logarithmic Euclidean space BOW (bag of words) model | |
CN117710888A (en) | Method and system for re-identifying blocked pedestrians | |
CN108597172A (en) | A kind of forest fire recognition methods, device, electronic equipment and storage medium | |
CN108764287A (en) | Object detection method and system based on deep learning and grouping convolution |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |