CN114332919A - Pedestrian detection method and device based on multi-spatial relationship perception and terminal equipment - Google Patents
Pedestrian detection method and device based on multi-spatial relationship perception and terminal equipment Download PDFInfo
- Publication number
- CN114332919A CN114332919A CN202111510823.5A CN202111510823A CN114332919A CN 114332919 A CN114332919 A CN 114332919A CN 202111510823 A CN202111510823 A CN 202111510823A CN 114332919 A CN114332919 A CN 114332919A
- Authority
- CN
- China
- Prior art keywords
- spatial relationship
- feature
- pedestrian detection
- relation
- perception
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 57
- 230000008447 perception Effects 0.000 title claims abstract description 51
- 238000000034 method Methods 0.000 claims abstract description 9
- 238000012549 training Methods 0.000 claims abstract description 8
- 239000013598 vector Substances 0.000 claims description 30
- 239000011159 matrix material Substances 0.000 claims description 28
- 238000010586 diagram Methods 0.000 claims description 21
- 230000006870 function Effects 0.000 claims description 12
- 238000004364 calculation method Methods 0.000 claims description 10
- 238000013507 mapping Methods 0.000 claims description 9
- 238000011176 pooling Methods 0.000 claims description 9
- 238000012545 processing Methods 0.000 claims description 8
- 230000004913 activation Effects 0.000 claims description 6
- 230000002457 bidirectional effect Effects 0.000 claims description 5
- 238000010606 normalization Methods 0.000 claims description 4
- 230000000717 retained effect Effects 0.000 claims description 3
- 238000005065 mining Methods 0.000 claims description 2
- 238000013527 convolutional neural network Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000009412 basement excavation Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses a pedestrian detection method, a device and terminal equipment based on multi-spatial relationship perception, wherein the method comprises the following steps of 1, collecting a pedestrian image data set, adjusting the pedestrian image data set to a fixed size, and training a model; step 2, inputting the image into a frame model by adopting a detection frame of YOLOX, and firstly performing data enhancement on the image; step 3, inputting the image after data enhancement into a Focus module, carrying out slicing operation on the image according to odd and even to obtain 4 images, and then splicing along the channel direction; step 4, inputting the spliced images into a backbone network of a YOLOX detection framework, wherein three branches are connected with the backbone network; and step 5, each branch comprises 2 parts, a multi-spatial relationship perception module and a detection head, and the multi-spatial relationship perception module effectively fuses global information and local information together by combining the relationship between the features in different spatial dimensions to obtain a multi-spatial relationship perception feature map. The method focuses on global information, can extract local information, and effectively fuses the global information and the local information, so that characteristic information with more identification degree is obtained, and the pedestrian detection performance is improved.
Description
Technical Field
The invention relates to the field of image recognition research, in particular to a pedestrian detection method, and specifically relates to a pedestrian detection method, a pedestrian detection device and terminal equipment based on multi-spatial relationship perception.
Background
With the continuous development of smart city construction, many new artificial intelligence technologies are applied to intelligent transportation, intelligent government affairs, intelligent factories and the like, and each application cannot leave the masses and serves people, so that pedestrian detection is the premise of many application technologies. However, real scenes are often complex, such as overlapping of bodies due to dense crowds, or blocking of objects, strong illumination change, blurred pictures due to severe weather factors (rain and snow weather, etc.), and the real situations increase the difficulty of pedestrian detection. Therefore, there is a need for a pedestrian detection technology that can dig deeper and discriminative features in a pedestrian region in an image enough to characterize pedestrians in various environments.
In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art: most of the popular pedestrian detection technologies at present are based on a Convolutional Neural Network (CNN), and most of CNN pedestrian detection models use limited receptive fields, so that it is difficult to learn rich structural modes by combining global information, for example, the CNN is used for detecting and segmenting pedestrians, so as to obtain final position information; such as pedestrian detection using CNN in combination with feature fusion; although some methods take into account different receptive fields, they do not combine global and local information well; in addition, some approaches enhance the learning capabilities of the model by stacking network depths, which are very resource intensive to train and deploy.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a pedestrian detection method, a device and terminal equipment based on multi-spatial relationship perception. The technical scheme is as follows:
the invention provides a pedestrian detection method based on multi-spatial relationship perception, which comprises the following steps:
And 2, inputting the image into a frame model by adopting a detection frame of the YOLOX, and performing data enhancement on the image.
And 3, inputting the image subjected to data enhancement into a Focus module, slicing the image according to parity to obtain 4 images, and splicing along the channel direction.
And 4, inputting the spliced image into a backbone network of the YOLOX detection framework, wherein three branches are connected with the backbone network, the three branches respectively correspond to different receptive fields, and the three receptive fields can cover targets with different sizes.
And step 5, each branch comprises 2 parts, a multi-spatial relationship perception module and a detection head, and the multi-spatial relationship perception module effectively fuses global information and local information together by combining the relationship between the features in different spatial dimensions to obtain a multi-spatial relationship perception feature map.
The workflow of the multi-spatial relationship perception module is as follows:
the X dimension of a characteristic diagram input into the multi-spatial relationship perception module is H multiplied by W multiplied by C, H is height, W is width, and C is the number of channels;
(1) constructing a relation characteristic diagram of an H multiplied by W space;
in H X W space range, decomposing the characteristic diagram X into H X W characteristic vectors with length of C, wherein the characteristic vectors XiMapping to a feature vector xjR for relation informationi,jExpressed, the calculation is as follows:
wherein,and phiH×WFor 2 embedding functions, each of which is a 1 × 1 convolutional layer, a BatchNormalization layer, and aAnd a ReLU activation layer. Correspondingly, the feature vector xjMapping to a feature vector JxiIs rj,i=fH×W(xj,xi) Then (r)i,j,rj,i) Feature vector x is describediAnd xjThe two-way relationship between; for the one-way relation, the relation information among all the characteristic vectors is calculated and stacked to obtain an affinity matrixThe number of matrix channels is H multiplied by W, so that two different affinity matrixes M1 and M2 can be obtained through the bidirectional relation, and the local information of the features is deeply mined.
The original global structure information is retained, specifically, after 1 × 1 convolution is performed on the original feature map X, global average pooling operation is performed in the channel direction to obtain a global structure feature map F,and connecting the global structural feature map F and the two affinity matrixes in series to obtain a feature matrix Y, wherein the formula is as follows:
pool denotes global average pooling, θH×WAndeach consisting of a 1 × 1 convolutional layer, a Batch Normalization layer and a ReLU activation layer, compared withAnd phiH×WThe number of output active nodes is different; and fusing all global and local information contained in the feature matrix by convolving the feature matrix Y by 1 × 1 to obtain a relational feature map belonging to a slice × W space.
(2) Constructing a relation characteristic diagram of a channel space C;
similarly, in the channel space range, the feature map X is decomposed into C feature vectors with length H multiplied by W, and the feature vectors XaMapping to a feature vector xbRelation information r ofa,bComprises the following steps:
whereinYang phiCFunction andand phiH×WConsistent, only different output dimensions; obtaining affinity matrix by the same calculation method as that in step 5(1)That is, the bidirectional relationship can obtain two different affinity matrices M'1And M'2。
After the original characteristic diagram X is convoluted by 1 multiplied by 1, the global average pooling is carried out in H multiplied by W dimension to obtain a structural characteristic diagramThe structural feature map and the two affinity matrices are connected in series to obtain a feature matrix Y', and the calculation mode is as follows:
Y′=[pool(θC(X)),θC(M′1),θC(M′2)]。
θCandfunction and thetaH×WAndconsistent, only different output dimensions; passing the feature matrix YAnd fusing all global and local information contained in the feature matrix through convolution of 1 × 1, thereby obtaining a relationship feature map belonging to the channel space C.
And multiplying the relation feature maps of the H multiplied by W space and the channel space C to obtain a multi-space relation perception feature map.
And 6, placing the multi-spatial relationship perception characteristic graph into a detection head, decoupling classification and coordinate positioning by YOLOX, reducing dimensions of a channel by a convolution of 1 x 1, and then connecting two light-weight branches for classification and regression respectively.
Preferably, the step 2 data enhancement comprises random horizontal flipping of the image, color dithering, multi-scale enhancement and mosaic coordinate enhancement methods.
Preferably, the receptive fields corresponding to the three branches in step 4 are respectively 8 times, 16 times and 32 times of downsampling.
Preferably, in the training phase, the classification loss function adopts cross entropy, the regression loss function adopts GIOU loss, and penalty is imposed on the acquired position information by using L1 norm.
Compared with the prior art, one of the technical schemes has the following beneficial effects: through many spatial relation perception modules, among the different spatial dimensions of degree of depth excavation, the relation between characteristic and the characteristic had both been paid close attention to global information, can extract local information again to effectively fuse the two, establish the relation with the relation information between the characteristic information of different spaces and characteristic, make the characteristic that the model study had more the degree of distinguishing and discrimination, thereby obtain the characteristic information who has more the degree of distinguishing, improve pedestrian detection accuracy.
Drawings
Fig. 1 is a flowchart of a multi-spatial relationship awareness module according to an embodiment of the present disclosure.
Detailed Description
In order to clarify the technical solution and the working principle of the present invention, the embodiments of the present disclosure will be described in further detail with reference to the accompanying drawings. All the above optional technical solutions may be combined arbitrarily to form the optional embodiments of the present disclosure, and are not described herein again.
The terms "step 1," "step 2," "step 3," and the like in the description and claims of this application and the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used may be interchanged under appropriate circumstances such that the embodiments of the application described herein may be practiced in sequences other than those described herein.
In a first aspect: the embodiment of the disclosure provides a pedestrian detection method based on multi-spatial relationship perception, which comprises the following steps:
And 2, a detection framework of the YOLOX is adopted, the framework is simple in structure, an anchor frame does not need to be manually arranged, and training and deployment are facilitated. Inputting the image into a frame model, and performing data enhancement on the image, preferably, the data enhancement in the step 2 comprises random horizontal inversion, color dithering, multi-scale enhancement, mosaic coordinate enhancement method and the like of the image so as to enlarge the scale of a training set and improve the generalization capability of the model.
And 3, inputting the image subjected to data enhancement into a Focus module, slicing the image according to parity to obtain 4 images, and splicing along the channel direction. The Focus module performs downsampling without increasing the calculation amount, and more complete image information is reserved.
And 4, inputting the spliced image into a backbone network of the YOLOX detection framework, wherein three branches are connected with the backbone network, the three branches respectively correspond to different receptive fields, and the three receptive fields can cover targets with different sizes. Preferably, the receptive fields corresponding to the three branches in step 4 are respectively 8 times, 16 times and 32 times of downsampling.
And step 5, each branch comprises 2 parts, a multi-spatial relationship perception module and a detection head, and the multi-spatial relationship perception module effectively fuses global information and local information together by combining the relationship between the features in different spatial dimensions to obtain a multi-spatial relationship perception feature map.
Fig. 1 is a flowchart of a multi-spatial relationship sensing module, and with reference to the diagram, the workflow of the multi-spatial relationship sensing module is as follows:
the X dimension of a characteristic diagram input into the multi-spatial relationship perception module is H multiplied by W multiplied by C, H is height, W is width, and C is the number of channels;
(1) constructing a relation characteristic diagram of an H multiplied by W space;
in H X W space range, decomposing the characteristic diagram X into H X W characteristic vectors with length of C, wherein the characteristic vectors XiMapping to a feature vector xjR for relation informationi,jExpressed, the calculation is as follows:
wherein,and phiH×WFor 2 embedding functions, each consists of a 1 × 1 convolutional layer, a BatchNormalization layer, and a ReLU activation layer. Correspondingly, the feature vector xjMapping to a feature vector xiIs rj,i=fH×W(xj,xi) Then (r)i,j,rj,i) Feature vector x is describediAnd xjThe two-way relationship between; for the one-way relation, the relation information among all the characteristic vectors is calculated and stacked to obtain an affinity matrixThe number of matrix channels is H multiplied by W, so that two different affinity matrices M can be obtained by the bidirectional relation1And M2And deep mining is carried out on the local information of the features.
In order to simultaneously develop global information of features, original global structure information needs to be retained, specifically, after 1 × 1 convolution is performed on an original feature map X, global average pooling operation is performed in a channel direction to obtain global information of featuresA global structural feature map F is obtained,and connecting the global structural feature map F and the two affinity matrixes in series to obtain a feature matrix Y, wherein the formula is as follows:
Y=[pool(θH×W(X)),θH×W(M1),θH×W(M2)]。
pool denotes global average pooling, θH×WAnd thetaH×WEach consisting of a 1 × 1 convolutional layer, a Batch Normalization layer and a ReLU activation layer, compared withAnd phiH×WThe number of output active nodes is different; and fusing all global and local information contained in the feature matrix by convolving the feature matrix Y by 1 × 1 to obtain a relation feature map belonging to H × W space.
(2) Constructing a relation characteristic diagram of a channel space C;
similarly, in the channel space range, the feature map X is decomposed into C feature vectors with length H multiplied by W, and the feature vectors XaMapping to a feature vector xbRelation information r ofa,bComprises the following steps:
whereinAnd phiCFunction andand phiH×WConsistent, only different output dimensions; obtaining affinity matrix by the same calculation method as that in step 5(1)That is, the bidirectional relationship can obtain two different affinity matrices M'1And M'2;
Different from the structural feature map F obtained in the step 5, after the original feature map X is convolved by 1 multiplied by 1 in the step, the global average pooling is carried out in H multiplied by W dimension to obtain the structural feature mapThe structural feature map and the two affinity matrices are connected in series to obtain a feature matrix Y', and the calculation mode is as follows:
θCandfunction and thetaH×WAndconsistent, only different output dimensions; and fusing all global and local information contained in the feature matrix by convolution of 1 multiplied by 1 on the feature matrix Y', thereby obtaining a relation feature map belonging to the channel space C.
And multiplying the relation characteristic diagram of the H multiplied by W space and the channel space C to obtain a multi-space relation perception characteristic diagram, wherein the relation perception characteristic diagram contains global and local information of the characteristics in different space dimensions and is fully fused, and the effectiveness and the distinguishing capability of the characteristics are improved.
And 6, putting the multi-spatial relationship perception characteristic graph into a detection head, coupling classification and coordinate positioning together for training by using a detection head different from a traditional YOLO series detection head, decoupling the classification and the coordinate positioning by using YOLOX, reducing the dimension of a channel by using a convolution of 1 multiplied by 1, then connecting two light-weight branches, and performing classification and regression respectively, thereby effectively improving the convergence speed of the model.
Preferably, in the training phase, the classification loss function adopts cross entropy, the regression loss function adopts GIOU loss, and penalty is imposed on the acquired position information by using L1 norm.
In a second aspect, the disclosed embodiments provide a pedestrian detection apparatus based on multi-spatial relationship perception, which may implement or execute a pedestrian detection method based on multi-spatial relationship perception according to any one of all possible implementation manners based on the same technical concept.
Preferably, the device comprises a data acquisition unit, a first data processing unit, a second data processing unit and a result acquisition unit;
the data acquisition unit is used for executing the step 1 of the pedestrian detection method based on multi-spatial relationship perception in any one of all possible implementation manners;
the first data processing unit is used for executing the steps of step 2 and step 3 of the pedestrian detection method based on multi-spatial relationship perception in any one of all possible implementation manners;
the second data processing unit is used for executing the steps of step 4 and step 5 of the pedestrian detection method based on multi-spatial relationship perception in any one of all possible implementation manners;
the result obtaining unit is configured to execute the step of step 6 of the pedestrian detection method based on multi-spatial relationship perception according to any one of all possible implementation manners.
It should be noted that, when the pedestrian detection apparatus based on multi-spatial relationship sensing provided in the foregoing embodiment executes a pedestrian detection method based on multi-spatial relationship sensing, only the division of the functional modules is illustrated, and in practical applications, the function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules, so as to complete all or part of the functions described above. In addition, the embodiment of the pedestrian detection device based on multi-spatial relationship sensing and the embodiment of the pedestrian detection method based on multi-spatial relationship sensing belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments and are not repeated herein.
In a third aspect, an embodiment of the present disclosure provides a terminal device, where the terminal device includes any one of all possible implementation manners of the pedestrian detection apparatus based on multi-spatial relationship perception.
The invention has been described above by way of example with reference to the accompanying drawings, it being understood that the invention is not limited to the specific embodiments described above, but is capable of numerous insubstantial modifications when implemented in accordance with the principles and solutions of the present invention; or directly apply the conception and the technical scheme of the invention to other occasions without improvement and equivalent replacement, and the invention is within the protection scope of the invention.
Claims (7)
1. A pedestrian detection method based on multi-spatial relationship perception is characterized by comprising the following steps:
step 1, acquiring a pedestrian image data set, adjusting the pedestrian image data set to a fixed size, and training a model;
step 2, inputting the image into a frame model by adopting a detection frame of YOLOX, and firstly performing data enhancement on the image;
step 3, inputting the image after data enhancement into a Focus module, carrying out slicing operation on the image according to odd and even to obtain 4 images, and then splicing along the channel direction;
step 4, inputting the spliced image into a backbone network of a YOLOX detection frame, wherein three branches are connected with the backbone network, the three branches respectively correspond to different receptive fields, and the three receptive fields can cover targets with different sizes;
step 5, each branch comprises 2 parts, a multi-spatial relationship perception module and a detection head, wherein the multi-spatial relationship perception module effectively fuses global information and local information together by combining the relationship between the characteristics in different spatial dimensions to obtain a multi-spatial relationship perception characteristic diagram;
the workflow of the multi-spatial relationship perception module is as follows:
the X dimension of a characteristic diagram input into the multi-spatial relationship perception module is H multiplied by W multiplied by C, H is height, W is width, and C is the number of channels;
(1) constructing a relation characteristic diagram of an H multiplied by W space;
in H X W space range, decomposing the characteristic diagram X into H X W characteristic vectors with length of C, wherein the characteristic vectors XiMapping to a feature vector xjR for relation informationi,jExpressed, the calculation is as follows:
wherein,and phiH×WThe embedded functions are 2, and each embedded function consists of a 1 multiplied by 1 convolution layer, a Batch Normalization layer and a ReLU activation layer; correspondingly, the feature vector xjMapping to a feature vector xiIs rj,i=fH×W(xj,xi) Then (r)i,j,rj,i) Feature vector x is describediAnd xjThe two-way relationship between; for the one-way relation, the relation information among all the characteristic vectors is calculated and stacked to obtain an affinity matrixThe number of matrix channels is H multiplied by W, so that two different affinity matrices M can be obtained by the bidirectional relation1And M2Deep mining is carried out on the local information of the features;
the original global structure information is retained, specifically, after 1 × 1 convolution is performed on the original feature map X, global average pooling operation is performed in the channel direction to obtain a global structure feature map F,and connecting the global structural feature map F and the two affinity matrixes in series to obtain a feature matrix Y, wherein the formula is as follows:
pool denotes global average pooling, θH×WAndeach consisting of a 1 × 1 convolutional layer, a Batch Normalization layer and a ReLU activation layer, compared withAnd phiH×WThe number of output active nodes is different; fusing all global and local information contained in the feature matrix by convolving the feature matrix Y by 1 × 1 so as to obtain a relation feature map belonging to an H × W space;
(2) constructing a relation characteristic diagram of a channel space C;
similarly, in the channel space range, the feature map X is decomposed into C feature vectors with length H multiplied by W, and the feature vectors XaMapping to a feature vector xbRelation information r ofa,bComprises the following steps:
whereinAnd phiCFunction andand phiH×WConsistent, only different output dimensions; obtaining affinity matrix by the same calculation method as that in step 5(1)I.e. the two-way relation can be made two differentAffinity matrix M 'of'1And M'2;
After the original characteristic diagram X is convoluted by 1 multiplied by 1, the global average pooling is carried out in H multiplied by W dimension to obtain a structural characteristic diagramThe structural feature map and the two affinity matrices are connected in series to obtain a feature matrix Y', and the calculation mode is as follows:
θCandfunction and thetaH×WAndconsistent, only different output dimensions; fusing all global and local information contained in the feature matrix by convolving the feature matrix Y' by 1 × 1 so as to obtain a relation feature map belonging to a channel space C;
multiplying the relation feature graph points of the H multiplied by W space and the channel space C to obtain a multi-space relation perception feature graph;
and 6, placing the multi-spatial relationship perception characteristic graph into a detection head, decoupling classification and coordinate positioning by YOLOX, reducing dimensions of a channel by a convolution of 1 x 1, and then connecting two light-weight branches for classification and regression respectively.
2. The pedestrian detection method based on multi-spatial relationship perception according to claim 1, wherein the step 2 data enhancement comprises random horizontal flipping of images, color dithering, multi-scale enhancement and mosaic coordinate enhancement methods.
3. The pedestrian detection method based on multi-spatial relationship perception according to claim 1, wherein in step 4, the receptive fields corresponding to the three branches are respectively 8 times, 16 times and 32 times of downsampling.
4. The pedestrian detection method based on multi-spatial relationship perception according to any one of claims 1-3, wherein in the training stage, the classification loss function adopts cross entropy, the regression loss function adopts GIOU loss, and penalty is applied to the acquired position information by using L1 norm.
5. A pedestrian detection device based on multi-spatial relationship perception is characterized in that the device can realize a pedestrian detection method based on multi-spatial relationship perception according to any one of claims 1 to 4.
6. The pedestrian detection device based on multi-spatial relationship perception according to claim 5, wherein the device comprises a data acquisition unit, a first data processing unit, a second data processing unit and a result acquisition unit;
the data acquisition unit is used for executing the step 1 of the pedestrian detection method based on multi-spatial relationship perception according to any one of claims 1 to 4;
the first data processing unit is used for executing the steps of step 2 and step 3 of the pedestrian detection method based on multi-spatial relationship perception according to any one of claims 1 to 4;
the second data processing unit is used for executing the steps of step 4 and step 5 of the pedestrian detection method based on multi-spatial relationship perception according to any one of claims 1 to 4;
the result obtaining unit is used for executing the step of step 6 of the pedestrian detection method based on multi-spatial relationship perception according to any one of claims 1 to 4.
7. A terminal device, characterized in that the terminal device comprises a pedestrian detection apparatus based on multi-spatial relationship perception as claimed in any one of claims 5 or 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111510823.5A CN114332919B (en) | 2021-12-11 | Pedestrian detection method and device based on multi-spatial relationship sensing and terminal equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111510823.5A CN114332919B (en) | 2021-12-11 | Pedestrian detection method and device based on multi-spatial relationship sensing and terminal equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114332919A true CN114332919A (en) | 2022-04-12 |
CN114332919B CN114332919B (en) | 2024-10-29 |
Family
ID=
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114663861A (en) * | 2022-05-17 | 2022-06-24 | 山东交通学院 | Vehicle re-identification method based on dimension decoupling and non-local relation |
CN115082855A (en) * | 2022-06-20 | 2022-09-20 | 安徽工程大学 | Pedestrian occlusion detection method based on improved YOLOX algorithm |
CN115311690A (en) * | 2022-10-08 | 2022-11-08 | 广州英码信息科技有限公司 | End-to-end pedestrian structural information and dependency relationship detection method thereof |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110796239A (en) * | 2019-10-30 | 2020-02-14 | 福州大学 | Deep learning target detection method based on channel and space fusion perception |
CN111369543A (en) * | 2020-03-07 | 2020-07-03 | 北京工业大学 | Rapid pollen particle detection algorithm based on dual self-attention module |
CN112733693A (en) * | 2021-01-04 | 2021-04-30 | 武汉大学 | Multi-scale residual error road extraction method for global perception high-resolution remote sensing image |
CN113505640A (en) * | 2021-05-31 | 2021-10-15 | 东南大学 | Small-scale pedestrian detection method based on multi-scale feature fusion |
CN113567984A (en) * | 2021-07-30 | 2021-10-29 | 长沙理工大学 | Method and system for detecting artificial small target in SAR image |
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110796239A (en) * | 2019-10-30 | 2020-02-14 | 福州大学 | Deep learning target detection method based on channel and space fusion perception |
CN111369543A (en) * | 2020-03-07 | 2020-07-03 | 北京工业大学 | Rapid pollen particle detection algorithm based on dual self-attention module |
CN112733693A (en) * | 2021-01-04 | 2021-04-30 | 武汉大学 | Multi-scale residual error road extraction method for global perception high-resolution remote sensing image |
CN113505640A (en) * | 2021-05-31 | 2021-10-15 | 东南大学 | Small-scale pedestrian detection method based on multi-scale feature fusion |
CN113567984A (en) * | 2021-07-30 | 2021-10-29 | 长沙理工大学 | Method and system for detecting artificial small target in SAR image |
Non-Patent Citations (1)
Title |
---|
聂玮;曹悦;朱冬雪;朱艺璇;黄林毅;: "复杂监控背景下基于边缘感知学习网络的行为识别算法", 计算机应用与软件, no. 08, 12 August 2020 (2020-08-12) * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114663861A (en) * | 2022-05-17 | 2022-06-24 | 山东交通学院 | Vehicle re-identification method based on dimension decoupling and non-local relation |
CN115082855A (en) * | 2022-06-20 | 2022-09-20 | 安徽工程大学 | Pedestrian occlusion detection method based on improved YOLOX algorithm |
CN115082855B (en) * | 2022-06-20 | 2024-07-12 | 安徽工程大学 | Pedestrian shielding detection method based on improved YOLOX algorithm |
CN115311690A (en) * | 2022-10-08 | 2022-11-08 | 广州英码信息科技有限公司 | End-to-end pedestrian structural information and dependency relationship detection method thereof |
CN115311690B (en) * | 2022-10-08 | 2022-12-23 | 广州英码信息科技有限公司 | End-to-end pedestrian structural information and dependency relationship detection method thereof |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105488517B (en) | A kind of vehicle brand type identifier method based on deep learning | |
CN113344806A (en) | Image defogging method and system based on global feature fusion attention network | |
CN108803617A (en) | Trajectory predictions method and device | |
CN110728209A (en) | Gesture recognition method and device, electronic equipment and storage medium | |
CN111325165A (en) | Urban remote sensing image scene classification method considering spatial relationship information | |
CN116052016A (en) | Fine segmentation detection method for remote sensing image cloud and cloud shadow based on deep learning | |
CN105100640A (en) | Local registration parallel video stitching method and local registration parallel video stitching system | |
CN112489050A (en) | Semi-supervised instance segmentation algorithm based on feature migration | |
CN104850857A (en) | Trans-camera pedestrian target matching method based on visual space significant constraints | |
CN115035298A (en) | City streetscape semantic segmentation enhancement method based on multi-dimensional attention mechanism | |
CN111191704B (en) | Foundation cloud classification method based on task graph convolutional network | |
CN115861883A (en) | Multi-target detection tracking method | |
CN109919832A (en) | One kind being used for unpiloted traffic image joining method | |
CN114943893A (en) | Feature enhancement network for land coverage classification | |
CN112149526A (en) | Lane line detection method and system based on long-distance information fusion | |
CN114693966A (en) | Target detection method based on deep learning | |
CN111680640B (en) | Vehicle type identification method and system based on domain migration | |
CN117351360A (en) | Remote sensing image road extraction method based on attention mechanism improvement | |
CN114332919A (en) | Pedestrian detection method and device based on multi-spatial relationship perception and terminal equipment | |
CN113870129B (en) | Video rain removing method based on space perception and time difference learning | |
CN114037922B (en) | Aerial image segmentation method based on hierarchical context network | |
CN114332919B (en) | Pedestrian detection method and device based on multi-spatial relationship sensing and terminal equipment | |
CN115546667A (en) | Real-time lane line detection method for unmanned aerial vehicle scene | |
Xia et al. | Research on Traffic Accident Detection Based on Vehicle Perspective | |
CN111160255A (en) | Fishing behavior identification method and system based on three-dimensional convolutional network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |