CN114332919A - Pedestrian detection method and device based on multi-spatial relationship perception and terminal equipment - Google Patents

Pedestrian detection method and device based on multi-spatial relationship perception and terminal equipment Download PDF

Info

Publication number
CN114332919A
CN114332919A CN202111510823.5A CN202111510823A CN114332919A CN 114332919 A CN114332919 A CN 114332919A CN 202111510823 A CN202111510823 A CN 202111510823A CN 114332919 A CN114332919 A CN 114332919A
Authority
CN
China
Prior art keywords
spatial relationship
feature
pedestrian detection
relation
perception
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111510823.5A
Other languages
Chinese (zh)
Inventor
姜峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Xingzheyi Intelligent Transportation Technology Co ltd
Original Assignee
Nanjing Xingzheyi Intelligent Transportation Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Xingzheyi Intelligent Transportation Technology Co ltd filed Critical Nanjing Xingzheyi Intelligent Transportation Technology Co ltd
Priority to CN202111510823.5A priority Critical patent/CN114332919A/en
Publication of CN114332919A publication Critical patent/CN114332919A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a pedestrian detection method, a device and terminal equipment based on multi-spatial relationship perception, wherein the method comprises the following steps of 1, collecting a pedestrian image data set, adjusting the pedestrian image data set to a fixed size, and training a model; step 2, inputting the image into a frame model by adopting a detection frame of YOLOX, and firstly performing data enhancement on the image; step 3, inputting the image after data enhancement into a Focus module, carrying out slicing operation on the image according to odd and even to obtain 4 images, and then splicing along the channel direction; step 4, inputting the spliced images into a backbone network of a YOLOX detection framework, wherein three branches are connected with the backbone network; and step 5, each branch comprises 2 parts, a multi-spatial relationship perception module and a detection head, and the multi-spatial relationship perception module effectively fuses global information and local information together by combining the relationship between the features in different spatial dimensions to obtain a multi-spatial relationship perception feature map. The method focuses on global information, can extract local information, and effectively fuses the global information and the local information, so that characteristic information with more identification degree is obtained, and the pedestrian detection performance is improved.

Description

Pedestrian detection method and device based on multi-spatial relationship perception and terminal equipment
Technical Field
The invention relates to the field of image recognition research, in particular to a pedestrian detection method, and specifically relates to a pedestrian detection method, a pedestrian detection device and terminal equipment based on multi-spatial relationship perception.
Background
With the continuous development of smart city construction, many new artificial intelligence technologies are applied to intelligent transportation, intelligent government affairs, intelligent factories and the like, and each application cannot leave the masses and serves people, so that pedestrian detection is the premise of many application technologies. However, real scenes are often complex, such as overlapping of bodies due to dense crowds, or blocking of objects, strong illumination change, blurred pictures due to severe weather factors (rain and snow weather, etc.), and the real situations increase the difficulty of pedestrian detection. Therefore, there is a need for a pedestrian detection technology that can dig deeper and discriminative features in a pedestrian region in an image enough to characterize pedestrians in various environments.
In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art: most of the popular pedestrian detection technologies at present are based on a Convolutional Neural Network (CNN), and most of CNN pedestrian detection models use limited receptive fields, so that it is difficult to learn rich structural modes by combining global information, for example, the CNN is used for detecting and segmenting pedestrians, so as to obtain final position information; such as pedestrian detection using CNN in combination with feature fusion; although some methods take into account different receptive fields, they do not combine global and local information well; in addition, some approaches enhance the learning capabilities of the model by stacking network depths, which are very resource intensive to train and deploy.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a pedestrian detection method, a device and terminal equipment based on multi-spatial relationship perception. The technical scheme is as follows:
the invention provides a pedestrian detection method based on multi-spatial relationship perception, which comprises the following steps:
step 1, acquiring a pedestrian image data set, and adjusting the pedestrian image data set to a fixed size to train a model.
And 2, inputting the image into a frame model by adopting a detection frame of the YOLOX, and performing data enhancement on the image.
And 3, inputting the image subjected to data enhancement into a Focus module, slicing the image according to parity to obtain 4 images, and splicing along the channel direction.
And 4, inputting the spliced image into a backbone network of the YOLOX detection framework, wherein three branches are connected with the backbone network, the three branches respectively correspond to different receptive fields, and the three receptive fields can cover targets with different sizes.
And step 5, each branch comprises 2 parts, a multi-spatial relationship perception module and a detection head, and the multi-spatial relationship perception module effectively fuses global information and local information together by combining the relationship between the features in different spatial dimensions to obtain a multi-spatial relationship perception feature map.
The workflow of the multi-spatial relationship perception module is as follows:
the X dimension of a characteristic diagram input into the multi-spatial relationship perception module is H multiplied by W multiplied by C, H is height, W is width, and C is the number of channels;
(1) constructing a relation characteristic diagram of an H multiplied by W space;
in H X W space range, decomposing the characteristic diagram X into H X W characteristic vectors with length of C, wherein the characteristic vectors XiMapping to a feature vector xjR for relation informationi,jExpressed, the calculation is as follows:
Figure BDA0003405565670000021
wherein the content of the first and second substances,
Figure BDA0003405565670000022
and phiH×WFor 2 embedding functions, each of which is a 1 × 1 convolutional layer, a BatchNormalization layer, and aAnd a ReLU activation layer. Correspondingly, the feature vector xjMapping to a feature vector JxiIs rj,i=fH×W(xj,xi) Then (r)i,j,rj,i) Feature vector x is describediAnd xjThe two-way relationship between; for the one-way relation, the relation information among all the characteristic vectors is calculated and stacked to obtain an affinity matrix
Figure BDA0003405565670000023
The number of matrix channels is H multiplied by W, so that two different affinity matrixes M1 and M2 can be obtained through the bidirectional relation, and the local information of the features is deeply mined.
The original global structure information is retained, specifically, after 1 × 1 convolution is performed on the original feature map X, global average pooling operation is performed in the channel direction to obtain a global structure feature map F,
Figure BDA0003405565670000024
and connecting the global structural feature map F and the two affinity matrixes in series to obtain a feature matrix Y, wherein the formula is as follows:
Figure BDA0003405565670000026
pool denotes global average pooling, θH×WAnd
Figure BDA0003405565670000027
each consisting of a 1 × 1 convolutional layer, a Batch Normalization layer and a ReLU activation layer, compared with
Figure BDA0003405565670000025
And phiH×WThe number of output active nodes is different; and fusing all global and local information contained in the feature matrix by convolving the feature matrix Y by 1 × 1 to obtain a relational feature map belonging to a slice × W space.
(2) Constructing a relation characteristic diagram of a channel space C;
similarly, in the channel space range, the feature map X is decomposed into C feature vectors with length H multiplied by W, and the feature vectors XaMapping to a feature vector xbRelation information r ofa,bComprises the following steps:
Figure BDA0003405565670000031
wherein
Figure BDA0003405565670000032
Yang phiCFunction and
Figure BDA0003405565670000033
and phiH×WConsistent, only different output dimensions; obtaining affinity matrix by the same calculation method as that in step 5(1)
Figure BDA0003405565670000034
That is, the bidirectional relationship can obtain two different affinity matrices M'1And M'2
After the original characteristic diagram X is convoluted by 1 multiplied by 1, the global average pooling is carried out in H multiplied by W dimension to obtain a structural characteristic diagram
Figure BDA0003405565670000035
The structural feature map and the two affinity matrices are connected in series to obtain a feature matrix Y', and the calculation mode is as follows:
Y′=[pool(θC(X)),θC(M′1),θC(M′2)]。
θCand
Figure BDA0003405565670000036
function and thetaH×WAnd
Figure BDA0003405565670000037
consistent, only different output dimensions; passing the feature matrix YAnd fusing all global and local information contained in the feature matrix through convolution of 1 × 1, thereby obtaining a relationship feature map belonging to the channel space C.
And multiplying the relation feature maps of the H multiplied by W space and the channel space C to obtain a multi-space relation perception feature map.
And 6, placing the multi-spatial relationship perception characteristic graph into a detection head, decoupling classification and coordinate positioning by YOLOX, reducing dimensions of a channel by a convolution of 1 x 1, and then connecting two light-weight branches for classification and regression respectively.
Preferably, the step 2 data enhancement comprises random horizontal flipping of the image, color dithering, multi-scale enhancement and mosaic coordinate enhancement methods.
Preferably, the receptive fields corresponding to the three branches in step 4 are respectively 8 times, 16 times and 32 times of downsampling.
Preferably, in the training phase, the classification loss function adopts cross entropy, the regression loss function adopts GIOU loss, and penalty is imposed on the acquired position information by using L1 norm.
Compared with the prior art, one of the technical schemes has the following beneficial effects: through many spatial relation perception modules, among the different spatial dimensions of degree of depth excavation, the relation between characteristic and the characteristic had both been paid close attention to global information, can extract local information again to effectively fuse the two, establish the relation with the relation information between the characteristic information of different spaces and characteristic, make the characteristic that the model study had more the degree of distinguishing and discrimination, thereby obtain the characteristic information who has more the degree of distinguishing, improve pedestrian detection accuracy.
Drawings
Fig. 1 is a flowchart of a multi-spatial relationship awareness module according to an embodiment of the present disclosure.
Detailed Description
In order to clarify the technical solution and the working principle of the present invention, the embodiments of the present disclosure will be described in further detail with reference to the accompanying drawings. All the above optional technical solutions may be combined arbitrarily to form the optional embodiments of the present disclosure, and are not described herein again.
The terms "step 1," "step 2," "step 3," and the like in the description and claims of this application and the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used may be interchanged under appropriate circumstances such that the embodiments of the application described herein may be practiced in sequences other than those described herein.
In a first aspect: the embodiment of the disclosure provides a pedestrian detection method based on multi-spatial relationship perception, which comprises the following steps:
step 1, acquiring a pedestrian image data set, and adjusting the pedestrian image data set to a fixed size to train a model.
And 2, a detection framework of the YOLOX is adopted, the framework is simple in structure, an anchor frame does not need to be manually arranged, and training and deployment are facilitated. Inputting the image into a frame model, and performing data enhancement on the image, preferably, the data enhancement in the step 2 comprises random horizontal inversion, color dithering, multi-scale enhancement, mosaic coordinate enhancement method and the like of the image so as to enlarge the scale of a training set and improve the generalization capability of the model.
And 3, inputting the image subjected to data enhancement into a Focus module, slicing the image according to parity to obtain 4 images, and splicing along the channel direction. The Focus module performs downsampling without increasing the calculation amount, and more complete image information is reserved.
And 4, inputting the spliced image into a backbone network of the YOLOX detection framework, wherein three branches are connected with the backbone network, the three branches respectively correspond to different receptive fields, and the three receptive fields can cover targets with different sizes. Preferably, the receptive fields corresponding to the three branches in step 4 are respectively 8 times, 16 times and 32 times of downsampling.
And step 5, each branch comprises 2 parts, a multi-spatial relationship perception module and a detection head, and the multi-spatial relationship perception module effectively fuses global information and local information together by combining the relationship between the features in different spatial dimensions to obtain a multi-spatial relationship perception feature map.
Fig. 1 is a flowchart of a multi-spatial relationship sensing module, and with reference to the diagram, the workflow of the multi-spatial relationship sensing module is as follows:
the X dimension of a characteristic diagram input into the multi-spatial relationship perception module is H multiplied by W multiplied by C, H is height, W is width, and C is the number of channels;
(1) constructing a relation characteristic diagram of an H multiplied by W space;
in H X W space range, decomposing the characteristic diagram X into H X W characteristic vectors with length of C, wherein the characteristic vectors XiMapping to a feature vector xjR for relation informationi,jExpressed, the calculation is as follows:
Figure BDA0003405565670000051
wherein the content of the first and second substances,
Figure BDA0003405565670000052
and phiH×WFor 2 embedding functions, each consists of a 1 × 1 convolutional layer, a BatchNormalization layer, and a ReLU activation layer. Correspondingly, the feature vector xjMapping to a feature vector xiIs rj,i=fH×W(xj,xi) Then (r)i,j,rj,i) Feature vector x is describediAnd xjThe two-way relationship between; for the one-way relation, the relation information among all the characteristic vectors is calculated and stacked to obtain an affinity matrix
Figure BDA0003405565670000053
The number of matrix channels is H multiplied by W, so that two different affinity matrices M can be obtained by the bidirectional relation1And M2And deep mining is carried out on the local information of the features.
In order to simultaneously develop global information of features, original global structure information needs to be retained, specifically, after 1 × 1 convolution is performed on an original feature map X, global average pooling operation is performed in a channel direction to obtain global information of featuresA global structural feature map F is obtained,
Figure BDA0003405565670000054
and connecting the global structural feature map F and the two affinity matrixes in series to obtain a feature matrix Y, wherein the formula is as follows:
Y=[pool(θH×W(X)),θH×W(M1),θH×W(M2)]。
pool denotes global average pooling, θH×WAnd thetaH×WEach consisting of a 1 × 1 convolutional layer, a Batch Normalization layer and a ReLU activation layer, compared with
Figure BDA0003405565670000055
And phiH×WThe number of output active nodes is different; and fusing all global and local information contained in the feature matrix by convolving the feature matrix Y by 1 × 1 to obtain a relation feature map belonging to H × W space.
(2) Constructing a relation characteristic diagram of a channel space C;
similarly, in the channel space range, the feature map X is decomposed into C feature vectors with length H multiplied by W, and the feature vectors XaMapping to a feature vector xbRelation information r ofa,bComprises the following steps:
Figure BDA0003405565670000056
wherein
Figure BDA0003405565670000061
And phiCFunction and
Figure BDA0003405565670000062
and phiH×WConsistent, only different output dimensions; obtaining affinity matrix by the same calculation method as that in step 5(1)
Figure BDA0003405565670000063
That is, the bidirectional relationship can obtain two different affinity matrices M'1And M'2
Different from the structural feature map F obtained in the step 5, after the original feature map X is convolved by 1 multiplied by 1 in the step, the global average pooling is carried out in H multiplied by W dimension to obtain the structural feature map
Figure BDA0003405565670000064
The structural feature map and the two affinity matrices are connected in series to obtain a feature matrix Y', and the calculation mode is as follows:
Figure BDA0003405565670000065
θCand
Figure BDA0003405565670000066
function and thetaH×WAnd
Figure BDA0003405565670000067
consistent, only different output dimensions; and fusing all global and local information contained in the feature matrix by convolution of 1 multiplied by 1 on the feature matrix Y', thereby obtaining a relation feature map belonging to the channel space C.
And multiplying the relation characteristic diagram of the H multiplied by W space and the channel space C to obtain a multi-space relation perception characteristic diagram, wherein the relation perception characteristic diagram contains global and local information of the characteristics in different space dimensions and is fully fused, and the effectiveness and the distinguishing capability of the characteristics are improved.
And 6, putting the multi-spatial relationship perception characteristic graph into a detection head, coupling classification and coordinate positioning together for training by using a detection head different from a traditional YOLO series detection head, decoupling the classification and the coordinate positioning by using YOLOX, reducing the dimension of a channel by using a convolution of 1 multiplied by 1, then connecting two light-weight branches, and performing classification and regression respectively, thereby effectively improving the convergence speed of the model.
Preferably, in the training phase, the classification loss function adopts cross entropy, the regression loss function adopts GIOU loss, and penalty is imposed on the acquired position information by using L1 norm.
In a second aspect, the disclosed embodiments provide a pedestrian detection apparatus based on multi-spatial relationship perception, which may implement or execute a pedestrian detection method based on multi-spatial relationship perception according to any one of all possible implementation manners based on the same technical concept.
Preferably, the device comprises a data acquisition unit, a first data processing unit, a second data processing unit and a result acquisition unit;
the data acquisition unit is used for executing the step 1 of the pedestrian detection method based on multi-spatial relationship perception in any one of all possible implementation manners;
the first data processing unit is used for executing the steps of step 2 and step 3 of the pedestrian detection method based on multi-spatial relationship perception in any one of all possible implementation manners;
the second data processing unit is used for executing the steps of step 4 and step 5 of the pedestrian detection method based on multi-spatial relationship perception in any one of all possible implementation manners;
the result obtaining unit is configured to execute the step of step 6 of the pedestrian detection method based on multi-spatial relationship perception according to any one of all possible implementation manners.
It should be noted that, when the pedestrian detection apparatus based on multi-spatial relationship sensing provided in the foregoing embodiment executes a pedestrian detection method based on multi-spatial relationship sensing, only the division of the functional modules is illustrated, and in practical applications, the function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules, so as to complete all or part of the functions described above. In addition, the embodiment of the pedestrian detection device based on multi-spatial relationship sensing and the embodiment of the pedestrian detection method based on multi-spatial relationship sensing belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments and are not repeated herein.
In a third aspect, an embodiment of the present disclosure provides a terminal device, where the terminal device includes any one of all possible implementation manners of the pedestrian detection apparatus based on multi-spatial relationship perception.
The invention has been described above by way of example with reference to the accompanying drawings, it being understood that the invention is not limited to the specific embodiments described above, but is capable of numerous insubstantial modifications when implemented in accordance with the principles and solutions of the present invention; or directly apply the conception and the technical scheme of the invention to other occasions without improvement and equivalent replacement, and the invention is within the protection scope of the invention.

Claims (7)

1. A pedestrian detection method based on multi-spatial relationship perception is characterized by comprising the following steps:
step 1, acquiring a pedestrian image data set, adjusting the pedestrian image data set to a fixed size, and training a model;
step 2, inputting the image into a frame model by adopting a detection frame of YOLOX, and firstly performing data enhancement on the image;
step 3, inputting the image after data enhancement into a Focus module, carrying out slicing operation on the image according to odd and even to obtain 4 images, and then splicing along the channel direction;
step 4, inputting the spliced image into a backbone network of a YOLOX detection frame, wherein three branches are connected with the backbone network, the three branches respectively correspond to different receptive fields, and the three receptive fields can cover targets with different sizes;
step 5, each branch comprises 2 parts, a multi-spatial relationship perception module and a detection head, wherein the multi-spatial relationship perception module effectively fuses global information and local information together by combining the relationship between the characteristics in different spatial dimensions to obtain a multi-spatial relationship perception characteristic diagram;
the workflow of the multi-spatial relationship perception module is as follows:
the X dimension of a characteristic diagram input into the multi-spatial relationship perception module is H multiplied by W multiplied by C, H is height, W is width, and C is the number of channels;
(1) constructing a relation characteristic diagram of an H multiplied by W space;
in H X W space range, decomposing the characteristic diagram X into H X W characteristic vectors with length of C, wherein the characteristic vectors XiMapping to a feature vector xjR for relation informationi,jExpressed, the calculation is as follows:
Figure FDA0003405565660000011
wherein the content of the first and second substances,
Figure FDA0003405565660000012
and phiH×WThe embedded functions are 2, and each embedded function consists of a 1 multiplied by 1 convolution layer, a Batch Normalization layer and a ReLU activation layer; correspondingly, the feature vector xjMapping to a feature vector xiIs rj,i=fH×W(xj,xi) Then (r)i,j,rj,i) Feature vector x is describediAnd xjThe two-way relationship between; for the one-way relation, the relation information among all the characteristic vectors is calculated and stacked to obtain an affinity matrix
Figure FDA0003405565660000013
The number of matrix channels is H multiplied by W, so that two different affinity matrices M can be obtained by the bidirectional relation1And M2Deep mining is carried out on the local information of the features;
the original global structure information is retained, specifically, after 1 × 1 convolution is performed on the original feature map X, global average pooling operation is performed in the channel direction to obtain a global structure feature map F,
Figure FDA0003405565660000014
and connecting the global structural feature map F and the two affinity matrixes in series to obtain a feature matrix Y, wherein the formula is as follows:
Figure FDA0003405565660000021
pool denotes global average pooling, θH×WAnd
Figure FDA00034055656600000212
each consisting of a 1 × 1 convolutional layer, a Batch Normalization layer and a ReLU activation layer, compared with
Figure FDA0003405565660000023
And phiH×WThe number of output active nodes is different; fusing all global and local information contained in the feature matrix by convolving the feature matrix Y by 1 × 1 so as to obtain a relation feature map belonging to an H × W space;
(2) constructing a relation characteristic diagram of a channel space C;
similarly, in the channel space range, the feature map X is decomposed into C feature vectors with length H multiplied by W, and the feature vectors XaMapping to a feature vector xbRelation information r ofa,bComprises the following steps:
Figure FDA0003405565660000024
wherein
Figure FDA0003405565660000025
And phiCFunction and
Figure FDA0003405565660000026
and phiH×WConsistent, only different output dimensions; obtaining affinity matrix by the same calculation method as that in step 5(1)
Figure FDA0003405565660000027
I.e. the two-way relation can be made two differentAffinity matrix M 'of'1And M'2
After the original characteristic diagram X is convoluted by 1 multiplied by 1, the global average pooling is carried out in H multiplied by W dimension to obtain a structural characteristic diagram
Figure FDA0003405565660000028
The structural feature map and the two affinity matrices are connected in series to obtain a feature matrix Y', and the calculation mode is as follows:
Figure FDA0003405565660000029
θCand
Figure FDA00034055656600000210
function and thetaH×WAnd
Figure FDA00034055656600000211
consistent, only different output dimensions; fusing all global and local information contained in the feature matrix by convolving the feature matrix Y' by 1 × 1 so as to obtain a relation feature map belonging to a channel space C;
multiplying the relation feature graph points of the H multiplied by W space and the channel space C to obtain a multi-space relation perception feature graph;
and 6, placing the multi-spatial relationship perception characteristic graph into a detection head, decoupling classification and coordinate positioning by YOLOX, reducing dimensions of a channel by a convolution of 1 x 1, and then connecting two light-weight branches for classification and regression respectively.
2. The pedestrian detection method based on multi-spatial relationship perception according to claim 1, wherein the step 2 data enhancement comprises random horizontal flipping of images, color dithering, multi-scale enhancement and mosaic coordinate enhancement methods.
3. The pedestrian detection method based on multi-spatial relationship perception according to claim 1, wherein in step 4, the receptive fields corresponding to the three branches are respectively 8 times, 16 times and 32 times of downsampling.
4. The pedestrian detection method based on multi-spatial relationship perception according to any one of claims 1-3, wherein in the training stage, the classification loss function adopts cross entropy, the regression loss function adopts GIOU loss, and penalty is applied to the acquired position information by using L1 norm.
5. A pedestrian detection device based on multi-spatial relationship perception is characterized in that the device can realize a pedestrian detection method based on multi-spatial relationship perception according to any one of claims 1 to 4.
6. The pedestrian detection device based on multi-spatial relationship perception according to claim 5, wherein the device comprises a data acquisition unit, a first data processing unit, a second data processing unit and a result acquisition unit;
the data acquisition unit is used for executing the step 1 of the pedestrian detection method based on multi-spatial relationship perception according to any one of claims 1 to 4;
the first data processing unit is used for executing the steps of step 2 and step 3 of the pedestrian detection method based on multi-spatial relationship perception according to any one of claims 1 to 4;
the second data processing unit is used for executing the steps of step 4 and step 5 of the pedestrian detection method based on multi-spatial relationship perception according to any one of claims 1 to 4;
the result obtaining unit is used for executing the step of step 6 of the pedestrian detection method based on multi-spatial relationship perception according to any one of claims 1 to 4.
7. A terminal device, characterized in that the terminal device comprises a pedestrian detection apparatus based on multi-spatial relationship perception as claimed in any one of claims 5 or 6.
CN202111510823.5A 2021-12-11 2021-12-11 Pedestrian detection method and device based on multi-spatial relationship perception and terminal equipment Pending CN114332919A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111510823.5A CN114332919A (en) 2021-12-11 2021-12-11 Pedestrian detection method and device based on multi-spatial relationship perception and terminal equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111510823.5A CN114332919A (en) 2021-12-11 2021-12-11 Pedestrian detection method and device based on multi-spatial relationship perception and terminal equipment

Publications (1)

Publication Number Publication Date
CN114332919A true CN114332919A (en) 2022-04-12

Family

ID=81050935

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111510823.5A Pending CN114332919A (en) 2021-12-11 2021-12-11 Pedestrian detection method and device based on multi-spatial relationship perception and terminal equipment

Country Status (1)

Country Link
CN (1) CN114332919A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114663861A (en) * 2022-05-17 2022-06-24 山东交通学院 Vehicle re-identification method based on dimension decoupling and non-local relation
CN115311690A (en) * 2022-10-08 2022-11-08 广州英码信息科技有限公司 End-to-end pedestrian structural information and dependency relationship detection method thereof

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114663861A (en) * 2022-05-17 2022-06-24 山东交通学院 Vehicle re-identification method based on dimension decoupling and non-local relation
CN115311690A (en) * 2022-10-08 2022-11-08 广州英码信息科技有限公司 End-to-end pedestrian structural information and dependency relationship detection method thereof
CN115311690B (en) * 2022-10-08 2022-12-23 广州英码信息科技有限公司 End-to-end pedestrian structural information and dependency relationship detection method thereof

Similar Documents

Publication Publication Date Title
CN101714262B (en) Method for reconstructing three-dimensional scene of single image
CN110147794A (en) A kind of unmanned vehicle outdoor scene real time method for segmenting based on deep learning
CN108803617A (en) Trajectory predictions method and device
CN110728209A (en) Gesture recognition method and device, electronic equipment and storage medium
CN114332919A (en) Pedestrian detection method and device based on multi-spatial relationship perception and terminal equipment
CN113344806A (en) Image defogging method and system based on global feature fusion attention network
CN111325165A (en) Urban remote sensing image scene classification method considering spatial relationship information
CN105100640A (en) Local registration parallel video stitching method and local registration parallel video stitching system
CN112489050A (en) Semi-supervised instance segmentation algorithm based on feature migration
CN106228121A (en) Gesture feature recognition methods and device
CN116052016A (en) Fine segmentation detection method for remote sensing image cloud and cloud shadow based on deep learning
CN104850857A (en) Trans-camera pedestrian target matching method based on visual space significant constraints
CN115035298A (en) City streetscape semantic segmentation enhancement method based on multi-dimensional attention mechanism
CN111191704B (en) Foundation cloud classification method based on task graph convolutional network
CN114943893A (en) Feature enhancement network for land coverage classification
CN109919832A (en) One kind being used for unpiloted traffic image joining method
CN113066089A (en) Real-time image semantic segmentation network based on attention guide mechanism
CN112149526A (en) Lane line detection method and system based on long-distance information fusion
CN114693966A (en) Target detection method based on deep learning
CN111680640B (en) Vehicle type identification method and system based on domain migration
CN116503602A (en) Unstructured environment three-dimensional point cloud semantic segmentation method based on multi-level edge enhancement
CN115546667A (en) Real-time lane line detection method for unmanned aerial vehicle scene
CN113870129B (en) Video rain removing method based on space perception and time difference learning
CN114037922A (en) Aerial image segmentation method based on hierarchical context network
Shashank et al. Behavior Cloning for Self Driving Cars using Attention Models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination