CN112966546A - Embedded attitude estimation method based on unmanned aerial vehicle scout image - Google Patents

Embedded attitude estimation method based on unmanned aerial vehicle scout image Download PDF

Info

Publication number
CN112966546A
CN112966546A CN202110004413.7A CN202110004413A CN112966546A CN 112966546 A CN112966546 A CN 112966546A CN 202110004413 A CN202110004413 A CN 202110004413A CN 112966546 A CN112966546 A CN 112966546A
Authority
CN
China
Prior art keywords
aerial vehicle
unmanned aerial
scout image
network
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110004413.7A
Other languages
Chinese (zh)
Inventor
姜梁
马祥森
吴国强
钱宇浛
孙浩惠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Spaceflight Electronic Technology Research Institute
Aerospace Times Feihong Technology Co ltd
China Academy of Aerospace Electronics Technology Co Ltd
Original Assignee
China Spaceflight Electronic Technology Research Institute
Aerospace Times Feihong Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Spaceflight Electronic Technology Research Institute, Aerospace Times Feihong Technology Co ltd filed Critical China Spaceflight Electronic Technology Research Institute
Priority to CN202110004413.7A priority Critical patent/CN112966546A/en
Publication of CN112966546A publication Critical patent/CN112966546A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses an embedded attitude estimation method based on an unmanned aerial vehicle scout image, and belongs to the field of image processing and machine vision. The method specifically comprises the following steps: acquiring an original unmanned aerial vehicle scout image data set, and performing data enhancement processing on the original unmanned aerial vehicle scout image data set; labeling the obtained original unmanned aerial vehicle reconnaissance image data set to obtain a training data set with a label; constructing a lightweight multi-stage hourglass network, and training the lightweight multi-stage hourglass network by using the training data set; inputting an unmanned aerial vehicle scout image to be processed, preprocessing the unmanned aerial vehicle scout image, inputting the preprocessed unmanned aerial vehicle scout image into a trained lightweight attitude estimation network to obtain a portrait feature map, and estimating the portrait attitude according to the portrait feature map. According to the technical scheme, the algorithm performance and the deployment adaptability are integrated, and various problems of attitude estimation of the unmanned aerial vehicle video processing system are solved.

Description

Embedded attitude estimation method based on unmanned aerial vehicle scout image
Technical Field
The invention relates to the field of image processing and machine vision, in particular to a ground small target embedded posture estimation hourglass network for an unmanned aerial vehicle aerial video.
Background
In recent years, the unmanned aerial vehicle as a new combat force plays an irreplaceable role under the intelligent combat condition, the unmanned aerial vehicle equipment technology is vigorously developed, and the unmanned aerial vehicle has great strategic significance for improving the combat capability of troops. The attitude estimation technology is one of key technologies for the unmanned aerial vehicle to execute reconnaissance and striking tasks, and can provide strong support for the unmanned aerial vehicle to quickly and accurately identify the target intention, the advancing route and the like. The high-efficiency and accurate attitude estimation algorithm can effectively reduce the burden of ground operators, and improve the investigation capability and the quick response combat efficiency.
The traditional unmanned aerial vehicle reconnaissance ground small target posture estimation algorithm mainly obtains coordinates of key points of a human body through an image processing technology, so that a human body skeleton model or a contour model is obtained, and human body posture behaviors can be expressed more intuitively. Before 2015, all body pose estimation methods aimed at regressing the exact coordinates of the body's key points. However, these methods are very poorly scalable due to the flexibility of human body actions.
The human body posture estimation-based algorithm has the advantages that a human body is converted into a human body posture skeleton diagram or a human body contour diagram, so that the method is concise and intuitive, and background interference can be suppressed to a great extent. The disadvantage is that the pose estimation itself is a relatively complex problem, which is used as the front-end input for the pose detector, and the detector results are greatly affected by the pose estimation.
In an unmanned aerial vehicle video image processing system, the attitude estimation technology for a ground small target currently faces the following problems:
1) the complex human body image makes the model need to learn the highly nonlinear mapping relation, and the learning difficulty of the mapping relation is extremely large. The main reasons are: firstly, human body images are shot in different scenes and have different shooting angles and illumination conditions; secondly, random occlusion can be caused by the interaction relationship between people and objects and between people; finally, different wear and body types also increase the complexity of the mapping. Although the human body posture estimation method based on manual features can realize accurate positioning of non-shielding joints under the conditions of fixed scenes, visual angles and stable illumination, the ideal situation does not exist in real scenes. Therefore, how to extract more robust features and learn complex mapping relationships through characterization learning is a problem which must be studied in human body posture estimation.
2) The highly non-linear mapping relationship needs to use a more complex model to learn, and the more complex model needs a large computational overhead. How to guarantee the accuracy of the model while accelerating the running speed of the human body posture estimation model is a key problem for the human body posture estimation method to be practical.
Disclosure of Invention
In order to solve the problems, the technical scheme of the invention provides an embedded attitude estimation algorithm based on unmanned aerial vehicle scout images based on unmanned aerial vehicle video image characteristics and defects of the domestic prior art in the aspect of attitude estimation of unmanned aerial vehicle scout ground small targets, and overall algorithm performance and deployment adaptability, and solves a plurality of problems of attitude estimation of an unmanned aerial vehicle video processing system. The method mainly comprises the following steps: 1) the traditional attitude estimation is greatly influenced by the foreground; 2) the traditional deep learning algorithm model is large and difficult to deploy in the embedded equipment; 3) the problem of low efficiency of feature extraction and poor effect of feature fusion; 4) real-time nature of the detection process.
According to a first aspect of the present invention, an embedded pose estimation method based on a scout image of an unmanned aerial vehicle is provided, where the method specifically includes:
step 1, acquiring an original unmanned aerial vehicle scout image data set, and performing data enhancement processing on the original unmanned aerial vehicle scout image data set;
step 2, performing labeling processing on the original unmanned aerial vehicle reconnaissance image data set obtained in the step 1 to obtain a training data set with a label;
step 3, constructing a lightweight multi-stage hourglass network, and training the lightweight multi-stage hourglass network by using the training data set;
and 4, inputting an unmanned aerial vehicle scout image to be processed, preprocessing the unmanned aerial vehicle scout image, inputting the preprocessed unmanned aerial vehicle scout image into the trained lightweight attitude estimation network to obtain a portrait feature map, and estimating the portrait attitude according to the portrait feature map.
Further, in step 1, the data enhancement processing includes dilation, erosion, and bilateral filtering operations.
Further, in the step 2, the labeling processing of adding the Label is realized by an image labeling tool Label Img.
Further, the image annotation tool is Label Img.
Further, in step 3, the lightweight posture estimation network includes a convolutional layer, a pooling layer, a channel separation Module, a multilevel hourglass network formed by a plurality of Pyramid Residual Modules (PRMs), and a channel mixing Module.
Furthermore, the multi-stage hourglass network is a two-stage hourglass network and is composed of two pyramid residual modules.
Further, the convolutional layers are depth separable convolutional layers, including depth convolution processing and point convolution processing.
Further, the depth separable convolutional layer is specifically operative to:
for the common convolution with convolution kernel K, input channel number M and output channel number O, the method is divided into deep convolution processing and point convolution processing,
deep convolution processing: performing a K convolution operation on each input channel;
and (3) point convolution processing: performing linear fusion on the M characteristics, wherein the number of point convolutions is O,
wherein K, M, O are all positive integers.
Further, the channel separation module includes a plurality of feature channels.
Further, the multi-stage hourglass network is an eight-stage hourglass network.
Further, the hourglass network is composed of
Further, the step 4 specifically includes:
step 41: inputting an unmanned aerial vehicle scout image to be processed, and intercepting the unmanned aerial vehicle scout image to obtain a reduced-size unmanned aerial vehicle scout image;
step 42: inputting the unmanned aerial vehicle scout image obtained in the step 41 into the lightweight attitude estimation network, and obtaining a first characteristic diagram after pooling and convolution;
step 43: inputting the first characteristic diagram into a multi-stage hourglass network through a channel separation module, and outputting a plurality of second characteristic diagrams;
step 44: and inputting the plurality of second feature maps into a channel mixing module, performing feature fusion on the plurality of second feature maps, and outputting a portrait feature map, thereby estimating the portrait posture according to the portrait feature map.
Further, the second feature map is a low resolution feature map.
According to a second aspect of the invention, there is provided a computer readable storage medium having a computer program stored thereon, characterized in that the program, when executed by a processor, implements the steps of the method according to any of the above aspects.
According to a third aspect of the present invention there is provided a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the steps of the method according to any aspect are implemented when the program is executed by the processor.
Compared with the prior art, the invention has the following advantages:
1) the invention has high operation efficiency, and can perform real-time processing on a video image with a resolution of 1920 x 1080 within 20ms under the condition of only using a video card GTX 1050.
2) According to the invention, the common convolutional network is replaced by the deep separable network, so that the estimation effect is ensured and the network is further lightened.
3) The invention can be grouped and transmitted downwards during application, respectively extracts the characteristics, and reorders the channels when the characteristics are fused finally. Therefore, the number of channels can be reduced during transmission, the image characteristics of all parts can be effectively transmitted to the back during transmission, the correlation of the image characteristics can be improved, and the posture estimation effect can be further improved.
4) The present invention fuses features using concatenation. The fusion between the features is enhanced, so that each group of output channels can include all input features, the correlation of information is enhanced, and the attitude estimation efficiency of the ground small target is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the structures shown in the drawings without creative efforts.
FIG. 1 is a flow chart of an embedded attitude estimation method based on an unmanned aerial vehicle scout image according to the technical scheme of the invention;
FIG. 2 is a schematic diagram of a network model built up from a plurality of hourglass networks according to an aspect of the present invention;
fig. 3 is a schematic view of an hourglass network of light-weight PRMs of identical construction according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a depth separable convolution according to aspects of the present invention;
FIG. 5 is a schematic diagram of channel separation and recombination according to the present invention;
FIG. 6 is a diagram illustrating an original pyramid residual block according to an embodiment of the present invention;
fig. 7a and 7b are schematic views of a light-weight PRM according to an embodiment of the present invention;
fig. 8 is a diagram illustrating the results of the detection of 16 key points on the MPII dataset by the detected person and the lightweight network.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
The terms "first," "second," and the like in the description and in the claims of the present disclosure are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein.
Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
A plurality, including two or more.
And/or, it should be understood that, for the term "and/or" as used in this disclosure, it is merely one type of association that describes an associated object, meaning that three types of relationships may exist. For example, a and/or B, may represent: a exists alone, A and B exist simultaneously, and B exists alone.
The technical scheme of the invention provides an embedded attitude estimation method based on an unmanned aerial vehicle scout image, which is mainly based on a pyramid residual error module and designs a lightweight network model. The depth separable convolution is used for replacing the ordinary convolution to reduce the training parameters, and a channel separation module and a channel mixing module are added to change the channel dimension of the feature map so as to strengthen the fusion of the features. In order to ensure that the network can still extract the features more completely, only the identity mapping part is subjected to channel separation processing, and a channel mixing module is added in the final feature fusion. According to the technical scheme, the deep separable convolution is added on the basis of the pyramid residual error network, and the channel separation and channel mixing module is combined, so that the network can effectively reduce the calculated amount and the storage space on the basis of maintaining the precision.
Based on the pyramid residual module, a lightweight network model is provided. The depth separable convolution is used for replacing the ordinary convolution to reduce the training parameters, and a channel separation module and a channel mixing module are added to change the channel dimension of the feature map so as to strengthen the fusion of the features. In order to ensure that the network can still extract the features more completely, only the identity mapping part is subjected to channel separation processing, and a channel mixing module is added in the final feature fusion.
And adding multi-scale features on the basis of the pyramid residual error module, extracting the features through convolution, and then performing feature fusion on the resolution ratio up-sampled before.
Specifically, as shown in fig. 1, the following steps are included.
101, acquiring an original unmanned aerial vehicle scout image data set, and performing data enhancement processing on the original unmanned aerial vehicle scout image data set;
102, performing labeling processing on the original unmanned aerial vehicle reconnaissance image data set obtained in the step 1 to obtain a training data set with a label;
103, constructing a lightweight multi-stage hourglass network, and training the lightweight multi-stage hourglass network by using the training data set;
and 104, inputting an unmanned aerial vehicle scout image to be processed, preprocessing the unmanned aerial vehicle scout image, inputting the preprocessed unmanned aerial vehicle scout image into the trained lightweight attitude estimation network to obtain a portrait feature map, and estimating the portrait attitude according to the portrait feature map.
The following describes key technologies related to the technical solutions of the present invention in detail with reference to the drawings.
Pyramid residual network
The hourglass network has a good detection effect on human body posture estimation, and a plurality of hourglass networks are stacked to continuously optimize a detection result. Each hourglass network combines the characteristics of a plurality of resolutions, is a modular network, and uses a residual error module for characteristic extraction for a plurality of times at each stage. Based on the pyramid residual network of the hourglass network, as shown in fig. 2, the image passes through a 7 × 7 convolutional layer, a pooling layer and a PRM, the image resolution is reduced to 64 × 64, and the image passes through each hourglass network in turn, and each network is followed by a relay monitor to prevent the gradient from disappearing. The structure of each hourglass network is as shown in fig. 3, the resolution is reduced continuously through the pooling layer, the lowest resolution reaches 4 × 4, and then feature extraction is performed through the pyramid residual module. Meanwhile, the multi-resolution features are continuously combined to carry out effective attitude estimation. Each module in the network is a pyramid residual module, and based on the fact that the module is a modularized network, the lightweight design is carried out on the module, the number of channels and the convolution mode are changed, and therefore the whole network is improved.
Designing a lightweight network:
depth separable convolution
The deep separable convolution is divided into two parts, namely, deep convolution and point convolution, as shown in fig. 4, wherein a convolution kernel is K, the number of input channels is M, the number of output channels is O, the deep convolution is performed on each channel by K × K convolution operation, then the point convolution is used for performing linear fusion on M features, and the number of the point convolution is the number of output channels.
For an input image of size Y × Z × M, the amount of computation through a common convolution is:
Y×Z×M×O×K×K (1)
the amount of computation through the depth separable convolution is:
Y×Z×M×O+Y×Z×M×K×K (2)
by comparison, when the convolution kernel is 3 × 3, the calculation amount of each convolution is reduced to about 1/9, and the convolution mode can effectively combine the characteristics of each channel.
Channel separation recombination
Channel separation and recombination as shown in fig. 5, when applied, the channels can be transmitted in groups, features are respectively extracted, and the channels are reordered when the features are fused finally. Therefore, the number of channels can be reduced during transmission, the image characteristics of all parts can be effectively transmitted to the back during transmission, and the correlation of the image characteristics can be improved.
Lightweight PRM
Each module in the hourglass network is a pyramid residual module. As shown in fig. 6, multi-scale features are added on the basis of a residual error module, the number of scales can be customized, and after features are extracted by convolution, the features are upsampled to the previous resolution ratio for feature fusion.
According to the analysis, the invention designs a lightweight pyramid residual module. As shown in fig. 7a and 7b, the normal convolution is replaced with a depth separable convolution. In experiments, it is found that the depth separable convolution has poor effect on the calculation rate although the parameter amount and the calculation amount are reduced, so that the invention only replaces the convolution of the original resolution branch with the depth separable convolution. Meanwhile, a channel separation module is added at the beginning part of the module, and in order to enable the network to extract more features, the number of channels of the feature extraction branch is not reduced, but half of the channels are selected at the identity mapping part and the features are fused by cascade connection. If the direct fusion has half of the channels with less information of feature extraction, a channel recombination module is added in the back to orderly reorder the channels, and the method enhances the fusion among the features, so that each group of output channels can include all the input features, and the information correlation is enhanced.
Examples
The network proposed herein was trained on the MPII dataset of human pose estimation, with the results shown in fig. 8, which included about 25000 images and 40000 labeled samples after Label Img, with 28000 trainings and 11000 tests. The environment Ubuntu was run, number of iterations 250, batchsize 6, and tested using the Torch7 framework and two NVIDIA 1080ti GPUs. The evaluation results were good using Percentage Correct Keys (PCK) as an accuracy evaluation index.
First, a 1080 × 1920 drone scout image is acquired, cut to size 227 × 227 by windowing, and data-enhanced by dilation, erosion, and bilateral filtering.
Wherein, swell (dilate): the operation of finding the local maximum value expands the boundary of the object, and the specific expansion result is related to the shape of the image and the structural element; corrosion (enode): erosion and dilation are the opposite operations, erosion being the operation of finding a local minimum. The erosion operation causes the highlight areas in the image to gradually decrease.
And secondly, outputting a first feature map through three times of pooling (Max Pool) and convolution, inputting the first feature map to a multi-stage hourglass network through a multi-channel separation module (Split), and outputting a plurality of second feature maps with low resolution.
Wherein the present embodiment uses eight hourglass network stacks as the overall network framework.
Finally, the plurality of second feature maps are subjected to feature fusion through a channel mixing module (Merge), and a portrait feature map is output, so that the portrait posture is estimated according to the portrait feature map.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the above implementation method can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation method. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (10)

1. An embedded attitude estimation method based on unmanned aerial vehicle scout images is characterized by specifically comprising the following steps:
step 1: acquiring an original unmanned aerial vehicle scout image data set, and performing data enhancement processing on the original unmanned aerial vehicle scout image data set;
step 2: performing labeling processing on the original unmanned aerial vehicle reconnaissance image data set obtained in the step 1 to obtain a training data set with a label;
and step 3: constructing a lightweight attitude estimation network, and training the lightweight attitude estimation network by using the training data set;
and 4, step 4: the method comprises the steps of obtaining an unmanned aerial vehicle scout image to be processed, preprocessing the unmanned aerial vehicle scout image, inputting the preprocessed unmanned aerial vehicle scout image into a trained lightweight attitude estimation network to obtain a portrait feature map, and estimating the portrait attitude according to the portrait feature map.
2. The embedded pose estimation method according to claim 1, wherein in step 1, the data enhancement process comprises dilation, erosion and bilateral filtering operations.
3. The embedded pose estimation method according to claim 1, wherein in the step 2, labeling processing for adding labels is realized by an image labeling tool.
4. The embedded pose estimation method of claim 1, wherein in step 3, the lightweight pose estimation network comprises a convolutional layer, a pooling layer, a channel separation module, a multi-level hourglass network composed of a plurality of pyramid residual modules, and a channel mixing module.
5. The embedded pose estimation method of claim 4, wherein the convolutional layers are depth separable convolutional layers, including depth convolution processing and point convolution processing.
6. The embedded pose estimation method of claim 5, wherein the depth separable convolutional layer is specifically operable to:
for the common convolution with convolution kernel K, input channel number M and output channel number O, the method is divided into deep convolution processing and point convolution processing,
deep convolution processing: performing a K convolution operation on each input channel;
and (3) point convolution processing: performing linear fusion on the M characteristics, wherein the number of point convolutions is O,
wherein K, M, O are all positive integers.
7. The embedded pose estimation method of claim 4, wherein the channel separation module comprises a plurality of independent feature channels.
8. The embedded pose estimation method according to claim 4, wherein the step 4 specifically comprises:
step 41: inputting an unmanned aerial vehicle scout image to be processed, and intercepting the unmanned aerial vehicle scout image to obtain a reduced-size unmanned aerial vehicle scout image;
step 42: inputting the unmanned aerial vehicle scout image obtained in the step 41 into the trained lightweight attitude estimation network, and performing pooling and convolution to obtain a first characteristic diagram;
step 43: after the first characteristic diagram is input into the multi-stage hourglass network through the channel separation module, a plurality of second characteristic diagrams are output;
step 44: and inputting the plurality of second feature maps into a channel mixing module, performing feature fusion on the plurality of second feature maps, and outputting a portrait feature map, thereby estimating the portrait posture according to the portrait feature map.
9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 8.
10. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method according to any of claims 1 to 8 are implemented when the program is executed by the processor.
CN202110004413.7A 2021-01-04 2021-01-04 Embedded attitude estimation method based on unmanned aerial vehicle scout image Pending CN112966546A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110004413.7A CN112966546A (en) 2021-01-04 2021-01-04 Embedded attitude estimation method based on unmanned aerial vehicle scout image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110004413.7A CN112966546A (en) 2021-01-04 2021-01-04 Embedded attitude estimation method based on unmanned aerial vehicle scout image

Publications (1)

Publication Number Publication Date
CN112966546A true CN112966546A (en) 2021-06-15

Family

ID=76271221

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110004413.7A Pending CN112966546A (en) 2021-01-04 2021-01-04 Embedded attitude estimation method based on unmanned aerial vehicle scout image

Country Status (1)

Country Link
CN (1) CN112966546A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116434127A (en) * 2023-06-14 2023-07-14 季华实验室 Human body posture estimation method, device, equipment and storage medium

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239728A (en) * 2017-01-04 2017-10-10 北京深鉴智能科技有限公司 Unmanned plane interactive device and method based on deep learning Attitude estimation
US20180182109A1 (en) * 2016-12-22 2018-06-28 TCL Research America Inc. System and method for enhancing target tracking via detector and tracker fusion for unmanned aerial vehicles
CN108960211A (en) * 2018-08-10 2018-12-07 罗普特(厦门)科技集团有限公司 A kind of multiple target human body attitude detection method and system
WO2019000325A1 (en) * 2017-06-29 2019-01-03 深圳市大疆创新科技有限公司 Augmented reality method for aerial photography of unmanned aerial vehicle, processor, and unmanned aerial vehicle
CN109766887A (en) * 2019-01-16 2019-05-17 中国科学院光电技术研究所 A kind of multi-target detection method based on cascade hourglass neural network
CN110175524A (en) * 2019-04-26 2019-08-27 南京航空航天大学 A kind of quick vehicle checking method of accurately taking photo by plane based on lightweight depth convolutional network
CN110781765A (en) * 2019-09-30 2020-02-11 腾讯科技(深圳)有限公司 Human body posture recognition method, device, equipment and storage medium
CN111079556A (en) * 2019-11-25 2020-04-28 航天时代飞鸿技术有限公司 Multi-temporal unmanned aerial vehicle video image change area detection and classification method
CN111160085A (en) * 2019-11-19 2020-05-15 天津中科智能识别产业技术研究院有限公司 Human body image key point posture estimation method
CN111192267A (en) * 2019-12-31 2020-05-22 航天时代飞鸿技术有限公司 Multisource perception fusion remote sensing image segmentation method based on UNET network and application
CN111461008A (en) * 2020-03-31 2020-07-28 华南理工大学 Unmanned aerial vehicle aerial shooting target detection method combining scene perspective information
WO2020164270A1 (en) * 2019-02-15 2020-08-20 平安科技(深圳)有限公司 Deep-learning-based pedestrian detection method, system and apparatus, and storage medium
CN111680655A (en) * 2020-06-15 2020-09-18 深延科技(北京)有限公司 Video target detection method for aerial images of unmanned aerial vehicle
CN111696033A (en) * 2020-05-07 2020-09-22 中山大学 Real image super-resolution model and method for learning cascaded hourglass network structure based on angular point guide
CN111815577A (en) * 2020-06-23 2020-10-23 深圳供电局有限公司 Method, device, equipment and storage medium for processing safety helmet wearing detection model
CN111860175A (en) * 2020-06-22 2020-10-30 中国科学院空天信息创新研究院 Unmanned aerial vehicle image vehicle detection method and device based on lightweight network
CN112101259A (en) * 2020-09-21 2020-12-18 中国农业大学 Single pig body posture recognition system and method based on stacked hourglass network

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180182109A1 (en) * 2016-12-22 2018-06-28 TCL Research America Inc. System and method for enhancing target tracking via detector and tracker fusion for unmanned aerial vehicles
CN107239728A (en) * 2017-01-04 2017-10-10 北京深鉴智能科技有限公司 Unmanned plane interactive device and method based on deep learning Attitude estimation
WO2019000325A1 (en) * 2017-06-29 2019-01-03 深圳市大疆创新科技有限公司 Augmented reality method for aerial photography of unmanned aerial vehicle, processor, and unmanned aerial vehicle
CN108960211A (en) * 2018-08-10 2018-12-07 罗普特(厦门)科技集团有限公司 A kind of multiple target human body attitude detection method and system
CN109766887A (en) * 2019-01-16 2019-05-17 中国科学院光电技术研究所 A kind of multi-target detection method based on cascade hourglass neural network
WO2020164270A1 (en) * 2019-02-15 2020-08-20 平安科技(深圳)有限公司 Deep-learning-based pedestrian detection method, system and apparatus, and storage medium
CN110175524A (en) * 2019-04-26 2019-08-27 南京航空航天大学 A kind of quick vehicle checking method of accurately taking photo by plane based on lightweight depth convolutional network
CN110781765A (en) * 2019-09-30 2020-02-11 腾讯科技(深圳)有限公司 Human body posture recognition method, device, equipment and storage medium
CN111160085A (en) * 2019-11-19 2020-05-15 天津中科智能识别产业技术研究院有限公司 Human body image key point posture estimation method
CN111079556A (en) * 2019-11-25 2020-04-28 航天时代飞鸿技术有限公司 Multi-temporal unmanned aerial vehicle video image change area detection and classification method
CN111192267A (en) * 2019-12-31 2020-05-22 航天时代飞鸿技术有限公司 Multisource perception fusion remote sensing image segmentation method based on UNET network and application
CN111461008A (en) * 2020-03-31 2020-07-28 华南理工大学 Unmanned aerial vehicle aerial shooting target detection method combining scene perspective information
CN111696033A (en) * 2020-05-07 2020-09-22 中山大学 Real image super-resolution model and method for learning cascaded hourglass network structure based on angular point guide
CN111680655A (en) * 2020-06-15 2020-09-18 深延科技(北京)有限公司 Video target detection method for aerial images of unmanned aerial vehicle
CN111860175A (en) * 2020-06-22 2020-10-30 中国科学院空天信息创新研究院 Unmanned aerial vehicle image vehicle detection method and device based on lightweight network
CN111815577A (en) * 2020-06-23 2020-10-23 深圳供电局有限公司 Method, device, equipment and storage medium for processing safety helmet wearing detection model
CN112101259A (en) * 2020-09-21 2020-12-18 中国农业大学 Single pig body posture recognition system and method based on stacked hourglass network

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116434127A (en) * 2023-06-14 2023-07-14 季华实验室 Human body posture estimation method, device, equipment and storage medium
CN116434127B (en) * 2023-06-14 2023-11-07 季华实验室 Human body posture estimation method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN108647585B (en) Traffic identifier detection method based on multi-scale circulation attention network
CN110246181B (en) Anchor point-based attitude estimation model training method, attitude estimation method and system
CN112528976B (en) Text detection model generation method and text detection method
CN111862126A (en) Non-cooperative target relative pose estimation method combining deep learning and geometric algorithm
CN107329962B (en) Image retrieval database generation method, and method and device for enhancing reality
CN107564009B (en) Outdoor scene multi-target segmentation method based on deep convolutional neural network
CN111476806B (en) Image processing method, image processing device, computer equipment and storage medium
CN110619638A (en) Multi-mode fusion significance detection method based on convolution block attention module
CN112365511B (en) Point cloud segmentation method based on overlapped region retrieval and alignment
CN110705566B (en) Multi-mode fusion significance detection method based on spatial pyramid pool
CN109461177B (en) Monocular image depth prediction method based on neural network
CN111462140B (en) Real-time image instance segmentation method based on block stitching
CN110348531B (en) Deep convolution neural network construction method with resolution adaptability and application
CN112232134A (en) Human body posture estimation method based on hourglass network and attention mechanism
CN112163498A (en) Foreground guiding and texture focusing pedestrian re-identification model establishing method and application thereof
CN112163447B (en) Multi-task real-time gesture detection and recognition method based on Attention and Squeezenet
JP2021096850A (en) Parallax estimation system and method, electronic apparatus, and computer readable storage medium
CN110807379A (en) Semantic recognition method and device and computer storage medium
CN113850136A (en) Yolov5 and BCNN-based vehicle orientation identification method and system
CN112528858A (en) Training method, device, equipment, medium and product of human body posture estimation model
CN111914596B (en) Lane line detection method, device, system and storage medium
CN112669452B (en) Object positioning method based on convolutional neural network multi-branch structure
CN112966546A (en) Embedded attitude estimation method based on unmanned aerial vehicle scout image
CN113298922A (en) Human body posture estimation method and device and terminal equipment
CN111931793B (en) Method and system for extracting saliency target

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination