CN110378297A

CN110378297A - A kind of Remote Sensing Target detection method based on deep learning

Info

Publication number: CN110378297A
Application number: CN201910667981.8A
Authority: CN
Inventors: 刘京; 田亮; 郭蔚; 杨烁今
Original assignee: Hebei Normal University
Current assignee: Hebei Normal University
Priority date: 2019-07-23
Filing date: 2019-07-23
Publication date: 2019-10-25
Anticipated expiration: 2039-07-23
Also published as: CN110378297B

Abstract

The present invention is suitable for technical field of image processing, provides a kind of Remote Sensing Target detection method based on deep learning, comprising: obtain remote sensing images to be detected, and pre-process to the remote sensing images；Using preset Multi resolution feature extraction network, Multi resolution feature extraction is carried out to pretreated remote sensing images, obtains the corresponding characteristic image of each graphical rule；Using preset multiple dimensioned regional prediction network, the prediction of candidate region is carried out to the corresponding characteristic image of each graphical rule respectively, obtains the corresponding candidate frame tag image of each graphical rule；Using preset multi-scale information converged network, the corresponding candidate frame tag image of each graphical rule is merged, the testing result figure of pre-set image scale is obtained.By the above method, the accuracy of Remote Sensing Target detection can be effectively improved.

Description

A kind of Remote Sensing Target detection method based on deep learning

Technical field

The invention belongs to technical field of image processing more particularly to a kind of Remote Sensing Target detections based on deep learning Method, detection device and terminal device.

Background technique

With the development of computer technology and the extensive use of visual theory, mesh is carried out using computer image processing technology Mark detection is more and more popular.The target detection technique of remote sensing images is the development with remote sensing technology and the new skill risen Art makes it have important military significance and civilian valence because of the advantages that its operating distance is remote, wide coverage, high execution efficiency Value.

But under complex background (such as obstruction conditions), the detection accuracy of existing Remote Sensing Target detection method compared with It is low, limit the application range of the target detection technique of remote sensing images.

Summary of the invention

In view of this, the Remote Sensing Target detection method that the embodiment of the invention provides a kind of based on deep learning, inspection Device and terminal device are surveyed, to solve under complex background, the detection accuracy of existing Remote Sensing Target detection method is lower Problem.

The first aspect of the embodiment of the present invention provides a kind of Remote Sensing Target detection method based on deep learning, packet It includes:

Remote sensing images to be detected are obtained, and the remote sensing images are pre-processed；

Using preset Multi resolution feature extraction network, Multi resolution feature extraction is carried out to pretreated remote sensing images, Obtain the corresponding characteristic image of each graphical rule；

Using preset multiple dimensioned regional prediction network, the corresponding characteristic image of each graphical rule is carried out respectively candidate The prediction in region obtains the corresponding candidate frame tag image of each graphical rule；

Using preset multi-scale information converged network, the corresponding candidate frame tag image of each graphical rule is melted It closes, obtains the testing result figure of pre-set image scale.

The second aspect of the embodiment of the present invention provides a kind of Remote Sensing Target detection device based on deep learning, packet It includes:

Pretreatment unit is pre-processed for obtaining remote sensing images to be detected, and to the remote sensing images；

Feature extraction unit, for utilize preset Multi resolution feature extraction network, to pretreated remote sensing images into Row Multi resolution feature extraction obtains the corresponding characteristic image of each graphical rule；

Regional prediction unit, it is corresponding to each graphical rule respectively for utilizing preset multiple dimensioned regional prediction network Characteristic image carry out candidate region prediction, obtain the corresponding candidate frame tag image of each graphical rule；

As a result integrated unit, for utilizing preset multi-scale information converged network, by the corresponding time of each graphical rule It selects frame tag image to be merged, obtains testing result figure identical with the graphical rule of the remote sensing images.

The third aspect of the embodiment of the present application provides a kind of terminal device, including memory, processor and is stored in In the memory and the computer program that can run on the processor, when the processor executes the computer program The step of realizing the method that the embodiment of the present application first aspect provides.

The fourth aspect of the embodiment of the present application provides a kind of computer readable storage medium, the computer-readable storage Media storage has computer program, and the computer program realizes the embodiment of the present application when being executed by one or more processors On the one hand the step of the method provided.

Existing beneficial effect is the embodiment of the present invention compared with prior art:

The embodiment of the present invention pre-processes the remote sensing images by obtaining remote sensing images to be detected；It utilizes Preset Multi resolution feature extraction network carries out Multi resolution feature extraction to pretreated remote sensing images, obtains each image The corresponding characteristic image of scale is obtained the characteristic image of multiple scales, the characteristic image of especially small scale with this, and then enhanced The characterization ability of Small object；Using preset multiple dimensioned regional prediction network, respectively to the corresponding characteristic pattern of each graphical rule Prediction as carrying out candidate region, obtains the corresponding candidate frame tag image of each graphical rule；Then preset more rulers are utilized Information converged network is spent, the corresponding candidate frame tag image of each graphical rule is merged, pre-set image scale is obtained Testing result figure.By the above method, target detection is carried out using multiple dimensioned characteristic information, and by the detection knot under each scale Fruit is merged, the feature of Small object has both been enhanced, it is also contemplated that the related information of target and background, is improved with this multiple The detection accuracy of target under miscellaneous background.

Detailed description of the invention

It to describe the technical solutions in the embodiments of the present invention more clearly, below will be to embodiment or description of the prior art Needed in attached drawing be briefly described, it should be apparent that, the accompanying drawings in the following description is only of the invention some Embodiment for those of ordinary skill in the art without any creative labor, can also be according to these Attached drawing obtains other attached drawings.

Fig. 1 is that the implementation process of the Remote Sensing Target detection method provided in an embodiment of the present invention based on deep learning is shown It is intended to；

Fig. 2 is the schematic diagram of the Remote Sensing Target detection device provided in an embodiment of the present invention based on deep learning；

Fig. 3 is the schematic diagram of terminal device provided in an embodiment of the present invention；

Fig. 4 is the pyramidal topology example figure of bidirectional picture provided in an embodiment of the present invention；

Fig. 5 is the schematic diagram of testing result figure provided in an embodiment of the present invention and optimum results figure；

Fig. 6 is the Remote Sensing Target detection side based on deep learning provided in an embodiment of the present invention using in the application The testing result schematic diagram of method progress target detection.

Specific embodiment

In being described below, for illustration and not for limitation, the tool of such as particular system structure, technology etc is proposed Body details, to understand thoroughly the embodiment of the present invention.However, it will be clear to one skilled in the art that there is no these specific The present invention also may be implemented in the other embodiments of details.In other situations, it omits to well-known system, device, electricity The detailed description of road and method, in case unnecessary details interferes description of the invention.

It should be appreciated that ought use in this specification and in the appended claims, term " includes " instruction is described special Sign, entirety, step, operation, the presence of element and/or component, but be not precluded one or more of the other feature, entirety, step, Operation, the presence or addition of element, component and/or its set.

It is also understood that mesh of the term used in this description of the invention merely for the sake of description specific embodiment And be not intended to limit the present invention.As description of the invention and it is used in the attached claims, unless on Other situations are hereafter clearly indicated, otherwise " one " of singular, "one" and "the" are intended to include plural form.

It will be further appreciated that the term "and/or" used in description of the invention and the appended claims is Refer to any combination and all possible combinations of one or more of associated item listed, and including these combinations.

As used in this specification and in the appended claims, term " if " can be according to context quilt Be construed to " when ... " or " once " or " in response to determination " or " in response to detecting ".Similarly, phrase " if it is determined that " or " if detecting [described condition or event] " can be interpreted to mean according to context " once it is determined that " or " in response to true It is fixed " or " once detecting [described condition or event] " or " in response to detecting [described condition or event] ".

In order to illustrate technical solutions according to the invention, the following is a description of specific embodiments.

Fig. 1 is that the implementation process of the Remote Sensing Target detection method provided in an embodiment of the present invention based on deep learning is shown It is intended to.As shown, detection method includes the following steps for the Remote Sensing Target based on deep learning:

Step S101 obtains remote sensing images to be detected, and pre-processes to the remote sensing images.

In practice, pretreatment may include the processing such as cutting out, overturning to remote sensing images to be detected, will be to be detected Remote sensing image processing is the image for meeting testing requirements.

Step S102 carries out pretreated remote sensing images multiple dimensioned using preset Multi resolution feature extraction network Feature extraction obtains the corresponding characteristic image of each graphical rule.

In one embodiment, described that Multi resolution feature extraction is carried out to pretreated remote sensing images, obtain each figure As the corresponding characteristic image of scale, comprising:

Establish the corresponding bidirectional picture pyramid of the pretreated remote sensing images, wherein the bidirectional picture gold word The graphical rule of image is identical in the graphical rule difference, same level of image in different levels in tower.

Feature extraction is carried out to the image in each level in the bidirectional picture pyramid respectively, obtains each image ruler Spend corresponding characteristic image.

In one embodiment, the bidirectional picture pyramid includes the first subgraph, the second subgraph and third subgraph Picture.

It is that the bidirectional picture is pyramidal that the pretreated remote sensing images, which carry out the image obtained after i-th convolution, The first subgraph in i+1 level.

It is obtained after the first subgraph progress jth time process of convolution in pyramidal i-th of the level of bidirectional picture Image is the second subgraph in the pyramidal i-th+j of the bidirectional picture levels, wherein 0 < j≤N-i, the N are described The number of bidirectional picture pyramid level.

The first subgraph in pyramidal i-th of the level of bidirectional picture obtains after carrying out the h times deconvolution processing Image be third subgraph in the pyramidal i-th-h of the bidirectional picture levels, wherein 0 < h < i.

Referring to fig. 4, Fig. 4 is the pyramidal topology example figure of bidirectional picture provided in an embodiment of the present invention.As shown, figure In B1- > B2- > direction B3- > B4 be the pyramidal direction of bidirectional picture, B1 is located at the pyramidal first layer of bidirectional picture Grade obtains B2 after carrying out convolution to B1, and B2 is located at pyramidal second level of bidirectional picture, and so on.

For B1, it is also necessary to carry out 3 process of convolution.E11 is obtained after carrying out the 1st process of convolution to B1, E11 is located at double To the second level of image pyramid；E12 (the 2nd process of convolution is carried out to B1) is obtained after carrying out process of convolution to E11, E12 is located at the pyramidal third level of bidirectional picture；E12 obtain E13 after at convolution and (carry out the 3rd convolution to B1 Processing), E13 is located at pyramidal 4th level of bidirectional picture.

It is similar, it carries out 2 process of convolution altogether to B2 and obtains E21, E22, E23, carry out 1 process of convolution altogether to B3 E31 is obtained, to B4 without carrying out process of convolution.

But need to carry out 3 deconvolution processing for B4.1st deconvolution is carried out to B4 and handles to obtain D41, D41 In the pyramidal third level of bidirectional picture；Deconvolution processing (carrying out the 2nd deconvolution processing to B4) is carried out to D41 to obtain D42, D42 are located at pyramidal second level of bidirectional picture；Deconvolution processing is carried out to D42 and (the 3rd deconvolution is carried out to B4 Processing) D43 is obtained, D43 is located at pyramidal first level of bidirectional picture.

It is similar, it carries out 2 deconvolution altogether to B3 and handles to obtain D31, D32；1 deconvolution is carried out altogether to B2 to handle to obtain D21；To B1 without carrying out deconvolution processing.

Pass through the above method, it is ensured that the graphical rule of image is identical in the pyramidal each level of bidirectional picture, divides Resolution is identical.In this way, the not only characteristic image under available different images scale, but also can guarantee the spy under each graphical rule Levying image includes more rich characteristic information.

Step S103, using preset multiple dimensioned regional prediction network, respectively to the corresponding characteristic pattern of each graphical rule Prediction as carrying out candidate region, obtains the corresponding candidate frame tag image of each graphical rule.

In practice, the result of multiple dimensioned regional prediction network output is candidate frame tag image.When there are target, Candidate frame is marked in candidate frame tag image at the corresponding position of target, the mark of sample class is also marked at each candidate frame Label.Sample class includes positive sample and negative sample.Wherein, positive sample usually indicates foreground image, i.e. target；The usual table of negative sample Show background image.

Certainly, multiple dimensioned regional prediction network needs precondition.Shown in training process following examples.

In one embodiment, preset multiple dimensioned regional prediction network is being utilized, it is corresponding to each graphical rule respectively Characteristic image carry out candidate region prediction before, the method also includes:

Step S201 obtains sample image, includes at least one target, the corresponding mark of each target in the sample image Sign the corresponding default frame of characteristic image vegetarian refreshments under frame and each scale.

In practice, artificial to be provided with default frame at characteristic image vegetarian refreshments under each scale, preset the length-width ratio of frame Example and resolution ratio are by artificial settings.Illustratively, can set three length-width ratios is respectively 1:1,1:2,2:1, resolution ratio point Not Wei 32x32,64x64,128x128 default frame.Design parameter, quantity of default frame etc. are without limitation.

It further include the label frame artificially marked, i.e., the side manually target being marked in sample image in sample image Frame.

Step S202, calculate separately each target it is corresponding it is opposite hand over and compare, and according to it is described it is opposite hand over and than for institute It states opposite friendship and sample class is set than corresponding target, the sample image after being marked.

It is in one embodiment, described to calculate separately that each target is corresponding opposite to be handed over and compare, comprising:

Pass throughIt calculates the corresponding friendship of current goal and compares.

Pass throughCalculating current goal is corresponding to be handed over and compares relatively.

Wherein, A is the area of the corresponding label frame of current goal, and G is the area of the corresponding default frame of current goal, IoU For the corresponding friendship of current goal and compare, RIoU, which is that current goal is corresponding, to be handed over and compare relatively.

In one embodiment, described according to the opposite friendship and than for the opposite friendship and than the setting of corresponding target Sample class, comprising:

If described hand over and than judging the friendship and than corresponding opposite friendship and than whether greater than second greater than first threshold Threshold value.

If described hand over and than corresponding opposite friendship and than that will hand over relatively and with described than corresponding mesh greater than second threshold Target sample class is set as positive sample.

If described hand over and than corresponding opposite friendship and than that will hand over and compare relatively with described less than or equal to second threshold The sample class for the target answered is set as negative sample.

Step S203, is trained neural network using the sample image after the label, the nerve after being trained Network, and using the neural network after the training as the preset multiple dimensioned regional prediction network.

By experiment, the loss function of multiple dimensioned regional prediction network can be with are as follows:

Wherein Lcls is Classification Loss function, and Lreg is to return loss function, and N is the quantity of sample, and pi is sample prospect Or background label.Recurrence loss is not calculated when sample is negative sample.

Classification Loss function is defined as follows:

L_cls(p_i,p_i*)=- log [p_i*p_i+(1-p_i*)(1-p_i)]

This is a classical intersection entropy loss, and pi* is class label, and pi is classification, in multiple dimensioned regional prediction network In, pi is that two classification are foreground and background.

Loss function is returned to be defined as follows:

L_reg(t_i,t_i*)=smooth_L1(x)

X=t_i-t_i*

Ti is four dimensional vector { tx, ty, tw, th }, and tx predicts the offset value of the upper left corner abscissa of bounding box: The difference of default frame upper left corner abscissa and label frame upper left corner abscissa；Ty predicts the inclined of the upper left corner ordinate of bounding box Move magnitude: the difference of default frame upper left corner ordinate and label frame upper left corner ordinate；Tw predicts the inclined of the width of bounding box Move magnitude: the difference of default width of frame and label width of frame；Th predicts the offset value of the height of bounding box: default frame height degree With the difference of label frame height；It is the offset of true default frame and label frame for ti*.Here loss function uses SmoothL1 function.

Step S104 is marked the corresponding candidate frame of each graphical rule using preset multi-scale information converged network Image is merged, and the testing result figure of pre-set image scale is obtained.

In practice, figure is marked by the candidate frame under the multiple dimensioned available different scale of regional prediction network, due to each The candidate frame obtained under scale it is not of uniform size, and the candidate frame of the level Small Target in large scale is more, in small scale Level in big target candidate frame it is more.In order to utilize the feature of image to the greatest extent, can be advised according to preset mapping Then, corresponding mapping level is chosen for each candidate frame.

Illustratively, it when the resolution ratio of candidate frame is less than 32x32, can choose using bidirectional picture pyramidal first Feature in level；When the resolution ratio of candidate frame is in 32x32 to 64x64 range, can choose using bidirectional picture gold word Feature in second level of tower；When the resolution ratio of candidate frame is in 64x64 to 128x128 range, can choose using double Feature into the third level of image pyramid；To that candidate frame resolution ratio in 128x128 to 256x256 range When, it can choose using the feature in pyramidal 4th level of bidirectional picture.

Pass through the above method, it is ensured that Small object uses the characteristic pattern of large scale, and biggish target is with smaller ruler The characteristic pattern of degree guarantees that image feature information is able to optimal utilization with this.

It before fusion, needs for the image of different scale to be normalized, i.e., by the Image Adjusting of different scale For the image of same scale, the testing result figure of pre-set image scale is finally obtained.Here pre-set image scale can for The graphical rules such as the remote sensing images of detection.

By experiment, the loss function of multi-scale information converged network can be with are as follows:

Wherein Lcls is that classification loss function is identical with multiple dimensioned regional prediction network classification loss function, is to intersect Entropy loss, classification is classification number to be sorted at this time.It is identical with multiple dimensioned regional prediction network to return loss Lreg, is smoothL1 Loss.Ci is the offset of fused bounding box and label frame, and ci* is really to preset frame and label frame offset.

In this application, Multi resolution feature extraction network, multiple dimensioned regional prediction network and multi-scale information converged network It is series relationship.Multi resolution feature extraction network exports multiple dimensioned characteristic image, and inputs in multiple dimensioned regional prediction network； It is (corresponding with candidate frame and each candidate frame that multiple dimensioned regional prediction network exports the corresponding candidate frame label figure of each scale Sample label), and input in multi-scale information converged network；Multi-scale information converged network output test result figure, in the figure With candidate frame, the corresponding class label of each candidate frame and the corresponding class probability of each class label.

Wherein, class label is used to indicate the classification of target.Illustratively, the class label of A target is aircraft, B target Class label be tank.Class probability is for indicating that target belongs to the probability of the category.For example, A target belongs to the general of aircraft Rate is that the probability that 80%, B target belongs to tank is 30%.

In one embodiment, after the testing result figure for obtaining pre-set image scale, the method also includes:

Using any one target in the testing result figure as target to be optimized, and obtain the target pair to be optimized The class probability for each candidate frame answered.

According to the class probability of the corresponding each candidate frame of the target to be optimized, non-maxima suppression method difference is utilized Calculate the detection score of the corresponding each candidate frame of the target to be optimized.

The candidate frame for deleting division result outer frame in the corresponding all candidate frames of the target to be optimized, obtains optimum results Figure, wherein the results box is the corresponding candidate frame of highest detection score.

It is the schematic diagram of testing result figure provided in an embodiment of the present invention and optimum results figure referring to Fig. 5, Fig. 5.Fig. 5 (a) It is testing result figure, Fig. 5 (b) is optimum results figure.It can be seen that same target location has multiple times in testing result figure Frame is selected, according to the class probability of each candidate frame, the detection of each candidate frame can be calculated using non-maxima suppression method Score (the detection score of corresponding two candidate frames of right side boy student's head portrait is respectively 0.81,0.67 in such as Fig. 5 (a)).By highest The corresponding candidate frame of detection score frame and retained as a result, delete the candidate frame of division result outer frame, i.e., in Fig. 5 (b) The corresponding candidate frame of middle right side man head portrait only remains next, and the candidate frame is to detect the corresponding candidate frame of score 0.81.

It is the remote sensing images mesh based on deep learning provided in an embodiment of the present invention using in the application referring to Fig. 6, Fig. 6 Mark the testing result schematic diagram that detection method carries out target detection.As shown, by the application based on deep learning Remote Sensing Target detection method can accurately detect the Small object under complex background.

It should be understood that the size of the serial number of each step is not meant that the order of the execution order in above-described embodiment, each process Execution sequence should be determined by its function and internal logic, the implementation process without coping with the embodiment of the present invention constitutes any limit It is fixed.

Fig. 2 is the schematic diagram of the Remote Sensing Target detection device provided in an embodiment of the present invention based on deep learning, is Convenient for explanation, part relevant to the embodiment of the present application is only shown.

Remote Sensing Target detection device shown in Fig. 2 based on deep learning, which can be, is built in existing terminal device Interior software unit, hardware cell or the unit of soft or hard combination, can also be used as independent pendant and is integrated into the terminal device In, it is also used as independent terminal device and exists.

The Remote Sensing Target detection device 2 based on deep learning includes:

Pretreatment unit 21 is pre-processed for obtaining remote sensing images to be detected, and to the remote sensing images.

Feature extraction unit 22, for utilizing preset Multi resolution feature extraction network, to pretreated remote sensing images Multi resolution feature extraction is carried out, the corresponding characteristic image of each graphical rule is obtained.

Regional prediction unit 23, for utilizing preset multiple dimensioned regional prediction network, respectively to each graphical rule pair The characteristic image answered carries out the prediction of candidate region, obtains the corresponding candidate frame tag image of each graphical rule.

As a result integrated unit 24, it is for utilizing preset multi-scale information converged network, each graphical rule is corresponding Candidate frame tag image is merged, and testing result figure identical with the graphical rule of the remote sensing images is obtained.

Optionally, the feature extraction unit 22 includes:

Module is established, for establishing the corresponding bidirectional picture pyramid of the pretreated remote sensing images, wherein described The graphical rule of image is identical in the graphical rule difference, same level of image in different levels in bidirectional picture pyramid.

Extraction module, for carrying out feature extraction to the image in each level in the bidirectional picture pyramid respectively, Obtain the corresponding characteristic image of each graphical rule.

Optionally, the bidirectional picture pyramid includes the first subgraph, the second subgraph and third subgraph.

Optionally, described device 2 further include:

Acquiring unit, it is corresponding to each graphical rule respectively for utilizing preset multiple dimensioned regional prediction network Before characteristic image carries out the prediction of candidate region, sample image is obtained, includes at least one target in the sample image, every The corresponding default frame of characteristic image vegetarian refreshments under a corresponding label frame of target and each scale.

Computing unit, for calculate separately each target it is corresponding it is opposite hand over and compare, and opposite handed over and ratio is according to described Sample image with the opposite friendship and than corresponding target setting sample class, after being marked.

Training unit, for being trained using the sample image after the label to neural network, after being trained Neural network, and using the neural network after the training as the preset multiple dimensioned regional prediction network.

Optionally, shown computing unit includes:

First computing module, for passing throughIt calculates the corresponding friendship of current goal and compares.

Second computing module, for passing throughCalculate the corresponding opposite friendship of current goal And compare.

Optionally, shown computing unit further include:

Judgment module, if being used for the friendship and than being greater than first threshold, judging the friendship and being handed over relatively simultaneously than corresponding Than whether being greater than second threshold.

First label model, if for the friendship and than it is corresponding it is opposite hand over and than being greater than second threshold, will with it is described It is opposite to hand over and be set as positive sample than the sample class of corresponding target.

Second label model, if for the friendship and than corresponding opposite friendship and than being less than or equal to second threshold, it will It opposite hand over described and is set as negative sample than the sample class of corresponding target.

Optionally, described device 2 further include:

Probability acquiring unit, for after the testing result figure for obtaining pre-set image scale, by the testing result figure In any one target as target to be optimized, and the classification for obtaining the corresponding each candidate frame of the target to be optimized is general Rate.

Score calculating unit, for the class probability according to the corresponding each candidate frame of the target to be optimized, utilization is non- Maximum suppressing method calculates separately the detection score of the corresponding each candidate frame of the target to be optimized.

Optimize unit, for deleting the candidate frame of division result outer frame in the corresponding all candidate frames of the target to be optimized, Obtain optimum results figure, wherein the results box is the corresponding candidate frame of highest detection score.

It is apparent to those skilled in the art that for convenience of description and succinctly, only with above-mentioned each function Can unit, module division progress for example, in practical application, can according to need and by above-mentioned function distribution by different Functional unit, module are completed, i.e., the internal structure of described device is divided into different functional unit or module, more than completing The all or part of function of description.Each functional unit in embodiment, module can integrate in one processing unit, can also To be that each unit physically exists alone, can also be integrated in one unit with two or more units, it is above-mentioned integrated Unit both can take the form of hardware realization, can also realize in the form of software functional units.In addition, each function list Member, the specific name of module are also only for convenience of distinguishing each other, the protection scope being not intended to limit this application.Above system The specific work process of middle unit, module, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.

Fig. 3 is the schematic diagram for the terminal device that one embodiment of the invention provides.As shown in figure 3, the terminal of the embodiment is set Standby 3 include: processor 30, memory 31 and are stored in the meter that can be run in the memory 31 and on the processor 30 Calculation machine program 32.The processor 30 realizes above-mentioned each remote sensing figure based on deep learning when executing the computer program 32 As the step in object detection method embodiment, such as step S101 to S104 shown in FIG. 1.Alternatively, the processor 30 is held The function of each module/unit in above-mentioned each Installation practice, such as module 21 shown in Fig. 2 are realized when the row computer program 32 To 24 function.

Illustratively, the computer program 32 can be divided into one or more module/units, it is one or Multiple module/units are stored in the memory 31, and are executed by the processor 30, to complete the present invention.Described one A or multiple module/units can be the series of computation machine program instruction section that can complete specific function, which is used for Implementation procedure of the computer program 32 in the terminal device 3 is described.For example, the computer program 32 can be divided It is cut into pretreatment unit, feature extraction unit, regional prediction unit, result integrated unit, each unit concrete function is as follows:

Pretreatment unit is pre-processed for obtaining remote sensing images to be detected, and to the remote sensing images.

Feature extraction unit, for utilize preset Multi resolution feature extraction network, to pretreated remote sensing images into Row Multi resolution feature extraction obtains the corresponding characteristic image of each graphical rule.

Regional prediction unit, it is corresponding to each graphical rule respectively for utilizing preset multiple dimensioned regional prediction network Characteristic image carry out candidate region prediction, obtain the corresponding candidate frame tag image of each graphical rule.

Optionally, the feature extraction unit includes:

Optionally, the computer program further include:

Optionally, shown computing unit includes:

Optionally, shown computing unit further include:

Optionally, the computer program further include:

The terminal device 3 can be the calculating such as desktop PC, notebook, palm PC and cloud server and set It is standby.The terminal device may include, but be not limited only to, processor 30, memory 31.It will be understood by those skilled in the art that Fig. 3 The only example of terminal device 3 does not constitute the restriction to terminal device 3, may include than illustrating more or fewer portions Part perhaps combines certain components or different components, such as the terminal device can also include input-output equipment, net Network access device, bus etc..

Alleged processor 30 can be central processing unit (Central Processing Unit, CPU), can also be Other general processors, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field- Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic, Discrete hardware components etc..General processor can be microprocessor or the processor is also possible to any conventional processor Deng.

The memory 31 can be the internal storage unit of the terminal device 3, such as the hard disk or interior of terminal device 3 It deposits.The memory 31 is also possible to the External memory equipment of the terminal device 3, such as be equipped on the terminal device 3 Plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card dodge Deposit card (Flash Card) etc..Further, the memory 31 can also both include the storage inside list of the terminal device 3 Member also includes External memory equipment.The memory 31 is for storing needed for the computer program and the terminal device Other programs and data.The memory 31 can be also used for temporarily storing the data that has exported or will export.

In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, is not described in detail or remembers in some embodiment The part of load may refer to the associated description of other embodiments.

Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure Member and algorithm steps can be realized with the combination of electronic hardware or computer software and electronic hardware.These functions are actually It is implemented in hardware or software, the specific application and design constraint depending on technical solution.Professional technician Each specific application can be used different methods to achieve the described function, but this realization is it is not considered that exceed The scope of the present invention.

In embodiment provided by the present invention, it should be understood that disclosed device/terminal device and method, it can be with It realizes by another way.For example, device described above/terminal device embodiment is only schematical, for example, institute The division of module or unit is stated, only a kind of logical function partition, there may be another division manner in actual implementation, such as Multiple units or components can be combined or can be integrated into another system, or some features can be ignored or not executed.Separately A bit, shown or discussed mutual coupling or direct-coupling or communication connection can be through some interfaces, device Or the INDIRECT COUPLING or communication connection of unit, it can be electrical property, mechanical or other forms.

The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.

It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list Member both can take the form of hardware realization, can also realize in the form of software functional units.

If the integrated module/unit be realized in the form of SFU software functional unit and as independent product sale or In use, can store in a computer readable storage medium.Based on this understanding, the present invention realizes above-mentioned implementation All or part of the process in example method, can also instruct relevant hardware to complete, the meter by computer program Calculation machine program can be stored in a computer readable storage medium, the computer program when being executed by processor, it can be achieved that on The step of stating each embodiment of the method.Wherein, the computer program includes computer program code, the computer program Code can be source code form, object identification code form, executable file or certain intermediate forms etc..Computer-readable Jie Matter may include: can carry the computer program code any entity or device, recording medium, USB flash disk, mobile hard disk, Magnetic disk, CD, computer storage, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium etc..It should be noted that described The content that computer-readable medium includes can carry out increasing appropriate according to the requirement made laws in jurisdiction with patent practice Subtract, such as in certain jurisdictions, according to legislation and patent practice, computer-readable medium do not include be electric carrier signal and Telecommunication signal.

Embodiment described above is merely illustrative of the technical solution of the present invention, rather than its limitations；Although referring to aforementioned reality Applying example, invention is explained in detail, those skilled in the art should understand that: it still can be to aforementioned each Technical solution documented by embodiment is modified or equivalent replacement of some of the technical features；And these are modified Or replacement, the spirit and scope for technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution should all It is included within protection scope of the present invention.

Claims

1. a kind of Remote Sensing Target detection method based on deep learning characterized by comprising

Using preset Multi resolution feature extraction network, Multi resolution feature extraction is carried out to pretreated remote sensing images, is obtained The corresponding characteristic image of each graphical rule；

Using preset multiple dimensioned regional prediction network, candidate region is carried out to the corresponding characteristic image of each graphical rule respectively Prediction, obtain the corresponding candidate frame tag image of each graphical rule；

Using preset multi-scale information converged network, the corresponding candidate frame tag image of each graphical rule is merged, Obtain the testing result figure of pre-set image scale.

2. the Remote Sensing Target detection method based on deep learning as described in claim 1, which is characterized in that described to pre- Treated, and remote sensing images carry out Multi resolution feature extraction, obtain the corresponding characteristic image of each graphical rule, comprising:

Establish the corresponding bidirectional picture pyramid of the pretreated remote sensing images, wherein in the bidirectional picture pyramid The graphical rule of image is identical in the graphical rule difference, same level of image in different levels；

Feature extraction is carried out to the image in each level in the bidirectional picture pyramid respectively, obtains each graphical rule pair The characteristic image answered.

3. the Remote Sensing Target detection method based on deep learning as claimed in claim 2, which is characterized in that described two-way Image pyramid includes the first subgraph, the second subgraph and third subgraph；

The pretreated remote sensing images carry out the image obtained after i-th convolution be the bidirectional picture pyramidal i-th+ The first subgraph in 1 level；

The first subgraph in pyramidal i-th of the level of bidirectional picture carries out the image obtained after jth time process of convolution For the second subgraph in pyramidal i-th+j levels of the bidirectional picture, wherein 0 < j≤N-i, the N is described double To the number of image pyramid level；

The first subgraph in pyramidal i-th of the level of bidirectional picture carries out the figure obtained after the h times deconvolution processing As being the third subgraph in the pyramidal i-th-h of the bidirectional picture levels, wherein 0 < h < i.

4. the Remote Sensing Target detection method based on deep learning as described in claim 1, which is characterized in that using in advance If multiple dimensioned regional prediction network, respectively to the corresponding characteristic image of each graphical rule carry out candidate region prediction it Before, the method also includes:

Sample image is obtained, includes at least one target, the corresponding label frame of each target and each in the sample image The corresponding default frame of characteristic image vegetarian refreshments under scale；

Calculate separately each target it is corresponding it is opposite hand over and compare, and opposite handed over and than opposite to hand over and compare with described according to described The target setting sample class answered, the sample image after being marked；

Neural network is trained using the sample image after the label, the neural network after being trained, and will be described Neural network after training is as the preset multiple dimensioned regional prediction network.

5. the Remote Sensing Target detection method based on deep learning as claimed in claim 4, which is characterized in that the difference Calculate that each target is corresponding opposite to be handed over and compare, comprising:

Pass throughIt calculates the corresponding friendship of current goal and compares；

Pass throughCalculating current goal is corresponding to be handed over and compares relatively；

Wherein, A is the area of the corresponding label frame of current goal, and G is the area of the corresponding default frame of current goal, and IoU is to work as The corresponding friendship of preceding target is simultaneously compared, and RloU, which is that current goal is corresponding, opposite to be handed over and compare.

6. the Remote Sensing Target detection method based on deep learning as claimed in claim 5, which is characterized in that the basis It is described to hand over relatively and than for the opposite friendship and than corresponding target setting sample class, comprising:

If described hand over and than judging the friendship and than corresponding opposite friendship and than whether greater than the second threshold greater than first threshold Value；

If described hand over and than corresponding opposite friendship and than that will hand over relatively and with described than corresponding target greater than second threshold Sample class is set as positive sample；

If described hand over and than corresponding opposite friendship and than that will hand over relatively and with described than corresponding less than or equal to second threshold The sample class of target is set as negative sample.

7. the Remote Sensing Target detection method based on deep learning as described in claim 1, which is characterized in that obtain it is pre- If after the testing result figure of graphical rule, the method also includes:

Using any one target in the testing result figure as target to be optimized, and it is corresponding to obtain the target to be optimized The class probability of each candidate frame；

According to the class probability of the corresponding each candidate frame of the target to be optimized, calculated separately using non-maxima suppression method The detection score of the corresponding each candidate frame of the target to be optimized；

The candidate frame for deleting division result outer frame in the corresponding all candidate frames of the target to be optimized, obtains optimum results figure, In, the results box is the corresponding candidate frame of highest detection score.

8. a kind of Remote Sensing Target detection device based on deep learning characterized by comprising

Feature extraction unit carries out pretreated remote sensing images more for utilizing preset Multi resolution feature extraction network Scale feature extracts, and obtains the corresponding characteristic image of each graphical rule；

Regional prediction unit, for utilizing preset multiple dimensioned regional prediction network, respectively to the corresponding spy of each graphical rule The prediction that image carries out candidate region is levied, the corresponding candidate frame tag image of each graphical rule is obtained；

As a result integrated unit, for utilizing preset multi-scale information converged network, by the corresponding candidate frame of each graphical rule Tag image is merged, and testing result figure identical with the graphical rule of the remote sensing images is obtained.

9. a kind of terminal device, including memory, processor and storage are in the memory and can be on the processor The computer program of operation, which is characterized in that the processor realizes such as claim 1 to 7 when executing the computer program The step of Remote Sensing Target detection method described in any one based on deep learning.

10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists In realization is as described in any one of claim 1 to 7 based on the remote sensing of deep learning when the computer program is executed by processor The step of image object detection method.