CN109145769A - The target detection network design method of blending image segmentation feature - Google Patents
The target detection network design method of blending image segmentation feature Download PDFInfo
- Publication number
- CN109145769A CN109145769A CN201810860392.7A CN201810860392A CN109145769A CN 109145769 A CN109145769 A CN 109145769A CN 201810860392 A CN201810860392 A CN 201810860392A CN 109145769 A CN109145769 A CN 109145769A
- Authority
- CN
- China
- Prior art keywords
- network
- target
- feature
- image segmentation
- segmentation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/194—Segmentation; Edge detection involving foreground-background segmentation
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
Blending image divides the target detection network design method of feature, and this method is significant for large-scale target effect.In conjunction with general target detection framework Mask RCNN, blending image divides feature simultaneously, by feature integration that Target Segmentation feature and ResNet-101 convolutional network obtain to being input in RPN module, RoI Pooling module and RoI Align module together on the basis of basic Mask RCNN algorithm, it is shown experimentally that this method is significant for large-scale target effect, it can then be improved comprehensively according to for the ideal image segmentation algorithm of Small object image segmentation, for self iteration of Mask RCNN.
Description
Technical field
The invention belongs to pedestrian detection method fields, in particular to the target detection network design of blending image segmentation feature
Method.
Background technique
With the development of science and technology with the progress in epoch, it has to recognize that our life style also constantly changes therewith
?.The trip mode of people is constantly updated, and automobile is most commonly used a kind of vehicles under Contemporary Environmental, according to Ministry of Public Security's traffic control
Office's statistics, in by the end of June, 2017 by, national vehicle guaranteeding organic quantity is up to 3.04 hundred million, and wherein 2.05 hundred million, automobile, colleague's traffic are pacified
Full problem is particularly pertinent, and according to incompletely statistics, the number that annual middle and low income country dies in traffic accident has reached entirely
90% or more of the total death toll of ball, however the vehicle fleet that these countries possess only accounts for the 48% of worldwide vehicle sum.It is striking
Soul-stirring data bring is the deep thinking of traffic safety problem behind, although and analyzing a variety of traffic accident causations we have found that making
The reason of taking place frequently at traffic accident has very much, but not high to the attention rate of pedestrian is wherein most important one of reason.
In order to solve this problem, researchers at home and abroad provide many solutions, and most typical is exactly that auxiliary is driven
Sail system.Advanced DAS (Driver Assistant System) (ASAD), and within the system, the most key technology is exactly pedestrian detection technology.
The final purpose of pedestrian detection technology is the presence to judge whether to have pedestrian in some video sequence or image,
The position of pedestrian can be accurately outlined on this basis.Although current research can judge to a certain degree pedestrian
Image, but still remain much be unable to entirely accurates identification differentiate the problem of.
In pedestrian's detection field, Early features extraction mainly uses HOG feature, but since HOG feature is engineer,
Its feature extraction algorithm process is fixed, and only pedestrian preferably identifies keeping standing position Shi Caike, therefore many researchs at that time
Personnel propose the thought of Fusion Features, HOG feature and other characteristics of image are blended, such as image segmentation feature, picture depth
Feature and picture edge characteristic etc..In the recent period, very fast in the development of computer vision field convolutional neural networks, it has gradually replaced artificial
The feature of design, but its performance is still to be improved, and Fusion Features thought stands good herein.
In the pedestrian detection model using the design of Mask RCNN algorithm, to identify compared with Small object, then need to put image
Greatly, so that Small object size enters in the range of region recommendation window, but some inhuman details can be amplified at this time, makes it
Shape is approximatively close to pedestrian's shape.
Summary of the invention
The object of the present invention is to provide the target detection network design methods of blending image segmentation feature, and this method is for big
Type target effect is significant.
In conjunction with general target detection framework Mask RCNN, while blending image divides feature, calculates in basic Mask RCNN
By feature integration that Target Segmentation feature and ResNet-101 convolutional network obtain to being input to RPN mould together on the basis of method
In block, RoI Pooling module and RoI Align module.And tested in MS COCO test set, verify its validity.
Pedestrian detection method Target Segmentation image based on multi-feature fusion has a kind of characteristic, both after amplification can't
Excessive details is generated, blending image divides feature in Mask RCNN algorithm based on this idea, improves its performance.
The advantage is that:
Describe the algorithm design conditions about Mask RCNN blending image segmentation feature in detail in technical solution, it is first
It first illustrates to introduce the motivation of image partition method and the process of this method.Then DeepLabv3 image segmentation net is described
The case where network, illustrates the effect of empty convolution wherein newly introduced.Next image segmentation network and feature gold word are described
The details of tower progress Fusion Features.Show that this method is significant for large-scale target effect finally by experiment, according to for small
The ideal image segmentation algorithm of target image segmentation effect can then improve comprehensively, for self iteration of Mask RCNN.
Detailed description of the invention
Fig. 1 is the Mask RCNN algorithm structure schematic diagram that blending image divides feature.
Fig. 2 is empty convolution schematic diagram, and (a) convolution kernel is 3*3.
Fig. 3 is empty convolution schematic diagram, and (b) convolution kernel is 7*7.
Fig. 4 is empty convolution schematic diagram, and (c) convolution kernel is 15*15.
Fig. 5 is comparison diagram before and after image segmentation network processes, (a) original image.
Fig. 6 is comparison diagram before and after image segmentation network processes, (b) effect picture after image segmentation network processes.
Fig. 7 is comparison diagram before and after image segmentation network processes.
Fig. 8 is that blending image divides feature schematic diagram.
Fig. 9 is residual error network element structures schematic diagram.
Figure 10 is residual error network portion structural schematic diagram.
Figure 11 is region recommendation network algorithm flow chart.
Figure 12 is candidate window classification, dividing processing flow chart.
Figure 13 is MS COCO training set human object's height distribution histogram.
Figure 14 is Cityscapes data set different resolution test result figure, (a) 0.5 times of scaling test result histogram
Figure.
Figure 15 is Cityscapes data set different resolution test result figure, (b) 1 times of scaling test result histogram.
Figure 16 is Cityscapes data set different resolution test result figure, (c) 2 times of scaling test result histograms.
Figure 17 is that Cityscapes data set different proportion scales test result figure.
Figure 18 is interior driver and passenger's erroneous detection is pedestrian, (a) passenger inside the vehicle.
Figure 19 is interior driver and passenger's erroneous detection is pedestrian, (b) interior driver.
Figure 20 is that optimization algorithm applies front and back comparison diagram, before (a) algorithm improvement.
Figure 21 is that optimization algorithm applies front and back comparison diagram, (b) after algorithm improvement.
Figure 22 optimization algorithm test result.
Specific embodiment
The target detection network design method of blending image segmentation feature:
Network structure design:
The Mask RCNN algorithm structure schematic diagram that blending image divides feature is as shown in Figure 1.
Image segmentation network introduction:
As seen from Figure 1, Target Segmentations different from the Target Segmentation in Mask RCNN, that this Fusion Features is added
Network is the module with independent processing ability, selects DeepLabv3 semantic segmentation algorithm as Target Segmentation network here,
DeepLabv3 method is divided into two steps:
(1) to obtain primary segmentation result figure using full convolutional network, and it is interpolated into original image size.
(2) the fine amendment in details is carried out to the image segmentation result that interpolation obtains using full connection CRFs algorithm,
Successive ignition is carried out to obtain optimum segmentation result.
The full convolutional network of DeepLabv3 makes it have structure end to end, is highly convenient for training, therein using cavity
Convolution design, or it is expansion convolution, can effectively replace pond layer reduces information loss.Based on convolutional neural networks
Image segmentation algorithm uses the model of coding and decoding, first successively reduces the Spatial Dimension of input data by encoder, recycles solution
Code device both the networks such as deconvolution successively restore target details and corresponding spatial position.
Model encoder therein generallys use pond layer reduction input data and then expands receptive field, but pond layer can be lost
Bulk information is lost, can greatly increase calculation amount if the method by increasing convolution kernel expands receptive field, especially encode
Interlude, the port number of eigenmatrix is usually 512 or 1024, if the convolution kernel of original 3*3 is revised as 5*5 or 7*7
Then calculation amount can explode.The alternative pondization operation of empty convolution is introduced, empty convolution schematic diagram is as in Figure 2-4.
Fig. 2 (a) is normal 3*3 convolution kernel, and Fig. 3 (b) is replaced the convolution kernel of 7*7 by the empty convolution kernel of 3*3, and Fig. 4 (c) is
The empty convolution kernel of 3*3 replaces the convolution kernel of 15*15, and wherein blue region is convolution kernel overlay area, its other than red dot
His convolution kernel part is zero padding operation.It can replace pondization to operate it can be seen from Fig. 2-4, do not losing information, do not increasing meter
Increase receptive field in the case where calculation amount, so that each convolution output covering larger area.
Image segmentation network segmentation front and back comparison is as seen in figs. 5-6.
Fig. 5 is original image, and Fig. 6 is effect picture after processing, it can be seen that after image segmentation network processes, nearby
Pedestrian contour is very clear, and the pedestrian contour of distant place is then unintelligible.By Fusion Features, region recommendation network and time
Select the result figure obtained after window classification, segmentation network processes as shown in Figure 7.
Digital convergence is in 0.994-0.998 range behind person in Fig. 7, wherein 0.998 number is at most (for seven),
0.994 is two, and 0.996 is three.
Comparison diagram 5-6 and Fig. 7 is it can be found that the result figure that image segmentation network obtains can not accurately be found out present in figure
Each target, but it can substantially distinguish complicated background with target is subsequent more accurately target detection
It lays the foundation with Target Segmentation, it is highly important that this shows that blending image segmentation is characterized in.
Due to Mask RCNN network introduced feature pyramid structure in feature extraction of this secondary design, so at this
The Target Segmentation network of addition must also make corresponding adjustment, and Fusion Features use Faster RCNN for basic framework, feature
The part-structure that network is Vgg16 convolutional neural networks is extracted, finally obtained is a global feature matrix, and the spy introduced
The eigenmatrix that sign pyramid structure will obtain 5 resolution ratio and successively successively decrease.It shows with the DeepLabv3 image segmentation network integration
It is intended to as shown in Figure 8.It can be seen that the eigenmatrix of image segmentation network output is for the spy that exports with feature pyramid in figure
Sign matrix blends, and needs constantly to go to reduce matrix resolution by pond layer, can use on Feature fusion a variety of
Mode is such as added by element, channel is cumulative in the case where multiplication or equal resolution.Wherein the image segmentation of this input is special
Levying port number is 3, the method for misfitting with 256 channels of feature pyramid output, therefore using cumulative port number, by port number
Increase to 259, and convolutional layer is added dimensionality reduction can be docked directly with network below in this way to 256 again by its port number.
Wherein the feature extraction network of Mask RCNN is single spur track, and the Target Segmentation network being newly added and original spy
Sign extracts network and forms two branches, in training, the case where encountering gradient backpropagation, needs to freeze original net at this time
Network weight parameter is only trained the weight parameter for the Target Segmentation network being newly added.If not freezing legacy network weight ginseng
Several, the network bring error being newly added will seriously pollute legacy network, and model performance is made serious degenerate occur.
The weight training of new network is built upon on the basis of original Mask RCNN, continues to select MS COCO as training
Collection, is totally divided into three steps: (1) freezing all original weight parameters first, only carry out to the Target Segmentation network weight being newly added
Training, makes loss function reach original numerical value.(2) freeze frame is input to network weight and Target Segmentation network between FPN
Weight is trained Weighted residue part.(3) whole weights of the network after stabilization are trained.
The first experimental analysis of foregoing teachings:
Mask RCNN marks context of detection after blending image divides feature, in pedestrian, the knowledge to Small object and medium target
Rate does not improve lower, is then significantly increased for the discrimination of big target.The basic reason for this phenomenon occur is to be merged
Image segmentation feature clarity in terms of Small object is poor, can only play the role of seed point, but it is in the extraction of larger target
Aspect significant effect improves meeting with image segmentation algorithm stepping up in aspect of performance brought by Fusion Features
It is increasing.Shown in specific experiment tables of data 1.The experiment test set is the human object in MS COCO test set.
The comparison of 1 accuracy rate of table
Feature extraction network design above-mentioned:
Current effect is selected preferably to increase income CNN network as feature extraction network.Wherein, ResNet-101 network is in mesh
Performance is excellent in terms of preceding feature extraction, and compared with other convolutional neural networks, it joined residual error function, this residual error letter
Number can be such that the depth of CNN reaches not degenerate in very high situation.In computer vision, with the increasing of the network number of plies
Add, the feature level extracted is also higher, closer to semantic information.Before the appearance of no residual error network, the too deep meeting of the network number of plies
Gradient disperse or gradient explosion phenomenon are brought, after solving degenerate problem, its performance is also continuous with the increase of the network number of plies
Increase, if the performance of ResNet-50, ResNet-101, ResNet-152 are to be promoted steadily.
Specific residual error function structure is shown in Fig. 9.If setting input feature vector matrix as x, intermediate weights network is F, then defeated
The eigenmatrix for arriving next layer out is H (x)=F (x)+x, and the function of unit networks fitting is F (x)=H (x)-x.Its initial mesh
Be the identical study situation of simulation, it is believed that network, which will learn mapping of F (the x)=x mapping than study F (x)=0, to be more difficult.
In addition to this, another actual influence of residual error structure bring is to export the shadow that the variation of eigenmatrix is F to intermediate weighting network
Sound is bigger, keeps it more sensitive.The thought of residual error network is to remove identical main part, thus the change of prominent features matrix
Change, this is quite similar with the differential amplification system in circuit, differential amplification system can solve signal transmit at a distance in line
Road interference, residual error network can solve gradient disperse or gradient explosion issues in deep layer network.
Characteristic extraction part schematic network structure is as shown in Figure 10.
X branch of the Figure 10 after pooling layers of Max joined the convolutional layer and BN (Batch that convolution kernel is 1*1
Normalization) layer, it acts as matrix dimension is changed, then without the convolutional layer and BN layers in subsequent residual error structure,
Input feature vector matrix is directly added with F (x).It in each F (x) structure, is all made of cubic convolution, for the first time and finally
Primary is the convolution kernel of 1*1, and work, which is used as, is changed to matrix dimension, and centre is the convolution kernel of 3*3.Its subsequent residual error network is exactly
The repetition of this network is superimposed, and wherein convolution kernel is 3*3 always, and the port number of eigenmatrix then constantly changes.
Reappear Mask RCNN algorithm with the deep learning frame library Keras based on the rear end TensorFlow, using MS
COCO data set trains the model.Currently, algorithm of target detection mainstream frame is FPN and integrating context information.Fusion is up and down
Literary information, that is, Low Level Vision information combination high-layer semantic information, in conjunction with method be various ways, as element be added one by one or be multiplied,
Characteristic pattern is cumulative to make port number the modes such as increase, wherein it is preferable by element addition method effect, therefore this is adopted this method.
Currently, TensorFlow frame supports preferably more video card parallel trainings, when using stochastic gradient descent algorithm,
The sample image of each batch processing is more, then model generalization ability and loss decline stability are higher, but MS COCO data set
Image resolution ratio is not consistent, and to improve batch processing ability, input picture is uniformly processed as 1024*1024 resolution ratio, but protects
Sample image original aspect ratio is demonstrate,proved, other parts are carried out to mend 0 processing.
Table 2 lists the dimension specific value of ResNet-101 residual error network with feature pyramid eigenmatrix when combining,
And feature pyramid carries out the resolution ratio after dimensionality reduction to it, this feature pyramid is 5 layers, and residual error network only exports 4 features
The eigenmatrix of matrix, the last layer is obtained by the direct dimensionality reduction of layer second from the bottom.
Be to small target deteection in conjunction with contextual information it is helpful, in pedestrian detection task, it may appear that a large amount of small mesh
Mark needs detected situation, after obtaining FPN layer eigenmatrix, to the progress deconvolution processing of high-level characteristic matrix, make its with
Preceding layer eigenmatrix dimension is consistent, merges it mutually by the method that matrix is added by element.
2 eigenmatrix resolution ratio contrast table of table
So far, the feature extraction of image is completed, and converts 5 eigenmatrixes for an image, subsequent to be pushed away by region
Algorithm is recommended, finds foreground target from 5 eigenmatrixes.
The design of region recommendation network:
The basic procedure of RCNN serial algorithm are as follows: feature extraction first is carried out to image, is then obtained by this feature matrix
Foreground target, the previous form chosen foreground target and generally use sliding window, it is envisaged that this is multiple small tasks cumulative
It is handled, treatment process then serially carries out.And Faster RCNN proposes the region RPN recommendation network structure, uses anchor
Point form, makes the sliding window task of serial process become the anchor point task of parallel processing, this greatly accelerates processing speed.And
The form of Mask RCNN selection foreground target is almost the same with Faster RCNN's, FPN layers each due to FPN layers of presence
The quantity of each anchor point be not 3 kinds of scales and 3 kinds of shapes (combine totally 9 kinds of shapes) in Faster RCNN, but
3 kinds of shapes of only a kind of scale, i.e., vertical rectangle, horizontal rectangle and square.Such as the eigenmatrix of the first layer of FPN
Resolution ratio is 256*256*256, then generates 256*256*3 pre-selection window altogether, and the eigenmatrix resolution ratio of the second layer is
128*128*256 then generates altogether 128*128*3 pre-selection window, according to original image resolution and each eigenmatrix point
Resolution, can calculate the coordinate of three of each anchor point pre-selection windows, the convolutional layer for being 3 × 3 by a convolution kernel, can be with
New matrix is obtained, value is the destination probability value and four coordinate shift amounts that each window generates.With 4 variable Pcx、Pcy、
Pw、PhRespectively indicate center abscissa, center ordinate, the window width, window height of each anchor point pre-selection window.Four seats
Mark offset is dx、dy、dw、dhRespectively preselect the translation abscissa of window center point, translation ordinate, the window of central point
Width zoom factor, window height zoom factor.Finally obtained new window value is P 'cx、P′cy、P′w、P′h, production Methods
As shown in formula (1).
Faster RCNN is all anchor point windows to be generated on a characteristic layer, and the Mask RCNN that FPN is added is
Various sizes of anchor point window is generated on different characteristic layer, as characteristic layer gradually increases, characteristic layer is more and more abstract, each
The corresponding area in original image of anchor point is then bigger, and table 3 is the corresponding size of each layer anchor point window of RPN.
The setting of table 3RPN anchor point window
Table 3 is it can be seen that the setting of its anchor point window covers the target of each size in MS COCO data set extensively.
Region recommendation network overall flow figure is as shown in figure 11.
5 eigenmatrixes obtained by feature extraction network will be final to obtain by the processing of flow chart shown in Figure 11
To recommendation window list.It designs herein and does not use SVM classifier, but use Softmax classifier, compared to obtaining for SVM
The scores of the result Softmax classifier divided have probability meaning, its score is mapped to probability space by Softmax,
Last item score had both been the probability value that target belongs to the category.
Be adjusted to the image of 1024*1024 resolution ratio for input picture, generated number of windows it is huge, even if
Later period removes the anchor point window beyond image border, quantity be also it is very huge, classified one by one to it, returned and mesh
Mark segmentation just needs great calculation amount.Therefore be ranked up by the destination probability value to each window, carrying out non-maximum
After inhibition processing, retain 2000 recommendation windows in the training stage, retains 1000 recommendation windows in test phase.
The problem of being matched about anchor with sample object frame, select in all anchors with sample object frame Duplication it is highest and
It is positive sample that Duplication, which is greater than 0.7, and Duplication is negative sample less than 0.3, other are as neutral sample.To guarantee positive and negative sample
The balance of this quantity, positive sample quantity must not exceed the half of selection anchor sum, and the part of positive negative sample exceeded will be set to
Neutral sample.All anchors are the summation of different FPN layers of anchors.
Candidate window classification, dividing processing design:
It has been observed that obtaining preliminary foreground target by the region RPN recommendation network, this brief summary will be to obtained recommendation target
It is handled.
Due to the presence of articulamentum complete in Faster RCNN, to handle the image of arbitrary resolution, it is necessary to connect entirely
Unified quantization operation is carried out before layer, Faster RCNN completes this operation using RoI Pooling layers, but it is not ensured that
The one-to-one correspondence of Pixel-level between input and output, the process can't affect greatly classification, but to Pixel-level
Target Segmentation impacts larger.And all quantizing process in Pooling layers of RoI of removal of Align layers of the RoI of Mask RCNN,
Target detection branch is continued to the 7*7 size of the RoI Pooling using Faster RCNN, Target Segmentation branch is adopted when reproduction
With 14*14 size.The design reduces the presence of error when extracting area-of-interest from characteristic pattern using bilinear interpolation.
Target Segmentation network and target detection network are all to receive candidate window from region recommendation network, in conjunction with preceding feature
5 layers of eigenmatrix that pyramid obtains, therefrom extract local feature matrix corresponding to candidate window, both RoI Pooling
The eigenmatrix that layer is exported with Align layers of RoI, the two principle is identical, and only resolution ratio is different.
Candidate window classification, dividing processing flow chart are as shown in figure 12.From the figure, it can be seen that the RoI in reproduction
Pooling layers of obtained eigenmatrix are handled by the convolutional layer that two convolution kernels are 1*1, and effect is as full articulamentum.
Since MS COCO data set complexity is higher, then do not copy Faster RCNN that dropout layers are added after convolutional layer.Often
A candidate window will all obtain the vector of one 81 dimension, obtains 81 probability values after the processing of sigmoid function, corresponds to
The probability value of 80 kinds of objects and background, wherein which class probability value highest, the candidate window be both that a kind of target, and final mask is defeated
Out when result, the object can be just marked in the picture when only probability value is higher than given threshold.Obtaining candidate window probability
While value, also further its position coordinates accurately will be adjusted according to which type objects is target belong to.Since output is tied
Fruit is very more, it may appear that the case where unifying object repeating label, therefore using non-very big in the final output of target detection
Value restrainable algorithms carry out deleting processing to the higher target of Duplication.
While obtaining object detection results, processing target is also divided into network.It is obtained after Align layers of RoI processing
To the eigenmatrix of each candidate window, deconvolution operation is carried out to it after multiple convolution operation is carried out to it, and to every one kind
The binary map of image segmentation is not generated not individually, and the process of binaryzation is completed by sigmoid function, and the final area-of-interest is adopted
Which determined with the binary map of classification by the target category that target detection branch exports, which also eliminates usual Target Segmentation institutes
Race problem between the class faced.
The loss function design can be used:
About loss function, Mask RCNN is added to L on the basis of Faster RCNNmaskVariable, it obtains prediction
Target Segmentation binary map carry out cross entropy operation, be multi-task learning mode, shown in whole loss function such as formula (2):
L=Lcls+Lbox+Lmask+Lp+Lr (2)。
Wherein LmaskFor the loss of object segmentation result, LclsFor target detection Classification Loss, LboxFor target detection coordinate
Return loss, LrFor weight regularization loss, LpFor the loss of region recommendation network.
(1) target detection Classification Loss:
In training, window is recommended in the region that target detection network can obtain 200, and the ratio of this positive negative sample is 1:
2.If p is the corresponding probability value of correct classification, LclsIt indicates that the Classification Loss of window is recommended in 200 regions, cross entropy is selected to make
For measurement standard, calculate as shown in formula (3):
(2) target detection coordinate returns loss:
Coordinate recurrence loss is different from the measurement standard of target detection Classification Loss, selects smooth L1 as its measurement
Standard.It is calculated as shown in formula (4):
(3) loss of object segmentation result:
In training, window is recommended in the region that target detection network can obtain 200, and Target Segmentation can export 200 28*
28 matrix, the probability value that each element of matrix is 0 to 1.Logarithm loss function is selected to measure object segmentation result.Below
Logarithm loss function is provided in the definition of individual data point, as shown in formula (5):
Cost (y, p (y | x))=- y ln p (y | x)-(1-y) ln (1-p (y | x)) (5).
Each window object segmented image matrix dimensionality is 28*28, then LmaskFormula such as formula (6) shown in:
(4) region recommendation network loses:
RPN network only need to distinguish candidate window whether be prospect can, therefore its be two classification problems, L can be referred tocls、
LboxIt is calculated.
(5) weight regularization error:
LrThe quadratic sum of as all weight coefficients and the product of proportionality coefficient α, specific as shown in formula (7), wherein w is
The trainable weight parameter of network.
The setting of network hyper parameter:
Mask RCNN network is end-to-end design, this brings great convenience to training, not only increases whole behaviour
The threshold of operator can also be effectively reduced in the efficiency of work.But it does so also with the presence of its drawback, is mainly improved in performance
When it is not noticeable to where problem, if it is the training executed step by step, where finding problem by contrast operation,
To breakthrough bottleneck, performance is improved.In training, intentional frozen fraction weight can choose to meet the need of substep training
It asks.
In training deep learning model, not only to prepare the data set marked, network structure and initialization network weight
Weight parameter, the more preferable hyper parameter that controlled training process is arranged, the hyper parameter of present networks are listed in table 4.
The setting of 4 hyper parameter of table
The second experimental analysis of foregoing teachings:
Training set of the training set of MS COCO data set as Mask RCNN is selected, there are 80 class different targets,
Target mark aspect, especially Small object mark aspect are more more careful and clear than other polytypic data sets.Before training
By training set image, scaling unites to wherein human object's size to 1024*1024 resolution ratio the case where keeping aspect ratio
Meter, as a result as shown in figure 13, this it appears that the distributed pole of its target size is extensively but uneven from histogram figure, big portion
Human object's height dimension is divided to concentrate near 30 pixel values.
In Mask CNN, area-of-interest area corresponding to each anchor point of characteristic pattern kind of the bottom is 16* in FPN
64,32*32 and 64*16, in training and test, the Duplication of the area-of-interest of positive sample and anchor point is 0.7, if target
Area is too small, then it can not obtain enough Duplication, therefore very poor for the discrimination compared with Small object.We are to Cityscapes
Data set carries out scaling processing, and scaling multiple is respectively 0.5 times, 1 times and 2 times, sharp respectively for the data set of three kinds of resolution ratio
It is tested with the model of MS COCO training set training, test result is as shown in figure 13, and blue histogram is test set institute in figure
Someone's class object height histogram, red histogram is the height histogram that model recalls correct target in the test set, from figure
In as can be seen that input picture by 0.5 times scaling after, the Small object to height less than 16 pixels can not identify.
In general, the training set sample size the more when being trained to image classification model, the mould that model learning arrives
Type is more accurate.There are bigger differences for the sample size of each scale of human object in the MS COCO data set used, this can also go out
The different situation of existing each scale weight training degree, for the scale of sample size abundance, then training effect is preferable, on the contrary, right
Then effect is relatively poor for the insufficient scale of sample size.In Figure 14-16, Figure 15 (b) and Figure 16 (c) are compared and can be seen
Out, its recognition effect is relatively poor after same target is amplified.
In practice, the human cost and time cost for making a data set are huge, this target scale distributions
Non-uniform situation be difficult to avoid and change, but can be improved by certain methods in the case where not changing data set
The performance of model:
(1) problem fixed for most grapnel size, can be put with the size of smallest sample in actual queries test set
Big arrive can be in identification range.
(2) while Small object is amplified, normal target is also amplified, and discrimination can reduce, and is not considering that video memory makes
In the case where not meeting cost performance, a variety of zoomed images to an image it can carry out while handle simultaneously, to what is obtained
Object statistics merge, and take out duplicate target with non-maxima suppression algorithm to final result.
For pedestrian detection problem, using Mask RCNN general target detection framework, formation zone recommendation network algorithm
Mode determine that it identifies Small object there is rigid critical value, i.e., the target that area is less than a certain threshold value can not be identified.
Small target deteection ability can be improved by carrying out resolution adjustment to input picture, in pedestrian's detection field, for model performance
Evaluation criterion there are many, wherein that relatively reasonable is MR-FPPI (Miss rate against false positives
Per image) curve index.Herein using the training set training Mask RCNN model in MS COCO data set, test set I
Select in Cityscapes the training set of Precision criterion and verifying collection, by it, scaling is 0.5 times, 1 times and 2 times and surveys respectively
Examination, test result are depicted as MR-FPPI curve, as shown in figure 17.
Horizontal axis is FPPI index, and the longitudinal axis is MR index, and wherein red line, blue line and green line respectively correspond scaling 0.5
Times, 1 times with 2 times after testing result, it can be seen that when averagely every figure erroneous detection quantity is 1,1 times and 2 times difference of scaling can
To obtain 0.7 and 0.73 accuracy rate.Its curve and the area that the longitudinal axis, horizontal axis are surrounded are smaller to show that its true model performance is got over
Good, i.e., in the case where every figure judges the alap situation of target by accident, the omission factor obtained is relatively low.
In Figure 17 it can be seen that, in the lower situation of FPPI index, three curve weave ins, i.e. model at this time
Detection performance for the image of three kinds of resolution ratio be it is similar, as the index of FPPI constantly increases, i.e., judge mesh by accident in image
Target gradually increases, and the omission factor of target obviously lowers in high-definition picture.And in the presence of the omission factor of low resolution image
Limit, the reason for this is that feature pyramid and its corresponding anchor have most window, i.e., its score is too low when target is too small to pass through
The mechanism of feature pyramid and anchor filters out it in subsequent region recommendation network, and then can not be in subsequent target detection
Network directly loses the target with Target Segmentation network.After image amplification, corresponding target also follows puts together
Greatly, and in the picture the quantity of high-resolution big target is compared and low Small object image its negligible amounts differentiated, therefore its MR-
FPPI curve will appear the situation in Figure 17.
The optimization of pedestrian detection algorithm:
It is all that all small target deteections are used described in preceding, a kind of common optimization algorithm, is being directed to specific pedestrian at last
Test problems are that it is not that all mankind are pedestrians in image that pedestrian, which has its specificity, the row only walked on road
The talent is pedestrian, and people in the car is simultaneously not belonging to pedestrian, but in a practical situation often judges interior driver or passenger
For pedestrian, specifically as depicted in figs. 18-19.
In Mask RCNN, the edge of object can be accurate to Pixel-level rather than just a rectangle frame, be based on
This, can accurately judge that the people is in the car or outside vehicle.Judge that process is as follows:
(1) by whether having motorcycle or bicycle below detection pedestrian to determine whether for pedestrian, because either taking charge of
Bicycle or motorcycle will not occur thereunder in machine or passenger.
(2) judge that the people is in the car or outside vehicle by the pixel coincidence factor of detection the people and vehicle.
Using in detection effect before and after the algorithm as shown in figures 20-21 figure it can be seen that interior driver is not marked with passenger
Note, and biggish pedestrian by motorcycle Chong Die with vehicle, are not judged as passenger inside the vehicle yet, this is being improved accurately to a certain degree
Degree.
For driver interior in pedestrian detection and the misjudged problem of passenger, its performance of MR-FPPI curve test is also used, is made
It applies the Cityscapes data set in the preferable 2 times of scalings of effect, and test result is as shown in figure 19.
It can be seen that in Figure 22, the blue line after optimization is lower than the red line before optimization, by optimizing later period input picture scaling
Accuracy rate is 0.75 when for 2 times of resolution ratio.
Claims (8)
1. the target detection network design method of blending image segmentation feature: it is characterized by comprising the following steps:
The Mask RCNN algorithm of blending image segmentation feature: input picture carries out feature extraction and image segmentation network respectively,
Then to region recommendation network, candidate window classification, segmentation network.
2. the target detection network design method of blending image segmentation feature according to claim 1: it is characterized in that packet
Include the following steps: image segmentation network is the module with independent processing ability;Target Segmentation network selects DeepLabv3 semantic
Partitioning algorithm, DeepLabv3 method are divided into two steps:
1) to obtain primary segmentation result figure using full convolutional network, and it is interpolated into original image size;
2) the fine amendment in details is carried out to the image segmentation result that interpolation obtains using full connection CRFs algorithm, carried out more
Secondary iteration is to obtain optimum segmentation result;
First successively reduce the Spatial Dimension of input data by encoder, decoder is recycled successively to restore the details of target and corresponding
Spatial position.
3. the target detection network design method of blending image segmentation feature according to claim 1: it is characterized in that packet
Include the following steps: the eigenmatrix of image segmentation network output needs in order to which the eigenmatrix exported with feature pyramid blends
Otherwise open close pond layer of crossing goes to reduce matrix resolution.
4. the target detection network design method of blending image segmentation feature according to claim 1: it is characterized in that packet
Include the following steps: feature extraction network is Target Segmentation network two branches of composition of single spur track and image segmentation, in training
When, the case where encountering gradient backpropagation, needs to freeze at this time legacy network weight parameter only to the Target Segmentation being newly added
The weight parameter of network is trained.
5. the target detection network design method of blending image segmentation feature according to claim 4: it is characterized in that packet
Include the following steps: being trained to the weight parameter for the Target Segmentation network being newly added: the weight training of new network is totally divided into
Three steps: 1) freezing all original weight parameters first, be only trained to the Target Segmentation network weight being newly added, and makes to lose letter
Number reaches original numerical value;2) freeze frame is input to the weight of network weight and Target Segmentation network between FPN, to Weighted residue
Part is trained;3) whole weights of the network after stabilization are trained.
6. the target detection network design method of blending image segmentation feature according to claim 1: it is characterized in that packet
Include the following steps: feature extraction: the x branch after pooling layers of Max joined the convolutional layer and BN that convolution kernel is 1*1
Layer, it acts as matrix dimension is changed, then without the convolutional layer and BN layers in subsequent residual error structure, by input feature vector matrix
Directly it is added with F (x);It in each F (x) structure, is all made of cubic convolution, the volume for being for the first time 1*1 with last time
Product core, work, which is used as, is changed to matrix dimension, and centre is the convolution kernel of 3*3;Its subsequent residual error network is exactly the repetition of this network
Superposition;
List residual error network and feature pyramid when combining the dimension specific value of eigenmatrix and feature pyramid to it
Resolution ratio after carrying out dimensionality reduction, this feature pyramid, the number of plies are only exported than residual error network 1 more than eigenmatrix number, and residual error network is most
The eigenmatrix of later layer is obtained by the direct dimensionality reduction of layer second from the bottom;
In conjunction with contextual information to small target deteection, in pedestrian detection task, it may appear that a large amount of Small objects need detected feelings
Condition carries out deconvolution processing to high-level characteristic matrix, ties up it with preceding layer eigenmatrix after obtaining FPN layers of eigenmatrix
Degree is consistent, merges it mutually by the method that matrix is added by element;
So far, the feature extraction of image is completed, subsequent to recommend to calculate by region by an image degree of being converted into eigenmatrix
Method finds foreground target from multiple eigenmatrixes.
7. the target detection network design method of blending image segmentation feature according to claim 1: it is characterized in that packet
Include the following steps: the design of region recommendation network:
The each eigenmatrix obtained by feature extraction network, respectively by two-way convolutional layer, BN layers, ReLU layers, convolutional layer,
BN layers and ReLU layers, then respectively correspond Sigmoid layers and anchor point window probability value list and the transformation of anchor point window coordinates and anchor
Point the window's position list, then handled jointly by non-maxima suppression, finally obtain recommendation window list.
8. the target detection network design method of blending image segmentation feature according to claim 1: it is characterized in that packet
Include the following steps: candidate window classification, dividing processing design:
In reproduction, Pooling layers of obtained eigenmatrix of RoI are incited somebody to action by two convolutional layers processing, each candidate window
To the vector of a multidimensional, multiple probability values are obtained after the processing of sigmoid function, correspond to a variety of objects and background
Probability value, wherein which class probability value highest, the candidate window be both that a kind of target;
While obtaining candidate window probability value, also by further according to target belong to which type objects to its position coordinates into
The accurate adjustment of row;Since output result is very more, it may appear that the case where unifying object repeating label, therefore in the final of target detection
The higher target of Duplication is carried out deleting processing using non-maxima suppression algorithm in output result;
While obtaining object detection results, processing target is also divided into network;It is obtained after Align layers of RoI processing each
The eigenmatrix of candidate window carries out deconvolution operation to it after carrying out multiple convolution operation to it, and independent to each classification
The binary map of image segmentation is generated, the process of binaryzation is completed by sigmoid function, which the final area-of-interest uses
The binary map of classification is determined by the target category that target detection branch exports.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810860392.7A CN109145769A (en) | 2018-08-01 | 2018-08-01 | The target detection network design method of blending image segmentation feature |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810860392.7A CN109145769A (en) | 2018-08-01 | 2018-08-01 | The target detection network design method of blending image segmentation feature |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109145769A true CN109145769A (en) | 2019-01-04 |
Family
ID=64799266
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810860392.7A Withdrawn CN109145769A (en) | 2018-08-01 | 2018-08-01 | The target detection network design method of blending image segmentation feature |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109145769A (en) |
Cited By (44)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108764115A (en) * | 2018-05-24 | 2018-11-06 | 东北大学 | A kind of truck danger based reminding method |
CN109815931A (en) * | 2019-02-01 | 2019-05-28 | 广东工业大学 | A kind of method, apparatus, equipment and the storage medium of video object identification |
CN109871798A (en) * | 2019-02-01 | 2019-06-11 | 浙江大学 | A kind of remote sensing image building extracting method based on convolutional neural networks |
CN109886990A (en) * | 2019-01-29 | 2019-06-14 | 理光软件研究所(北京)有限公司 | A kind of image segmentation system based on deep learning |
CN109902677A (en) * | 2019-01-30 | 2019-06-18 | 深圳北斗通信科技有限公司 | A kind of vehicle checking method based on deep learning |
CN109934161A (en) * | 2019-03-12 | 2019-06-25 | 天津瑟威兰斯科技有限公司 | Vehicle identification and detection method and system based on convolutional neural network |
CN109949334A (en) * | 2019-01-25 | 2019-06-28 | 广西科技大学 | Profile testing method based on the connection of deeply network residual error |
CN109948444A (en) * | 2019-02-19 | 2019-06-28 | 重庆理工大学 | Method for synchronously recognizing, system and the robot of fruit and barrier based on CNN |
CN109978886A (en) * | 2019-04-01 | 2019-07-05 | 北京市商汤科技开发有限公司 | Image processing method and device, electronic equipment and storage medium |
CN110059772A (en) * | 2019-05-14 | 2019-07-26 | 温州大学 | Remote sensing images semantic segmentation method based on migration VGG network |
CN110070030A (en) * | 2019-04-18 | 2019-07-30 | 北京迈格威科技有限公司 | Image recognition and the training method of neural network model, device and system |
CN110110702A (en) * | 2019-05-20 | 2019-08-09 | 哈尔滨理工大学 | It is a kind of that algorithm is evaded based on the unmanned plane for improving ssd target detection network |
CN110136141A (en) * | 2019-04-24 | 2019-08-16 | 佛山科学技术学院 | A kind of image, semantic dividing method and device towards complex environment |
CN110148135A (en) * | 2019-04-03 | 2019-08-20 | 深兰科技(上海)有限公司 | A kind of road surface dividing method, device, equipment and medium |
CN110288082A (en) * | 2019-06-05 | 2019-09-27 | 北京字节跳动网络技术有限公司 | Convolutional neural networks model training method, device and computer readable storage medium |
CN110349138A (en) * | 2019-06-28 | 2019-10-18 | 歌尔股份有限公司 | The detection method and device of the target object of Case-based Reasoning segmentation framework |
CN110530875A (en) * | 2019-08-29 | 2019-12-03 | 珠海博达创意科技有限公司 | A kind of FPCB open defect automatic detection algorithm based on deep learning |
CN110568445A (en) * | 2019-08-30 | 2019-12-13 | 浙江大学 | Laser radar and vision fusion perception method of lightweight convolutional neural network |
CN110609320A (en) * | 2019-08-28 | 2019-12-24 | 电子科技大学 | Pre-stack seismic reflection pattern recognition method based on multi-scale feature fusion |
CN110689061A (en) * | 2019-09-19 | 2020-01-14 | 深动科技(北京)有限公司 | Image processing method, device and system based on alignment feature pyramid network |
CN110852176A (en) * | 2019-10-17 | 2020-02-28 | 陕西师范大学 | High-resolution three-number SAR image road detection method based on Mask-RCNN |
CN110851633A (en) * | 2019-11-18 | 2020-02-28 | 中山大学 | Fine-grained image retrieval method capable of realizing simultaneous positioning and Hash |
CN110941995A (en) * | 2019-11-01 | 2020-03-31 | 中山大学 | Real-time target detection and semantic segmentation multi-task learning method based on lightweight network |
CN111144484A (en) * | 2019-12-26 | 2020-05-12 | 深圳集智数字科技有限公司 | Image identification method and device |
CN111339882A (en) * | 2020-02-19 | 2020-06-26 | 山东大学 | Power transmission line hidden danger detection method based on example segmentation |
CN111415106A (en) * | 2020-04-29 | 2020-07-14 | 上海东普信息科技有限公司 | Truck loading rate identification method, device, equipment and storage medium |
CN111462128A (en) * | 2020-05-28 | 2020-07-28 | 南京大学 | Pixel-level image segmentation system and method based on multi-modal spectral image |
CN111461130A (en) * | 2020-04-10 | 2020-07-28 | 视研智能科技(广州)有限公司 | High-precision image semantic segmentation algorithm model and segmentation method |
CN111580151A (en) * | 2020-05-13 | 2020-08-25 | 浙江大学 | SSNet model-based earthquake event time-of-arrival identification method |
CN111640125A (en) * | 2020-05-29 | 2020-09-08 | 广西大学 | Mask R-CNN-based aerial photograph building detection and segmentation method and device |
CN111695380A (en) * | 2019-03-13 | 2020-09-22 | 杭州海康威视数字技术股份有限公司 | Target detection method and device |
CN111723829A (en) * | 2019-03-18 | 2020-09-29 | 四川大学 | Full-convolution target detection method based on attention mask fusion |
CN111753579A (en) * | 2019-03-27 | 2020-10-09 | 杭州海康威视数字技术股份有限公司 | Detection method and device for designated walk-substituting tool |
CN111932553A (en) * | 2020-07-27 | 2020-11-13 | 北京航空航天大学 | Remote sensing image semantic segmentation method based on area description self-attention mechanism |
CN111985473A (en) * | 2020-08-20 | 2020-11-24 | 中再云图技术有限公司 | Method for identifying private business of store |
CN112001225A (en) * | 2020-07-06 | 2020-11-27 | 西安电子科技大学 | Online multi-target tracking method, system and application |
CN112215128A (en) * | 2020-10-09 | 2021-01-12 | 武汉理工大学 | FCOS-fused R-CNN urban road environment identification method and device |
CN112396582A (en) * | 2020-11-16 | 2021-02-23 | 南京工程学院 | Mask RCNN-based equalizing ring skew detection method |
CN113298036A (en) * | 2021-06-17 | 2021-08-24 | 浙江大学 | Unsupervised video target segmentation method |
CN113435271A (en) * | 2021-06-10 | 2021-09-24 | 中国电子科技集团公司第三十八研究所 | Fusion method based on target detection and instance segmentation model |
CN113487622A (en) * | 2021-05-25 | 2021-10-08 | 中国科学院自动化研究所 | Head and neck organ image segmentation method and device, electronic equipment and storage medium |
CN115546483A (en) * | 2022-09-30 | 2022-12-30 | 哈尔滨市科佳通用机电股份有限公司 | Method for measuring residual using amount of carbon slide plate of subway pantograph based on deep learning |
CN116452600A (en) * | 2023-06-15 | 2023-07-18 | 上海蜜度信息技术有限公司 | Instance segmentation method, system, model training method, medium and electronic equipment |
CN117252790A (en) * | 2023-08-23 | 2023-12-19 | 成都理工大学 | Multi-image fusion method based on NSCT-RCNN |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103984915A (en) * | 2014-02-28 | 2014-08-13 | 中国计量学院 | Pedestrian re-recognition method in monitoring video |
CN106874894A (en) * | 2017-03-28 | 2017-06-20 | 电子科技大学 | A kind of human body target detection method based on the full convolutional neural networks in region |
CN107507126A (en) * | 2017-07-27 | 2017-12-22 | 大连和创懒人科技有限公司 | A kind of method that 3D scenes are reduced using RGB image |
CN107527351A (en) * | 2017-08-31 | 2017-12-29 | 华南农业大学 | A kind of fusion FCN and Threshold segmentation milking sow image partition method |
CN108062756A (en) * | 2018-01-29 | 2018-05-22 | 重庆理工大学 | Image, semantic dividing method based on the full convolutional network of depth and condition random field |
CN108346154A (en) * | 2018-01-30 | 2018-07-31 | 浙江大学 | The method for building up of Lung neoplasm segmenting device based on Mask-RCNN neural networks |
-
2018
- 2018-08-01 CN CN201810860392.7A patent/CN109145769A/en not_active Withdrawn
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103984915A (en) * | 2014-02-28 | 2014-08-13 | 中国计量学院 | Pedestrian re-recognition method in monitoring video |
CN106874894A (en) * | 2017-03-28 | 2017-06-20 | 电子科技大学 | A kind of human body target detection method based on the full convolutional neural networks in region |
CN107507126A (en) * | 2017-07-27 | 2017-12-22 | 大连和创懒人科技有限公司 | A kind of method that 3D scenes are reduced using RGB image |
CN107527351A (en) * | 2017-08-31 | 2017-12-29 | 华南农业大学 | A kind of fusion FCN and Threshold segmentation milking sow image partition method |
CN108062756A (en) * | 2018-01-29 | 2018-05-22 | 重庆理工大学 | Image, semantic dividing method based on the full convolutional network of depth and condition random field |
CN108346154A (en) * | 2018-01-30 | 2018-07-31 | 浙江大学 | The method for building up of Lung neoplasm segmenting device based on Mask-RCNN neural networks |
Non-Patent Citations (2)
Title |
---|
KAIMING HE等: "《Mask R-CNN》", 《ARXIV》 * |
TSUNG-YI LIN等: "《Feature Pyramid Networks for Object Detection》", 《2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 * |
Cited By (67)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108764115B (en) * | 2018-05-24 | 2021-12-14 | 东北大学 | Truck danger reminding method |
CN108764115A (en) * | 2018-05-24 | 2018-11-06 | 东北大学 | A kind of truck danger based reminding method |
CN109949334B (en) * | 2019-01-25 | 2022-10-04 | 广西科技大学 | Contour detection method based on deep reinforced network residual error connection |
CN109949334A (en) * | 2019-01-25 | 2019-06-28 | 广西科技大学 | Profile testing method based on the connection of deeply network residual error |
CN109886990A (en) * | 2019-01-29 | 2019-06-14 | 理光软件研究所(北京)有限公司 | A kind of image segmentation system based on deep learning |
CN109902677A (en) * | 2019-01-30 | 2019-06-18 | 深圳北斗通信科技有限公司 | A kind of vehicle checking method based on deep learning |
CN109902677B (en) * | 2019-01-30 | 2021-11-12 | 深圳北斗通信科技有限公司 | Vehicle detection method based on deep learning |
CN109871798A (en) * | 2019-02-01 | 2019-06-11 | 浙江大学 | A kind of remote sensing image building extracting method based on convolutional neural networks |
CN109815931A (en) * | 2019-02-01 | 2019-05-28 | 广东工业大学 | A kind of method, apparatus, equipment and the storage medium of video object identification |
CN109948444A (en) * | 2019-02-19 | 2019-06-28 | 重庆理工大学 | Method for synchronously recognizing, system and the robot of fruit and barrier based on CNN |
CN109934161A (en) * | 2019-03-12 | 2019-06-25 | 天津瑟威兰斯科技有限公司 | Vehicle identification and detection method and system based on convolutional neural network |
CN111695380B (en) * | 2019-03-13 | 2023-09-26 | 杭州海康威视数字技术股份有限公司 | Target detection method and device |
CN111695380A (en) * | 2019-03-13 | 2020-09-22 | 杭州海康威视数字技术股份有限公司 | Target detection method and device |
CN111723829B (en) * | 2019-03-18 | 2022-05-06 | 四川大学 | Full-convolution target detection method based on attention mask fusion |
CN111723829A (en) * | 2019-03-18 | 2020-09-29 | 四川大学 | Full-convolution target detection method based on attention mask fusion |
CN111753579A (en) * | 2019-03-27 | 2020-10-09 | 杭州海康威视数字技术股份有限公司 | Detection method and device for designated walk-substituting tool |
CN109978886A (en) * | 2019-04-01 | 2019-07-05 | 北京市商汤科技开发有限公司 | Image processing method and device, electronic equipment and storage medium |
CN110148135A (en) * | 2019-04-03 | 2019-08-20 | 深兰科技(上海)有限公司 | A kind of road surface dividing method, device, equipment and medium |
CN110070030B (en) * | 2019-04-18 | 2021-10-15 | 北京迈格威科技有限公司 | Image recognition and neural network model training method, device and system |
CN110070030A (en) * | 2019-04-18 | 2019-07-30 | 北京迈格威科技有限公司 | Image recognition and the training method of neural network model, device and system |
CN110136141B (en) * | 2019-04-24 | 2023-07-11 | 佛山科学技术学院 | Image semantic segmentation method and device oriented to complex environment |
CN110136141A (en) * | 2019-04-24 | 2019-08-16 | 佛山科学技术学院 | A kind of image, semantic dividing method and device towards complex environment |
CN110059772B (en) * | 2019-05-14 | 2021-04-30 | 温州大学 | Remote sensing image semantic segmentation method based on multi-scale decoding network |
CN110059772A (en) * | 2019-05-14 | 2019-07-26 | 温州大学 | Remote sensing images semantic segmentation method based on migration VGG network |
CN110110702A (en) * | 2019-05-20 | 2019-08-09 | 哈尔滨理工大学 | It is a kind of that algorithm is evaded based on the unmanned plane for improving ssd target detection network |
CN110288082A (en) * | 2019-06-05 | 2019-09-27 | 北京字节跳动网络技术有限公司 | Convolutional neural networks model training method, device and computer readable storage medium |
CN110349138B (en) * | 2019-06-28 | 2021-07-27 | 歌尔股份有限公司 | Target object detection method and device based on example segmentation framework |
CN110349138A (en) * | 2019-06-28 | 2019-10-18 | 歌尔股份有限公司 | The detection method and device of the target object of Case-based Reasoning segmentation framework |
CN110609320A (en) * | 2019-08-28 | 2019-12-24 | 电子科技大学 | Pre-stack seismic reflection pattern recognition method based on multi-scale feature fusion |
CN110530875A (en) * | 2019-08-29 | 2019-12-03 | 珠海博达创意科技有限公司 | A kind of FPCB open defect automatic detection algorithm based on deep learning |
CN110568445A (en) * | 2019-08-30 | 2019-12-13 | 浙江大学 | Laser radar and vision fusion perception method of lightweight convolutional neural network |
CN110689061A (en) * | 2019-09-19 | 2020-01-14 | 深动科技(北京)有限公司 | Image processing method, device and system based on alignment feature pyramid network |
CN110689061B (en) * | 2019-09-19 | 2023-04-28 | 小米汽车科技有限公司 | Image processing method, device and system based on alignment feature pyramid network |
CN110852176A (en) * | 2019-10-17 | 2020-02-28 | 陕西师范大学 | High-resolution three-number SAR image road detection method based on Mask-RCNN |
CN110941995A (en) * | 2019-11-01 | 2020-03-31 | 中山大学 | Real-time target detection and semantic segmentation multi-task learning method based on lightweight network |
CN110851633B (en) * | 2019-11-18 | 2022-04-22 | 中山大学 | Fine-grained image retrieval method capable of realizing simultaneous positioning and Hash |
CN110851633A (en) * | 2019-11-18 | 2020-02-28 | 中山大学 | Fine-grained image retrieval method capable of realizing simultaneous positioning and Hash |
CN111144484A (en) * | 2019-12-26 | 2020-05-12 | 深圳集智数字科技有限公司 | Image identification method and device |
CN111339882B (en) * | 2020-02-19 | 2022-05-31 | 山东大学 | Power transmission line hidden danger detection method based on example segmentation |
CN111339882A (en) * | 2020-02-19 | 2020-06-26 | 山东大学 | Power transmission line hidden danger detection method based on example segmentation |
CN111461130A (en) * | 2020-04-10 | 2020-07-28 | 视研智能科技(广州)有限公司 | High-precision image semantic segmentation algorithm model and segmentation method |
CN111461130B (en) * | 2020-04-10 | 2021-02-09 | 视研智能科技(广州)有限公司 | High-precision image semantic segmentation algorithm model and segmentation method |
CN111415106A (en) * | 2020-04-29 | 2020-07-14 | 上海东普信息科技有限公司 | Truck loading rate identification method, device, equipment and storage medium |
CN111580151A (en) * | 2020-05-13 | 2020-08-25 | 浙江大学 | SSNet model-based earthquake event time-of-arrival identification method |
CN111580151B (en) * | 2020-05-13 | 2021-04-20 | 浙江大学 | SSNet model-based earthquake event time-of-arrival identification method |
CN111462128B (en) * | 2020-05-28 | 2023-12-12 | 南京大学 | Pixel-level image segmentation system and method based on multi-mode spectrum image |
CN111462128A (en) * | 2020-05-28 | 2020-07-28 | 南京大学 | Pixel-level image segmentation system and method based on multi-modal spectral image |
CN111640125A (en) * | 2020-05-29 | 2020-09-08 | 广西大学 | Mask R-CNN-based aerial photograph building detection and segmentation method and device |
CN111640125B (en) * | 2020-05-29 | 2022-11-18 | 广西大学 | Aerial photography graph building detection and segmentation method and device based on Mask R-CNN |
CN112001225B (en) * | 2020-07-06 | 2023-06-23 | 西安电子科技大学 | Online multi-target tracking method, system and application |
CN112001225A (en) * | 2020-07-06 | 2020-11-27 | 西安电子科技大学 | Online multi-target tracking method, system and application |
CN111932553A (en) * | 2020-07-27 | 2020-11-13 | 北京航空航天大学 | Remote sensing image semantic segmentation method based on area description self-attention mechanism |
CN111985473A (en) * | 2020-08-20 | 2020-11-24 | 中再云图技术有限公司 | Method for identifying private business of store |
CN112215128A (en) * | 2020-10-09 | 2021-01-12 | 武汉理工大学 | FCOS-fused R-CNN urban road environment identification method and device |
CN112215128B (en) * | 2020-10-09 | 2024-04-05 | 武汉理工大学 | FCOS-fused R-CNN urban road environment recognition method and device |
CN112396582B (en) * | 2020-11-16 | 2024-04-26 | 南京工程学院 | Mask RCNN-based equalizing ring skew detection method |
CN112396582A (en) * | 2020-11-16 | 2021-02-23 | 南京工程学院 | Mask RCNN-based equalizing ring skew detection method |
CN113487622B (en) * | 2021-05-25 | 2023-10-31 | 中国科学院自动化研究所 | Head-neck organ image segmentation method, device, electronic equipment and storage medium |
CN113487622A (en) * | 2021-05-25 | 2021-10-08 | 中国科学院自动化研究所 | Head and neck organ image segmentation method and device, electronic equipment and storage medium |
CN113435271A (en) * | 2021-06-10 | 2021-09-24 | 中国电子科技集团公司第三十八研究所 | Fusion method based on target detection and instance segmentation model |
CN113298036A (en) * | 2021-06-17 | 2021-08-24 | 浙江大学 | Unsupervised video target segmentation method |
CN113298036B (en) * | 2021-06-17 | 2023-06-02 | 浙江大学 | Method for dividing unsupervised video target |
CN115546483A (en) * | 2022-09-30 | 2022-12-30 | 哈尔滨市科佳通用机电股份有限公司 | Method for measuring residual using amount of carbon slide plate of subway pantograph based on deep learning |
CN115546483B (en) * | 2022-09-30 | 2023-05-12 | 哈尔滨市科佳通用机电股份有限公司 | Deep learning-based method for measuring residual usage amount of carbon slide plate of subway pantograph |
CN116452600B (en) * | 2023-06-15 | 2023-10-03 | 上海蜜度信息技术有限公司 | Instance segmentation method, system, model training method, medium and electronic equipment |
CN116452600A (en) * | 2023-06-15 | 2023-07-18 | 上海蜜度信息技术有限公司 | Instance segmentation method, system, model training method, medium and electronic equipment |
CN117252790A (en) * | 2023-08-23 | 2023-12-19 | 成都理工大学 | Multi-image fusion method based on NSCT-RCNN |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109145769A (en) | The target detection network design method of blending image segmentation feature | |
CN109284669A (en) | Pedestrian detection method based on Mask RCNN | |
CN110188705B (en) | Remote traffic sign detection and identification method suitable for vehicle-mounted system | |
CN110363215B (en) | Method for converting SAR image into optical image based on generating type countermeasure network | |
CN109977793A (en) | Trackside image pedestrian's dividing method based on mutative scale multiple features fusion convolutional network | |
CN109359684A (en) | Fine granularity model recognizing method based on Weakly supervised positioning and subclass similarity measurement | |
CN108509978A (en) | The multi-class targets detection method and model of multi-stage characteristics fusion based on CNN | |
CN111553201B (en) | Traffic light detection method based on YOLOv3 optimization algorithm | |
CN111126202A (en) | Optical remote sensing image target detection method based on void feature pyramid network | |
CN107871119A (en) | A kind of object detection method learnt based on object space knowledge and two-stage forecasting | |
CN108985269A (en) | Converged network driving environment sensor model based on convolution sum cavity convolutional coding structure | |
CN110119728A (en) | Remote sensing images cloud detection method of optic based on Multiscale Fusion semantic segmentation network | |
CN108009518A (en) | A kind of stratification traffic mark recognition methods based on quick two points of convolutional neural networks | |
CN107122776A (en) | A kind of road traffic sign detection and recognition methods based on convolutional neural networks | |
CN110532946B (en) | Method for identifying axle type of green-traffic vehicle based on convolutional neural network | |
CN109785344A (en) | The remote sensing image segmentation method of binary channel residual error network based on feature recalibration | |
CN110197152A (en) | A kind of road target recognition methods for automated driving system | |
CN113160062B (en) | Infrared image target detection method, device, equipment and storage medium | |
CN107092884A (en) | Rapid coarse-fine cascade pedestrian detection method | |
CN104657980A (en) | Improved multi-channel image partitioning algorithm based on Meanshift | |
CN110060273A (en) | Remote sensing image landslide plotting method based on deep neural network | |
CN111882620A (en) | Road drivable area segmentation method based on multi-scale information | |
CN105894030A (en) | High-resolution remote sensing image scene classification method based on layered multi-characteristic fusion | |
CN110633727A (en) | Deep neural network ship target fine-grained identification method based on selective search | |
CN109635726A (en) | A kind of landslide identification method based on the symmetrical multiple dimensioned pond of depth network integration |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20190104 |
|
WW01 | Invention patent application withdrawn after publication |