CN109145769A - The target detection network design method of blending image segmentation feature - Google Patents

The target detection network design method of blending image segmentation feature Download PDF

Info

Publication number
CN109145769A
CN109145769A CN201810860392.7A CN201810860392A CN109145769A CN 109145769 A CN109145769 A CN 109145769A CN 201810860392 A CN201810860392 A CN 201810860392A CN 109145769 A CN109145769 A CN 109145769A
Authority
CN
China
Prior art keywords
network
target
feature
image segmentation
segmentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201810860392.7A
Other languages
Chinese (zh)
Inventor
孙福明
蔡希彪
贾旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Liaoning University of Technology
Original Assignee
Liaoning University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Liaoning University of Technology filed Critical Liaoning University of Technology
Priority to CN201810860392.7A priority Critical patent/CN109145769A/en
Publication of CN109145769A publication Critical patent/CN109145769A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

Blending image divides the target detection network design method of feature, and this method is significant for large-scale target effect.In conjunction with general target detection framework Mask RCNN, blending image divides feature simultaneously, by feature integration that Target Segmentation feature and ResNet-101 convolutional network obtain to being input in RPN module, RoI Pooling module and RoI Align module together on the basis of basic Mask RCNN algorithm, it is shown experimentally that this method is significant for large-scale target effect, it can then be improved comprehensively according to for the ideal image segmentation algorithm of Small object image segmentation, for self iteration of Mask RCNN.

Description

The target detection network design method of blending image segmentation feature
Technical field
The invention belongs to pedestrian detection method fields, in particular to the target detection network design of blending image segmentation feature Method.
Background technique
With the development of science and technology with the progress in epoch, it has to recognize that our life style also constantly changes therewith ?.The trip mode of people is constantly updated, and automobile is most commonly used a kind of vehicles under Contemporary Environmental, according to Ministry of Public Security's traffic control Office's statistics, in by the end of June, 2017 by, national vehicle guaranteeding organic quantity is up to 3.04 hundred million, and wherein 2.05 hundred million, automobile, colleague's traffic are pacified Full problem is particularly pertinent, and according to incompletely statistics, the number that annual middle and low income country dies in traffic accident has reached entirely 90% or more of the total death toll of ball, however the vehicle fleet that these countries possess only accounts for the 48% of worldwide vehicle sum.It is striking Soul-stirring data bring is the deep thinking of traffic safety problem behind, although and analyzing a variety of traffic accident causations we have found that making The reason of taking place frequently at traffic accident has very much, but not high to the attention rate of pedestrian is wherein most important one of reason.
In order to solve this problem, researchers at home and abroad provide many solutions, and most typical is exactly that auxiliary is driven Sail system.Advanced DAS (Driver Assistant System) (ASAD), and within the system, the most key technology is exactly pedestrian detection technology.
The final purpose of pedestrian detection technology is the presence to judge whether to have pedestrian in some video sequence or image, The position of pedestrian can be accurately outlined on this basis.Although current research can judge to a certain degree pedestrian Image, but still remain much be unable to entirely accurates identification differentiate the problem of.
In pedestrian's detection field, Early features extraction mainly uses HOG feature, but since HOG feature is engineer, Its feature extraction algorithm process is fixed, and only pedestrian preferably identifies keeping standing position Shi Caike, therefore many researchs at that time Personnel propose the thought of Fusion Features, HOG feature and other characteristics of image are blended, such as image segmentation feature, picture depth Feature and picture edge characteristic etc..In the recent period, very fast in the development of computer vision field convolutional neural networks, it has gradually replaced artificial The feature of design, but its performance is still to be improved, and Fusion Features thought stands good herein.
In the pedestrian detection model using the design of Mask RCNN algorithm, to identify compared with Small object, then need to put image Greatly, so that Small object size enters in the range of region recommendation window, but some inhuman details can be amplified at this time, makes it Shape is approximatively close to pedestrian's shape.
Summary of the invention
The object of the present invention is to provide the target detection network design methods of blending image segmentation feature, and this method is for big Type target effect is significant.
In conjunction with general target detection framework Mask RCNN, while blending image divides feature, calculates in basic Mask RCNN By feature integration that Target Segmentation feature and ResNet-101 convolutional network obtain to being input to RPN mould together on the basis of method In block, RoI Pooling module and RoI Align module.And tested in MS COCO test set, verify its validity.
Pedestrian detection method Target Segmentation image based on multi-feature fusion has a kind of characteristic, both after amplification can't Excessive details is generated, blending image divides feature in Mask RCNN algorithm based on this idea, improves its performance.
The advantage is that:
Describe the algorithm design conditions about Mask RCNN blending image segmentation feature in detail in technical solution, it is first It first illustrates to introduce the motivation of image partition method and the process of this method.Then DeepLabv3 image segmentation net is described The case where network, illustrates the effect of empty convolution wherein newly introduced.Next image segmentation network and feature gold word are described The details of tower progress Fusion Features.Show that this method is significant for large-scale target effect finally by experiment, according to for small The ideal image segmentation algorithm of target image segmentation effect can then improve comprehensively, for self iteration of Mask RCNN.
Detailed description of the invention
Fig. 1 is the Mask RCNN algorithm structure schematic diagram that blending image divides feature.
Fig. 2 is empty convolution schematic diagram, and (a) convolution kernel is 3*3.
Fig. 3 is empty convolution schematic diagram, and (b) convolution kernel is 7*7.
Fig. 4 is empty convolution schematic diagram, and (c) convolution kernel is 15*15.
Fig. 5 is comparison diagram before and after image segmentation network processes, (a) original image.
Fig. 6 is comparison diagram before and after image segmentation network processes, (b) effect picture after image segmentation network processes.
Fig. 7 is comparison diagram before and after image segmentation network processes.
Fig. 8 is that blending image divides feature schematic diagram.
Fig. 9 is residual error network element structures schematic diagram.
Figure 10 is residual error network portion structural schematic diagram.
Figure 11 is region recommendation network algorithm flow chart.
Figure 12 is candidate window classification, dividing processing flow chart.
Figure 13 is MS COCO training set human object's height distribution histogram.
Figure 14 is Cityscapes data set different resolution test result figure, (a) 0.5 times of scaling test result histogram Figure.
Figure 15 is Cityscapes data set different resolution test result figure, (b) 1 times of scaling test result histogram.
Figure 16 is Cityscapes data set different resolution test result figure, (c) 2 times of scaling test result histograms.
Figure 17 is that Cityscapes data set different proportion scales test result figure.
Figure 18 is interior driver and passenger's erroneous detection is pedestrian, (a) passenger inside the vehicle.
Figure 19 is interior driver and passenger's erroneous detection is pedestrian, (b) interior driver.
Figure 20 is that optimization algorithm applies front and back comparison diagram, before (a) algorithm improvement.
Figure 21 is that optimization algorithm applies front and back comparison diagram, (b) after algorithm improvement.
Figure 22 optimization algorithm test result.
Specific embodiment
The target detection network design method of blending image segmentation feature:
Network structure design:
The Mask RCNN algorithm structure schematic diagram that blending image divides feature is as shown in Figure 1.
Image segmentation network introduction:
As seen from Figure 1, Target Segmentations different from the Target Segmentation in Mask RCNN, that this Fusion Features is added Network is the module with independent processing ability, selects DeepLabv3 semantic segmentation algorithm as Target Segmentation network here, DeepLabv3 method is divided into two steps:
(1) to obtain primary segmentation result figure using full convolutional network, and it is interpolated into original image size.
(2) the fine amendment in details is carried out to the image segmentation result that interpolation obtains using full connection CRFs algorithm, Successive ignition is carried out to obtain optimum segmentation result.
The full convolutional network of DeepLabv3 makes it have structure end to end, is highly convenient for training, therein using cavity Convolution design, or it is expansion convolution, can effectively replace pond layer reduces information loss.Based on convolutional neural networks Image segmentation algorithm uses the model of coding and decoding, first successively reduces the Spatial Dimension of input data by encoder, recycles solution Code device both the networks such as deconvolution successively restore target details and corresponding spatial position.
Model encoder therein generallys use pond layer reduction input data and then expands receptive field, but pond layer can be lost Bulk information is lost, can greatly increase calculation amount if the method by increasing convolution kernel expands receptive field, especially encode Interlude, the port number of eigenmatrix is usually 512 or 1024, if the convolution kernel of original 3*3 is revised as 5*5 or 7*7 Then calculation amount can explode.The alternative pondization operation of empty convolution is introduced, empty convolution schematic diagram is as in Figure 2-4.
Fig. 2 (a) is normal 3*3 convolution kernel, and Fig. 3 (b) is replaced the convolution kernel of 7*7 by the empty convolution kernel of 3*3, and Fig. 4 (c) is The empty convolution kernel of 3*3 replaces the convolution kernel of 15*15, and wherein blue region is convolution kernel overlay area, its other than red dot His convolution kernel part is zero padding operation.It can replace pondization to operate it can be seen from Fig. 2-4, do not losing information, do not increasing meter Increase receptive field in the case where calculation amount, so that each convolution output covering larger area.
Image segmentation network segmentation front and back comparison is as seen in figs. 5-6.
Fig. 5 is original image, and Fig. 6 is effect picture after processing, it can be seen that after image segmentation network processes, nearby Pedestrian contour is very clear, and the pedestrian contour of distant place is then unintelligible.By Fusion Features, region recommendation network and time Select the result figure obtained after window classification, segmentation network processes as shown in Figure 7.
Digital convergence is in 0.994-0.998 range behind person in Fig. 7, wherein 0.998 number is at most (for seven), 0.994 is two, and 0.996 is three.
Comparison diagram 5-6 and Fig. 7 is it can be found that the result figure that image segmentation network obtains can not accurately be found out present in figure Each target, but it can substantially distinguish complicated background with target is subsequent more accurately target detection It lays the foundation with Target Segmentation, it is highly important that this shows that blending image segmentation is characterized in.
Due to Mask RCNN network introduced feature pyramid structure in feature extraction of this secondary design, so at this The Target Segmentation network of addition must also make corresponding adjustment, and Fusion Features use Faster RCNN for basic framework, feature The part-structure that network is Vgg16 convolutional neural networks is extracted, finally obtained is a global feature matrix, and the spy introduced The eigenmatrix that sign pyramid structure will obtain 5 resolution ratio and successively successively decrease.It shows with the DeepLabv3 image segmentation network integration It is intended to as shown in Figure 8.It can be seen that the eigenmatrix of image segmentation network output is for the spy that exports with feature pyramid in figure Sign matrix blends, and needs constantly to go to reduce matrix resolution by pond layer, can use on Feature fusion a variety of Mode is such as added by element, channel is cumulative in the case where multiplication or equal resolution.Wherein the image segmentation of this input is special Levying port number is 3, the method for misfitting with 256 channels of feature pyramid output, therefore using cumulative port number, by port number Increase to 259, and convolutional layer is added dimensionality reduction can be docked directly with network below in this way to 256 again by its port number.
Wherein the feature extraction network of Mask RCNN is single spur track, and the Target Segmentation network being newly added and original spy Sign extracts network and forms two branches, in training, the case where encountering gradient backpropagation, needs to freeze original net at this time Network weight parameter is only trained the weight parameter for the Target Segmentation network being newly added.If not freezing legacy network weight ginseng Several, the network bring error being newly added will seriously pollute legacy network, and model performance is made serious degenerate occur.
The weight training of new network is built upon on the basis of original Mask RCNN, continues to select MS COCO as training Collection, is totally divided into three steps: (1) freezing all original weight parameters first, only carry out to the Target Segmentation network weight being newly added Training, makes loss function reach original numerical value.(2) freeze frame is input to network weight and Target Segmentation network between FPN Weight is trained Weighted residue part.(3) whole weights of the network after stabilization are trained.
The first experimental analysis of foregoing teachings:
Mask RCNN marks context of detection after blending image divides feature, in pedestrian, the knowledge to Small object and medium target Rate does not improve lower, is then significantly increased for the discrimination of big target.The basic reason for this phenomenon occur is to be merged Image segmentation feature clarity in terms of Small object is poor, can only play the role of seed point, but it is in the extraction of larger target Aspect significant effect improves meeting with image segmentation algorithm stepping up in aspect of performance brought by Fusion Features It is increasing.Shown in specific experiment tables of data 1.The experiment test set is the human object in MS COCO test set.
The comparison of 1 accuracy rate of table
Feature extraction network design above-mentioned:
Current effect is selected preferably to increase income CNN network as feature extraction network.Wherein, ResNet-101 network is in mesh Performance is excellent in terms of preceding feature extraction, and compared with other convolutional neural networks, it joined residual error function, this residual error letter Number can be such that the depth of CNN reaches not degenerate in very high situation.In computer vision, with the increasing of the network number of plies Add, the feature level extracted is also higher, closer to semantic information.Before the appearance of no residual error network, the too deep meeting of the network number of plies Gradient disperse or gradient explosion phenomenon are brought, after solving degenerate problem, its performance is also continuous with the increase of the network number of plies Increase, if the performance of ResNet-50, ResNet-101, ResNet-152 are to be promoted steadily.
Specific residual error function structure is shown in Fig. 9.If setting input feature vector matrix as x, intermediate weights network is F, then defeated The eigenmatrix for arriving next layer out is H (x)=F (x)+x, and the function of unit networks fitting is F (x)=H (x)-x.Its initial mesh Be the identical study situation of simulation, it is believed that network, which will learn mapping of F (the x)=x mapping than study F (x)=0, to be more difficult. In addition to this, another actual influence of residual error structure bring is to export the shadow that the variation of eigenmatrix is F to intermediate weighting network Sound is bigger, keeps it more sensitive.The thought of residual error network is to remove identical main part, thus the change of prominent features matrix Change, this is quite similar with the differential amplification system in circuit, differential amplification system can solve signal transmit at a distance in line Road interference, residual error network can solve gradient disperse or gradient explosion issues in deep layer network.
Characteristic extraction part schematic network structure is as shown in Figure 10.
X branch of the Figure 10 after pooling layers of Max joined the convolutional layer and BN (Batch that convolution kernel is 1*1 Normalization) layer, it acts as matrix dimension is changed, then without the convolutional layer and BN layers in subsequent residual error structure, Input feature vector matrix is directly added with F (x).It in each F (x) structure, is all made of cubic convolution, for the first time and finally Primary is the convolution kernel of 1*1, and work, which is used as, is changed to matrix dimension, and centre is the convolution kernel of 3*3.Its subsequent residual error network is exactly The repetition of this network is superimposed, and wherein convolution kernel is 3*3 always, and the port number of eigenmatrix then constantly changes.
Reappear Mask RCNN algorithm with the deep learning frame library Keras based on the rear end TensorFlow, using MS COCO data set trains the model.Currently, algorithm of target detection mainstream frame is FPN and integrating context information.Fusion is up and down Literary information, that is, Low Level Vision information combination high-layer semantic information, in conjunction with method be various ways, as element be added one by one or be multiplied, Characteristic pattern is cumulative to make port number the modes such as increase, wherein it is preferable by element addition method effect, therefore this is adopted this method.
Currently, TensorFlow frame supports preferably more video card parallel trainings, when using stochastic gradient descent algorithm, The sample image of each batch processing is more, then model generalization ability and loss decline stability are higher, but MS COCO data set Image resolution ratio is not consistent, and to improve batch processing ability, input picture is uniformly processed as 1024*1024 resolution ratio, but protects Sample image original aspect ratio is demonstrate,proved, other parts are carried out to mend 0 processing.
Table 2 lists the dimension specific value of ResNet-101 residual error network with feature pyramid eigenmatrix when combining, And feature pyramid carries out the resolution ratio after dimensionality reduction to it, this feature pyramid is 5 layers, and residual error network only exports 4 features The eigenmatrix of matrix, the last layer is obtained by the direct dimensionality reduction of layer second from the bottom.
Be to small target deteection in conjunction with contextual information it is helpful, in pedestrian detection task, it may appear that a large amount of small mesh Mark needs detected situation, after obtaining FPN layer eigenmatrix, to the progress deconvolution processing of high-level characteristic matrix, make its with Preceding layer eigenmatrix dimension is consistent, merges it mutually by the method that matrix is added by element.
2 eigenmatrix resolution ratio contrast table of table
So far, the feature extraction of image is completed, and converts 5 eigenmatrixes for an image, subsequent to be pushed away by region Algorithm is recommended, finds foreground target from 5 eigenmatrixes.
The design of region recommendation network:
The basic procedure of RCNN serial algorithm are as follows: feature extraction first is carried out to image, is then obtained by this feature matrix Foreground target, the previous form chosen foreground target and generally use sliding window, it is envisaged that this is multiple small tasks cumulative It is handled, treatment process then serially carries out.And Faster RCNN proposes the region RPN recommendation network structure, uses anchor Point form, makes the sliding window task of serial process become the anchor point task of parallel processing, this greatly accelerates processing speed.And The form of Mask RCNN selection foreground target is almost the same with Faster RCNN's, FPN layers each due to FPN layers of presence The quantity of each anchor point be not 3 kinds of scales and 3 kinds of shapes (combine totally 9 kinds of shapes) in Faster RCNN, but 3 kinds of shapes of only a kind of scale, i.e., vertical rectangle, horizontal rectangle and square.Such as the eigenmatrix of the first layer of FPN Resolution ratio is 256*256*256, then generates 256*256*3 pre-selection window altogether, and the eigenmatrix resolution ratio of the second layer is 128*128*256 then generates altogether 128*128*3 pre-selection window, according to original image resolution and each eigenmatrix point Resolution, can calculate the coordinate of three of each anchor point pre-selection windows, the convolutional layer for being 3 × 3 by a convolution kernel, can be with New matrix is obtained, value is the destination probability value and four coordinate shift amounts that each window generates.With 4 variable Pcx、Pcy、 Pw、PhRespectively indicate center abscissa, center ordinate, the window width, window height of each anchor point pre-selection window.Four seats Mark offset is dx、dy、dw、dhRespectively preselect the translation abscissa of window center point, translation ordinate, the window of central point Width zoom factor, window height zoom factor.Finally obtained new window value is P 'cx、P′cy、P′w、P′h, production Methods As shown in formula (1).
Faster RCNN is all anchor point windows to be generated on a characteristic layer, and the Mask RCNN that FPN is added is Various sizes of anchor point window is generated on different characteristic layer, as characteristic layer gradually increases, characteristic layer is more and more abstract, each The corresponding area in original image of anchor point is then bigger, and table 3 is the corresponding size of each layer anchor point window of RPN.
The setting of table 3RPN anchor point window
Table 3 is it can be seen that the setting of its anchor point window covers the target of each size in MS COCO data set extensively.
Region recommendation network overall flow figure is as shown in figure 11.
5 eigenmatrixes obtained by feature extraction network will be final to obtain by the processing of flow chart shown in Figure 11 To recommendation window list.It designs herein and does not use SVM classifier, but use Softmax classifier, compared to obtaining for SVM The scores of the result Softmax classifier divided have probability meaning, its score is mapped to probability space by Softmax, Last item score had both been the probability value that target belongs to the category.
Be adjusted to the image of 1024*1024 resolution ratio for input picture, generated number of windows it is huge, even if Later period removes the anchor point window beyond image border, quantity be also it is very huge, classified one by one to it, returned and mesh Mark segmentation just needs great calculation amount.Therefore be ranked up by the destination probability value to each window, carrying out non-maximum After inhibition processing, retain 2000 recommendation windows in the training stage, retains 1000 recommendation windows in test phase.
The problem of being matched about anchor with sample object frame, select in all anchors with sample object frame Duplication it is highest and It is positive sample that Duplication, which is greater than 0.7, and Duplication is negative sample less than 0.3, other are as neutral sample.To guarantee positive and negative sample The balance of this quantity, positive sample quantity must not exceed the half of selection anchor sum, and the part of positive negative sample exceeded will be set to Neutral sample.All anchors are the summation of different FPN layers of anchors.
Candidate window classification, dividing processing design:
It has been observed that obtaining preliminary foreground target by the region RPN recommendation network, this brief summary will be to obtained recommendation target It is handled.
Due to the presence of articulamentum complete in Faster RCNN, to handle the image of arbitrary resolution, it is necessary to connect entirely Unified quantization operation is carried out before layer, Faster RCNN completes this operation using RoI Pooling layers, but it is not ensured that The one-to-one correspondence of Pixel-level between input and output, the process can't affect greatly classification, but to Pixel-level Target Segmentation impacts larger.And all quantizing process in Pooling layers of RoI of removal of Align layers of the RoI of Mask RCNN, Target detection branch is continued to the 7*7 size of the RoI Pooling using Faster RCNN, Target Segmentation branch is adopted when reproduction With 14*14 size.The design reduces the presence of error when extracting area-of-interest from characteristic pattern using bilinear interpolation.
Target Segmentation network and target detection network are all to receive candidate window from region recommendation network, in conjunction with preceding feature 5 layers of eigenmatrix that pyramid obtains, therefrom extract local feature matrix corresponding to candidate window, both RoI Pooling The eigenmatrix that layer is exported with Align layers of RoI, the two principle is identical, and only resolution ratio is different.
Candidate window classification, dividing processing flow chart are as shown in figure 12.From the figure, it can be seen that the RoI in reproduction Pooling layers of obtained eigenmatrix are handled by the convolutional layer that two convolution kernels are 1*1, and effect is as full articulamentum. Since MS COCO data set complexity is higher, then do not copy Faster RCNN that dropout layers are added after convolutional layer.Often A candidate window will all obtain the vector of one 81 dimension, obtains 81 probability values after the processing of sigmoid function, corresponds to The probability value of 80 kinds of objects and background, wherein which class probability value highest, the candidate window be both that a kind of target, and final mask is defeated Out when result, the object can be just marked in the picture when only probability value is higher than given threshold.Obtaining candidate window probability While value, also further its position coordinates accurately will be adjusted according to which type objects is target belong to.Since output is tied Fruit is very more, it may appear that the case where unifying object repeating label, therefore using non-very big in the final output of target detection Value restrainable algorithms carry out deleting processing to the higher target of Duplication.
While obtaining object detection results, processing target is also divided into network.It is obtained after Align layers of RoI processing To the eigenmatrix of each candidate window, deconvolution operation is carried out to it after multiple convolution operation is carried out to it, and to every one kind The binary map of image segmentation is not generated not individually, and the process of binaryzation is completed by sigmoid function, and the final area-of-interest is adopted Which determined with the binary map of classification by the target category that target detection branch exports, which also eliminates usual Target Segmentation institutes Race problem between the class faced.
The loss function design can be used:
About loss function, Mask RCNN is added to L on the basis of Faster RCNNmaskVariable, it obtains prediction Target Segmentation binary map carry out cross entropy operation, be multi-task learning mode, shown in whole loss function such as formula (2):
L=Lcls+Lbox+Lmask+Lp+Lr (2)。
Wherein LmaskFor the loss of object segmentation result, LclsFor target detection Classification Loss, LboxFor target detection coordinate Return loss, LrFor weight regularization loss, LpFor the loss of region recommendation network.
(1) target detection Classification Loss:
In training, window is recommended in the region that target detection network can obtain 200, and the ratio of this positive negative sample is 1: 2.If p is the corresponding probability value of correct classification, LclsIt indicates that the Classification Loss of window is recommended in 200 regions, cross entropy is selected to make For measurement standard, calculate as shown in formula (3):
(2) target detection coordinate returns loss:
Coordinate recurrence loss is different from the measurement standard of target detection Classification Loss, selects smooth L1 as its measurement Standard.It is calculated as shown in formula (4):
(3) loss of object segmentation result:
In training, window is recommended in the region that target detection network can obtain 200, and Target Segmentation can export 200 28* 28 matrix, the probability value that each element of matrix is 0 to 1.Logarithm loss function is selected to measure object segmentation result.Below Logarithm loss function is provided in the definition of individual data point, as shown in formula (5):
Cost (y, p (y | x))=- y ln p (y | x)-(1-y) ln (1-p (y | x)) (5).
Each window object segmented image matrix dimensionality is 28*28, then LmaskFormula such as formula (6) shown in:
(4) region recommendation network loses:
RPN network only need to distinguish candidate window whether be prospect can, therefore its be two classification problems, L can be referred tocls、 LboxIt is calculated.
(5) weight regularization error:
LrThe quadratic sum of as all weight coefficients and the product of proportionality coefficient α, specific as shown in formula (7), wherein w is The trainable weight parameter of network.
The setting of network hyper parameter:
Mask RCNN network is end-to-end design, this brings great convenience to training, not only increases whole behaviour The threshold of operator can also be effectively reduced in the efficiency of work.But it does so also with the presence of its drawback, is mainly improved in performance When it is not noticeable to where problem, if it is the training executed step by step, where finding problem by contrast operation, To breakthrough bottleneck, performance is improved.In training, intentional frozen fraction weight can choose to meet the need of substep training It asks.
In training deep learning model, not only to prepare the data set marked, network structure and initialization network weight Weight parameter, the more preferable hyper parameter that controlled training process is arranged, the hyper parameter of present networks are listed in table 4.
The setting of 4 hyper parameter of table
The second experimental analysis of foregoing teachings:
Training set of the training set of MS COCO data set as Mask RCNN is selected, there are 80 class different targets, Target mark aspect, especially Small object mark aspect are more more careful and clear than other polytypic data sets.Before training By training set image, scaling unites to wherein human object's size to 1024*1024 resolution ratio the case where keeping aspect ratio Meter, as a result as shown in figure 13, this it appears that the distributed pole of its target size is extensively but uneven from histogram figure, big portion Human object's height dimension is divided to concentrate near 30 pixel values.
In Mask CNN, area-of-interest area corresponding to each anchor point of characteristic pattern kind of the bottom is 16* in FPN 64,32*32 and 64*16, in training and test, the Duplication of the area-of-interest of positive sample and anchor point is 0.7, if target Area is too small, then it can not obtain enough Duplication, therefore very poor for the discrimination compared with Small object.We are to Cityscapes Data set carries out scaling processing, and scaling multiple is respectively 0.5 times, 1 times and 2 times, sharp respectively for the data set of three kinds of resolution ratio It is tested with the model of MS COCO training set training, test result is as shown in figure 13, and blue histogram is test set institute in figure Someone's class object height histogram, red histogram is the height histogram that model recalls correct target in the test set, from figure In as can be seen that input picture by 0.5 times scaling after, the Small object to height less than 16 pixels can not identify.
In general, the training set sample size the more when being trained to image classification model, the mould that model learning arrives Type is more accurate.There are bigger differences for the sample size of each scale of human object in the MS COCO data set used, this can also go out The different situation of existing each scale weight training degree, for the scale of sample size abundance, then training effect is preferable, on the contrary, right Then effect is relatively poor for the insufficient scale of sample size.In Figure 14-16, Figure 15 (b) and Figure 16 (c) are compared and can be seen Out, its recognition effect is relatively poor after same target is amplified.
In practice, the human cost and time cost for making a data set are huge, this target scale distributions Non-uniform situation be difficult to avoid and change, but can be improved by certain methods in the case where not changing data set The performance of model:
(1) problem fixed for most grapnel size, can be put with the size of smallest sample in actual queries test set Big arrive can be in identification range.
(2) while Small object is amplified, normal target is also amplified, and discrimination can reduce, and is not considering that video memory makes In the case where not meeting cost performance, a variety of zoomed images to an image it can carry out while handle simultaneously, to what is obtained Object statistics merge, and take out duplicate target with non-maxima suppression algorithm to final result.
For pedestrian detection problem, using Mask RCNN general target detection framework, formation zone recommendation network algorithm Mode determine that it identifies Small object there is rigid critical value, i.e., the target that area is less than a certain threshold value can not be identified. Small target deteection ability can be improved by carrying out resolution adjustment to input picture, in pedestrian's detection field, for model performance Evaluation criterion there are many, wherein that relatively reasonable is MR-FPPI (Miss rate against false positives Per image) curve index.Herein using the training set training Mask RCNN model in MS COCO data set, test set I Select in Cityscapes the training set of Precision criterion and verifying collection, by it, scaling is 0.5 times, 1 times and 2 times and surveys respectively Examination, test result are depicted as MR-FPPI curve, as shown in figure 17.
Horizontal axis is FPPI index, and the longitudinal axis is MR index, and wherein red line, blue line and green line respectively correspond scaling 0.5 Times, 1 times with 2 times after testing result, it can be seen that when averagely every figure erroneous detection quantity is 1,1 times and 2 times difference of scaling can To obtain 0.7 and 0.73 accuracy rate.Its curve and the area that the longitudinal axis, horizontal axis are surrounded are smaller to show that its true model performance is got over Good, i.e., in the case where every figure judges the alap situation of target by accident, the omission factor obtained is relatively low.
In Figure 17 it can be seen that, in the lower situation of FPPI index, three curve weave ins, i.e. model at this time Detection performance for the image of three kinds of resolution ratio be it is similar, as the index of FPPI constantly increases, i.e., judge mesh by accident in image Target gradually increases, and the omission factor of target obviously lowers in high-definition picture.And in the presence of the omission factor of low resolution image Limit, the reason for this is that feature pyramid and its corresponding anchor have most window, i.e., its score is too low when target is too small to pass through The mechanism of feature pyramid and anchor filters out it in subsequent region recommendation network, and then can not be in subsequent target detection Network directly loses the target with Target Segmentation network.After image amplification, corresponding target also follows puts together Greatly, and in the picture the quantity of high-resolution big target is compared and low Small object image its negligible amounts differentiated, therefore its MR- FPPI curve will appear the situation in Figure 17.
The optimization of pedestrian detection algorithm:
It is all that all small target deteections are used described in preceding, a kind of common optimization algorithm, is being directed to specific pedestrian at last Test problems are that it is not that all mankind are pedestrians in image that pedestrian, which has its specificity, the row only walked on road The talent is pedestrian, and people in the car is simultaneously not belonging to pedestrian, but in a practical situation often judges interior driver or passenger For pedestrian, specifically as depicted in figs. 18-19.
In Mask RCNN, the edge of object can be accurate to Pixel-level rather than just a rectangle frame, be based on This, can accurately judge that the people is in the car or outside vehicle.Judge that process is as follows:
(1) by whether having motorcycle or bicycle below detection pedestrian to determine whether for pedestrian, because either taking charge of Bicycle or motorcycle will not occur thereunder in machine or passenger.
(2) judge that the people is in the car or outside vehicle by the pixel coincidence factor of detection the people and vehicle.
Using in detection effect before and after the algorithm as shown in figures 20-21 figure it can be seen that interior driver is not marked with passenger Note, and biggish pedestrian by motorcycle Chong Die with vehicle, are not judged as passenger inside the vehicle yet, this is being improved accurately to a certain degree Degree.
For driver interior in pedestrian detection and the misjudged problem of passenger, its performance of MR-FPPI curve test is also used, is made It applies the Cityscapes data set in the preferable 2 times of scalings of effect, and test result is as shown in figure 19.
It can be seen that in Figure 22, the blue line after optimization is lower than the red line before optimization, by optimizing later period input picture scaling Accuracy rate is 0.75 when for 2 times of resolution ratio.

Claims (8)

1. the target detection network design method of blending image segmentation feature: it is characterized by comprising the following steps:
The Mask RCNN algorithm of blending image segmentation feature: input picture carries out feature extraction and image segmentation network respectively, Then to region recommendation network, candidate window classification, segmentation network.
2. the target detection network design method of blending image segmentation feature according to claim 1: it is characterized in that packet Include the following steps: image segmentation network is the module with independent processing ability;Target Segmentation network selects DeepLabv3 semantic Partitioning algorithm, DeepLabv3 method are divided into two steps:
1) to obtain primary segmentation result figure using full convolutional network, and it is interpolated into original image size;
2) the fine amendment in details is carried out to the image segmentation result that interpolation obtains using full connection CRFs algorithm, carried out more Secondary iteration is to obtain optimum segmentation result;
First successively reduce the Spatial Dimension of input data by encoder, decoder is recycled successively to restore the details of target and corresponding Spatial position.
3. the target detection network design method of blending image segmentation feature according to claim 1: it is characterized in that packet Include the following steps: the eigenmatrix of image segmentation network output needs in order to which the eigenmatrix exported with feature pyramid blends Otherwise open close pond layer of crossing goes to reduce matrix resolution.
4. the target detection network design method of blending image segmentation feature according to claim 1: it is characterized in that packet Include the following steps: feature extraction network is Target Segmentation network two branches of composition of single spur track and image segmentation, in training When, the case where encountering gradient backpropagation, needs to freeze at this time legacy network weight parameter only to the Target Segmentation being newly added The weight parameter of network is trained.
5. the target detection network design method of blending image segmentation feature according to claim 4: it is characterized in that packet Include the following steps: being trained to the weight parameter for the Target Segmentation network being newly added: the weight training of new network is totally divided into Three steps: 1) freezing all original weight parameters first, be only trained to the Target Segmentation network weight being newly added, and makes to lose letter Number reaches original numerical value;2) freeze frame is input to the weight of network weight and Target Segmentation network between FPN, to Weighted residue Part is trained;3) whole weights of the network after stabilization are trained.
6. the target detection network design method of blending image segmentation feature according to claim 1: it is characterized in that packet Include the following steps: feature extraction: the x branch after pooling layers of Max joined the convolutional layer and BN that convolution kernel is 1*1 Layer, it acts as matrix dimension is changed, then without the convolutional layer and BN layers in subsequent residual error structure, by input feature vector matrix Directly it is added with F (x);It in each F (x) structure, is all made of cubic convolution, the volume for being for the first time 1*1 with last time Product core, work, which is used as, is changed to matrix dimension, and centre is the convolution kernel of 3*3;Its subsequent residual error network is exactly the repetition of this network Superposition;
List residual error network and feature pyramid when combining the dimension specific value of eigenmatrix and feature pyramid to it Resolution ratio after carrying out dimensionality reduction, this feature pyramid, the number of plies are only exported than residual error network 1 more than eigenmatrix number, and residual error network is most The eigenmatrix of later layer is obtained by the direct dimensionality reduction of layer second from the bottom;
In conjunction with contextual information to small target deteection, in pedestrian detection task, it may appear that a large amount of Small objects need detected feelings Condition carries out deconvolution processing to high-level characteristic matrix, ties up it with preceding layer eigenmatrix after obtaining FPN layers of eigenmatrix Degree is consistent, merges it mutually by the method that matrix is added by element;
So far, the feature extraction of image is completed, subsequent to recommend to calculate by region by an image degree of being converted into eigenmatrix Method finds foreground target from multiple eigenmatrixes.
7. the target detection network design method of blending image segmentation feature according to claim 1: it is characterized in that packet Include the following steps: the design of region recommendation network:
The each eigenmatrix obtained by feature extraction network, respectively by two-way convolutional layer, BN layers, ReLU layers, convolutional layer, BN layers and ReLU layers, then respectively correspond Sigmoid layers and anchor point window probability value list and the transformation of anchor point window coordinates and anchor Point the window's position list, then handled jointly by non-maxima suppression, finally obtain recommendation window list.
8. the target detection network design method of blending image segmentation feature according to claim 1: it is characterized in that packet Include the following steps: candidate window classification, dividing processing design:
In reproduction, Pooling layers of obtained eigenmatrix of RoI are incited somebody to action by two convolutional layers processing, each candidate window To the vector of a multidimensional, multiple probability values are obtained after the processing of sigmoid function, correspond to a variety of objects and background Probability value, wherein which class probability value highest, the candidate window be both that a kind of target;
While obtaining candidate window probability value, also by further according to target belong to which type objects to its position coordinates into The accurate adjustment of row;Since output result is very more, it may appear that the case where unifying object repeating label, therefore in the final of target detection The higher target of Duplication is carried out deleting processing using non-maxima suppression algorithm in output result;
While obtaining object detection results, processing target is also divided into network;It is obtained after Align layers of RoI processing each The eigenmatrix of candidate window carries out deconvolution operation to it after carrying out multiple convolution operation to it, and independent to each classification The binary map of image segmentation is generated, the process of binaryzation is completed by sigmoid function, which the final area-of-interest uses The binary map of classification is determined by the target category that target detection branch exports.
CN201810860392.7A 2018-08-01 2018-08-01 The target detection network design method of blending image segmentation feature Withdrawn CN109145769A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810860392.7A CN109145769A (en) 2018-08-01 2018-08-01 The target detection network design method of blending image segmentation feature

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810860392.7A CN109145769A (en) 2018-08-01 2018-08-01 The target detection network design method of blending image segmentation feature

Publications (1)

Publication Number Publication Date
CN109145769A true CN109145769A (en) 2019-01-04

Family

ID=64799266

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810860392.7A Withdrawn CN109145769A (en) 2018-08-01 2018-08-01 The target detection network design method of blending image segmentation feature

Country Status (1)

Country Link
CN (1) CN109145769A (en)

Cited By (44)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108764115A (en) * 2018-05-24 2018-11-06 东北大学 A kind of truck danger based reminding method
CN109815931A (en) * 2019-02-01 2019-05-28 广东工业大学 A kind of method, apparatus, equipment and the storage medium of video object identification
CN109871798A (en) * 2019-02-01 2019-06-11 浙江大学 A kind of remote sensing image building extracting method based on convolutional neural networks
CN109886990A (en) * 2019-01-29 2019-06-14 理光软件研究所(北京)有限公司 A kind of image segmentation system based on deep learning
CN109902677A (en) * 2019-01-30 2019-06-18 深圳北斗通信科技有限公司 A kind of vehicle checking method based on deep learning
CN109934161A (en) * 2019-03-12 2019-06-25 天津瑟威兰斯科技有限公司 Vehicle identification and detection method and system based on convolutional neural network
CN109949334A (en) * 2019-01-25 2019-06-28 广西科技大学 Profile testing method based on the connection of deeply network residual error
CN109948444A (en) * 2019-02-19 2019-06-28 重庆理工大学 Method for synchronously recognizing, system and the robot of fruit and barrier based on CNN
CN109978886A (en) * 2019-04-01 2019-07-05 北京市商汤科技开发有限公司 Image processing method and device, electronic equipment and storage medium
CN110059772A (en) * 2019-05-14 2019-07-26 温州大学 Remote sensing images semantic segmentation method based on migration VGG network
CN110070030A (en) * 2019-04-18 2019-07-30 北京迈格威科技有限公司 Image recognition and the training method of neural network model, device and system
CN110110702A (en) * 2019-05-20 2019-08-09 哈尔滨理工大学 It is a kind of that algorithm is evaded based on the unmanned plane for improving ssd target detection network
CN110136141A (en) * 2019-04-24 2019-08-16 佛山科学技术学院 A kind of image, semantic dividing method and device towards complex environment
CN110148135A (en) * 2019-04-03 2019-08-20 深兰科技(上海)有限公司 A kind of road surface dividing method, device, equipment and medium
CN110288082A (en) * 2019-06-05 2019-09-27 北京字节跳动网络技术有限公司 Convolutional neural networks model training method, device and computer readable storage medium
CN110349138A (en) * 2019-06-28 2019-10-18 歌尔股份有限公司 The detection method and device of the target object of Case-based Reasoning segmentation framework
CN110530875A (en) * 2019-08-29 2019-12-03 珠海博达创意科技有限公司 A kind of FPCB open defect automatic detection algorithm based on deep learning
CN110568445A (en) * 2019-08-30 2019-12-13 浙江大学 Laser radar and vision fusion perception method of lightweight convolutional neural network
CN110609320A (en) * 2019-08-28 2019-12-24 电子科技大学 Pre-stack seismic reflection pattern recognition method based on multi-scale feature fusion
CN110689061A (en) * 2019-09-19 2020-01-14 深动科技(北京)有限公司 Image processing method, device and system based on alignment feature pyramid network
CN110852176A (en) * 2019-10-17 2020-02-28 陕西师范大学 High-resolution three-number SAR image road detection method based on Mask-RCNN
CN110851633A (en) * 2019-11-18 2020-02-28 中山大学 Fine-grained image retrieval method capable of realizing simultaneous positioning and Hash
CN110941995A (en) * 2019-11-01 2020-03-31 中山大学 Real-time target detection and semantic segmentation multi-task learning method based on lightweight network
CN111144484A (en) * 2019-12-26 2020-05-12 深圳集智数字科技有限公司 Image identification method and device
CN111339882A (en) * 2020-02-19 2020-06-26 山东大学 Power transmission line hidden danger detection method based on example segmentation
CN111415106A (en) * 2020-04-29 2020-07-14 上海东普信息科技有限公司 Truck loading rate identification method, device, equipment and storage medium
CN111462128A (en) * 2020-05-28 2020-07-28 南京大学 Pixel-level image segmentation system and method based on multi-modal spectral image
CN111461130A (en) * 2020-04-10 2020-07-28 视研智能科技(广州)有限公司 High-precision image semantic segmentation algorithm model and segmentation method
CN111580151A (en) * 2020-05-13 2020-08-25 浙江大学 SSNet model-based earthquake event time-of-arrival identification method
CN111640125A (en) * 2020-05-29 2020-09-08 广西大学 Mask R-CNN-based aerial photograph building detection and segmentation method and device
CN111695380A (en) * 2019-03-13 2020-09-22 杭州海康威视数字技术股份有限公司 Target detection method and device
CN111723829A (en) * 2019-03-18 2020-09-29 四川大学 Full-convolution target detection method based on attention mask fusion
CN111753579A (en) * 2019-03-27 2020-10-09 杭州海康威视数字技术股份有限公司 Detection method and device for designated walk-substituting tool
CN111932553A (en) * 2020-07-27 2020-11-13 北京航空航天大学 Remote sensing image semantic segmentation method based on area description self-attention mechanism
CN111985473A (en) * 2020-08-20 2020-11-24 中再云图技术有限公司 Method for identifying private business of store
CN112001225A (en) * 2020-07-06 2020-11-27 西安电子科技大学 Online multi-target tracking method, system and application
CN112215128A (en) * 2020-10-09 2021-01-12 武汉理工大学 FCOS-fused R-CNN urban road environment identification method and device
CN112396582A (en) * 2020-11-16 2021-02-23 南京工程学院 Mask RCNN-based equalizing ring skew detection method
CN113298036A (en) * 2021-06-17 2021-08-24 浙江大学 Unsupervised video target segmentation method
CN113435271A (en) * 2021-06-10 2021-09-24 中国电子科技集团公司第三十八研究所 Fusion method based on target detection and instance segmentation model
CN113487622A (en) * 2021-05-25 2021-10-08 中国科学院自动化研究所 Head and neck organ image segmentation method and device, electronic equipment and storage medium
CN115546483A (en) * 2022-09-30 2022-12-30 哈尔滨市科佳通用机电股份有限公司 Method for measuring residual using amount of carbon slide plate of subway pantograph based on deep learning
CN116452600A (en) * 2023-06-15 2023-07-18 上海蜜度信息技术有限公司 Instance segmentation method, system, model training method, medium and electronic equipment
CN117252790A (en) * 2023-08-23 2023-12-19 成都理工大学 Multi-image fusion method based on NSCT-RCNN

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103984915A (en) * 2014-02-28 2014-08-13 中国计量学院 Pedestrian re-recognition method in monitoring video
CN106874894A (en) * 2017-03-28 2017-06-20 电子科技大学 A kind of human body target detection method based on the full convolutional neural networks in region
CN107507126A (en) * 2017-07-27 2017-12-22 大连和创懒人科技有限公司 A kind of method that 3D scenes are reduced using RGB image
CN107527351A (en) * 2017-08-31 2017-12-29 华南农业大学 A kind of fusion FCN and Threshold segmentation milking sow image partition method
CN108062756A (en) * 2018-01-29 2018-05-22 重庆理工大学 Image, semantic dividing method based on the full convolutional network of depth and condition random field
CN108346154A (en) * 2018-01-30 2018-07-31 浙江大学 The method for building up of Lung neoplasm segmenting device based on Mask-RCNN neural networks

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103984915A (en) * 2014-02-28 2014-08-13 中国计量学院 Pedestrian re-recognition method in monitoring video
CN106874894A (en) * 2017-03-28 2017-06-20 电子科技大学 A kind of human body target detection method based on the full convolutional neural networks in region
CN107507126A (en) * 2017-07-27 2017-12-22 大连和创懒人科技有限公司 A kind of method that 3D scenes are reduced using RGB image
CN107527351A (en) * 2017-08-31 2017-12-29 华南农业大学 A kind of fusion FCN and Threshold segmentation milking sow image partition method
CN108062756A (en) * 2018-01-29 2018-05-22 重庆理工大学 Image, semantic dividing method based on the full convolutional network of depth and condition random field
CN108346154A (en) * 2018-01-30 2018-07-31 浙江大学 The method for building up of Lung neoplasm segmenting device based on Mask-RCNN neural networks

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
KAIMING HE等: "《Mask R-CNN》", 《ARXIV》 *
TSUNG-YI LIN等: "《Feature Pyramid Networks for Object Detection》", 《2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 *

Cited By (67)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108764115B (en) * 2018-05-24 2021-12-14 东北大学 Truck danger reminding method
CN108764115A (en) * 2018-05-24 2018-11-06 东北大学 A kind of truck danger based reminding method
CN109949334B (en) * 2019-01-25 2022-10-04 广西科技大学 Contour detection method based on deep reinforced network residual error connection
CN109949334A (en) * 2019-01-25 2019-06-28 广西科技大学 Profile testing method based on the connection of deeply network residual error
CN109886990A (en) * 2019-01-29 2019-06-14 理光软件研究所(北京)有限公司 A kind of image segmentation system based on deep learning
CN109902677A (en) * 2019-01-30 2019-06-18 深圳北斗通信科技有限公司 A kind of vehicle checking method based on deep learning
CN109902677B (en) * 2019-01-30 2021-11-12 深圳北斗通信科技有限公司 Vehicle detection method based on deep learning
CN109871798A (en) * 2019-02-01 2019-06-11 浙江大学 A kind of remote sensing image building extracting method based on convolutional neural networks
CN109815931A (en) * 2019-02-01 2019-05-28 广东工业大学 A kind of method, apparatus, equipment and the storage medium of video object identification
CN109948444A (en) * 2019-02-19 2019-06-28 重庆理工大学 Method for synchronously recognizing, system and the robot of fruit and barrier based on CNN
CN109934161A (en) * 2019-03-12 2019-06-25 天津瑟威兰斯科技有限公司 Vehicle identification and detection method and system based on convolutional neural network
CN111695380B (en) * 2019-03-13 2023-09-26 杭州海康威视数字技术股份有限公司 Target detection method and device
CN111695380A (en) * 2019-03-13 2020-09-22 杭州海康威视数字技术股份有限公司 Target detection method and device
CN111723829B (en) * 2019-03-18 2022-05-06 四川大学 Full-convolution target detection method based on attention mask fusion
CN111723829A (en) * 2019-03-18 2020-09-29 四川大学 Full-convolution target detection method based on attention mask fusion
CN111753579A (en) * 2019-03-27 2020-10-09 杭州海康威视数字技术股份有限公司 Detection method and device for designated walk-substituting tool
CN109978886A (en) * 2019-04-01 2019-07-05 北京市商汤科技开发有限公司 Image processing method and device, electronic equipment and storage medium
CN110148135A (en) * 2019-04-03 2019-08-20 深兰科技(上海)有限公司 A kind of road surface dividing method, device, equipment and medium
CN110070030B (en) * 2019-04-18 2021-10-15 北京迈格威科技有限公司 Image recognition and neural network model training method, device and system
CN110070030A (en) * 2019-04-18 2019-07-30 北京迈格威科技有限公司 Image recognition and the training method of neural network model, device and system
CN110136141B (en) * 2019-04-24 2023-07-11 佛山科学技术学院 Image semantic segmentation method and device oriented to complex environment
CN110136141A (en) * 2019-04-24 2019-08-16 佛山科学技术学院 A kind of image, semantic dividing method and device towards complex environment
CN110059772B (en) * 2019-05-14 2021-04-30 温州大学 Remote sensing image semantic segmentation method based on multi-scale decoding network
CN110059772A (en) * 2019-05-14 2019-07-26 温州大学 Remote sensing images semantic segmentation method based on migration VGG network
CN110110702A (en) * 2019-05-20 2019-08-09 哈尔滨理工大学 It is a kind of that algorithm is evaded based on the unmanned plane for improving ssd target detection network
CN110288082A (en) * 2019-06-05 2019-09-27 北京字节跳动网络技术有限公司 Convolutional neural networks model training method, device and computer readable storage medium
CN110349138B (en) * 2019-06-28 2021-07-27 歌尔股份有限公司 Target object detection method and device based on example segmentation framework
CN110349138A (en) * 2019-06-28 2019-10-18 歌尔股份有限公司 The detection method and device of the target object of Case-based Reasoning segmentation framework
CN110609320A (en) * 2019-08-28 2019-12-24 电子科技大学 Pre-stack seismic reflection pattern recognition method based on multi-scale feature fusion
CN110530875A (en) * 2019-08-29 2019-12-03 珠海博达创意科技有限公司 A kind of FPCB open defect automatic detection algorithm based on deep learning
CN110568445A (en) * 2019-08-30 2019-12-13 浙江大学 Laser radar and vision fusion perception method of lightweight convolutional neural network
CN110689061A (en) * 2019-09-19 2020-01-14 深动科技(北京)有限公司 Image processing method, device and system based on alignment feature pyramid network
CN110689061B (en) * 2019-09-19 2023-04-28 小米汽车科技有限公司 Image processing method, device and system based on alignment feature pyramid network
CN110852176A (en) * 2019-10-17 2020-02-28 陕西师范大学 High-resolution three-number SAR image road detection method based on Mask-RCNN
CN110941995A (en) * 2019-11-01 2020-03-31 中山大学 Real-time target detection and semantic segmentation multi-task learning method based on lightweight network
CN110851633B (en) * 2019-11-18 2022-04-22 中山大学 Fine-grained image retrieval method capable of realizing simultaneous positioning and Hash
CN110851633A (en) * 2019-11-18 2020-02-28 中山大学 Fine-grained image retrieval method capable of realizing simultaneous positioning and Hash
CN111144484A (en) * 2019-12-26 2020-05-12 深圳集智数字科技有限公司 Image identification method and device
CN111339882B (en) * 2020-02-19 2022-05-31 山东大学 Power transmission line hidden danger detection method based on example segmentation
CN111339882A (en) * 2020-02-19 2020-06-26 山东大学 Power transmission line hidden danger detection method based on example segmentation
CN111461130A (en) * 2020-04-10 2020-07-28 视研智能科技(广州)有限公司 High-precision image semantic segmentation algorithm model and segmentation method
CN111461130B (en) * 2020-04-10 2021-02-09 视研智能科技(广州)有限公司 High-precision image semantic segmentation algorithm model and segmentation method
CN111415106A (en) * 2020-04-29 2020-07-14 上海东普信息科技有限公司 Truck loading rate identification method, device, equipment and storage medium
CN111580151A (en) * 2020-05-13 2020-08-25 浙江大学 SSNet model-based earthquake event time-of-arrival identification method
CN111580151B (en) * 2020-05-13 2021-04-20 浙江大学 SSNet model-based earthquake event time-of-arrival identification method
CN111462128B (en) * 2020-05-28 2023-12-12 南京大学 Pixel-level image segmentation system and method based on multi-mode spectrum image
CN111462128A (en) * 2020-05-28 2020-07-28 南京大学 Pixel-level image segmentation system and method based on multi-modal spectral image
CN111640125A (en) * 2020-05-29 2020-09-08 广西大学 Mask R-CNN-based aerial photograph building detection and segmentation method and device
CN111640125B (en) * 2020-05-29 2022-11-18 广西大学 Aerial photography graph building detection and segmentation method and device based on Mask R-CNN
CN112001225B (en) * 2020-07-06 2023-06-23 西安电子科技大学 Online multi-target tracking method, system and application
CN112001225A (en) * 2020-07-06 2020-11-27 西安电子科技大学 Online multi-target tracking method, system and application
CN111932553A (en) * 2020-07-27 2020-11-13 北京航空航天大学 Remote sensing image semantic segmentation method based on area description self-attention mechanism
CN111985473A (en) * 2020-08-20 2020-11-24 中再云图技术有限公司 Method for identifying private business of store
CN112215128A (en) * 2020-10-09 2021-01-12 武汉理工大学 FCOS-fused R-CNN urban road environment identification method and device
CN112215128B (en) * 2020-10-09 2024-04-05 武汉理工大学 FCOS-fused R-CNN urban road environment recognition method and device
CN112396582B (en) * 2020-11-16 2024-04-26 南京工程学院 Mask RCNN-based equalizing ring skew detection method
CN112396582A (en) * 2020-11-16 2021-02-23 南京工程学院 Mask RCNN-based equalizing ring skew detection method
CN113487622B (en) * 2021-05-25 2023-10-31 中国科学院自动化研究所 Head-neck organ image segmentation method, device, electronic equipment and storage medium
CN113487622A (en) * 2021-05-25 2021-10-08 中国科学院自动化研究所 Head and neck organ image segmentation method and device, electronic equipment and storage medium
CN113435271A (en) * 2021-06-10 2021-09-24 中国电子科技集团公司第三十八研究所 Fusion method based on target detection and instance segmentation model
CN113298036A (en) * 2021-06-17 2021-08-24 浙江大学 Unsupervised video target segmentation method
CN113298036B (en) * 2021-06-17 2023-06-02 浙江大学 Method for dividing unsupervised video target
CN115546483A (en) * 2022-09-30 2022-12-30 哈尔滨市科佳通用机电股份有限公司 Method for measuring residual using amount of carbon slide plate of subway pantograph based on deep learning
CN115546483B (en) * 2022-09-30 2023-05-12 哈尔滨市科佳通用机电股份有限公司 Deep learning-based method for measuring residual usage amount of carbon slide plate of subway pantograph
CN116452600B (en) * 2023-06-15 2023-10-03 上海蜜度信息技术有限公司 Instance segmentation method, system, model training method, medium and electronic equipment
CN116452600A (en) * 2023-06-15 2023-07-18 上海蜜度信息技术有限公司 Instance segmentation method, system, model training method, medium and electronic equipment
CN117252790A (en) * 2023-08-23 2023-12-19 成都理工大学 Multi-image fusion method based on NSCT-RCNN

Similar Documents

Publication Publication Date Title
CN109145769A (en) The target detection network design method of blending image segmentation feature
CN109284669A (en) Pedestrian detection method based on Mask RCNN
CN110188705B (en) Remote traffic sign detection and identification method suitable for vehicle-mounted system
CN110363215B (en) Method for converting SAR image into optical image based on generating type countermeasure network
CN109977793A (en) Trackside image pedestrian's dividing method based on mutative scale multiple features fusion convolutional network
CN109359684A (en) Fine granularity model recognizing method based on Weakly supervised positioning and subclass similarity measurement
CN108509978A (en) The multi-class targets detection method and model of multi-stage characteristics fusion based on CNN
CN111553201B (en) Traffic light detection method based on YOLOv3 optimization algorithm
CN111126202A (en) Optical remote sensing image target detection method based on void feature pyramid network
CN107871119A (en) A kind of object detection method learnt based on object space knowledge and two-stage forecasting
CN108985269A (en) Converged network driving environment sensor model based on convolution sum cavity convolutional coding structure
CN110119728A (en) Remote sensing images cloud detection method of optic based on Multiscale Fusion semantic segmentation network
CN108009518A (en) A kind of stratification traffic mark recognition methods based on quick two points of convolutional neural networks
CN107122776A (en) A kind of road traffic sign detection and recognition methods based on convolutional neural networks
CN110532946B (en) Method for identifying axle type of green-traffic vehicle based on convolutional neural network
CN109785344A (en) The remote sensing image segmentation method of binary channel residual error network based on feature recalibration
CN110197152A (en) A kind of road target recognition methods for automated driving system
CN113160062B (en) Infrared image target detection method, device, equipment and storage medium
CN107092884A (en) Rapid coarse-fine cascade pedestrian detection method
CN104657980A (en) Improved multi-channel image partitioning algorithm based on Meanshift
CN110060273A (en) Remote sensing image landslide plotting method based on deep neural network
CN111882620A (en) Road drivable area segmentation method based on multi-scale information
CN105894030A (en) High-resolution remote sensing image scene classification method based on layered multi-characteristic fusion
CN110633727A (en) Deep neural network ship target fine-grained identification method based on selective search
CN109635726A (en) A kind of landslide identification method based on the symmetrical multiple dimensioned pond of depth network integration

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20190104

WW01 Invention patent application withdrawn after publication