CN109426825A - A kind of detection method and device of object closed outline - Google Patents

A kind of detection method and device of object closed outline Download PDF

Info

Publication number
CN109426825A
CN109426825A CN201810722257.6A CN201810722257A CN109426825A CN 109426825 A CN109426825 A CN 109426825A CN 201810722257 A CN201810722257 A CN 201810722257A CN 109426825 A CN109426825 A CN 109426825A
Authority
CN
China
Prior art keywords
characteristic pattern
convolution
size
semantic segmentation
sampling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810722257.6A
Other languages
Chinese (zh)
Inventor
王泮渠
陈鹏飞
黄泽铧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Tusimple Future Technology Co Ltd
Original Assignee
Beijing Tusimple Future Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US15/693,446 external-priority patent/US10067509B1/en
Application filed by Beijing Tusimple Future Technology Co Ltd filed Critical Beijing Tusimple Future Technology Co Ltd
Priority to CN202210425480.0A priority Critical patent/CN114782705A/en
Publication of CN109426825A publication Critical patent/CN109426825A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The present invention discloses the detection method and device of a kind of object closed outline.This method comprises: object closed outline detection device is in the decoding process that semantic segmentation is handled, intensive up-sampling process of convolution is carried out to the characteristic pattern of cataloged procedure output, size output image identical with input image size is obtained, the occluding contour in image including object example is exported;According to pixel class, the occluding contour of object example is identified and extracted from output image.The information that can restore discreet portions or wisp in image data by this method makes up the problem of wisp information caused by cataloged procedure down-sampling loses and is unable to get recovery.

Description

A kind of detection method and device of object closed outline
Technical field
The present invention relates to computer vision field, in particular to a kind of the detection method and device of object closed outline.
Background technique
The processing of image data has important role for fields such as automatic Pilots.Semantic segmentation is a kind of according to image The technology of data progress object identification.Semantic segmentation is that each of image data pixel divides a classification.
Contour of object detection is the underlying issue in many visual tasks, including image segmentation, object detection, example semantic Segmentation and closed outline speculate.For the correct operation of an automated driving system, institute in a traffic environment is detected There is object to be very important, these objects can be automobile, bus, pedestrian and bicycle.For an object (such as One automobile or a people) detection unsuccessfully may result in an automatic driving vehicle motion planning system failure, To cause a series of accident.
Semantic segmentation frame provides the classification annotation of pixel scale, but it is other to carry out single object instance-level Mark.Current object detection frame there are problems that the shape of object can not be restored or can not handle closed occupancy detection. This is primarily due to the limitation of (bounding box) the fusion treatment bring of bounding box in conventional frame.Especially near bounding box Belong to after other different classes of objects are fused together to reduce false positive rate in the case where, can bring detect and hidden The problem of block material body.
That is, in the prior art, asking for object closed outline in image data can not accurately and effectively be detected by existing Topic.
Summary of the invention
In view of this, the embodiment of the invention provides the detection methods and device of a kind of object closed outline, to solve The problem of can not accurately and effectively detecting contour of object in image data in the prior art.
On the one hand, the embodiment of the present application provides a kind of detection method of object closed outline, comprising:
Object closed outline detection device is in the decoding process that semantic segmentation is handled, to the characteristic pattern of cataloged procedure output Intensive up-sampling process of convolution is carried out, size output image identical with input image size is obtained, exporting includes object in image The occluding contour of body example;
According to pixel class, the occluding contour of object example is identified and extracted from output image.
On the other hand, the embodiment of the present application provides a kind of detection device of object closed outline, comprising:
Intensive up-sampling convolution module, the spy for being exported to cataloged procedure in the decoding process that semantic segmentation is handled Sign figure carries out intensive up-sampling process of convolution, obtains size output image identical with input image size, exports and wrap in image Include the occluding contour of object example;
Profile extraction module, for identifying and extracting the closing of object example from output image according to pixel class Contour line.
On the other hand, the embodiment of the present application provides a kind of detection device of object closed outline, including a processor And at least one processor, at least one machine-executable instruction is stored at least one processor, processor executes at least One machine-executable instruction is to execute:
In the decoding process of semantic segmentation processing, the characteristic pattern of cataloged procedure output is carried out at intensive up-sampling convolution Reason obtains size output image identical with input image size, exports the occluding contour in image including object example;
According to pixel class, the occluding contour of object example is identified and extracted from output image.
According to technical solution provided by the embodiments of the present application, the characteristic pattern of effectively size compression can be restored to The identical image data of input image data size, and according to the quantity of the down-sampling factor and scheduled object category, to spy The port number of sign figure is converted, and the more intensive characteristic pattern of quantity is obtained, can be to more by the more dense characteristic pattern of quantity The classification of more pixels is predicted, so as to restore the information of discreet portions or wisp in image data, makes up coding The problem of wisp information caused by process down-sampling loses and can not be restored by bilinear interpolation.It is able to solve existing There is the problem of contour of object in image data can not be accurately and effectively detected in technology.
Detailed description of the invention
Attached drawing is used to provide further understanding of the present invention, and constitutes part of specification, with reality of the invention It applies example to be used to explain the present invention together, not be construed as limiting the invention.
Fig. 1 is the process flow diagram of the detection method of object closed outline provided by the embodiments of the present application;
Fig. 2 is the process flow diagram of step 101 in Fig. 1;
Fig. 3 is the image data example handled using method shown in Fig. 1 input picture;
Fig. 4 is another process flow diagram of the detection method of object closed outline provided by the embodiments of the present application;
Fig. 5 is another process flow diagram of the detection method of object closed outline provided by the embodiments of the present application;
Fig. 6 is the schematic diagram for expanding convolution kernel;
Fig. 7 is the network architecture schematic diagram for realizing the semantic segmentation model of method shown in Fig. 5 in a particular application;
Fig. 8 is the image data example of method shown in application drawing 1;
Fig. 9 is the image data example of method shown in application drawing 5;
Figure 10 is an original input picture in a concrete application scene;
Figure 11 is to extract the characteristic pattern obtained after feature to input picture shown in Fig. 10;
Figure 12 is to be marked using object detection technology in the prior art using bounding box to input picture shown in Fig. 10 Infuse the schematic diagram of object;
Figure 13 a is to apply object closed outline detection method provided by the embodiments of the present application to input picture shown in Fig. 10 The object example profile diagram obtained in advance afterwards;
Figure 13 b is the effect of visualization figure after stacking chart 13a and Figure 10;
Figure 14 is an original input picture in another concrete application scene;
Figure 15 is to extract the feature illustrated example obtained after feature to input picture shown in Figure 14;
Figure 16 is to be marked using object detection technology in the prior art using bounding box to input picture shown in Figure 14 Infuse the schematic diagram of object;
Figure 17 is to apply object closed outline detection method provided by the embodiments of the present application to input picture shown in Figure 14 The object example profile diagram extracted afterwards;
Figure 18 is to be superimposed the effect of visualization figure after Figure 17 and Figure 14;
Figure 19 is the structural block diagram of the detection device of object closed outline provided by the embodiments of the present application;
Figure 20 is another structural block diagram of the detection device of object closed outline provided by the embodiments of the present application;
Figure 21 is another structural block diagram of the detection device of object closed outline provided by the embodiments of the present application;
Figure 22 is another structural block diagram of the detection device of object closed outline provided by the embodiments of the present application;
Figure 23 is another structural block diagram of the detection device of object closed outline provided by the embodiments of the present application;
Figure 24 is another structural block diagram of the detection device of object closed outline provided by the embodiments of the present application.
Specific embodiment
Technical solution in order to enable those skilled in the art to better understand the present invention, below in conjunction with of the invention real The attached drawing in example is applied, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described implementation Example is only a part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, this field is common Technical staff's every other embodiment obtained without making creative work, all should belong to protection of the present invention Range.
In the prior art at present, semantic segmentation processing generally includes two parts: character representation decodes (Decoding Of Feature Representation) and expansion convolution (Dilated Convolution).
The semantic segmentation information of pixel scale can be obtained by character representation decoding, the characteristic pattern of output has and input The identical size of image.Since the maximization Chi Huahe in convolutional neural networks strides convolution operation, inevitably cause most The size reduction of the characteristic pattern of several layer networks afterwards can decode the characteristic pattern of low resolution for this problem kinds of schemes To accurate information.Common bilinear interpolation can save memory space and processing speed is fast.The method of deconvolution uses Pond location information in pondization processing, necessary information required for Lai Huifu image reconstruction and characteristic visual.In some examples In son, an individual uncoiling lamination is added in decoding stage, generates prediction result using the characteristic pattern that middle layer stacks. In other examples, target object, such as chair, desk or automobile are generated in multiple features using multiple uncoiling laminations. By the pond position in upper storage reservoir (unpooling) the step of using storage in some researchs, using uncoiling lamination as rolling up The mirror-image structure of lamination.Some other researchs illustrate in the communication process of uncoiling lamination, may be implemented coarse to careful (coarse-to-fine) detection of object structures, these object structures are very crucial for rebuilding subtle information. There are also some researchs to use a similar mirror-image structure, and final to realize in conjunction with the information of uncoiling lamination execution up-sampling Prediction.There are also some systems to come prediction label image and statistics with higher by using the classifier of pixel scale Efficiency.Wherein, bilinear interpolation should be relatively wide.But bilinear interpolation up-sampling is obtained by filling 0 and inputs phase With the output of resolution ratio, it is easily lost detailed information, wisp information is lost, loses the precision of image data, and bilinearity Interpolation does not have learning ability.
Expanding convolution (or being known as empty convolution) is researched and developed for wavelet decomposition earliest.Expand the main core of convolution The heart is to fill the receptive field for " 0 " carrying out enlarged image between the pixel of convolution kernel, so as to realize in deep neural network Dense feature extract.In the frame of semantic segmentation, expansion convolution is also used for expanding the size of convolution kernel.Make in some researchs It realizes that context polymerize (context aggregation) with the serializing layer with cumulative amplification degree, and designs one It is a that " spatial pyramid (Atrous Spatial Pyramid Pooling, ASPP) based on hole is multiple arranged side by side by being arranged Convolutional layer is expanded, to capture multiple dimensioned object and contextual information.Recently, expansion convolution is used for wide range, such as Object detection, visual problem based on light stream are answered and audio generates.But these convolutional system explaineds can be because of standard extension Convolution and lead to the problem of one " grid effect ", lead to not the shape or profile that identify larger-size object.
The above problem as present in semantic segmentation technology namely wisp information lose, can not identify large scale object The shape or profile of body lead to not the closed outline for effectively and accurately extracting to obtain object.
For the above-mentioned problems in the prior art, the embodiment of the present application provides a kind of detection of object closed outline Method and apparatus, to solve the problems, such as this.In the technical solution that some embodiments of the present application provide, by decoding stage Intensive up-sampling process of convolution is carried out to the characteristic pattern of coding stage output, the resolution ratio of forecast image is improved, restores more Detailed information, so as to retain more wisp information, and further detection obtains the profile information of wisp.? In the technical solution that other embodiments provide, multiple mixing is carried out in characteristic pattern of the coding stage to extraction and is expanded at convolution Reason, the local information and long-range information that can retain more convolution can obtain big object so as to overcome grid effect Continuous profile information, so as to detect to obtain the profile information of big object.To technology provided by the embodiments of the present application Scheme can accurately and effectively detect to obtain the profile information of object, be able to solve the above problem in the prior art.
It on the other hand,, can be direct without marking object by bounding box using method provided by the embodiments of the present application Extract the closed outline information of object.In the prior art, the reality of object when marking object, can not be identified by bounding box Border shape, can not the information such as size, area to object accurately inferred;And in the technical field of such as automatic Pilot etc. In, these information will be the key message for carrying out many decisions.Also, some objects can bounding box merge processing in quilt It neglects, object information is caused to be lost.Using method provided by the embodiments of the present application, the disk wheel of object can be directly extracted Exterior feature is inferred or decision mentions to be other so as to information such as the true forms, size, area that further identify object For accurately and effectively information.
It is core of the invention thought above, in order to enable those skilled in the art to better understand the present invention in embodiment Technical solution, and keep the above objects, features, and advantages of the embodiment of the present invention more obvious and easy to understand, with reference to the accompanying drawing Technical solution in the embodiment of the present invention is described in further detail.
Fig. 1 shows the process flow diagram of the detection method of object closed outline provided by the embodiments of the present application, comprising:
Step 101, object closed outline detection device export cataloged procedure in the decoding process that semantic segmentation is handled Characteristic pattern carry out intensive up-sampling process of convolution, obtain size output image identical with input image size, output image In include object example contour line;
Step 102, according to pixel class, identify and extract the contour line of object example from output image.
Wherein, in a step 101, intensive up-sampling process of convolution is carried out to characteristic pattern, as shown in Figure 2, comprising:
The port number c of characteristic pattern is switched to down-sampling factor d in cataloged procedure by step 10112With predetermined object category Quantity L product;
For example, the input picture size of model is (H, W, C), wherein H is the height of image data, and W is image data Width, C are the port number of image data.It is F by the characteristic pattern size that the processing of cataloged procedure is input to decoding processout= (h, w, c), wherein H/d=h, W/d=w, d are the down-sampling factor.In the prior art using bilinear interpolation to characteristic pattern into Row up-sampling, if d=16, namely it is input to 16 times of output down-sampling, if the length or width of an object is less than 16 pixels (pixels), such as electric pole, traffic lights, traffic signals or a people of distant place, which will not It is sampled, and bilinear interpolation up-sampling will be unable to restore this information, to lose the object in the output image.
In step 1011 provided by the embodiments of the present application, the intensive process of convolution that up-samples is by characteristic pattern FoutSize in Port number c converted, obtain port number d2* L, d are the down-sampling factor, and L is the quantity of scheduled whole object categories, are obtained To characteristic pattern Fout=(h, w, d2*L)。
It specifically, can be according to the port number d after the port number c of former characteristic pattern and channel conversion2* the number ratios of L close It is d2* L/c, learns the characteristic pattern on each channel, (the d after being learnt2* L) on a channel having a size of h*w's Characteristic pattern, so that each intensive up-sampling convolutional layer is learning the prediction to each pixel.Wherein, on each channel Characteristic pattern is learnt, and is that the learning functionality obtained according to neural network training study in advance is realized.For example, in former characteristic pattern Port number c and scheduled object category the identical situation of quantity L numerical value under, can be by the feature graphics on each channel Practise d2Part, the d after being learnt2* the characteristic pattern having a size of h*w on L channel.
Step 1012 is combined the characteristic pattern after number of channels conversion, and carries out normalizing to the characteristic pattern after combination Change processing, to obtain size output image identical with input image size.
That is, to the characteristic pattern F after number of channels conversionout=(h, w, d2* L) be combined, obtain having a size of (h*d, W*d, L) characteristic pattern, H/d=h, W/d=w as described above, then, the size of the characteristic pattern after the combination are (H, W, L), namely The size of characteristic pattern is upsampled to size identical with input picture.
Wherein, to the processing of the characteristic pattern combination after number of channels conversion, it can be and obtain characteristic pattern according to collection apparatus Sequence and channel sequence, characteristic pattern is combined.For example, in d2* the characteristic pattern in L channel, on n-th~m channel It is that the feature extracted to the xth row data of input image data presses the characteristic pattern on n-th~m channel when being combined According to pixel order, be successively combined to the xth row of output data, and so on combination to subsequent characteristics figure.Characteristic pattern is carried out Combined processing can be implemented according to the regulation of actual algorithm in concrete application scene, and the application is not done specifically here It limits.
Middle bilinear interpolation method does not have learnability compared with the prior art, does not have deconvolution, the application yet The intensive up-sampling treatment that embodiment provides has learnability, can be before processing shown in Fig. 1, previously according to truthful data Training neural network obtains semantic segmentation model, includes the intensive up-sampling convolutional layer of decoding stage in semantic segmentation model.Its In, the intensive convolutional layer that up-samples may include multiple convolutional layers.Specifically, a series of up-sampling can be arrived by training study Filter, by this series of up-sampling filter to having a size of Fout=(h, w, c) characteristic pattern up-sampling for having a size of Fout=(h, w, d2* L) image data.
And the output image data after up-sampling is normalized by one softmax layers, it obtains to the end Export image.
The example handled using method shown in Fig. 1 input picture is shown in Fig. 3.It is input on the left of Fig. 3 Image, other parts are the output image in the case of the different down-sampling factors from left to right, it can be seen that in input picture Some wisps have obtained good identification, such as electric pole and signal lamp.
During specific implementation, it can be trained study in full convolutional network, obtain above-mentioned semantic segmentation Model.
By intensive up-sampling process of convolution as described above, the characteristic pattern of effectively size compression can be restored to Image data identical with input image data size, and according to the quantity of the down-sampling factor and scheduled object category, it is right The port number of characteristic pattern is converted, and the more intensive characteristic pattern of quantity is obtained, can be right by the more dense characteristic pattern of quantity The classification of more pixels is predicted, so as to restore the information of discreet portions or wisp in image data, makes up volume The problem of wisp information caused by code process down-sampling loses and can not be restored by bilinear interpolation.To this Shen Please embodiment provide method can from output image in extract obtain wisp example information and wisp disk wheel Wide information.
Also, intensive up-sampling process of convolution can be realized directly from characteristic pattern to the processing for exporting mark image, and nothing It needs as first carrying out bilinear interpolation to characteristic pattern in the prior art, and is up-sampled the mark exported to the image of interpolation Image.On the other hand, intensively up-sampling process of convolution is directly handled the characteristic pattern of original resolution, can be realized pixel The decoding of rank.
Also, further as shown in figure 4, on the basis of processing shown in Fig. 1, method provided by the embodiments of the present application is also Include:
Step 103, the occluding contour according to the object example of extraction, determine the shape, size and/or face of object example Product.
Using method provided by the embodiments of the present application, without marking object by bounding box, object can be directly extracted Closed outline information.In the prior art, the true form of object when marking object, can not be identified by bounding box, Can not the information such as size, area to object accurately inferred;And in the technical field of such as automatic Pilot etc., these Information will be the key message for carrying out many decisions.Using method provided by the embodiments of the present application, object can be directly extracted Closed outline, so as to information such as the true forms, size, area that further identify object, infer to be other or Person's decision provides accurately and effectively information.
Based on identical inventive concept, on the basis of the embodiment of the present application method shown in Fig. 1, a kind of object is additionally provided The detection method of closed outline.
Fig. 5 shows the detection method of object closed outline provided by the embodiments of the present application, comprising:
Step 100, object closed outline detection device are in the cataloged procedure that semantic segmentation is handled, to the characteristic pattern of extraction It carries out multiple mixing and expands process of convolution, obtain the characteristic pattern for expanding receptive field;
Step 101, in decoding process, intensive up-sampling process of convolution is carried out to the characteristic pattern of cataloged procedure output, is obtained To size output image identical with input image size, the occluding contour in image including object example is exported;
Step 102, according to pixel class, identify and extract the occluding contour of object example from output image.
Wherein, the processing of above-mentioned steps 100 namely on multiple convolutional layers, using a series of amplification degree to characteristic pattern into Row process of convolution.
In the prior art, expand process of convolution and convolution is carried out to characteristic pattern usually using expansion convolution kernel, to expand spy The receptive field of figure is levied, expanding convolution kernel, building obtains and being inserted into " 0 " between each pixel in convolution kernel.Two dimension is believed Number, convolution kernel size is K*K, and the result by expanding convolution is Kd*Kd, wherein Kd=k+ (k-1) (r-1), r is amplification degree. The receptive field (or being the visual field) of characteristic pattern can be expanded by expanding convolution, can replace the pond layer in full convolutional network framework.Example A convolutional layer such as in ResNet-101 has step-length s=2, then step-length can be reset to 1 to replace down-sampling to operate, And 2 will be set as to subsequent network layer, the rate of will be enlarged by.The network layer of whole progress down-sampling processing is alternately performed Processing is stated, then the characteristic pattern exported can expand receptive field.In practical applications, under expansion process of convolution is typically used in On the characteristic pattern of sampling, to reach reasonable efficiency and expense.But grid effect can be caused by expanding process of convolution.
In the embodiment of the present application, above-mentioned steps 100 can be realized are as follows: each convolutional layer in multiple expansion convolutional layers On, using based on K*K size and amplification degree is riConvolution kernel, expansion process of convolution is carried out to characteristic pattern;Wherein, 1≤i≤ N, n are the number of plies of convolutional layer.The processing can by four kinds of at least following modes one of implemented:
Multiple expansion convolutional layers are divided into several groups by mode one, and the amplification degree for expanding convolutional layer in every group is constantly incremented by.
For example, s group can be divided by N layers when expansion convolutional layer has N layers, wherein every group includes at least two layers of convolutional layer, often The size of the convolution kernel used in group uses constantly incremental amplification degree, namely in s group for K*K, rsi-2<rsi-1< rsi.In this way, the amplification degree in every group is constantly incremented by, expanding in convolutional layer first and last in multiple groups, amplification degree changes at sawtooth wave, The convolution kernel of smaller amplification degree can extract local information, and the convolution kernel of larger amplification degree can extract long range information.
Mode two, each convolution kernel for expanding convolutional layer and there is any amplification degree.
Any amplification degree is set for convolution kernel, the receptive field of convolution kernel can be expanded, so as to identify biggish object.
Mode three, on the basis of aforesaid way one or mode two, the transformation factor that amplification degree is incremented by every time is different.
For example, amplification degree r=(1,2,5), the transformation factor that amplification degree is incremented by every time is 1 and 3, namely change incremental every time The factor is changed to be different.
The different multiple amplification degrees of transformation factor are set, one group of expansion convolution kernel can be made to cover more pixels, phase If being instead using the identical amplification degree of transformation factor, such as r=(2,4,6,8), the changed factor that amplification degree is incremented by every time 2, overcome the effect of grid effect weaker in this way.
Mode four, in above-mentioned three kinds of modes on the basis of any mode, the last layer expands the expansion volume of convolutional layer The size of the receptive field of product core is less than or equal to the size of characteristic pattern.
That is, by preset amplification degree, so that the last layer expands the receptive field of the expansion convolution kernel of convolutional layer Size be less than or equal to characteristic pattern size, the receptive field of the last layer convolutional layer can be expanded, especially when expand convolution Under the size of the receptive field of core and the identical situation of the size of characteristic pattern, the receptive field for expanding convolution kernel can cover characteristic pattern Whole region, thus will not lose it is any cavity or edge, can guarantee the consistent and complete of long-range information.
Below in the above described manner for one and the prior art compares explanation.
The example for expanding convolution kernel is shown in Fig. 6.The pixel of surrounding grey is the meter to the pixel of central black in Fig. 6 Count the pixel to contribution.Fig. 6 a is expansion convolution kernel schematic diagram in the prior art.Fig. 6 b is to mention using the embodiment of the present application The mixing of confession expands the convolution kernel schematic diagram of convolution.
The size of convolution kernel is 3*3 in Fig. 6 (a), and amplification degree from left to right is r=2.For expanding in convolutional layer One pixel p, contributive to its is upper one layer of K centered on pd*KdClose region because expand convolution introduce 0 Value, in Kd*KdRegion only calculate K*K pixel, between non-zero pixel between be divided into r-1.Such as in k=3, the expansion of r=2 In big convolution, as shown in the figure on the left side Fig. 6 a, only have 9 pixels to be made that contribution in 25 pixels.Since all layers all have There is identical amplification degree r, for the point p of top expanded in convolutional layer, the maximum possible of contribution is played to the calculating of p point The quantity of pixel is (w ' * h ')/r2, wherein w ' and h ' is the width and height for the characteristic pattern that bottom expands convolutional layer respectively.From And p point can only check its information in the form of tessellated in the characteristic pattern of top layer, will lead to losing for bulk information in this way It loses (when r=2, about 75% information can be lost).When the r in higher convolutional layer becomes increasing, can make The data sampled from input are more and more sparse, are unfavorable for convolution study, this is because: 1) the complete loss of local message;2) It is too far uncorrelated between information.Another is ceased as a result, being inscribed to collect mail from the region of r*r from entirely different " grid " set, This will damage the consistency of local information.
In the convolution kernel shown in Fig. 6 (b), using aforesaid way one, several convolutional layers are divided into one group, each group Amplification degree be constantly incremented by, such as K=3, r=(1,2,3), so that the transformation of amplification degree is similar to the shape of sawtooth wave, in this way Local information can be obtained in the bottom on the left side, and the information in broader region can be obtained in top layer on the right.No The segmentation requirement of wisp and big object can be taken into account with the combination of amplification degree, i.e., lesser amplification degree extracts local information, compared with Big amplification degree extracts long range information.
The above embodiment for expanding convolution by mixing can rolled up by the way that a series of convolution kernel of amplification degrees is arranged So that expansion convolution kernel is covered more pixels as far as possible during product, takes into account and extract local information and long-range information.Also, with expansion The range of the receptive field covering of big convolution kernel is bigger, and the cavity of loss and marginal information are fewer, can guarantee long-range information It is consistent and complete, it can be efficiently against grid effect, so as to obtain complete, the closed shape and wheel of big object It is wide.
On the other hand, there is learnability since mixing expands process of convolution, it can be before processing shown in Fig. 5, in advance Semantic segmentation model is obtained according to truthful data training neural network, includes the mixing expansion of coding stage in semantic segmentation model Convolutional layer.
During specific implementation, it can be trained study in full convolutional network, obtain above-mentioned semantic segmentation Model.
Fig. 7 is shown in a particular application, the semanteme of method shown in the accomplished Fig. 5 of training on ResNet-101 framework The network architecture of parted pattern.In coding stage in Fig. 7, multiple mixing expand convolutional layer and mix to the characteristic pattern extracted It closes and expands process of convolution, in decoding stage, the characteristic pattern that multiple intensive up-sampling convolutional layers export coding stage is handled, The mark image exported.
Method shown in Fig. 5 is that the mixing in cataloged procedure expands process of convolution and decodes at intensive up-sampling convolution in the process The combination of reason, mixing, which expands process of convolution, can effectively expand the shape and profile of the receptive field of characteristic pattern, the big object of identification, Intensive up-sampling process of convolution can restore the information of wisp, and the combination of both is conducive to comprehensively, accurately and efficiently know Indescribably take the profile of the object example and object example in image data.
Further, similar with Fig. 4, method shown in fig. 5 can also include step 103, and which is not described herein again.
Fig. 8 and Fig. 9 respectively illustrates the comparative situation of the output image of method shown in application drawing 1 respectively and Fig. 5.In Fig. 8 In, be from left to right input picture, the output image of method shown in truthful data, application drawing 1, method shown in application drawing 5 it is defeated Image out.It can be in fig. 8, it will be seen that output figure of the output image of method shown in application drawing 5 compared to method shown in application drawing 1 Picture, closer to truthful data in terms of the identification of wisp.In Fig. 9, the first behavior truthful data, the second behavior application drawing The input data of method shown in 1, the output image of method shown in third behavior application drawing 5.It can be seen that in Fig. 9, application drawing 5 The output image of shown method can be more in terms of the identification of big contour of object compared to the output image of method shown in application drawing 1 Efficiently against grid effect, closer to truthful data.
The example using method provided by the embodiments of the present application is also shown in another group of 10~13b of image datagram.Figure 10 For original input picture, Figure 11 is to extract the characteristic pattern obtained after feature, Figure 12 prior art to input picture shown in Fig. 10 In object detection technology the schematic diagram of object is marked using bounding box, Figure 13 a is to apply object provided by the embodiments of the present application The object example profile diagram obtained in advance after body closed outline detection method.
Wherein, the object of plurality of classes is marked out in Figure 11 by different colors.But in Figure 10, single body is real The information of example rank has been lost, such as all automobiles are noted as identical color namely blue, and are noted as " vapour Vehicle " classification.But identify whole object examples in a traffic environment, each automobile, bus, pedestrian and voluntarily Vehicle is very crucial for safely effectively automated driving system.It may to the detection failure of an object example It will lead to the disabler or classification error of the motion planning module of autonomous driving vehicle, so as to cause a series of accident. Semantic segmentation frame provides the object mark of pixel scale, but can not identify individually only according to semantic segmentation technology The other object of instance-level.
Figure 12 is the schematic diagram for marking object using bounding box using traditional object detection frame.Traditional object inspection It surveys frame and is able to use bounding box although to mark object, but the shape of object can not be restored or handle the closing of object The problem of contour detecting.Particularly, due to the limitation of bounding box fusion treatment in traditional object detection frame, in order to reduce vacation Positive rate, bounding box be close, mark different objects example may be fused together, so as to cause that can not detect object The closed outline of body or object example, especially when the object that is blocked is very big.As shown in figure 12, traditional object inspection Survey shape or profile that frame restores different objects or different objects example using the bounding box of rectangle.To merge During the bounding box of object and its neighbouring object, the object being blocked or the object example being blocked may detected It is lost in journey.
Figure 13 a shows the output image using object closed outline detection method provided by the embodiments of the present application.This Shen Please embodiment provide object closed outline detection method based on one it is assumed that namely a particular category object have it is similar Global shape, the detection of profile and boundary line for the object of the same category, have consistent planform.Such as Figure 13 a Shown in, the enclosed edge boundary line along the vehicle of the stop in roadside has similar width and direction.If can be calculated using one Model learning to this structural information, we can restore contour of object and enclosed edge boundary line, and detect and be blocked Object.In the embodiment of the present application, the task of contour of object detection can be treated as a semantic segmentation task, wherein original Input picture and the mark image of output are image data, so as to implement object on the semantic segmentation frame of pixel scale Body contour detecting.Particularly, the embodiment of the present application proposes method as shown in Figure 1.Convolution is intensively up-sampled shown in Fig. 1 Processing is suitable for contour of object and detects, and reason is: 1), intensive up-sampling be suitable for restoring the shape of object, 2), it is intensive on Sampling can reach higher accuracy rate compared to the coding/decoding method of bilinearity up-sampling etc., other coding/decoding methods for Width is easily lost in 8 pixel objects below, 3) profile for the object, being resumed cannot be too thick, otherwise may be by object It fogs.Intensive up-sampling can decode the profile of any width, and other methods of such as bilinearity up-sampling can only be restored The wide profile of at least eight pixel.It as depicted in fig. 13 a, can be quasi- from input picture using method provided by the embodiments of the present application It really detects and obtains the other object segmentation of instance-level.Figure 13 b show the occluding contour of the object of extraction is superimposed upon it is defeated Enter the effect of visualization on image.
The example using method provided by the embodiments of the present application is also shown in another group of image datagram 14~18.Figure 14 is Original input picture, Figure 15 are to extract the characteristic pattern obtained after feature to input picture shown in Figure 14, and Figure 16 is in the prior art Object detection technology the schematic diagram of object is marked using bounding box, Figure 17 is that application object provided by the embodiments of the present application seals The object example profile diagram extracted after profile testing method is closed, Figure 18 is by the contour of object information superposition in Figure 17 to figure Effect of visualization schematic diagram on input picture shown in 14.
As can be seen from Figure 17, it using method provided by the embodiments of the present application, can accurately detect each independent Object example shape, and the adjacent object that blocks also is not lost.Once detection obtains the shape of each independent object The profile of object can be added on input picture as shown in figure 14 by shape and profile, to form visual expression, and be automatic The control system of driving provides accurately and effectively object information.
Based on identical inventive concept, the embodiment of the present application also provides a kind of detection devices of object closed outline.
Figure 19 shows the structural block diagram of the detection device of object closed outline provided by the embodiments of the present application, comprising:
Intensive up-sampling convolution module 91, for being exported to cataloged procedure in the decoding process that semantic segmentation is handled Characteristic pattern carries out intensive up-sampling process of convolution, obtains size output image identical with input image size, exports in image Occluding contour including object example;
Profile extraction module 92, for identifying and extracting the envelope of object example from output image according to pixel class Close contour line.
Wherein, intensively up-sampling convolution module 91 is specifically used for: the port number (c) of characteristic pattern is switched in cataloged procedure The down-sampling factor (d2) and predetermined object category quantity (L) product;Characteristic pattern after number of channels conversion is combined, And the characteristic pattern after combination is normalized, obtain size output image identical with input image size.
The port number (c) of characteristic pattern is switched to the down-sampling factor (d in cataloged procedure by intensive up-sampling convolution module 912) With the product of the quantity (L) of predetermined object category, comprising: according to the port number (c) of characteristic pattern and the down-sampling factor and pre- earnest Product (the d of the quantity of body classification2* L) number ratios relationship, the characteristic pattern on each channel is learnt, is turned Port number (the d of characteristic pattern after changing2*L)。
Intensive up-sampling convolution module 91 is combined the characteristic pattern after number of channels conversion, comprising: adopts according to feature Collection obtains the sequence and channel sequence of characteristic pattern, is combined to the characteristic pattern after number of channels conversion.
On the basis of Figure 19 shown device, as shown in figure 20, device provided by the embodiments of the present application can also be further Include:
Determining module 93 determines shape, the size of object example for the occluding contour according to the object example of extraction And/or area.
Based on identical inventive concept, on the basis of Figure 19 shown device, as shown in figure 21, the embodiment of the present application is provided Device can further include:
Mixing expands convolution module 90, for being carried out in the cataloged procedure that semantic segmentation is handled to the characteristic pattern of extraction Multiple mixing expands process of convolution, obtains the characteristic pattern for expanding receptive field.
Wherein, mixing expands convolution module 90 and is specifically used for: on each convolutional layer in multiple expansion convolutional layers, using Based on K*K size and amplification degree is riConvolution kernel, expansion process of convolution is carried out to characteristic pattern;Wherein, 1≤i≤n, n are volume The number of plies of lamination.
In some embodiments, mixing expands convolution module 90 and is also used to multiple expansion convolutional layers being divided into several groups, often The amplification degree for respectively expanding convolutional layer in group is constantly incremented by.
In some embodiments, each convolution kernel for expanding convolutional layer and there is any amplification degree.
In some embodiments, the transformation factor that amplification degree is incremented by every time is different.
In some embodiments, the size that the last layer expands the receptive field of the expansion convolution kernel of convolutional layer is less than or equal to The size of characteristic pattern.
As shown in figure 22, on the basis of Figure 21 shown device, device provided by the embodiments of the present application can also be further Include:
First pre-training module 94, for obtaining semantic segmentation model, language previously according to truthful data training neural network Mixing in adopted parted pattern including coding stage expands convolutional layer.
In some embodiments, the first pre-training module 94 trains full convolutional network previously according to truthful data end-to-endly Obtain semantic segmentation model
As shown in figure 23, on the basis of Figure 19 shown device, device provided by the embodiments of the present application can also be further Include:
Second pre-training module 95, for obtaining semantic segmentation model, language previously according to truthful data training neural network It include the intensive up-sampling convolutional layer of decoding stage in adopted parted pattern.
In some embodiments, the second pre-training module 95 trains full convolutional network previously according to truthful data end-to-endly Obtain semantic segmentation model.
According to above-mentioned apparatus provided by the embodiments of the present application, mixing, which expands convolution module, can effectively expand characteristic pattern The shape and profile of receptive field, the big object of identification, the intensive convolution module that up-samples can restore the information of wisp, by this two Person can comprehensively, accurately and efficiently identify the profile for extracting object example and object example in image data.
Based on identical inventive concept, the embodiment of the present application also provides a kind of detection devices of object closed outline.
Figure 24 shows the detection device of object closed outline provided by the embodiments of the present application, including a processor 2401 With at least one processor 2402, at least one machine-executable instruction, processor are stored at least one processor 2402 2401 execute at least one machine-executable instruction to execute:
In the decoding process of semantic segmentation processing, the characteristic pattern of cataloged procedure output is carried out at intensive up-sampling convolution Reason obtains size output image identical with input image size, exports the occluding contour in image including object example;
According to pixel class, the occluding contour of object example is identified and extracted from output image.
Wherein, processor 2401 executes at least one machine-executable instruction to execute the characteristic pattern to cataloged procedure output Intensive up-sampling process of convolution is carried out, size output image identical with input image size is obtained, comprising: by the logical of characteristic pattern Road number (c) switchs to the down-sampling factor (d in cataloged procedure2) and predetermined object category quantity (L) product;To number of channels Characteristic pattern after conversion is combined, and the characteristic pattern after combination is normalized, and obtains size and input picture ruler Very little identical output image.
Processor 2401 executes at least one machine-executable instruction and switchs to encode with the port number (c) for executing characteristic pattern The down-sampling factor (d in the process2) and predetermined object category quantity (L) product, comprising: according to the port number of characteristic pattern (c) and the product (d of the quantity of the down-sampling factor and predetermined object category2* L) number ratios relationship, on each channel Characteristic pattern learnt, the port number (d of the characteristic pattern after being converted2*L)。
Processor 2401 execute at least one machine-executable instruction with execute to number of channels conversion after characteristic pattern into Row combination, comprising: obtain the sequence and channel sequence of characteristic pattern according to collection apparatus, to number of channels conversion after characteristic pattern into Row combination.
In some embodiments, processor 2401 executes at least one machine-executable instruction and also executes: previously according to true Real data training neural network obtains semantic segmentation model, includes the intensive up-sampling convolution of decoding stage in semantic segmentation model Layer.
At least one machine-executable instruction of execution of processor 2401 is also executed instructs end-to-endly previously according to truthful data Practice full convolutional network and obtains semantic segmentation model.
In further embodiments, processor 2401 executes at least one machine-executable instruction and also executes: in semanteme point It cuts in the cataloged procedure of processing, multiple mixing is carried out to the characteristic pattern of extraction and expands process of convolution, obtains expanding receptive field Characteristic pattern.
Wherein, processor 2401 executes at least one machine-executable instruction to execute the characteristic pattern progress to extraction repeatedly Mixing expand process of convolution, obtain the characteristic pattern for expanding receptive field, comprising: it is multiple expansion convolutional layers in each convolution On layer, using based on K*K size and amplification degree is riConvolution kernel, expansion process of convolution is carried out to characteristic pattern;Wherein, 1≤i ≤ n, n are the number of plies of convolutional layer.
In some embodiments, processor 2401 executes at least one machine-executable instruction and also executes multiple expansions volume Lamination is divided into several groups, and the amplification degree for expanding convolutional layer in every group is constantly incremented by.
In some embodiments, each convolution kernel for expanding convolutional layer and there is any amplification degree.
In some embodiments, the transformation factor that amplification degree is incremented by every time is different.
In some embodiments, the size that the last layer expands the receptive field of the expansion convolution kernel of convolutional layer is less than or equal to The size of characteristic pattern.
In some embodiments, processor 2401 executes at least one machine-executable instruction and also executes: previously according to true Real data training neural network obtains semantic segmentation model, includes the mixing expansion convolution of coding stage in semantic segmentation model Layer.
At least one machine-executable instruction of execution of processor 2401 is also executed instructs end-to-endly previously according to truthful data Practice full convolutional network and obtains semantic segmentation model.
According to above-mentioned apparatus provided by the embodiments of the present application, mixing, which expands process of convolution, can effectively expand characteristic pattern The shape and profile of receptive field, the big object of identification, the intensive process of convolution that up-samples can restore the information of wisp, by this two Person can comprehensively, accurately and efficiently identify the profile for extracting object example and object example in image data.
Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the art Mind and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologies Within, then the present invention is also intended to include these modifications and variations.

Claims (42)

1. a kind of detection method of object closed outline characterized by comprising
Object closed outline detection device carries out the characteristic pattern of cataloged procedure output in the decoding process that semantic segmentation is handled Intensive up-sampling process of convolution, obtains size output image identical with input image size, and exporting in image includes that object is real The occluding contour of example;
According to pixel class, the occluding contour of object example is identified and extracted from output image.
2. the method according to claim 1, wherein intensively being up-sampled to the characteristic pattern of cataloged procedure output Process of convolution obtains size output image identical with input image size, comprising:
The port number (c) of characteristic pattern is switched to the down-sampling factor (d in cataloged procedure2) and predetermined object category quantity (L) Product;
Characteristic pattern after number of channels conversion is combined, and the characteristic pattern after combination is normalized, obtains ruler Very little output image identical with input image size.
3. according to the method described in claim 2, it is characterized in that, the port number (c) of characteristic pattern is switched in cataloged procedure The down-sampling factor (d2) and predetermined object category quantity (L) product, comprising:
According to the product (d of the port number (c) of characteristic pattern and the quantity of the down-sampling factor and predetermined object category2* L) numeric ratio Example relationship, learns the characteristic pattern on each channel, the port number (d of the characteristic pattern after being converted2*L)。
4. according to the method described in claim 2, it is characterized in that, being combined to the characteristic pattern after number of channels conversion, packet It includes:
The sequence and channel sequence of characteristic pattern are obtained according to collection apparatus, the characteristic pattern after number of channels conversion is combined.
5. the method according to claim 1, wherein the method also includes:
In the cataloged procedure of semantic segmentation processing, multiple mixing is carried out to the characteristic pattern of extraction and expands process of convolution, is obtained Expand the characteristic pattern of receptive field.
6. according to the method described in claim 5, it is characterized in that, carrying out multiple mixing to the characteristic pattern of extraction expands convolution Processing, obtains the characteristic pattern for expanding receptive field, comprising:
On each convolutional layers in multiple expansion convolutional layers, using based on K*K size and amplification degree is riConvolution kernel, to spy Sign figure carries out expansion process of convolution;Wherein, 1≤i≤n, n are the number of plies of convolutional layer.
7. each in every group according to the method described in claim 6, it is characterized in that, multiple expansion convolutional layers are divided into several groups The amplification degree for expanding convolutional layer is constantly incremented by.
8. according to the method described in claim 6, it is characterized in that, each convolution for expanding convolutional layer and there is any amplification degree Core.
9. according to the method described in claim 6, it is characterized in that, the transformation factor that amplification degree is incremented by every time is different.
10. according to the method described in claim 6, it is characterized in that, the last layer expands the sense of the expansion convolution kernel of convolutional layer It is less than or equal to the size of characteristic pattern by wild size.
11. according to the method described in claim 5, it is characterized in that, the method also includes:
Semantic segmentation model is obtained previously according to truthful data (ground truth) training neural network, in semantic segmentation model Mixing including coding stage expands convolutional layer.
12. the method according to claim 1, wherein the method also includes:
Semantic segmentation model is obtained previously according to truthful data training neural network, includes decoding stage in semantic segmentation model Intensive up-sampling convolutional layer.
13. method according to claim 11 or 12, which is characterized in that previously according to truthful data, training is complete end-to-endly Convolutional network obtains semantic segmentation model.
14. method according to claim 1 or 5, which is characterized in that further include:
According to the occluding contour of the object example of extraction, the shape, size and/or area of object example are determined.
15. a kind of detection device of object closed outline characterized by comprising
Intensive up-sampling convolution module, the characteristic pattern for being exported to cataloged procedure in the decoding process that semantic segmentation is handled Intensive up-sampling process of convolution is carried out, size output image identical with input image size is obtained, exporting includes object in image The occluding contour of body example;
Profile extraction module, for identifying and extracting the closed outline of object example from output image according to pixel class Line.
16. device according to claim 15, which is characterized in that intensive up-sampling convolution module is specifically used for:
The port number (c) of characteristic pattern is switched to the down-sampling factor (d in cataloged procedure2) and predetermined object category quantity (L) Product;
Characteristic pattern after number of channels conversion is combined, and the characteristic pattern after combination is normalized, obtains ruler Very little output image identical with input image size.
17. device according to claim 16, which is characterized in that the intensive convolution module that up-samples is by the port number of characteristic pattern (c) switch to the down-sampling factor (d in cataloged procedure2) and predetermined object category quantity (L) product, comprising:
According to the product (d of the port number (c) of characteristic pattern and the quantity of the down-sampling factor and predetermined object category2* L) numeric ratio Example relationship, learns the characteristic pattern on each channel, the port number (d of the characteristic pattern after being converted2*L)。
18. device according to claim 16, which is characterized in that after intensive up-sampling convolution module is to number of channels conversion Characteristic pattern be combined, comprising:
The sequence and channel sequence of characteristic pattern are obtained according to collection apparatus, the characteristic pattern after number of channels conversion is combined.
19. device according to claim 15, which is characterized in that described device further include:
Mixing expands convolution module, for being carried out repeatedly in the cataloged procedure that semantic segmentation is handled to the characteristic pattern of extraction Mixing expands process of convolution, obtains the characteristic pattern for expanding receptive field.
20. device according to claim 19, which is characterized in that mixing expands convolution module and is specifically used for:
On each convolutional layers in multiple expansion convolutional layers, using based on K*K size and amplification degree is riConvolution kernel, to spy Sign figure carries out expansion process of convolution;Wherein, 1≤i≤n, n are the number of plies of convolutional layer.
21. device according to claim 20, which is characterized in that mixing expands convolution module and is also used to roll up multiple expansions Lamination is divided into several groups, and the amplification degree for expanding convolutional layer in every group is constantly incremented by.
22. device according to claim 20, which is characterized in that each convolution for expanding convolutional layer and there is any amplification degree Core.
23. device according to claim 20, which is characterized in that the transformation factor that amplification degree is incremented by every time is different.
24. device according to claim 20, which is characterized in that the last layer expands the sense of the expansion convolution kernel of convolutional layer It is less than or equal to the size of characteristic pattern by wild size.
25. device according to claim 19, which is characterized in that described device further include:
First pre-training module, for obtaining semantic segmentation previously according to truthful data (ground truth) training neural network Model includes that the mixing of coding stage expands convolutional layer in semantic segmentation model.
26. device according to claim 15, which is characterized in that described device further include:
Second pre-training module, for obtaining semantic segmentation model, semantic segmentation previously according to truthful data training neural network It include the intensive up-sampling convolutional layer of decoding stage in model.
27. according to device described in claim 25 or 26, which is characterized in that the first pre-training module is previously according to true number Semantic segmentation model is obtained according to the full convolutional network of training end-to-endly;
Previously according to truthful data, the full convolutional network of training obtains semantic segmentation model to second pre-training module end-to-endly.
28. device described in 5 or 19 according to claim 1, which is characterized in that further include:
Determining module, for the occluding contour according to the object example of extraction, determine the shape of object example, size and/or Area.
29. a kind of detection device of object closed outline, which is characterized in that including a processor and at least one processor, At least one machine-executable instruction is stored at least one processor, processor executes at least one machine-executable instruction To execute:
In the decoding process of semantic segmentation processing, intensive up-sampling process of convolution is carried out to the characteristic pattern of cataloged procedure output, Size output image identical with input image size is obtained, the occluding contour in image including object example is exported;
According to pixel class, the occluding contour of object example is identified and extracted from output image.
30. device according to claim 29, which is characterized in that processor execute at least one machine-executable instruction with It executes and intensive up-sampling process of convolution is carried out to the characteristic pattern that cataloged procedure exports, it is identical with input image size to obtain size Export image, comprising:
The port number (c) of characteristic pattern is switched to the down-sampling factor (d in cataloged procedure2) and predetermined object category quantity (L) Product;
Characteristic pattern after number of channels conversion is combined, and the characteristic pattern after combination is normalized, obtains ruler Very little output image identical with input image size.
31. device according to claim 30, which is characterized in that processor execute at least one machine-executable instruction with Execute the down-sampling factor (d switched to the port number (c) of characteristic pattern in cataloged procedure2) and predetermined object category quantity (L) Product, comprising:
According to the product (d of the port number (c) of characteristic pattern and the quantity of the down-sampling factor and predetermined object category2* L) numeric ratio Example relationship, learns the characteristic pattern on each channel, the port number (d of the characteristic pattern after being converted2*L)。
32. device according to claim 30, which is characterized in that processor execute at least one machine-executable instruction with It executes and the characteristic pattern after number of channels conversion is combined, comprising:
The sequence and channel sequence of characteristic pattern are obtained according to collection apparatus, the characteristic pattern after number of channels conversion is combined.
33. device according to claim 29, which is characterized in that the processor executes the executable finger of at least one machine It enables and also executing:
In the cataloged procedure of semantic segmentation processing, multiple mixing is carried out to the characteristic pattern of extraction and expands process of convolution, is obtained Expand the characteristic pattern of receptive field.
34. device according to claim 33, which is characterized in that processor execute at least one machine-executable instruction with It executes and carries out multiple mixing expansion process of convolution to the characteristic pattern of extraction, obtain the characteristic pattern for expanding receptive field, comprising:
On each convolutional layers in multiple expansion convolutional layers, using based on K*K size and amplification degree is riConvolution kernel, to spy Sign figure carries out expansion process of convolution;Wherein, 1≤i≤n, n are the number of plies of convolutional layer.
35. device according to claim 34, which is characterized in that processor executes at least one machine-executable instruction also It executes and multiple expansion convolutional layers is divided into several groups, the amplification degree for expanding convolutional layer in every group is constantly incremented by.
36. device according to claim 34, which is characterized in that each convolution for expanding convolutional layer and there is any amplification degree Core.
37. device according to claim 34, which is characterized in that the transformation factor that amplification degree is incremented by every time is different.
38. device according to claim 34, which is characterized in that the last layer expands the sense of the expansion convolution kernel of convolutional layer It is less than or equal to the size of characteristic pattern by wild size.
39. device according to claim 33, which is characterized in that processor executes at least one machine-executable instruction also It executes:
Semantic segmentation model is obtained previously according to truthful data training neural network, includes coding stage in semantic segmentation model Mixing expands convolutional layer.
40. device according to claim 29, which is characterized in that processor executes at least one machine-executable instruction also It executes:
Semantic segmentation model is obtained previously according to truthful data training neural network, includes decoding stage in semantic segmentation model Intensive up-sampling convolutional layer.
41. the device according to claim 39 or 40, which is characterized in that processor executes the executable finger of at least one machine Enable also executing trains full convolutional network to obtain semantic segmentation model end-to-endly previously according to truthful data.
42. the device according to claim 29 or 33, which is characterized in that processor executes the executable finger of at least one machine It enables and also executing:
According to the occluding contour of the object example of extraction, the shape, size and/or area of object example are determined.
CN201810722257.6A 2017-08-31 2018-06-29 A kind of detection method and device of object closed outline Pending CN109426825A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210425480.0A CN114782705A (en) 2017-08-31 2018-06-29 Method and device for detecting closed contour of object

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
USUS15/693,446 2017-08-31
US15/693,446 US10067509B1 (en) 2017-03-10 2017-08-31 System and method for occluding contour detection

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202210425480.0A Division CN114782705A (en) 2017-08-31 2018-06-29 Method and device for detecting closed contour of object

Publications (1)

Publication Number Publication Date
CN109426825A true CN109426825A (en) 2019-03-05

Family

ID=65513699

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201810722257.6A Pending CN109426825A (en) 2017-08-31 2018-06-29 A kind of detection method and device of object closed outline
CN202210425480.0A Pending CN114782705A (en) 2017-08-31 2018-06-29 Method and device for detecting closed contour of object

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202210425480.0A Pending CN114782705A (en) 2017-08-31 2018-06-29 Method and device for detecting closed contour of object

Country Status (1)

Country Link
CN (2) CN109426825A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110189264A (en) * 2019-05-05 2019-08-30 深圳市华星光电技术有限公司 Image processing method
CN110544261A (en) * 2019-09-04 2019-12-06 东北大学 Blast furnace tuyere coal injection state detection method based on image processing
WO2020228279A1 (en) * 2019-05-10 2020-11-19 平安科技(深圳)有限公司 Image palm region extraction method and apparatus
CN113408531A (en) * 2021-07-19 2021-09-17 北博(厦门)智能科技有限公司 Target object shape framing method based on image recognition and terminal
CN113888567A (en) * 2021-10-21 2022-01-04 中国科学院上海微系统与信息技术研究所 Training method of image segmentation model, image segmentation method and device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009013636A2 (en) * 2007-05-22 2009-01-29 The University Of Western Ontario A method for automatic boundary segmentation of object in 2d and/or 3d image
CN102043950A (en) * 2010-12-30 2011-05-04 南京信息工程大学 Vehicle outline recognition method based on canny operator and marginal point statistic
CN106780536A (en) * 2017-01-13 2017-05-31 深圳市唯特视科技有限公司 A kind of shape based on object mask network perceives example dividing method
CN106886801A (en) * 2017-04-14 2017-06-23 北京图森未来科技有限公司 A kind of image, semantic dividing method and device
CN107038693A (en) * 2015-10-27 2017-08-11 富士通天株式会社 Image processing equipment and image processing method
CN107092870A (en) * 2017-04-05 2017-08-25 武汉大学 A kind of high resolution image semantics information extracting method and system
CN107105130A (en) * 2016-02-19 2017-08-29 三星电子株式会社 Electronic equipment and its operating method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009013636A2 (en) * 2007-05-22 2009-01-29 The University Of Western Ontario A method for automatic boundary segmentation of object in 2d and/or 3d image
CN102043950A (en) * 2010-12-30 2011-05-04 南京信息工程大学 Vehicle outline recognition method based on canny operator and marginal point statistic
CN107038693A (en) * 2015-10-27 2017-08-11 富士通天株式会社 Image processing equipment and image processing method
CN107105130A (en) * 2016-02-19 2017-08-29 三星电子株式会社 Electronic equipment and its operating method
CN106780536A (en) * 2017-01-13 2017-05-31 深圳市唯特视科技有限公司 A kind of shape based on object mask network perceives example dividing method
CN107092870A (en) * 2017-04-05 2017-08-25 武汉大学 A kind of high resolution image semantics information extracting method and system
CN106886801A (en) * 2017-04-14 2017-06-23 北京图森未来科技有限公司 A kind of image, semantic dividing method and device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
PANQU WANG等: "Understand Convolution for Semantic Segmentation", 《COMPUTER VISION AND PATTERN RECOGNITION》 *
WENZHE SHI等: "Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network", 《2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 *
有些代码不应该被忘记: "语义分割中的深度学习方法全解:从FCN、SegNet到个版本DeepLab", 《HTTPS://BLOG.CSDN.NET/SCUTJY2015/ARTICLE/DETAILS/74971060》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110189264A (en) * 2019-05-05 2019-08-30 深圳市华星光电技术有限公司 Image processing method
CN110189264B (en) * 2019-05-05 2021-04-23 Tcl华星光电技术有限公司 Image processing method
WO2020228279A1 (en) * 2019-05-10 2020-11-19 平安科技(深圳)有限公司 Image palm region extraction method and apparatus
CN110544261A (en) * 2019-09-04 2019-12-06 东北大学 Blast furnace tuyere coal injection state detection method based on image processing
CN110544261B (en) * 2019-09-04 2023-08-29 东北大学 Method for detecting coal injection state of blast furnace tuyere based on image processing
CN113408531A (en) * 2021-07-19 2021-09-17 北博(厦门)智能科技有限公司 Target object shape framing method based on image recognition and terminal
CN113408531B (en) * 2021-07-19 2023-07-14 北博(厦门)智能科技有限公司 Target object shape frame selection method and terminal based on image recognition
CN113888567A (en) * 2021-10-21 2022-01-04 中国科学院上海微系统与信息技术研究所 Training method of image segmentation model, image segmentation method and device

Also Published As

Publication number Publication date
CN114782705A (en) 2022-07-22

Similar Documents

Publication Publication Date Title
CN109934163B (en) Aerial image vehicle detection method based on scene prior and feature re-fusion
CN109426825A (en) A kind of detection method and device of object closed outline
CN108416377B (en) Information extraction method and device in histogram
CN104978580B (en) A kind of insulator recognition methods for unmanned plane inspection transmission line of electricity
CN106056155B (en) Superpixel segmentation method based on boundary information fusion
CN110119780A (en) Based on the hyperspectral image super-resolution reconstruction method for generating confrontation network
CN109447994A (en) In conjunction with the remote sensing image segmentation method of complete residual error and Fusion Features
CN106599869A (en) Vehicle attribute identification method based on multi-task convolutional neural network
CN107644426A (en) Image, semantic dividing method based on pyramid pond encoding and decoding structure
CN106920243A (en) The ceramic material part method for sequence image segmentation of improved full convolutional neural networks
CN106097444A (en) High-precision map generates method and apparatus
CN107944443A (en) One kind carries out object consistency detection method based on end-to-end deep learning
CN106599805A (en) Supervised data driving-based monocular video depth estimating method
CN101299235A (en) Method for reconstructing human face super resolution based on core principle component analysis
CN113379771B (en) Hierarchical human body analysis semantic segmentation method with edge constraint
CN110335199A (en) A kind of image processing method, device, electronic equipment and storage medium
CN111144418B (en) Railway track area segmentation and extraction method
CN110070091A (en) The semantic segmentation method and system rebuild based on dynamic interpolation understood for streetscape
CN109002752A (en) A kind of complicated common scene rapid pedestrian detection method based on deep learning
CN103984963B (en) Method for classifying high-resolution remote sensing image scenes
CN107564009A (en) Outdoor scene Segmentation of Multi-target method based on depth convolutional neural networks
CN112288776B (en) Target tracking method based on multi-time step pyramid codec
CN110399760A (en) A kind of batch two dimensional code localization method, device, electronic equipment and storage medium
CN107045722A (en) Merge the video signal process method of static information and multidate information
CN110705366A (en) Real-time human head detection method based on stair scene

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190305

RJ01 Rejection of invention patent application after publication