CN109426825A - A kind of detection method and device of object closed outline - Google Patents
A kind of detection method and device of object closed outline Download PDFInfo
- Publication number
- CN109426825A CN109426825A CN201810722257.6A CN201810722257A CN109426825A CN 109426825 A CN109426825 A CN 109426825A CN 201810722257 A CN201810722257 A CN 201810722257A CN 109426825 A CN109426825 A CN 109426825A
- Authority
- CN
- China
- Prior art keywords
- characteristic pattern
- convolution
- size
- semantic segmentation
- sampling
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Abstract
The present invention discloses the detection method and device of a kind of object closed outline.This method comprises: object closed outline detection device is in the decoding process that semantic segmentation is handled, intensive up-sampling process of convolution is carried out to the characteristic pattern of cataloged procedure output, size output image identical with input image size is obtained, the occluding contour in image including object example is exported;According to pixel class, the occluding contour of object example is identified and extracted from output image.The information that can restore discreet portions or wisp in image data by this method makes up the problem of wisp information caused by cataloged procedure down-sampling loses and is unable to get recovery.
Description
Technical field
The present invention relates to computer vision field, in particular to a kind of the detection method and device of object closed outline.
Background technique
The processing of image data has important role for fields such as automatic Pilots.Semantic segmentation is a kind of according to image
The technology of data progress object identification.Semantic segmentation is that each of image data pixel divides a classification.
Contour of object detection is the underlying issue in many visual tasks, including image segmentation, object detection, example semantic
Segmentation and closed outline speculate.For the correct operation of an automated driving system, institute in a traffic environment is detected
There is object to be very important, these objects can be automobile, bus, pedestrian and bicycle.For an object (such as
One automobile or a people) detection unsuccessfully may result in an automatic driving vehicle motion planning system failure,
To cause a series of accident.
Semantic segmentation frame provides the classification annotation of pixel scale, but it is other to carry out single object instance-level
Mark.Current object detection frame there are problems that the shape of object can not be restored or can not handle closed occupancy detection.
This is primarily due to the limitation of (bounding box) the fusion treatment bring of bounding box in conventional frame.Especially near bounding box
Belong to after other different classes of objects are fused together to reduce false positive rate in the case where, can bring detect and hidden
The problem of block material body.
That is, in the prior art, asking for object closed outline in image data can not accurately and effectively be detected by existing
Topic.
Summary of the invention
In view of this, the embodiment of the invention provides the detection methods and device of a kind of object closed outline, to solve
The problem of can not accurately and effectively detecting contour of object in image data in the prior art.
On the one hand, the embodiment of the present application provides a kind of detection method of object closed outline, comprising:
Object closed outline detection device is in the decoding process that semantic segmentation is handled, to the characteristic pattern of cataloged procedure output
Intensive up-sampling process of convolution is carried out, size output image identical with input image size is obtained, exporting includes object in image
The occluding contour of body example;
According to pixel class, the occluding contour of object example is identified and extracted from output image.
On the other hand, the embodiment of the present application provides a kind of detection device of object closed outline, comprising:
Intensive up-sampling convolution module, the spy for being exported to cataloged procedure in the decoding process that semantic segmentation is handled
Sign figure carries out intensive up-sampling process of convolution, obtains size output image identical with input image size, exports and wrap in image
Include the occluding contour of object example;
Profile extraction module, for identifying and extracting the closing of object example from output image according to pixel class
Contour line.
On the other hand, the embodiment of the present application provides a kind of detection device of object closed outline, including a processor
And at least one processor, at least one machine-executable instruction is stored at least one processor, processor executes at least
One machine-executable instruction is to execute:
In the decoding process of semantic segmentation processing, the characteristic pattern of cataloged procedure output is carried out at intensive up-sampling convolution
Reason obtains size output image identical with input image size, exports the occluding contour in image including object example;
According to pixel class, the occluding contour of object example is identified and extracted from output image.
According to technical solution provided by the embodiments of the present application, the characteristic pattern of effectively size compression can be restored to
The identical image data of input image data size, and according to the quantity of the down-sampling factor and scheduled object category, to spy
The port number of sign figure is converted, and the more intensive characteristic pattern of quantity is obtained, can be to more by the more dense characteristic pattern of quantity
The classification of more pixels is predicted, so as to restore the information of discreet portions or wisp in image data, makes up coding
The problem of wisp information caused by process down-sampling loses and can not be restored by bilinear interpolation.It is able to solve existing
There is the problem of contour of object in image data can not be accurately and effectively detected in technology.
Detailed description of the invention
Attached drawing is used to provide further understanding of the present invention, and constitutes part of specification, with reality of the invention
It applies example to be used to explain the present invention together, not be construed as limiting the invention.
Fig. 1 is the process flow diagram of the detection method of object closed outline provided by the embodiments of the present application;
Fig. 2 is the process flow diagram of step 101 in Fig. 1;
Fig. 3 is the image data example handled using method shown in Fig. 1 input picture;
Fig. 4 is another process flow diagram of the detection method of object closed outline provided by the embodiments of the present application;
Fig. 5 is another process flow diagram of the detection method of object closed outline provided by the embodiments of the present application;
Fig. 6 is the schematic diagram for expanding convolution kernel;
Fig. 7 is the network architecture schematic diagram for realizing the semantic segmentation model of method shown in Fig. 5 in a particular application;
Fig. 8 is the image data example of method shown in application drawing 1;
Fig. 9 is the image data example of method shown in application drawing 5;
Figure 10 is an original input picture in a concrete application scene;
Figure 11 is to extract the characteristic pattern obtained after feature to input picture shown in Fig. 10;
Figure 12 is to be marked using object detection technology in the prior art using bounding box to input picture shown in Fig. 10
Infuse the schematic diagram of object;
Figure 13 a is to apply object closed outline detection method provided by the embodiments of the present application to input picture shown in Fig. 10
The object example profile diagram obtained in advance afterwards;
Figure 13 b is the effect of visualization figure after stacking chart 13a and Figure 10;
Figure 14 is an original input picture in another concrete application scene;
Figure 15 is to extract the feature illustrated example obtained after feature to input picture shown in Figure 14;
Figure 16 is to be marked using object detection technology in the prior art using bounding box to input picture shown in Figure 14
Infuse the schematic diagram of object;
Figure 17 is to apply object closed outline detection method provided by the embodiments of the present application to input picture shown in Figure 14
The object example profile diagram extracted afterwards;
Figure 18 is to be superimposed the effect of visualization figure after Figure 17 and Figure 14;
Figure 19 is the structural block diagram of the detection device of object closed outline provided by the embodiments of the present application;
Figure 20 is another structural block diagram of the detection device of object closed outline provided by the embodiments of the present application;
Figure 21 is another structural block diagram of the detection device of object closed outline provided by the embodiments of the present application;
Figure 22 is another structural block diagram of the detection device of object closed outline provided by the embodiments of the present application;
Figure 23 is another structural block diagram of the detection device of object closed outline provided by the embodiments of the present application;
Figure 24 is another structural block diagram of the detection device of object closed outline provided by the embodiments of the present application.
Specific embodiment
Technical solution in order to enable those skilled in the art to better understand the present invention, below in conjunction with of the invention real
The attached drawing in example is applied, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described implementation
Example is only a part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, this field is common
Technical staff's every other embodiment obtained without making creative work, all should belong to protection of the present invention
Range.
In the prior art at present, semantic segmentation processing generally includes two parts: character representation decodes (Decoding
Of Feature Representation) and expansion convolution (Dilated Convolution).
The semantic segmentation information of pixel scale can be obtained by character representation decoding, the characteristic pattern of output has and input
The identical size of image.Since the maximization Chi Huahe in convolutional neural networks strides convolution operation, inevitably cause most
The size reduction of the characteristic pattern of several layer networks afterwards can decode the characteristic pattern of low resolution for this problem kinds of schemes
To accurate information.Common bilinear interpolation can save memory space and processing speed is fast.The method of deconvolution uses
Pond location information in pondization processing, necessary information required for Lai Huifu image reconstruction and characteristic visual.In some examples
In son, an individual uncoiling lamination is added in decoding stage, generates prediction result using the characteristic pattern that middle layer stacks.
In other examples, target object, such as chair, desk or automobile are generated in multiple features using multiple uncoiling laminations.
By the pond position in upper storage reservoir (unpooling) the step of using storage in some researchs, using uncoiling lamination as rolling up
The mirror-image structure of lamination.Some other researchs illustrate in the communication process of uncoiling lamination, may be implemented coarse to careful
(coarse-to-fine) detection of object structures, these object structures are very crucial for rebuilding subtle information.
There are also some researchs to use a similar mirror-image structure, and final to realize in conjunction with the information of uncoiling lamination execution up-sampling
Prediction.There are also some systems to come prediction label image and statistics with higher by using the classifier of pixel scale
Efficiency.Wherein, bilinear interpolation should be relatively wide.But bilinear interpolation up-sampling is obtained by filling 0 and inputs phase
With the output of resolution ratio, it is easily lost detailed information, wisp information is lost, loses the precision of image data, and bilinearity
Interpolation does not have learning ability.
Expanding convolution (or being known as empty convolution) is researched and developed for wavelet decomposition earliest.Expand the main core of convolution
The heart is to fill the receptive field for " 0 " carrying out enlarged image between the pixel of convolution kernel, so as to realize in deep neural network
Dense feature extract.In the frame of semantic segmentation, expansion convolution is also used for expanding the size of convolution kernel.Make in some researchs
It realizes that context polymerize (context aggregation) with the serializing layer with cumulative amplification degree, and designs one
It is a that " spatial pyramid (Atrous Spatial Pyramid Pooling, ASPP) based on hole is multiple arranged side by side by being arranged
Convolutional layer is expanded, to capture multiple dimensioned object and contextual information.Recently, expansion convolution is used for wide range, such as
Object detection, visual problem based on light stream are answered and audio generates.But these convolutional system explaineds can be because of standard extension
Convolution and lead to the problem of one " grid effect ", lead to not the shape or profile that identify larger-size object.
The above problem as present in semantic segmentation technology namely wisp information lose, can not identify large scale object
The shape or profile of body lead to not the closed outline for effectively and accurately extracting to obtain object.
For the above-mentioned problems in the prior art, the embodiment of the present application provides a kind of detection of object closed outline
Method and apparatus, to solve the problems, such as this.In the technical solution that some embodiments of the present application provide, by decoding stage
Intensive up-sampling process of convolution is carried out to the characteristic pattern of coding stage output, the resolution ratio of forecast image is improved, restores more
Detailed information, so as to retain more wisp information, and further detection obtains the profile information of wisp.?
In the technical solution that other embodiments provide, multiple mixing is carried out in characteristic pattern of the coding stage to extraction and is expanded at convolution
Reason, the local information and long-range information that can retain more convolution can obtain big object so as to overcome grid effect
Continuous profile information, so as to detect to obtain the profile information of big object.To technology provided by the embodiments of the present application
Scheme can accurately and effectively detect to obtain the profile information of object, be able to solve the above problem in the prior art.
It on the other hand,, can be direct without marking object by bounding box using method provided by the embodiments of the present application
Extract the closed outline information of object.In the prior art, the reality of object when marking object, can not be identified by bounding box
Border shape, can not the information such as size, area to object accurately inferred;And in the technical field of such as automatic Pilot etc.
In, these information will be the key message for carrying out many decisions.Also, some objects can bounding box merge processing in quilt
It neglects, object information is caused to be lost.Using method provided by the embodiments of the present application, the disk wheel of object can be directly extracted
Exterior feature is inferred or decision mentions to be other so as to information such as the true forms, size, area that further identify object
For accurately and effectively information.
It is core of the invention thought above, in order to enable those skilled in the art to better understand the present invention in embodiment
Technical solution, and keep the above objects, features, and advantages of the embodiment of the present invention more obvious and easy to understand, with reference to the accompanying drawing
Technical solution in the embodiment of the present invention is described in further detail.
Fig. 1 shows the process flow diagram of the detection method of object closed outline provided by the embodiments of the present application, comprising:
Step 101, object closed outline detection device export cataloged procedure in the decoding process that semantic segmentation is handled
Characteristic pattern carry out intensive up-sampling process of convolution, obtain size output image identical with input image size, output image
In include object example contour line;
Step 102, according to pixel class, identify and extract the contour line of object example from output image.
Wherein, in a step 101, intensive up-sampling process of convolution is carried out to characteristic pattern, as shown in Figure 2, comprising:
The port number c of characteristic pattern is switched to down-sampling factor d in cataloged procedure by step 10112With predetermined object category
Quantity L product;
For example, the input picture size of model is (H, W, C), wherein H is the height of image data, and W is image data
Width, C are the port number of image data.It is F by the characteristic pattern size that the processing of cataloged procedure is input to decoding processout=
(h, w, c), wherein H/d=h, W/d=w, d are the down-sampling factor.In the prior art using bilinear interpolation to characteristic pattern into
Row up-sampling, if d=16, namely it is input to 16 times of output down-sampling, if the length or width of an object is less than
16 pixels (pixels), such as electric pole, traffic lights, traffic signals or a people of distant place, which will not
It is sampled, and bilinear interpolation up-sampling will be unable to restore this information, to lose the object in the output image.
In step 1011 provided by the embodiments of the present application, the intensive process of convolution that up-samples is by characteristic pattern FoutSize in
Port number c converted, obtain port number d2* L, d are the down-sampling factor, and L is the quantity of scheduled whole object categories, are obtained
To characteristic pattern Fout=(h, w, d2*L)。
It specifically, can be according to the port number d after the port number c of former characteristic pattern and channel conversion2* the number ratios of L close
It is d2* L/c, learns the characteristic pattern on each channel, (the d after being learnt2* L) on a channel having a size of h*w's
Characteristic pattern, so that each intensive up-sampling convolutional layer is learning the prediction to each pixel.Wherein, on each channel
Characteristic pattern is learnt, and is that the learning functionality obtained according to neural network training study in advance is realized.For example, in former characteristic pattern
Port number c and scheduled object category the identical situation of quantity L numerical value under, can be by the feature graphics on each channel
Practise d2Part, the d after being learnt2* the characteristic pattern having a size of h*w on L channel.
Step 1012 is combined the characteristic pattern after number of channels conversion, and carries out normalizing to the characteristic pattern after combination
Change processing, to obtain size output image identical with input image size.
That is, to the characteristic pattern F after number of channels conversionout=(h, w, d2* L) be combined, obtain having a size of (h*d,
W*d, L) characteristic pattern, H/d=h, W/d=w as described above, then, the size of the characteristic pattern after the combination are (H, W, L), namely
The size of characteristic pattern is upsampled to size identical with input picture.
Wherein, to the processing of the characteristic pattern combination after number of channels conversion, it can be and obtain characteristic pattern according to collection apparatus
Sequence and channel sequence, characteristic pattern is combined.For example, in d2* the characteristic pattern in L channel, on n-th~m channel
It is that the feature extracted to the xth row data of input image data presses the characteristic pattern on n-th~m channel when being combined
According to pixel order, be successively combined to the xth row of output data, and so on combination to subsequent characteristics figure.Characteristic pattern is carried out
Combined processing can be implemented according to the regulation of actual algorithm in concrete application scene, and the application is not done specifically here
It limits.
Middle bilinear interpolation method does not have learnability compared with the prior art, does not have deconvolution, the application yet
The intensive up-sampling treatment that embodiment provides has learnability, can be before processing shown in Fig. 1, previously according to truthful data
Training neural network obtains semantic segmentation model, includes the intensive up-sampling convolutional layer of decoding stage in semantic segmentation model.Its
In, the intensive convolutional layer that up-samples may include multiple convolutional layers.Specifically, a series of up-sampling can be arrived by training study
Filter, by this series of up-sampling filter to having a size of Fout=(h, w, c) characteristic pattern up-sampling for having a size of
Fout=(h, w, d2* L) image data.
And the output image data after up-sampling is normalized by one softmax layers, it obtains to the end
Export image.
The example handled using method shown in Fig. 1 input picture is shown in Fig. 3.It is input on the left of Fig. 3
Image, other parts are the output image in the case of the different down-sampling factors from left to right, it can be seen that in input picture
Some wisps have obtained good identification, such as electric pole and signal lamp.
During specific implementation, it can be trained study in full convolutional network, obtain above-mentioned semantic segmentation
Model.
By intensive up-sampling process of convolution as described above, the characteristic pattern of effectively size compression can be restored to
Image data identical with input image data size, and according to the quantity of the down-sampling factor and scheduled object category, it is right
The port number of characteristic pattern is converted, and the more intensive characteristic pattern of quantity is obtained, can be right by the more dense characteristic pattern of quantity
The classification of more pixels is predicted, so as to restore the information of discreet portions or wisp in image data, makes up volume
The problem of wisp information caused by code process down-sampling loses and can not be restored by bilinear interpolation.To this Shen
Please embodiment provide method can from output image in extract obtain wisp example information and wisp disk wheel
Wide information.
Also, intensive up-sampling process of convolution can be realized directly from characteristic pattern to the processing for exporting mark image, and nothing
It needs as first carrying out bilinear interpolation to characteristic pattern in the prior art, and is up-sampled the mark exported to the image of interpolation
Image.On the other hand, intensively up-sampling process of convolution is directly handled the characteristic pattern of original resolution, can be realized pixel
The decoding of rank.
Also, further as shown in figure 4, on the basis of processing shown in Fig. 1, method provided by the embodiments of the present application is also
Include:
Step 103, the occluding contour according to the object example of extraction, determine the shape, size and/or face of object example
Product.
Using method provided by the embodiments of the present application, without marking object by bounding box, object can be directly extracted
Closed outline information.In the prior art, the true form of object when marking object, can not be identified by bounding box,
Can not the information such as size, area to object accurately inferred;And in the technical field of such as automatic Pilot etc., these
Information will be the key message for carrying out many decisions.Using method provided by the embodiments of the present application, object can be directly extracted
Closed outline, so as to information such as the true forms, size, area that further identify object, infer to be other or
Person's decision provides accurately and effectively information.
Based on identical inventive concept, on the basis of the embodiment of the present application method shown in Fig. 1, a kind of object is additionally provided
The detection method of closed outline.
Fig. 5 shows the detection method of object closed outline provided by the embodiments of the present application, comprising:
Step 100, object closed outline detection device are in the cataloged procedure that semantic segmentation is handled, to the characteristic pattern of extraction
It carries out multiple mixing and expands process of convolution, obtain the characteristic pattern for expanding receptive field;
Step 101, in decoding process, intensive up-sampling process of convolution is carried out to the characteristic pattern of cataloged procedure output, is obtained
To size output image identical with input image size, the occluding contour in image including object example is exported;
Step 102, according to pixel class, identify and extract the occluding contour of object example from output image.
Wherein, the processing of above-mentioned steps 100 namely on multiple convolutional layers, using a series of amplification degree to characteristic pattern into
Row process of convolution.
In the prior art, expand process of convolution and convolution is carried out to characteristic pattern usually using expansion convolution kernel, to expand spy
The receptive field of figure is levied, expanding convolution kernel, building obtains and being inserted into " 0 " between each pixel in convolution kernel.Two dimension is believed
Number, convolution kernel size is K*K, and the result by expanding convolution is Kd*Kd, wherein Kd=k+ (k-1) (r-1), r is amplification degree.
The receptive field (or being the visual field) of characteristic pattern can be expanded by expanding convolution, can replace the pond layer in full convolutional network framework.Example
A convolutional layer such as in ResNet-101 has step-length s=2, then step-length can be reset to 1 to replace down-sampling to operate,
And 2 will be set as to subsequent network layer, the rate of will be enlarged by.The network layer of whole progress down-sampling processing is alternately performed
Processing is stated, then the characteristic pattern exported can expand receptive field.In practical applications, under expansion process of convolution is typically used in
On the characteristic pattern of sampling, to reach reasonable efficiency and expense.But grid effect can be caused by expanding process of convolution.
In the embodiment of the present application, above-mentioned steps 100 can be realized are as follows: each convolutional layer in multiple expansion convolutional layers
On, using based on K*K size and amplification degree is riConvolution kernel, expansion process of convolution is carried out to characteristic pattern;Wherein, 1≤i≤
N, n are the number of plies of convolutional layer.The processing can by four kinds of at least following modes one of implemented:
Multiple expansion convolutional layers are divided into several groups by mode one, and the amplification degree for expanding convolutional layer in every group is constantly incremented by.
For example, s group can be divided by N layers when expansion convolutional layer has N layers, wherein every group includes at least two layers of convolutional layer, often
The size of the convolution kernel used in group uses constantly incremental amplification degree, namely in s group for K*K, rsi-2<rsi-1<
rsi.In this way, the amplification degree in every group is constantly incremented by, expanding in convolutional layer first and last in multiple groups, amplification degree changes at sawtooth wave,
The convolution kernel of smaller amplification degree can extract local information, and the convolution kernel of larger amplification degree can extract long range information.
Mode two, each convolution kernel for expanding convolutional layer and there is any amplification degree.
Any amplification degree is set for convolution kernel, the receptive field of convolution kernel can be expanded, so as to identify biggish object.
Mode three, on the basis of aforesaid way one or mode two, the transformation factor that amplification degree is incremented by every time is different.
For example, amplification degree r=(1,2,5), the transformation factor that amplification degree is incremented by every time is 1 and 3, namely change incremental every time
The factor is changed to be different.
The different multiple amplification degrees of transformation factor are set, one group of expansion convolution kernel can be made to cover more pixels, phase
If being instead using the identical amplification degree of transformation factor, such as r=(2,4,6,8), the changed factor that amplification degree is incremented by every time
2, overcome the effect of grid effect weaker in this way.
Mode four, in above-mentioned three kinds of modes on the basis of any mode, the last layer expands the expansion volume of convolutional layer
The size of the receptive field of product core is less than or equal to the size of characteristic pattern.
That is, by preset amplification degree, so that the last layer expands the receptive field of the expansion convolution kernel of convolutional layer
Size be less than or equal to characteristic pattern size, the receptive field of the last layer convolutional layer can be expanded, especially when expand convolution
Under the size of the receptive field of core and the identical situation of the size of characteristic pattern, the receptive field for expanding convolution kernel can cover characteristic pattern
Whole region, thus will not lose it is any cavity or edge, can guarantee the consistent and complete of long-range information.
Below in the above described manner for one and the prior art compares explanation.
The example for expanding convolution kernel is shown in Fig. 6.The pixel of surrounding grey is the meter to the pixel of central black in Fig. 6
Count the pixel to contribution.Fig. 6 a is expansion convolution kernel schematic diagram in the prior art.Fig. 6 b is to mention using the embodiment of the present application
The mixing of confession expands the convolution kernel schematic diagram of convolution.
The size of convolution kernel is 3*3 in Fig. 6 (a), and amplification degree from left to right is r=2.For expanding in convolutional layer
One pixel p, contributive to its is upper one layer of K centered on pd*KdClose region because expand convolution introduce 0
Value, in Kd*KdRegion only calculate K*K pixel, between non-zero pixel between be divided into r-1.Such as in k=3, the expansion of r=2
In big convolution, as shown in the figure on the left side Fig. 6 a, only have 9 pixels to be made that contribution in 25 pixels.Since all layers all have
There is identical amplification degree r, for the point p of top expanded in convolutional layer, the maximum possible of contribution is played to the calculating of p point
The quantity of pixel is (w ' * h ')/r2, wherein w ' and h ' is the width and height for the characteristic pattern that bottom expands convolutional layer respectively.From
And p point can only check its information in the form of tessellated in the characteristic pattern of top layer, will lead to losing for bulk information in this way
It loses (when r=2, about 75% information can be lost).When the r in higher convolutional layer becomes increasing, can make
The data sampled from input are more and more sparse, are unfavorable for convolution study, this is because: 1) the complete loss of local message;2)
It is too far uncorrelated between information.Another is ceased as a result, being inscribed to collect mail from the region of r*r from entirely different " grid " set,
This will damage the consistency of local information.
In the convolution kernel shown in Fig. 6 (b), using aforesaid way one, several convolutional layers are divided into one group, each group
Amplification degree be constantly incremented by, such as K=3, r=(1,2,3), so that the transformation of amplification degree is similar to the shape of sawtooth wave, in this way
Local information can be obtained in the bottom on the left side, and the information in broader region can be obtained in top layer on the right.No
The segmentation requirement of wisp and big object can be taken into account with the combination of amplification degree, i.e., lesser amplification degree extracts local information, compared with
Big amplification degree extracts long range information.
The above embodiment for expanding convolution by mixing can rolled up by the way that a series of convolution kernel of amplification degrees is arranged
So that expansion convolution kernel is covered more pixels as far as possible during product, takes into account and extract local information and long-range information.Also, with expansion
The range of the receptive field covering of big convolution kernel is bigger, and the cavity of loss and marginal information are fewer, can guarantee long-range information
It is consistent and complete, it can be efficiently against grid effect, so as to obtain complete, the closed shape and wheel of big object
It is wide.
On the other hand, there is learnability since mixing expands process of convolution, it can be before processing shown in Fig. 5, in advance
Semantic segmentation model is obtained according to truthful data training neural network, includes the mixing expansion of coding stage in semantic segmentation model
Convolutional layer.
During specific implementation, it can be trained study in full convolutional network, obtain above-mentioned semantic segmentation
Model.
Fig. 7 is shown in a particular application, the semanteme of method shown in the accomplished Fig. 5 of training on ResNet-101 framework
The network architecture of parted pattern.In coding stage in Fig. 7, multiple mixing expand convolutional layer and mix to the characteristic pattern extracted
It closes and expands process of convolution, in decoding stage, the characteristic pattern that multiple intensive up-sampling convolutional layers export coding stage is handled,
The mark image exported.
Method shown in Fig. 5 is that the mixing in cataloged procedure expands process of convolution and decodes at intensive up-sampling convolution in the process
The combination of reason, mixing, which expands process of convolution, can effectively expand the shape and profile of the receptive field of characteristic pattern, the big object of identification,
Intensive up-sampling process of convolution can restore the information of wisp, and the combination of both is conducive to comprehensively, accurately and efficiently know
Indescribably take the profile of the object example and object example in image data.
Further, similar with Fig. 4, method shown in fig. 5 can also include step 103, and which is not described herein again.
Fig. 8 and Fig. 9 respectively illustrates the comparative situation of the output image of method shown in application drawing 1 respectively and Fig. 5.In Fig. 8
In, be from left to right input picture, the output image of method shown in truthful data, application drawing 1, method shown in application drawing 5 it is defeated
Image out.It can be in fig. 8, it will be seen that output figure of the output image of method shown in application drawing 5 compared to method shown in application drawing 1
Picture, closer to truthful data in terms of the identification of wisp.In Fig. 9, the first behavior truthful data, the second behavior application drawing
The input data of method shown in 1, the output image of method shown in third behavior application drawing 5.It can be seen that in Fig. 9, application drawing 5
The output image of shown method can be more in terms of the identification of big contour of object compared to the output image of method shown in application drawing 1
Efficiently against grid effect, closer to truthful data.
The example using method provided by the embodiments of the present application is also shown in another group of 10~13b of image datagram.Figure 10
For original input picture, Figure 11 is to extract the characteristic pattern obtained after feature, Figure 12 prior art to input picture shown in Fig. 10
In object detection technology the schematic diagram of object is marked using bounding box, Figure 13 a is to apply object provided by the embodiments of the present application
The object example profile diagram obtained in advance after body closed outline detection method.
Wherein, the object of plurality of classes is marked out in Figure 11 by different colors.But in Figure 10, single body is real
The information of example rank has been lost, such as all automobiles are noted as identical color namely blue, and are noted as " vapour
Vehicle " classification.But identify whole object examples in a traffic environment, each automobile, bus, pedestrian and voluntarily
Vehicle is very crucial for safely effectively automated driving system.It may to the detection failure of an object example
It will lead to the disabler or classification error of the motion planning module of autonomous driving vehicle, so as to cause a series of accident.
Semantic segmentation frame provides the object mark of pixel scale, but can not identify individually only according to semantic segmentation technology
The other object of instance-level.
Figure 12 is the schematic diagram for marking object using bounding box using traditional object detection frame.Traditional object inspection
It surveys frame and is able to use bounding box although to mark object, but the shape of object can not be restored or handle the closing of object
The problem of contour detecting.Particularly, due to the limitation of bounding box fusion treatment in traditional object detection frame, in order to reduce vacation
Positive rate, bounding box be close, mark different objects example may be fused together, so as to cause that can not detect object
The closed outline of body or object example, especially when the object that is blocked is very big.As shown in figure 12, traditional object inspection
Survey shape or profile that frame restores different objects or different objects example using the bounding box of rectangle.To merge
During the bounding box of object and its neighbouring object, the object being blocked or the object example being blocked may detected
It is lost in journey.
Figure 13 a shows the output image using object closed outline detection method provided by the embodiments of the present application.This Shen
Please embodiment provide object closed outline detection method based on one it is assumed that namely a particular category object have it is similar
Global shape, the detection of profile and boundary line for the object of the same category, have consistent planform.Such as Figure 13 a
Shown in, the enclosed edge boundary line along the vehicle of the stop in roadside has similar width and direction.If can be calculated using one
Model learning to this structural information, we can restore contour of object and enclosed edge boundary line, and detect and be blocked
Object.In the embodiment of the present application, the task of contour of object detection can be treated as a semantic segmentation task, wherein original
Input picture and the mark image of output are image data, so as to implement object on the semantic segmentation frame of pixel scale
Body contour detecting.Particularly, the embodiment of the present application proposes method as shown in Figure 1.Convolution is intensively up-sampled shown in Fig. 1
Processing is suitable for contour of object and detects, and reason is: 1), intensive up-sampling be suitable for restoring the shape of object, 2), it is intensive on
Sampling can reach higher accuracy rate compared to the coding/decoding method of bilinearity up-sampling etc., other coding/decoding methods for
Width is easily lost in 8 pixel objects below, 3) profile for the object, being resumed cannot be too thick, otherwise may be by object
It fogs.Intensive up-sampling can decode the profile of any width, and other methods of such as bilinearity up-sampling can only be restored
The wide profile of at least eight pixel.It as depicted in fig. 13 a, can be quasi- from input picture using method provided by the embodiments of the present application
It really detects and obtains the other object segmentation of instance-level.Figure 13 b show the occluding contour of the object of extraction is superimposed upon it is defeated
Enter the effect of visualization on image.
The example using method provided by the embodiments of the present application is also shown in another group of image datagram 14~18.Figure 14 is
Original input picture, Figure 15 are to extract the characteristic pattern obtained after feature to input picture shown in Figure 14, and Figure 16 is in the prior art
Object detection technology the schematic diagram of object is marked using bounding box, Figure 17 is that application object provided by the embodiments of the present application seals
The object example profile diagram extracted after profile testing method is closed, Figure 18 is by the contour of object information superposition in Figure 17 to figure
Effect of visualization schematic diagram on input picture shown in 14.
As can be seen from Figure 17, it using method provided by the embodiments of the present application, can accurately detect each independent
Object example shape, and the adjacent object that blocks also is not lost.Once detection obtains the shape of each independent object
The profile of object can be added on input picture as shown in figure 14 by shape and profile, to form visual expression, and be automatic
The control system of driving provides accurately and effectively object information.
Based on identical inventive concept, the embodiment of the present application also provides a kind of detection devices of object closed outline.
Figure 19 shows the structural block diagram of the detection device of object closed outline provided by the embodiments of the present application, comprising:
Intensive up-sampling convolution module 91, for being exported to cataloged procedure in the decoding process that semantic segmentation is handled
Characteristic pattern carries out intensive up-sampling process of convolution, obtains size output image identical with input image size, exports in image
Occluding contour including object example;
Profile extraction module 92, for identifying and extracting the envelope of object example from output image according to pixel class
Close contour line.
Wherein, intensively up-sampling convolution module 91 is specifically used for: the port number (c) of characteristic pattern is switched in cataloged procedure
The down-sampling factor (d2) and predetermined object category quantity (L) product;Characteristic pattern after number of channels conversion is combined,
And the characteristic pattern after combination is normalized, obtain size output image identical with input image size.
The port number (c) of characteristic pattern is switched to the down-sampling factor (d in cataloged procedure by intensive up-sampling convolution module 912)
With the product of the quantity (L) of predetermined object category, comprising: according to the port number (c) of characteristic pattern and the down-sampling factor and pre- earnest
Product (the d of the quantity of body classification2* L) number ratios relationship, the characteristic pattern on each channel is learnt, is turned
Port number (the d of characteristic pattern after changing2*L)。
Intensive up-sampling convolution module 91 is combined the characteristic pattern after number of channels conversion, comprising: adopts according to feature
Collection obtains the sequence and channel sequence of characteristic pattern, is combined to the characteristic pattern after number of channels conversion.
On the basis of Figure 19 shown device, as shown in figure 20, device provided by the embodiments of the present application can also be further
Include:
Determining module 93 determines shape, the size of object example for the occluding contour according to the object example of extraction
And/or area.
Based on identical inventive concept, on the basis of Figure 19 shown device, as shown in figure 21, the embodiment of the present application is provided
Device can further include:
Mixing expands convolution module 90, for being carried out in the cataloged procedure that semantic segmentation is handled to the characteristic pattern of extraction
Multiple mixing expands process of convolution, obtains the characteristic pattern for expanding receptive field.
Wherein, mixing expands convolution module 90 and is specifically used for: on each convolutional layer in multiple expansion convolutional layers, using
Based on K*K size and amplification degree is riConvolution kernel, expansion process of convolution is carried out to characteristic pattern;Wherein, 1≤i≤n, n are volume
The number of plies of lamination.
In some embodiments, mixing expands convolution module 90 and is also used to multiple expansion convolutional layers being divided into several groups, often
The amplification degree for respectively expanding convolutional layer in group is constantly incremented by.
In some embodiments, each convolution kernel for expanding convolutional layer and there is any amplification degree.
In some embodiments, the transformation factor that amplification degree is incremented by every time is different.
In some embodiments, the size that the last layer expands the receptive field of the expansion convolution kernel of convolutional layer is less than or equal to
The size of characteristic pattern.
As shown in figure 22, on the basis of Figure 21 shown device, device provided by the embodiments of the present application can also be further
Include:
First pre-training module 94, for obtaining semantic segmentation model, language previously according to truthful data training neural network
Mixing in adopted parted pattern including coding stage expands convolutional layer.
In some embodiments, the first pre-training module 94 trains full convolutional network previously according to truthful data end-to-endly
Obtain semantic segmentation model
As shown in figure 23, on the basis of Figure 19 shown device, device provided by the embodiments of the present application can also be further
Include:
Second pre-training module 95, for obtaining semantic segmentation model, language previously according to truthful data training neural network
It include the intensive up-sampling convolutional layer of decoding stage in adopted parted pattern.
In some embodiments, the second pre-training module 95 trains full convolutional network previously according to truthful data end-to-endly
Obtain semantic segmentation model.
According to above-mentioned apparatus provided by the embodiments of the present application, mixing, which expands convolution module, can effectively expand characteristic pattern
The shape and profile of receptive field, the big object of identification, the intensive convolution module that up-samples can restore the information of wisp, by this two
Person can comprehensively, accurately and efficiently identify the profile for extracting object example and object example in image data.
Based on identical inventive concept, the embodiment of the present application also provides a kind of detection devices of object closed outline.
Figure 24 shows the detection device of object closed outline provided by the embodiments of the present application, including a processor 2401
With at least one processor 2402, at least one machine-executable instruction, processor are stored at least one processor 2402
2401 execute at least one machine-executable instruction to execute:
In the decoding process of semantic segmentation processing, the characteristic pattern of cataloged procedure output is carried out at intensive up-sampling convolution
Reason obtains size output image identical with input image size, exports the occluding contour in image including object example;
According to pixel class, the occluding contour of object example is identified and extracted from output image.
Wherein, processor 2401 executes at least one machine-executable instruction to execute the characteristic pattern to cataloged procedure output
Intensive up-sampling process of convolution is carried out, size output image identical with input image size is obtained, comprising: by the logical of characteristic pattern
Road number (c) switchs to the down-sampling factor (d in cataloged procedure2) and predetermined object category quantity (L) product;To number of channels
Characteristic pattern after conversion is combined, and the characteristic pattern after combination is normalized, and obtains size and input picture ruler
Very little identical output image.
Processor 2401 executes at least one machine-executable instruction and switchs to encode with the port number (c) for executing characteristic pattern
The down-sampling factor (d in the process2) and predetermined object category quantity (L) product, comprising: according to the port number of characteristic pattern
(c) and the product (d of the quantity of the down-sampling factor and predetermined object category2* L) number ratios relationship, on each channel
Characteristic pattern learnt, the port number (d of the characteristic pattern after being converted2*L)。
Processor 2401 execute at least one machine-executable instruction with execute to number of channels conversion after characteristic pattern into
Row combination, comprising: obtain the sequence and channel sequence of characteristic pattern according to collection apparatus, to number of channels conversion after characteristic pattern into
Row combination.
In some embodiments, processor 2401 executes at least one machine-executable instruction and also executes: previously according to true
Real data training neural network obtains semantic segmentation model, includes the intensive up-sampling convolution of decoding stage in semantic segmentation model
Layer.
At least one machine-executable instruction of execution of processor 2401 is also executed instructs end-to-endly previously according to truthful data
Practice full convolutional network and obtains semantic segmentation model.
In further embodiments, processor 2401 executes at least one machine-executable instruction and also executes: in semanteme point
It cuts in the cataloged procedure of processing, multiple mixing is carried out to the characteristic pattern of extraction and expands process of convolution, obtains expanding receptive field
Characteristic pattern.
Wherein, processor 2401 executes at least one machine-executable instruction to execute the characteristic pattern progress to extraction repeatedly
Mixing expand process of convolution, obtain the characteristic pattern for expanding receptive field, comprising: it is multiple expansion convolutional layers in each convolution
On layer, using based on K*K size and amplification degree is riConvolution kernel, expansion process of convolution is carried out to characteristic pattern;Wherein, 1≤i
≤ n, n are the number of plies of convolutional layer.
In some embodiments, processor 2401 executes at least one machine-executable instruction and also executes multiple expansions volume
Lamination is divided into several groups, and the amplification degree for expanding convolutional layer in every group is constantly incremented by.
In some embodiments, each convolution kernel for expanding convolutional layer and there is any amplification degree.
In some embodiments, the transformation factor that amplification degree is incremented by every time is different.
In some embodiments, the size that the last layer expands the receptive field of the expansion convolution kernel of convolutional layer is less than or equal to
The size of characteristic pattern.
In some embodiments, processor 2401 executes at least one machine-executable instruction and also executes: previously according to true
Real data training neural network obtains semantic segmentation model, includes the mixing expansion convolution of coding stage in semantic segmentation model
Layer.
At least one machine-executable instruction of execution of processor 2401 is also executed instructs end-to-endly previously according to truthful data
Practice full convolutional network and obtains semantic segmentation model.
According to above-mentioned apparatus provided by the embodiments of the present application, mixing, which expands process of convolution, can effectively expand characteristic pattern
The shape and profile of receptive field, the big object of identification, the intensive process of convolution that up-samples can restore the information of wisp, by this two
Person can comprehensively, accurately and efficiently identify the profile for extracting object example and object example in image data.
Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the art
Mind and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologies
Within, then the present invention is also intended to include these modifications and variations.
Claims (42)
1. a kind of detection method of object closed outline characterized by comprising
Object closed outline detection device carries out the characteristic pattern of cataloged procedure output in the decoding process that semantic segmentation is handled
Intensive up-sampling process of convolution, obtains size output image identical with input image size, and exporting in image includes that object is real
The occluding contour of example;
According to pixel class, the occluding contour of object example is identified and extracted from output image.
2. the method according to claim 1, wherein intensively being up-sampled to the characteristic pattern of cataloged procedure output
Process of convolution obtains size output image identical with input image size, comprising:
The port number (c) of characteristic pattern is switched to the down-sampling factor (d in cataloged procedure2) and predetermined object category quantity (L)
Product;
Characteristic pattern after number of channels conversion is combined, and the characteristic pattern after combination is normalized, obtains ruler
Very little output image identical with input image size.
3. according to the method described in claim 2, it is characterized in that, the port number (c) of characteristic pattern is switched in cataloged procedure
The down-sampling factor (d2) and predetermined object category quantity (L) product, comprising:
According to the product (d of the port number (c) of characteristic pattern and the quantity of the down-sampling factor and predetermined object category2* L) numeric ratio
Example relationship, learns the characteristic pattern on each channel, the port number (d of the characteristic pattern after being converted2*L)。
4. according to the method described in claim 2, it is characterized in that, being combined to the characteristic pattern after number of channels conversion, packet
It includes:
The sequence and channel sequence of characteristic pattern are obtained according to collection apparatus, the characteristic pattern after number of channels conversion is combined.
5. the method according to claim 1, wherein the method also includes:
In the cataloged procedure of semantic segmentation processing, multiple mixing is carried out to the characteristic pattern of extraction and expands process of convolution, is obtained
Expand the characteristic pattern of receptive field.
6. according to the method described in claim 5, it is characterized in that, carrying out multiple mixing to the characteristic pattern of extraction expands convolution
Processing, obtains the characteristic pattern for expanding receptive field, comprising:
On each convolutional layers in multiple expansion convolutional layers, using based on K*K size and amplification degree is riConvolution kernel, to spy
Sign figure carries out expansion process of convolution;Wherein, 1≤i≤n, n are the number of plies of convolutional layer.
7. each in every group according to the method described in claim 6, it is characterized in that, multiple expansion convolutional layers are divided into several groups
The amplification degree for expanding convolutional layer is constantly incremented by.
8. according to the method described in claim 6, it is characterized in that, each convolution for expanding convolutional layer and there is any amplification degree
Core.
9. according to the method described in claim 6, it is characterized in that, the transformation factor that amplification degree is incremented by every time is different.
10. according to the method described in claim 6, it is characterized in that, the last layer expands the sense of the expansion convolution kernel of convolutional layer
It is less than or equal to the size of characteristic pattern by wild size.
11. according to the method described in claim 5, it is characterized in that, the method also includes:
Semantic segmentation model is obtained previously according to truthful data (ground truth) training neural network, in semantic segmentation model
Mixing including coding stage expands convolutional layer.
12. the method according to claim 1, wherein the method also includes:
Semantic segmentation model is obtained previously according to truthful data training neural network, includes decoding stage in semantic segmentation model
Intensive up-sampling convolutional layer.
13. method according to claim 11 or 12, which is characterized in that previously according to truthful data, training is complete end-to-endly
Convolutional network obtains semantic segmentation model.
14. method according to claim 1 or 5, which is characterized in that further include:
According to the occluding contour of the object example of extraction, the shape, size and/or area of object example are determined.
15. a kind of detection device of object closed outline characterized by comprising
Intensive up-sampling convolution module, the characteristic pattern for being exported to cataloged procedure in the decoding process that semantic segmentation is handled
Intensive up-sampling process of convolution is carried out, size output image identical with input image size is obtained, exporting includes object in image
The occluding contour of body example;
Profile extraction module, for identifying and extracting the closed outline of object example from output image according to pixel class
Line.
16. device according to claim 15, which is characterized in that intensive up-sampling convolution module is specifically used for:
The port number (c) of characteristic pattern is switched to the down-sampling factor (d in cataloged procedure2) and predetermined object category quantity (L)
Product;
Characteristic pattern after number of channels conversion is combined, and the characteristic pattern after combination is normalized, obtains ruler
Very little output image identical with input image size.
17. device according to claim 16, which is characterized in that the intensive convolution module that up-samples is by the port number of characteristic pattern
(c) switch to the down-sampling factor (d in cataloged procedure2) and predetermined object category quantity (L) product, comprising:
According to the product (d of the port number (c) of characteristic pattern and the quantity of the down-sampling factor and predetermined object category2* L) numeric ratio
Example relationship, learns the characteristic pattern on each channel, the port number (d of the characteristic pattern after being converted2*L)。
18. device according to claim 16, which is characterized in that after intensive up-sampling convolution module is to number of channels conversion
Characteristic pattern be combined, comprising:
The sequence and channel sequence of characteristic pattern are obtained according to collection apparatus, the characteristic pattern after number of channels conversion is combined.
19. device according to claim 15, which is characterized in that described device further include:
Mixing expands convolution module, for being carried out repeatedly in the cataloged procedure that semantic segmentation is handled to the characteristic pattern of extraction
Mixing expands process of convolution, obtains the characteristic pattern for expanding receptive field.
20. device according to claim 19, which is characterized in that mixing expands convolution module and is specifically used for:
On each convolutional layers in multiple expansion convolutional layers, using based on K*K size and amplification degree is riConvolution kernel, to spy
Sign figure carries out expansion process of convolution;Wherein, 1≤i≤n, n are the number of plies of convolutional layer.
21. device according to claim 20, which is characterized in that mixing expands convolution module and is also used to roll up multiple expansions
Lamination is divided into several groups, and the amplification degree for expanding convolutional layer in every group is constantly incremented by.
22. device according to claim 20, which is characterized in that each convolution for expanding convolutional layer and there is any amplification degree
Core.
23. device according to claim 20, which is characterized in that the transformation factor that amplification degree is incremented by every time is different.
24. device according to claim 20, which is characterized in that the last layer expands the sense of the expansion convolution kernel of convolutional layer
It is less than or equal to the size of characteristic pattern by wild size.
25. device according to claim 19, which is characterized in that described device further include:
First pre-training module, for obtaining semantic segmentation previously according to truthful data (ground truth) training neural network
Model includes that the mixing of coding stage expands convolutional layer in semantic segmentation model.
26. device according to claim 15, which is characterized in that described device further include:
Second pre-training module, for obtaining semantic segmentation model, semantic segmentation previously according to truthful data training neural network
It include the intensive up-sampling convolutional layer of decoding stage in model.
27. according to device described in claim 25 or 26, which is characterized in that the first pre-training module is previously according to true number
Semantic segmentation model is obtained according to the full convolutional network of training end-to-endly;
Previously according to truthful data, the full convolutional network of training obtains semantic segmentation model to second pre-training module end-to-endly.
28. device described in 5 or 19 according to claim 1, which is characterized in that further include:
Determining module, for the occluding contour according to the object example of extraction, determine the shape of object example, size and/or
Area.
29. a kind of detection device of object closed outline, which is characterized in that including a processor and at least one processor,
At least one machine-executable instruction is stored at least one processor, processor executes at least one machine-executable instruction
To execute:
In the decoding process of semantic segmentation processing, intensive up-sampling process of convolution is carried out to the characteristic pattern of cataloged procedure output,
Size output image identical with input image size is obtained, the occluding contour in image including object example is exported;
According to pixel class, the occluding contour of object example is identified and extracted from output image.
30. device according to claim 29, which is characterized in that processor execute at least one machine-executable instruction with
It executes and intensive up-sampling process of convolution is carried out to the characteristic pattern that cataloged procedure exports, it is identical with input image size to obtain size
Export image, comprising:
The port number (c) of characteristic pattern is switched to the down-sampling factor (d in cataloged procedure2) and predetermined object category quantity (L)
Product;
Characteristic pattern after number of channels conversion is combined, and the characteristic pattern after combination is normalized, obtains ruler
Very little output image identical with input image size.
31. device according to claim 30, which is characterized in that processor execute at least one machine-executable instruction with
Execute the down-sampling factor (d switched to the port number (c) of characteristic pattern in cataloged procedure2) and predetermined object category quantity (L)
Product, comprising:
According to the product (d of the port number (c) of characteristic pattern and the quantity of the down-sampling factor and predetermined object category2* L) numeric ratio
Example relationship, learns the characteristic pattern on each channel, the port number (d of the characteristic pattern after being converted2*L)。
32. device according to claim 30, which is characterized in that processor execute at least one machine-executable instruction with
It executes and the characteristic pattern after number of channels conversion is combined, comprising:
The sequence and channel sequence of characteristic pattern are obtained according to collection apparatus, the characteristic pattern after number of channels conversion is combined.
33. device according to claim 29, which is characterized in that the processor executes the executable finger of at least one machine
It enables and also executing:
In the cataloged procedure of semantic segmentation processing, multiple mixing is carried out to the characteristic pattern of extraction and expands process of convolution, is obtained
Expand the characteristic pattern of receptive field.
34. device according to claim 33, which is characterized in that processor execute at least one machine-executable instruction with
It executes and carries out multiple mixing expansion process of convolution to the characteristic pattern of extraction, obtain the characteristic pattern for expanding receptive field, comprising:
On each convolutional layers in multiple expansion convolutional layers, using based on K*K size and amplification degree is riConvolution kernel, to spy
Sign figure carries out expansion process of convolution;Wherein, 1≤i≤n, n are the number of plies of convolutional layer.
35. device according to claim 34, which is characterized in that processor executes at least one machine-executable instruction also
It executes and multiple expansion convolutional layers is divided into several groups, the amplification degree for expanding convolutional layer in every group is constantly incremented by.
36. device according to claim 34, which is characterized in that each convolution for expanding convolutional layer and there is any amplification degree
Core.
37. device according to claim 34, which is characterized in that the transformation factor that amplification degree is incremented by every time is different.
38. device according to claim 34, which is characterized in that the last layer expands the sense of the expansion convolution kernel of convolutional layer
It is less than or equal to the size of characteristic pattern by wild size.
39. device according to claim 33, which is characterized in that processor executes at least one machine-executable instruction also
It executes:
Semantic segmentation model is obtained previously according to truthful data training neural network, includes coding stage in semantic segmentation model
Mixing expands convolutional layer.
40. device according to claim 29, which is characterized in that processor executes at least one machine-executable instruction also
It executes:
Semantic segmentation model is obtained previously according to truthful data training neural network, includes decoding stage in semantic segmentation model
Intensive up-sampling convolutional layer.
41. the device according to claim 39 or 40, which is characterized in that processor executes the executable finger of at least one machine
Enable also executing trains full convolutional network to obtain semantic segmentation model end-to-endly previously according to truthful data.
42. the device according to claim 29 or 33, which is characterized in that processor executes the executable finger of at least one machine
It enables and also executing:
According to the occluding contour of the object example of extraction, the shape, size and/or area of object example are determined.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210425480.0A CN114782705A (en) | 2017-08-31 | 2018-06-29 | Method and device for detecting closed contour of object |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
USUS15/693,446 | 2017-08-31 | ||
US15/693,446 US10067509B1 (en) | 2017-03-10 | 2017-08-31 | System and method for occluding contour detection |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210425480.0A Division CN114782705A (en) | 2017-08-31 | 2018-06-29 | Method and device for detecting closed contour of object |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109426825A true CN109426825A (en) | 2019-03-05 |
Family
ID=65513699
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810722257.6A Pending CN109426825A (en) | 2017-08-31 | 2018-06-29 | A kind of detection method and device of object closed outline |
CN202210425480.0A Pending CN114782705A (en) | 2017-08-31 | 2018-06-29 | Method and device for detecting closed contour of object |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210425480.0A Pending CN114782705A (en) | 2017-08-31 | 2018-06-29 | Method and device for detecting closed contour of object |
Country Status (1)
Country | Link |
---|---|
CN (2) | CN109426825A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110189264A (en) * | 2019-05-05 | 2019-08-30 | 深圳市华星光电技术有限公司 | Image processing method |
CN110544261A (en) * | 2019-09-04 | 2019-12-06 | 东北大学 | Blast furnace tuyere coal injection state detection method based on image processing |
WO2020228279A1 (en) * | 2019-05-10 | 2020-11-19 | 平安科技(深圳)有限公司 | Image palm region extraction method and apparatus |
CN113408531A (en) * | 2021-07-19 | 2021-09-17 | 北博(厦门)智能科技有限公司 | Target object shape framing method based on image recognition and terminal |
CN113888567A (en) * | 2021-10-21 | 2022-01-04 | 中国科学院上海微系统与信息技术研究所 | Training method of image segmentation model, image segmentation method and device |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2009013636A2 (en) * | 2007-05-22 | 2009-01-29 | The University Of Western Ontario | A method for automatic boundary segmentation of object in 2d and/or 3d image |
CN102043950A (en) * | 2010-12-30 | 2011-05-04 | 南京信息工程大学 | Vehicle outline recognition method based on canny operator and marginal point statistic |
CN106780536A (en) * | 2017-01-13 | 2017-05-31 | 深圳市唯特视科技有限公司 | A kind of shape based on object mask network perceives example dividing method |
CN106886801A (en) * | 2017-04-14 | 2017-06-23 | 北京图森未来科技有限公司 | A kind of image, semantic dividing method and device |
CN107038693A (en) * | 2015-10-27 | 2017-08-11 | 富士通天株式会社 | Image processing equipment and image processing method |
CN107092870A (en) * | 2017-04-05 | 2017-08-25 | 武汉大学 | A kind of high resolution image semantics information extracting method and system |
CN107105130A (en) * | 2016-02-19 | 2017-08-29 | 三星电子株式会社 | Electronic equipment and its operating method |
-
2018
- 2018-06-29 CN CN201810722257.6A patent/CN109426825A/en active Pending
- 2018-06-29 CN CN202210425480.0A patent/CN114782705A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2009013636A2 (en) * | 2007-05-22 | 2009-01-29 | The University Of Western Ontario | A method for automatic boundary segmentation of object in 2d and/or 3d image |
CN102043950A (en) * | 2010-12-30 | 2011-05-04 | 南京信息工程大学 | Vehicle outline recognition method based on canny operator and marginal point statistic |
CN107038693A (en) * | 2015-10-27 | 2017-08-11 | 富士通天株式会社 | Image processing equipment and image processing method |
CN107105130A (en) * | 2016-02-19 | 2017-08-29 | 三星电子株式会社 | Electronic equipment and its operating method |
CN106780536A (en) * | 2017-01-13 | 2017-05-31 | 深圳市唯特视科技有限公司 | A kind of shape based on object mask network perceives example dividing method |
CN107092870A (en) * | 2017-04-05 | 2017-08-25 | 武汉大学 | A kind of high resolution image semantics information extracting method and system |
CN106886801A (en) * | 2017-04-14 | 2017-06-23 | 北京图森未来科技有限公司 | A kind of image, semantic dividing method and device |
Non-Patent Citations (3)
Title |
---|
PANQU WANG等: "Understand Convolution for Semantic Segmentation", 《COMPUTER VISION AND PATTERN RECOGNITION》 * |
WENZHE SHI等: "Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network", 《2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 * |
有些代码不应该被忘记: "语义分割中的深度学习方法全解:从FCN、SegNet到个版本DeepLab", 《HTTPS://BLOG.CSDN.NET/SCUTJY2015/ARTICLE/DETAILS/74971060》 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110189264A (en) * | 2019-05-05 | 2019-08-30 | 深圳市华星光电技术有限公司 | Image processing method |
CN110189264B (en) * | 2019-05-05 | 2021-04-23 | Tcl华星光电技术有限公司 | Image processing method |
WO2020228279A1 (en) * | 2019-05-10 | 2020-11-19 | 平安科技(深圳)有限公司 | Image palm region extraction method and apparatus |
CN110544261A (en) * | 2019-09-04 | 2019-12-06 | 东北大学 | Blast furnace tuyere coal injection state detection method based on image processing |
CN110544261B (en) * | 2019-09-04 | 2023-08-29 | 东北大学 | Method for detecting coal injection state of blast furnace tuyere based on image processing |
CN113408531A (en) * | 2021-07-19 | 2021-09-17 | 北博(厦门)智能科技有限公司 | Target object shape framing method based on image recognition and terminal |
CN113408531B (en) * | 2021-07-19 | 2023-07-14 | 北博(厦门)智能科技有限公司 | Target object shape frame selection method and terminal based on image recognition |
CN113888567A (en) * | 2021-10-21 | 2022-01-04 | 中国科学院上海微系统与信息技术研究所 | Training method of image segmentation model, image segmentation method and device |
Also Published As
Publication number | Publication date |
---|---|
CN114782705A (en) | 2022-07-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109934163B (en) | Aerial image vehicle detection method based on scene prior and feature re-fusion | |
CN109426825A (en) | A kind of detection method and device of object closed outline | |
CN108416377B (en) | Information extraction method and device in histogram | |
CN104978580B (en) | A kind of insulator recognition methods for unmanned plane inspection transmission line of electricity | |
CN106056155B (en) | Superpixel segmentation method based on boundary information fusion | |
CN110119780A (en) | Based on the hyperspectral image super-resolution reconstruction method for generating confrontation network | |
CN109447994A (en) | In conjunction with the remote sensing image segmentation method of complete residual error and Fusion Features | |
CN106599869A (en) | Vehicle attribute identification method based on multi-task convolutional neural network | |
CN107644426A (en) | Image, semantic dividing method based on pyramid pond encoding and decoding structure | |
CN106920243A (en) | The ceramic material part method for sequence image segmentation of improved full convolutional neural networks | |
CN106097444A (en) | High-precision map generates method and apparatus | |
CN107944443A (en) | One kind carries out object consistency detection method based on end-to-end deep learning | |
CN106599805A (en) | Supervised data driving-based monocular video depth estimating method | |
CN101299235A (en) | Method for reconstructing human face super resolution based on core principle component analysis | |
CN113379771B (en) | Hierarchical human body analysis semantic segmentation method with edge constraint | |
CN110335199A (en) | A kind of image processing method, device, electronic equipment and storage medium | |
CN111144418B (en) | Railway track area segmentation and extraction method | |
CN110070091A (en) | The semantic segmentation method and system rebuild based on dynamic interpolation understood for streetscape | |
CN109002752A (en) | A kind of complicated common scene rapid pedestrian detection method based on deep learning | |
CN103984963B (en) | Method for classifying high-resolution remote sensing image scenes | |
CN107564009A (en) | Outdoor scene Segmentation of Multi-target method based on depth convolutional neural networks | |
CN112288776B (en) | Target tracking method based on multi-time step pyramid codec | |
CN110399760A (en) | A kind of batch two dimensional code localization method, device, electronic equipment and storage medium | |
CN107045722A (en) | Merge the video signal process method of static information and multidate information | |
CN110705366A (en) | Real-time human head detection method based on stair scene |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190305 |
|
RJ01 | Rejection of invention patent application after publication |