CN108876805A - The end-to-end unsupervised scene of one kind can traffic areas cognition and understanding method - Google Patents
The end-to-end unsupervised scene of one kind can traffic areas cognition and understanding method Download PDFInfo
- Publication number
- CN108876805A CN108876805A CN201810636311.5A CN201810636311A CN108876805A CN 108876805 A CN108876805 A CN 108876805A CN 201810636311 A CN201810636311 A CN 201810636311A CN 108876805 A CN108876805 A CN 108876805A
- Authority
- CN
- China
- Prior art keywords
- network
- traffic areas
- probability distribution
- fcn
- road
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 230000019771 cognition Effects 0.000 title description 12
- 238000001514 detection method Methods 0.000 claims abstract description 37
- 238000012549 training Methods 0.000 claims abstract description 30
- 238000012360 testing method Methods 0.000 claims abstract description 20
- 238000013507 mapping Methods 0.000 claims abstract description 11
- 230000008602 contraction Effects 0.000 claims description 26
- 238000005070 sampling Methods 0.000 claims description 21
- 239000003550 marker Substances 0.000 claims description 11
- 230000008034 disappearance Effects 0.000 claims description 5
- 230000008439 repair process Effects 0.000 claims description 5
- 230000004913 activation Effects 0.000 claims description 4
- 230000008569 process Effects 0.000 claims description 4
- 230000004927 fusion Effects 0.000 claims description 3
- 230000007423 decrease Effects 0.000 claims description 2
- 238000003475 lamination Methods 0.000 claims description 2
- 238000013527 convolutional neural network Methods 0.000 description 8
- 238000000605 extraction Methods 0.000 description 5
- 230000011218 segmentation Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 230000003247 decreasing effect Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 230000006872 improvement Effects 0.000 description 3
- 230000003252 repetitive effect Effects 0.000 description 3
- 238000010276 construction Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000001788 irregular Effects 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 1
- 206010039203 Road traffic accident Diseases 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000010339 dilation Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/194—Segmentation; Edge detection involving foreground-background segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a kind of end-to-end unsupervised scene road surface area determination methods, it is attached directly in convolutional layer by building site of road prior probability distribution figure and as the Feature Mapping of detection network, construct a kind of convolutional network frame for merging position priori features, the depth network architecture-UC-FCN network is constructed then in conjunction with full convolutional network and U-NET, using building can traffic areas location-prior probability distribution graph as the depth network architecture-UC-FCN network a kind of characteristic pattern map, generate UC-FCN-L network;Based on vanishing Point Detection Method method to can traffic areas carry out detect and UC-FCN-L network is trained using obtained testing result as the true value of training dataset, obtain the depth network model for can travel extracted region, solve the problems, such as can traffic areas label it is difficult, strong applicability, can under various roads environment steady operation, and real-time is preferable, and this method Detection accuracy is high, adaptability, real-time and robustness are good, and method is simple and effective.
Description
Technical field
The invention belongs to technical field of traffic control, and in particular to a kind of end-to-end self-supervisory field based on sets of video data
Scape can traffic areas cognition and understanding method.
Background technique
With the development of society, automobile has become the irreplaceable vehicles of mankind's daily life.However, it brings
Safety problem also become increasingly conspicuous.《Global road safety status report》It points out, the death toll as caused by traffic accident is annual
Up to 1,240,000, and the main reason for causing the accident is the carelessness and fatigue driving of driver, in order to alleviate such case, automobile intelligence
The development that technology can be changed is particularly important, and is driven in research in automatic Pilot based on computer vision and advanced auxiliary, vehicle
The real-time cognition and understanding that can travel region in front of are essential links, and the travelable region of vehicle includes structuring
Road surface, semi-structured road surface, non-structured road surface.The road surface of structuring is usually to have road edge line, pavement structure
It is single, such as major urban arterial highway, high speed, national highway, provincial highway etc.;Semi-structured road surface refers to the road surface of general nonstandardized technique,
Top course is that color and material differ greatly, such as parking lot, square etc., and there are also some distributor roads;Non-structured road
Face does not have structure sheaf, natural road scene.Intelligent vehicle carries out can travel area mainly in conjunction with radar and video camera at present
The cognition and understanding in domain, however radar (laser radar, millimetre-wave radar, ultrasonic radar) typically cost is higher, power consumption is larger
And it is also easy to produce and interferes with each other.
The travelable local cognition and understanding method of view-based access control model are mainly based upon road surface color, road model, road surface line
It manages the basic structural feature that feature etc. obtains road surface, further obtains vanishing point, road edge line, road by these features
The potential informations such as basic orientation (directly walk, turn left, turning right, left racing, right racing) extract these features using traditional segmentation
Method can travel the final extraction in region, however often effect is undesirable for this method using conventional segmentation, may
The traffic participants such as Some vehicles, pedestrian are extracted in travelable region, for the latter step of intelligent automobile traveling cause it is bad
It influences.
Summary of the invention
The purpose of the present invention is to provide a kind of end-to-end unsupervised scene can traffic areas cognition and understanding method, with gram
Take the deficiencies in the prior art.
In order to achieve the above objectives, the present invention adopts the following technical scheme that:
A kind of end-to-end unsupervised scene road surface area determination method, includes the following steps:
Step 1), building site of road prior probability distribution figure are simultaneously directly added as the Feature Mapping of detection network
Into convolutional layer, thus construct location-prior information be capable of in real road traffic environment flexible Application can traffic areas position
Set prior probability distribution figure;
Step 2) constructs the depth network architecture-UC-FCN network in conjunction with full convolutional network and U-NET, detects as realizing
Major network model;
Step 3), using building can traffic areas location-prior probability distribution graph as the depth network architecture-UC-FCN net
A kind of characteristic pattern of network maps, and obtains best additional position, is attached directly in the best additional position of full convolutional layer, generates UC-
FCN-L network;
Step 4), based on vanishing Point Detection Method method to can traffic areas carry out detection and using obtained testing result as instruct
The true value for practicing data set is trained UC-FCN-L network, obtains the depth network model for can travel extracted region.
It further,, can based on statistics building using the regularity of distribution of the road area in space and image in step 1)
Traffic areas location-prior probability distribution graph.
Further, lane urban road and reality without lane urban road are concentrated with based on KITTI data in step 1)
Scape figure and true value figure, to its can traffic areas count, respectively obtain under two kinds of road conditions can traffic areas position it is first
Test probability distribution graph, then to obtained under two kinds of road conditions can traffic areas location-prior probability distribution graph merge, obtain
To can traffic areas location-prior probability distribution graph.
Further, lane urban road and realistic picture without lane urban road and true are concentrated with based on KITTI data
Value figure, to its can traffic areas count, count each coordinate position be judged as can traffic areas number and to it
Average, respectively obtain under two kinds of road conditions can traffic areas location-prior probability distribution graph, in probability distribution graph, often
The brightness of a pixel indicates that the pixel belongs to the probability of target, and the brightness of pixel is higher, and the probability for belonging to target is got over
Greatly;Conversely, brightness is lower, the probability for belonging to target is smaller;By probability distribution image, can traffic areas from scene point
It separates out and, then two kinds of prior probability distribution figures are merged, obtaining can traffic areas location-prior probability distribution graph.
Further, in step 2), UC-FCN network includes contraction structure and expansion structure, contraction structure carry out convolution with
Pondization operation, gradually decreases Spatial Dimension, so that obtained image is smaller and smaller, resolution ratio is lower and lower;Then pass through expansion
Structure replaces the pondization operation in contraction structure after convolutional layer with up-sampling operation after convolutional layer, produces in network contraction structure
Raw high-resolution features have been connected in the result after expansion structure convolution, increase the resolution ratio of output, gradually restoration
The details and Spatial Dimension of body.
Further, expansion structure using up-sampling convolution repetition framework, repeat framework in up-sampling be specially on
Sampling plus ReLU activation primitive structure solve gradient using ReLU later and disappear using bilinear interpolation by 2 times of input up-sampling
Problem after up-sampling, realizes that the port number of characteristic pattern changes using convolution operation, and convolution kernel size is 3*3 in convolutional layer,
By after convolution result and contraction structure in correspond to step characteristic pattern merge, finally pass through softmax layers, obtain high-precision
Spend recognition result.
Further, in step 3), can traffic areas location-prior probability distribution graph should wait than adjusting to being connected thereto
The last one characteristic pattern size it is identical, using it is adjusted can traffic areas location-prior probability distribution graph as UC-FCN network
A kind of characteristic pattern mapping be attached in its corresponding position, generate UC-FCN-L network.
Further, to collected training image be based on disappearance point methods carry out can traffic areas detect and detected
As a result as the true value GT of training data, in network training process, reduce the net of proposition by continuously improving network parameter
Network model realization testing result is trained network with the difference for obtaining testing result based on end point, can be ultimately utilized in
Can traffic areas detection the network architecture.
Further, UC-FCN-L network is trained in step 4) by the way of unsupervised, is obtained for feasible
Sail the depth network model of extracted region.
Further, unsupervised mode training, i.e., be divided into marker samples and unmarked sample for sample, marker samples are exactly
Training sample set Dl={ (x1,y1),(x2,y2),K(xl,yl) in this l category label be known sample, unmarked sample
It is exactly training sample set Du={ xl+1,xl+2,xl+uIn the unknown sample of this u category label, u is much larger than l, based on there is label
Sample DlTraining construct model, unmarked sample DuThe unutilized such training method of the information for being included, which is referred to as, supervises
Educational inspector practises, if lacking marker samples DlIf sample, need to consider never marker samples DuStudy of the middle realization to model.
Compared with prior art, the invention has the following beneficial technical effects:
A kind of end-to-end unsupervised scene road surface area determination method of the invention, passes through building site of road prior probability point
Butut is simultaneously attached directly in convolutional layer as the Feature Mapping of detection network, to construct location-prior information in reality
Be capable of in road traffic environment flexible Application can traffic areas location-prior probability distribution graph, construct a kind of fusion location-prior
The convolutional network frame of feature constructs the depth network architecture-UC-FCN network then in conjunction with full convolutional network and U-NET, as
Realize the major network model of detection;Using building can traffic areas location-prior probability distribution graph as the depth network architecture-
A kind of characteristic pattern of UC-FCN network maps, and obtains best additional position, is attached directly in the best additional position of full convolutional layer,
Generate UC-FCN-L network;Based on vanishing Point Detection Method method to can traffic areas carry out detection and using obtained testing result as
The true value of training dataset is trained UC-FCN-L network, obtains the depth network model for can travel extracted region,
Using the mode of learning of self-supervisory, solve the problems, such as can traffic areas label it is difficult, strong applicability can be in various roads environment
Lower steady operation, and real-time is preferable, can be widely used for intelligent automobile and DAS (Driver Assistant System), compared to having travelable region
Cognition and understanding method, this method Detection accuracy is high, and adaptability, real-time and robustness are good, and method is simple and effective.
Further, lane urban road and realistic picture without lane urban road and true are concentrated with based on KITTI data
Value figure, to its can traffic areas count, respectively obtain under two kinds of road conditions can traffic areas location-prior probability point
Butut, then to obtained under two kinds of road conditions can traffic areas location-prior probability distribution graph merge, obtain to pass through
Regional location prior probability distribution figure, thus eliminate traditional convolutional neural networks it is insensitive to location-prior and caused by will have
The problem of similar appearance eigenforeground and background error detection.
Further, convolution is carried out using contraction structure and pondization operates, Spatial Dimension is gradually decreased, so that obtained figure
As smaller and smaller, resolution ratio is lower and lower;Then contraction structure is replaced with up-sampling operation after convolutional layer by expansion structure
Pondization operation after middle convolutional layer, the high-resolution features generated in network contraction structure have been connected to expansion structure convolution
In result afterwards, increase the resolution ratio of output, gradually repair the details and Spatial Dimension of object, realizes that detection speed is promoted simultaneously
The higher purpose of detection accuracy.
Detailed description of the invention
Fig. 1 is that scene can traffic areas cognition and understanding method general frame figure.
Fig. 2 is location-prior and location-prior feature schematic diagram, and (a) is the spatial distribution of object in actual traffic scene;
It (b) is that lane urban road and realistic picture and true value figure without lane urban road are concentrated with based on KITTI data.
Fig. 3 is UC-FCN network architecture schematic diagram.
Fig. 4 is that different connections can the comparison signal of traffic areas location-prior probability distribution graph.
Fig. 5 is UC-FCN-L network overall schematic.
Specific embodiment
The invention will be described in further detail with reference to the accompanying drawing:
As shown in Figure 1, a kind of end-to-end unsupervised scene road surface area determination method, specifically includes following steps:
1), the regularity of distribution using road area in space and image, based on statistics building site of road prior probability
Distribution map is simultaneously attached directly in convolutional layer as a kind of Feature Mapping of detection network, constructs location-prior information in reality
In the road traffic environment of border can with flexible Application can traffic areas location-prior probability distribution graph;
2), for can traffic areas cognition and understanding method, both pavement detection and segmentation problem, in conjunction with full convolutional network
(FCN) and U-NET constructs the new depth network architecture-UC-FCN network, as the major network model for realizing detection;
3), using building can traffic areas location-prior probability distribution graph reflect as a kind of characteristic pattern of UC-FCN network
It penetrates, its best additional position by experimental verification is attached directly in the best additional position of full convolutional layer, generates UC-FCN-L net
Network;
4), for traffic scene sets of video data is adopted certainly, to obtain the corresponding Pixel-level semantic label difficulty of training data big
Problem proposes a kind of unsupervised training method, based on traditional vanishing Point Detection Method method to can traffic areas carry out rough detection simultaneously
UC-FCN-L network being trained using obtained testing result as the true value of training dataset, being obtained for can travel region
The depth network model of extraction.
In step 1), for eliminate traditional convolutional neural networks it is insensitive to location-prior and caused by will have similar appearance
The problem of eigenforeground and background error detection, lane urban road is concentrated with based on KITTI data and without lane urban road
Realistic picture and true value figure, to its can traffic areas count, respectively obtain under two kinds of road conditions can traffic areas position
Prior probability distribution figure is set, then two kinds of prior probability distribution figures are merged, obtaining can traffic areas location-prior probability point
Butut;As shown in Fig. 2 (a), the spatial distribution of object has specific rule in actual traffic scene, for example, sky is distributed
In image top, building is distributed in image two sides, and road area is distributed in image base.Traditional convolutional neural networks are only
To target local appearance feature-sensitive, location-prior information can not be utilized, construction area may mistakenly be regarded as with
Its road area with similar appearance feature, if rationally can effectively eliminate such error detection using location-prior.For
Make location-prior information flexible Application in real road traffic environment, it is special that different input pictures should possess same position
Sign expression, therefore be attached directly to location-prior as a kind of Feature Mapping of detection network in convolutional layer.Such as Fig. 2 (b) institute
Show, is concentrated with lane urban road and realistic picture and true value figure without lane urban road based on KITTI data, can pass through to it
Region is counted, count each coordinate position be judged as can traffic areas number and average to it, respectively obtain
Under two kinds of road conditions can traffic areas location-prior probability distribution graph, in probability distribution graph, the brightness of each pixel
Indicate that the pixel belongs to the probability of target, the brightness of pixel is higher, and the probability for belonging to target is bigger;Conversely, brightness is got over
Low, the probability for belonging to target is smaller.By probability distribution image, can traffic areas separated from scene.Again
Two kinds of prior probability distribution figures are merged, obtaining can traffic areas location-prior probability distribution graph.
In step 2), a kind of new depth network architecture-UC-FCN network is proposed based on full convolutional neural networks.Convolution
Neural network (CNN) since 2012, image classification and in terms of achieve huge achievement and extensive
Using.Traditional CNN method can only extract some local features using block of pixels as sensing region, so as to cause the property of classification
It can be restricted.For this problem, Jonathan Long of UC Berkeley et al. proposes Fully
Convolutional Networks (FCN) is used for the segmentation of image, it is intended to each pixel institute is recovered from abstract feature
The classification of category.Full articulamentum in traditional CNN is converted to convolutional layer one by one by FCN, and all layers are all convolutional layers, therefore is claimed
For full convolutional network.
Based on FCN method to build network, our network of improved construction:
UC-FCN network includes contraction structure and expansion structure, and contraction structure carries out convolution and pondization operates, and is gradually decreased
Spatial Dimension, so that obtained image is smaller and smaller, resolution ratio is lower and lower, in order to extensive from the low rough image of this resolution ratio
The resolution ratio for arriving original image again replaces convolution in contraction structure with up-sampling operation specially after convolutional layer using expansion structure
Pondization operation after layer, increases the resolution ratio of output, repairs the details and Spatial Dimension of object, gradually in order to use local letter
Breath, in the setting connection of two intermodules to help expansion structure preferably to repair the details of target, specially in network contraction structure
The high-resolution features of middle generation have been connected in the result after expansion structure convolution.
Based on being operated based on FCN network struction by convolution and pondization, Spatial Dimension is gradually decreased, so that it is more next to obtain image
Based on smaller contraction structure and up-sampling, convolution operation, increase output resolution ratio, gradually repairs object detail and Spatial Dimension
Expansion structure composition UC-FCN network, in order to use local message, between two structures setting connection to help expansion structure
The high-resolution features generated in network contraction structure, are specially connected to expansion structure by the details for preferably repairing target
In result after convolution, by the improvement to two-part structure, realize that detection speed promotes the higher purpose of detection accuracy simultaneously.
Specifically, as shown in figure 3, since what the height to width ratio of characteristic pattern inputted wants small, so can traffic areas position
Set prior probability distribution figure should wait it is more identical to the last one characteristic pattern size connected to it than adjusting.It can traffic areas position
Prior probability distribution figure is connected to behind two kinds of characteristic patterns of 33*33 or 15*15, the former possesses more accurate location-prior than the latter
Information can describe more diversified, more irregular shape, more can preferably embody remote road and small-sized turning etc.
Detailed information can obtain more accurate testing result.Using it is final can traffic areas location-prior probability distribution graph as UC-
A kind of characteristic pattern mapping of FCN network is attached in its corresponding position, generates UC-FCN-L network;
Contraction structure is typical convolutional network framework, its framework is a kind of repetitive structure, repeat every time in have 2
Convolutional layer and a pond layer, convolution kernel size is 3*3 in convolutional layer, and activation primitive uses ReLU, after two convolutional layers
It is the maximum pond layer that the step-length of a 2*2 is 2, the quantity in feature channel can all double after down-sampling each time, 5 convolution ponds
It is full convolution convolutional coding structure after change repetitive structure, there is level 2 volume lamination, the improvement of FCN is exactly to change the full articulamentum of CNN here into
Convolutional layer, for FCN based on feature extraction phases (contraction structure) uses VGG16, this network is in full convolution convolutional coding structure
There is the filter of 4096 7*7, a large amount of large scale filter makes calculation amount larger, we are by the filter of full convolution convolutional coding structure
Wave device quantity is reduced to 1024 from 4096, and filter size becomes 3*3 from 7*7, and the parameter of such network is reduced, calculation amount
Corresponding to reduce, precision is also declined, and for the accuracy of identification for keeping network, does corresponding improvement in expansion structure;
Specifically, expansion structure using up-sampling convolution repetition framework, repeat framework in up-sampling be specially on adopt
Sample adds ReLU activation primitive structure, using bilinear interpolation by 2 times of input up-sampling, solves gradient disappearance using ReLU later and asks
Topic after up-sampling, uses convolution operation to realize the port number of characteristic pattern every time using up-sampling all by characteristic pattern size doubles
Change, convolution kernel size is 3*3 in convolutional layer, by after convolution result and contraction structure in correspond to the characteristic pattern fusion of step
Get up, finally passes through softmax layers, obtain recognition result.
In order to make the reduction of contraction structure filter quantity not influence accuracy of identification, specifically it is improved in expansion structure:
1) increase conv-Ncl layers between contraction structure and expansion structure, conv-Ncl layers of convolution kernel size are 1*1, are passed through
Conv-Ncl layers of characteristic pattern port number are converted into specific quantity by 1024, and characteristic pattern size is converted into 1*1, in order to simplify subsequent point
Conversion number of channels is directly set as classification number by class calculation amount;
2) to match expansion structure convolution results with contraction structure characteristic pattern port number, all framework layers of expansion structure
All use scalar value C as convolution nuclear volume coefficient, new net for the substantial increase for avoiding network parameter using multiple convolution kernels
The dilation of network has C*Ncl convolution kernel, is adjusted according to different individual features figure positions to C, is allowed to and accordingly
Contraction structure convolution nuclear volume is identical.
In step 3), using building can traffic areas location-prior probability distribution graph as a kind of feature of UC-FCN network
Figure mapping, is attached directly in full convolutional layer, extracts to position feature, generates UC-FCN-L network.Such as institute in step 1)
It states, some error detections rationally can be effectively avoided using location-prior, inputted due to the height to width ratio of characteristic pattern
Want small, thus can traffic areas location-prior probability distribution graph should wait than adjusting to the last one characteristic pattern size connected to it
It is identical.It is not difficult to find out that, the convolution for generating characteristic pattern is a kind of repetitive structure, and this structure repeats 7 from UC-FCN network
Secondary, exporting wide high is respectively 259*259,130*130,65*65,33*33,17*17,15*15 (finally full convolution characteristic pattern twice
Size is constant), can traffic areas location-prior probability distribution graph pass through and can be led for the difference of convolution layer number of feature extraction
It causing to generate result difference, the convolution number of plies is more, and the characteristic information extracted is more specific, more detail, and the convolution number of plies is fewer,
The characteristic information moon profile extracted, can more cover Global Information.Can traffic areas location-prior probability distribution graph as can
The auxiliary information of traffic areas detection, has done appropriate correction to testing result to a certain extent, to can traffic areas position it is first
The extraction for testing probability distribution graph feature should retain profile information, have detailed information to be included, therefore can traffic areas position
Prior probability distribution figure is connected to behind the characteristic pattern of 33*33, at this time can traffic areas location-prior probability distribution graph can retouch
State more diversified, more irregular shape, the feature extracted can embody the profile informations such as road general shape, position and
The detailed information such as remote road and small-sized turning can be preferably embodied, more accurate testing result can be obtained, it will most
It is whole can traffic areas location-prior probability distribution graph be placed in its corresponding position, obtain the depth net for can travel extracted region
Network model, as shown in Figure 5.
UC-FCN-L network being trained by the way of unsupervised in step 4), being obtained for can travel extracted region
Depth network model.Sample is indispensable in the training process of deep learning, is broadly divided into marker samples and unmarked sample
This, marker samples are exactly training sample set Dl={ (x1,y1),(x2,y2),K(xl,yl) in this l category label be known
Sample, unmarked sample are exactly training sample set Du={ xl+1,xl+2,xl+u(u is unknown much larger than this u category label in l)
Sample.Based on marked sample DlTraining construct model, unmarked sample DuThe unutilized such instruction of the information for being included
The mode of white silk is referred to as supervised learning, if lacking marker samples DlIf sample, need to consider never marker samples DuMiddle realization pair
The study of model, it is this to only have and be referred to as unsupervised learning merely with the training method of unmarked sample.
It is proposed by the present invention for scene can traffic areas cognition with understand the network architecture be based on from adopt traffic scene view
Frequency data set, as shown in figure 4, the real image data comprising the acquisition of the scenes such as urban district, rural area and highway, chooses therein
Part image data is trained and tests, can traffic areas cognition with understanding method be substantially to picture carry out pixel fraction
It cuts, to obtain image segmentation true value, it is required that training data obtains its corresponding Pixel-level semantic label, however acquire
It is very big that a large amount of live-action datas carry out Pixel-level label difficulty to it, must be trained using unsupervised method to network.
Specifically, first with conventional method to collected training image be based on disappearance point methods carries out can traffic areas inspection
It surveys.End point is exactly unique intersection point of one group of space parallel lines imaging on the image plane.Passing through based on end point
Region detection mainly has following steps:Texture analysis is carried out on multiple scales using Gabor wavelet, and it is inapparent to give up texture
Point;The relationship for investigating each point and texture information calculates the score of each point using the method that texture is voted;It is sought according to end point
Road edge is looked for, road surface region is obtained.Based on end point can traffic areas detection effect, the detection that will be obtained based on end point
As a result as the true value GT of training data, in network training process, reduce the net of proposition by continuously improving network parameter
Network model realization testing result is trained network with the difference for obtaining testing result based on end point, can be ultimately utilized in
Can traffic areas detection the network architecture.
Claims (10)
1. a kind of end-to-end unsupervised scene road surface area determination method, which is characterized in that include the following steps:
Step 1), building site of road prior probability distribution figure and the Feature Mapping as detection network are attached directly to roll up
In lamination, thus construct location-prior information be capable of in real road traffic environment flexible Application can traffic areas position it is first
Test probability distribution graph;
Step 2) constructs the depth network architecture-UC-FCN network in conjunction with full convolutional network and U-NET, as the master for realizing detection
Volume grid model;
Step 3), using building can traffic areas location-prior probability distribution graph as the depth network architecture-UC-FCN network
A kind of characteristic pattern mapping, obtains best additional position, is attached directly in the best additional position of full convolutional layer, generate UC-FCN-L
Network;
Step 4), based on vanishing Point Detection Method method to can traffic areas carry out detection and using obtained testing result as trained number
UC-FCN-L network being trained according to the true value of collection, obtaining the depth network model for can travel extracted region.
2. the end-to-end unsupervised scene road surface area determination method of one kind according to claim 1, which is characterized in that step
It 1), can traffic areas location-prior probability based on statistics building using the regularity of distribution of the road area in space and image in
Distribution map.
3. the end-to-end unsupervised scene road surface area determination method of one kind according to claim 1 or 2, which is characterized in that
It is concentrated with lane urban road and realistic picture and true value figure without lane urban road based on KITTI data in step 1), to it
Can traffic areas counted, respectively obtain under two kinds of road conditions can traffic areas location-prior probability distribution graph, then it is right
Obtained under two kinds of road conditions can traffic areas location-prior probability distribution graph merged, obtain can traffic areas position it is first
Test probability distribution graph.
4. the end-to-end unsupervised scene road surface area determination method of one kind according to claim 3, which is characterized in that be based on
KITTI data are concentrated with lane urban road and realistic picture and true value figure without lane urban road, to its can traffic areas into
Row statistics, count each coordinate position be judged as can traffic areas number and average to it, respectively obtain two kinds of roads
In the case of road can traffic areas location-prior probability distribution graph, in probability distribution graph, the brightness of each pixel indicates should
Pixel belongs to the probability of target, and the brightness of pixel is higher, and the probability for belonging to target is bigger;Conversely, brightness is lower, belong to
It is smaller in the probability of target;By probability distribution image, can traffic areas separated from scene, then it is general to two kinds of priori
Rate distribution map is merged, and obtaining can traffic areas location-prior probability distribution graph.
5. the end-to-end unsupervised scene road surface area determination method of one kind according to claim 1, which is characterized in that step
2) in, UC-FCN network includes contraction structure and expansion structure, and contraction structure carries out convolution and pondization operates, and gradually decreases space
Dimension, so that obtained image is smaller and smaller, resolution ratio is lower and lower;Then up-sampling is used after convolutional layer by expansion structure
Operation replaces the pondization operation in contraction structure after convolutional layer, and the high-resolution features generated in network contraction structure are connected
In result after having arrived expansion structure convolution, increase the resolution ratio of output, gradually repairs the details and Spatial Dimension of object.
6. the end-to-end unsupervised scene road surface area determination method of one kind according to claim 5, which is characterized in that expansion
For structure using the repetition framework of up-sampling convolution, repeating the up-sampling in framework is specially up-sampling plus ReLU activation primitive knot
Structure, using bilinear interpolation will 2 times of input up-sampling, solves the problems, such as gradient disappearance using ReLU later, after up-sampling, use
Convolution operation realizes that the port number of characteristic pattern changes, and convolution kernel size is 3*3 in convolutional layer, by the result and contraction after convolution
The characteristic pattern fusion that step is corresponded in structure, finally passes through softmax layers, obtains high-precision recognition result.
7. the end-to-end unsupervised scene road surface area determination method of one kind according to claim 1, which is characterized in that step
3) in, can traffic areas location-prior probability distribution graph should wait than adjusting to the last one characteristic pattern size phase connected to it
Together, using it is adjusted can traffic areas location-prior probability distribution graph as UC-FCN network a kind of characteristic pattern mapping be attached to
In its corresponding position, UC-FCN-L network is generated.
8. the end-to-end unsupervised scene road surface area determination method of one kind according to claim 1, which is characterized in that adopting
The training image collected based on disappearance point methods carry out can traffic areas detection and using its testing result as the true of training data
Value GT, in network training process, by continuously improving network parameter with reduce proposition network model realize testing result with
The difference for obtaining testing result based on end point is trained network, obtain can be ultimately utilized in can traffic areas detection network
Framework.
9. the end-to-end unsupervised scene road surface area determination method of one kind according to claim 1, which is characterized in that step
4) UC-FCN-L network being trained by the way of unsupervised in, obtaining the depth network mould for can travel extracted region
Type.
10. the end-to-end unsupervised scene road surface area determination method of one kind according to claim 9, which is characterized in that nothing
Monitor mode training, i.e., be divided into marker samples and unmarked sample for sample, marker samples are exactly training sample set Dl={ (x1,
y1),(x2,y2),K(xl,yl) in this l category label be known sample, unmarked sample is exactly training sample set Du=
{xl+1,xl+2,xl+uIn the unknown sample of this u category label, u is much larger than l, is based on marked sample DlTraining construct
Model, unmarked sample DuThe unutilized such training method of the information for being included is referred to as supervised learning, if lacking mark
Remember sample DlIf sample, need to consider never marker samples DuStudy of the middle realization to model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810636311.5A CN108876805B (en) | 2018-06-20 | 2018-06-20 | End-to-end unsupervised scene passable area cognition and understanding method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810636311.5A CN108876805B (en) | 2018-06-20 | 2018-06-20 | End-to-end unsupervised scene passable area cognition and understanding method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108876805A true CN108876805A (en) | 2018-11-23 |
CN108876805B CN108876805B (en) | 2021-07-27 |
Family
ID=64340750
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810636311.5A Active CN108876805B (en) | 2018-06-20 | 2018-06-20 | End-to-end unsupervised scene passable area cognition and understanding method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108876805B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110415187A (en) * | 2019-07-04 | 2019-11-05 | 深圳市华星光电技术有限公司 | Image processing method and image processing system |
CN111369566A (en) * | 2018-12-25 | 2020-07-03 | 杭州海康威视数字技术股份有限公司 | Method, device and equipment for determining position of pavement blanking point and storage medium |
CN113221826A (en) * | 2021-05-31 | 2021-08-06 | 浙江工商大学 | Road detection method based on self-supervision learning significance estimation pixel embedding |
CN113392809A (en) * | 2019-02-21 | 2021-09-14 | 百度在线网络技术(北京)有限公司 | Automatic driving information processing method and device and storage medium |
WO2022087853A1 (en) * | 2020-10-27 | 2022-05-05 | 深圳市深光粟科技有限公司 | Image segmentation method and apparatus, and computer-readable storage medium |
US11473927B2 (en) * | 2020-02-05 | 2022-10-18 | Electronic Arts Inc. | Generating positions of map items for placement on a virtual map |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103034862A (en) * | 2012-12-14 | 2013-04-10 | 北京诚达交通科技有限公司 | Road snow and rain state automatic identification method based on feature information classification |
CN106650690A (en) * | 2016-12-30 | 2017-05-10 | 东华大学 | Night vision image scene identification method based on deep convolution-deconvolution neural network |
CN107492071A (en) * | 2017-08-17 | 2017-12-19 | 京东方科技集团股份有限公司 | Medical image processing method and equipment |
CN107808140A (en) * | 2017-11-07 | 2018-03-16 | 浙江大学 | A kind of monocular vision Road Recognition Algorithm based on image co-registration |
CN108062753A (en) * | 2017-12-29 | 2018-05-22 | 重庆理工大学 | The adaptive brain tumor semantic segmentation method in unsupervised domain based on depth confrontation study |
-
2018
- 2018-06-20 CN CN201810636311.5A patent/CN108876805B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103034862A (en) * | 2012-12-14 | 2013-04-10 | 北京诚达交通科技有限公司 | Road snow and rain state automatic identification method based on feature information classification |
CN106650690A (en) * | 2016-12-30 | 2017-05-10 | 东华大学 | Night vision image scene identification method based on deep convolution-deconvolution neural network |
CN107492071A (en) * | 2017-08-17 | 2017-12-19 | 京东方科技集团股份有限公司 | Medical image processing method and equipment |
CN107808140A (en) * | 2017-11-07 | 2018-03-16 | 浙江大学 | A kind of monocular vision Road Recognition Algorithm based on image co-registration |
CN108062753A (en) * | 2017-12-29 | 2018-05-22 | 重庆理工大学 | The adaptive brain tumor semantic segmentation method in unsupervised domain based on depth confrontation study |
Non-Patent Citations (1)
Title |
---|
STIAAN WIEHMAN.ETC: ""Unsupervised Pre-training for Fully Convolutional Neural Networks"", 《2016 PATTERN RECOGNITION ASSOCIATION OF SOUTH AFRICA AND ROBOTICS AND MECHATRONICS INTERNATIONAL CONFERENCE》 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111369566A (en) * | 2018-12-25 | 2020-07-03 | 杭州海康威视数字技术股份有限公司 | Method, device and equipment for determining position of pavement blanking point and storage medium |
CN111369566B (en) * | 2018-12-25 | 2023-12-05 | 杭州海康威视数字技术股份有限公司 | Method, device, equipment and storage medium for determining position of pavement blanking point |
CN113392809A (en) * | 2019-02-21 | 2021-09-14 | 百度在线网络技术(北京)有限公司 | Automatic driving information processing method and device and storage medium |
CN113392809B (en) * | 2019-02-21 | 2023-08-15 | 百度在线网络技术(北京)有限公司 | Automatic driving information processing method, device and storage medium |
CN110415187A (en) * | 2019-07-04 | 2019-11-05 | 深圳市华星光电技术有限公司 | Image processing method and image processing system |
CN110415187B (en) * | 2019-07-04 | 2021-07-23 | Tcl华星光电技术有限公司 | Image processing method and image processing system |
US11473927B2 (en) * | 2020-02-05 | 2022-10-18 | Electronic Arts Inc. | Generating positions of map items for placement on a virtual map |
US20220412765A1 (en) * | 2020-02-05 | 2022-12-29 | Electronic Arts Inc. | Generating Positions of Map Items for Placement on a Virtual Map |
US11668581B2 (en) * | 2020-02-05 | 2023-06-06 | Electronic Arts Inc. | Generating positions of map items for placement on a virtual map |
WO2022087853A1 (en) * | 2020-10-27 | 2022-05-05 | 深圳市深光粟科技有限公司 | Image segmentation method and apparatus, and computer-readable storage medium |
CN113221826A (en) * | 2021-05-31 | 2021-08-06 | 浙江工商大学 | Road detection method based on self-supervision learning significance estimation pixel embedding |
CN113221826B (en) * | 2021-05-31 | 2023-05-02 | 浙江工商大学 | Road detection method based on self-supervision learning significance estimation pixel embedding |
Also Published As
Publication number | Publication date |
---|---|
CN108876805B (en) | 2021-07-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108876805A (en) | The end-to-end unsupervised scene of one kind can traffic areas cognition and understanding method | |
CN109934163A (en) | A kind of aerial image vehicle checking method merged again based on scene priori and feature | |
CN113936139B (en) | Scene aerial view reconstruction method and system combining visual depth information and semantic segmentation | |
CN106920243A (en) | The ceramic material part method for sequence image segmentation of improved full convolutional neural networks | |
CN106408015A (en) | Road fork identification and depth estimation method based on convolutional neural network | |
CN110263833A (en) | Based on coding-decoding structure image, semantic dividing method | |
CN110853057B (en) | Aerial image segmentation method based on global and multi-scale full-convolution network | |
CN104318569A (en) | Space salient region extraction method based on depth variation model | |
CN108256464A (en) | High-resolution remote sensing image urban road extracting method based on deep learning | |
CN113052106B (en) | Airplane take-off and landing runway identification method based on PSPNet network | |
CN106355643A (en) | Method for generating three-dimensional real scene road model of highway | |
CN113505842A (en) | Automatic extraction method suitable for large-scale regional remote sensing image urban building | |
Cao et al. | MCS-YOLO: A multiscale object detection method for autonomous driving road environment recognition | |
CN115292913A (en) | Vehicle-road-cooperation-oriented drive test perception simulation system | |
CN111599007B (en) | Smart city CIM road mapping method based on unmanned aerial vehicle aerial photography | |
CN106295491A (en) | Track line detection method and device | |
CN102254162B (en) | Method for detecting airport runway in synthetic aperture radar (SAR) image based on minimum linear ratio | |
CN115661032A (en) | Intelligent pavement disease detection method suitable for complex background | |
Tian et al. | Road marking detection based on mask R-CNN instance segmentation model | |
CN114943902A (en) | Urban vegetation unmanned aerial vehicle remote sensing classification method based on multi-scale feature perception network | |
CN110472508A (en) | Lane line distance measuring method based on deep learning and binocular vision | |
CN113361528A (en) | Multi-scale target detection method and system | |
CN114708560B (en) | YOLOX algorithm-based illegal parking detection method and system | |
CN103886289A (en) | Direction self-adaptive method and system for identifying on-water bridge targets | |
CN116385716A (en) | Three-dimensional map ground object data automatic production method based on remote sensing map |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20240207 Address after: 710200 Jingwei Industrial Park, economic development zone, Xi'an City, Shaanxi Province Patentee after: SHAANXI HEAVY DUTY AUTOMOBILE Co.,Ltd. Country or region after: China Address before: 710064 No. 33, South Second Ring Road, Shaanxi, Xi'an Patentee before: CHANG'AN University Country or region before: China |
|
TR01 | Transfer of patent right |