CN108876805A - The end-to-end unsupervised scene of one kind can traffic areas cognition and understanding method - Google Patents

The end-to-end unsupervised scene of one kind can traffic areas cognition and understanding method Download PDF

Info

Publication number
CN108876805A
CN108876805A CN201810636311.5A CN201810636311A CN108876805A CN 108876805 A CN108876805 A CN 108876805A CN 201810636311 A CN201810636311 A CN 201810636311A CN 108876805 A CN108876805 A CN 108876805A
Authority
CN
China
Prior art keywords
network
traffic areas
probability distribution
fcn
road
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810636311.5A
Other languages
Chinese (zh)
Other versions
CN108876805B (en
Inventor
赵祥模
刘占文
樊星
高涛
董鸣
沈超
王润民
连心雨
徐江
张凡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shaanxi Heavy Duty Automobile Co Ltd
Original Assignee
Changan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changan University filed Critical Changan University
Priority to CN201810636311.5A priority Critical patent/CN108876805B/en
Publication of CN108876805A publication Critical patent/CN108876805A/en
Application granted granted Critical
Publication of CN108876805B publication Critical patent/CN108876805B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a kind of end-to-end unsupervised scene road surface area determination methods, it is attached directly in convolutional layer by building site of road prior probability distribution figure and as the Feature Mapping of detection network, construct a kind of convolutional network frame for merging position priori features, the depth network architecture-UC-FCN network is constructed then in conjunction with full convolutional network and U-NET, using building can traffic areas location-prior probability distribution graph as the depth network architecture-UC-FCN network a kind of characteristic pattern map, generate UC-FCN-L network;Based on vanishing Point Detection Method method to can traffic areas carry out detect and UC-FCN-L network is trained using obtained testing result as the true value of training dataset, obtain the depth network model for can travel extracted region, solve the problems, such as can traffic areas label it is difficult, strong applicability, can under various roads environment steady operation, and real-time is preferable, and this method Detection accuracy is high, adaptability, real-time and robustness are good, and method is simple and effective.

Description

The end-to-end unsupervised scene of one kind can traffic areas cognition and understanding method
Technical field
The invention belongs to technical field of traffic control, and in particular to a kind of end-to-end self-supervisory field based on sets of video data Scape can traffic areas cognition and understanding method.
Background technique
With the development of society, automobile has become the irreplaceable vehicles of mankind's daily life.However, it brings Safety problem also become increasingly conspicuous.《Global road safety status report》It points out, the death toll as caused by traffic accident is annual Up to 1,240,000, and the main reason for causing the accident is the carelessness and fatigue driving of driver, in order to alleviate such case, automobile intelligence The development that technology can be changed is particularly important, and is driven in research in automatic Pilot based on computer vision and advanced auxiliary, vehicle The real-time cognition and understanding that can travel region in front of are essential links, and the travelable region of vehicle includes structuring Road surface, semi-structured road surface, non-structured road surface.The road surface of structuring is usually to have road edge line, pavement structure It is single, such as major urban arterial highway, high speed, national highway, provincial highway etc.;Semi-structured road surface refers to the road surface of general nonstandardized technique, Top course is that color and material differ greatly, such as parking lot, square etc., and there are also some distributor roads;Non-structured road Face does not have structure sheaf, natural road scene.Intelligent vehicle carries out can travel area mainly in conjunction with radar and video camera at present The cognition and understanding in domain, however radar (laser radar, millimetre-wave radar, ultrasonic radar) typically cost is higher, power consumption is larger And it is also easy to produce and interferes with each other.
The travelable local cognition and understanding method of view-based access control model are mainly based upon road surface color, road model, road surface line It manages the basic structural feature that feature etc. obtains road surface, further obtains vanishing point, road edge line, road by these features The potential informations such as basic orientation (directly walk, turn left, turning right, left racing, right racing) extract these features using traditional segmentation Method can travel the final extraction in region, however often effect is undesirable for this method using conventional segmentation, may The traffic participants such as Some vehicles, pedestrian are extracted in travelable region, for the latter step of intelligent automobile traveling cause it is bad It influences.
Summary of the invention
The purpose of the present invention is to provide a kind of end-to-end unsupervised scene can traffic areas cognition and understanding method, with gram Take the deficiencies in the prior art.
In order to achieve the above objectives, the present invention adopts the following technical scheme that:
A kind of end-to-end unsupervised scene road surface area determination method, includes the following steps:
Step 1), building site of road prior probability distribution figure are simultaneously directly added as the Feature Mapping of detection network Into convolutional layer, thus construct location-prior information be capable of in real road traffic environment flexible Application can traffic areas position Set prior probability distribution figure;
Step 2) constructs the depth network architecture-UC-FCN network in conjunction with full convolutional network and U-NET, detects as realizing Major network model;
Step 3), using building can traffic areas location-prior probability distribution graph as the depth network architecture-UC-FCN net A kind of characteristic pattern of network maps, and obtains best additional position, is attached directly in the best additional position of full convolutional layer, generates UC- FCN-L network;
Step 4), based on vanishing Point Detection Method method to can traffic areas carry out detection and using obtained testing result as instruct The true value for practicing data set is trained UC-FCN-L network, obtains the depth network model for can travel extracted region.
It further,, can based on statistics building using the regularity of distribution of the road area in space and image in step 1) Traffic areas location-prior probability distribution graph.
Further, lane urban road and reality without lane urban road are concentrated with based on KITTI data in step 1) Scape figure and true value figure, to its can traffic areas count, respectively obtain under two kinds of road conditions can traffic areas position it is first Test probability distribution graph, then to obtained under two kinds of road conditions can traffic areas location-prior probability distribution graph merge, obtain To can traffic areas location-prior probability distribution graph.
Further, lane urban road and realistic picture without lane urban road and true are concentrated with based on KITTI data Value figure, to its can traffic areas count, count each coordinate position be judged as can traffic areas number and to it Average, respectively obtain under two kinds of road conditions can traffic areas location-prior probability distribution graph, in probability distribution graph, often The brightness of a pixel indicates that the pixel belongs to the probability of target, and the brightness of pixel is higher, and the probability for belonging to target is got over Greatly;Conversely, brightness is lower, the probability for belonging to target is smaller;By probability distribution image, can traffic areas from scene point It separates out and, then two kinds of prior probability distribution figures are merged, obtaining can traffic areas location-prior probability distribution graph.
Further, in step 2), UC-FCN network includes contraction structure and expansion structure, contraction structure carry out convolution with Pondization operation, gradually decreases Spatial Dimension, so that obtained image is smaller and smaller, resolution ratio is lower and lower;Then pass through expansion Structure replaces the pondization operation in contraction structure after convolutional layer with up-sampling operation after convolutional layer, produces in network contraction structure Raw high-resolution features have been connected in the result after expansion structure convolution, increase the resolution ratio of output, gradually restoration The details and Spatial Dimension of body.
Further, expansion structure using up-sampling convolution repetition framework, repeat framework in up-sampling be specially on Sampling plus ReLU activation primitive structure solve gradient using ReLU later and disappear using bilinear interpolation by 2 times of input up-sampling Problem after up-sampling, realizes that the port number of characteristic pattern changes using convolution operation, and convolution kernel size is 3*3 in convolutional layer, By after convolution result and contraction structure in correspond to step characteristic pattern merge, finally pass through softmax layers, obtain high-precision Spend recognition result.
Further, in step 3), can traffic areas location-prior probability distribution graph should wait than adjusting to being connected thereto The last one characteristic pattern size it is identical, using it is adjusted can traffic areas location-prior probability distribution graph as UC-FCN network A kind of characteristic pattern mapping be attached in its corresponding position, generate UC-FCN-L network.
Further, to collected training image be based on disappearance point methods carry out can traffic areas detect and detected As a result as the true value GT of training data, in network training process, reduce the net of proposition by continuously improving network parameter Network model realization testing result is trained network with the difference for obtaining testing result based on end point, can be ultimately utilized in Can traffic areas detection the network architecture.
Further, UC-FCN-L network is trained in step 4) by the way of unsupervised, is obtained for feasible Sail the depth network model of extracted region.
Further, unsupervised mode training, i.e., be divided into marker samples and unmarked sample for sample, marker samples are exactly Training sample set Dl={ (x1,y1),(x2,y2),K(xl,yl) in this l category label be known sample, unmarked sample It is exactly training sample set Du={ xl+1,xl+2,xl+uIn the unknown sample of this u category label, u is much larger than l, based on there is label Sample DlTraining construct model, unmarked sample DuThe unutilized such training method of the information for being included, which is referred to as, supervises Educational inspector practises, if lacking marker samples DlIf sample, need to consider never marker samples DuStudy of the middle realization to model.
Compared with prior art, the invention has the following beneficial technical effects:
A kind of end-to-end unsupervised scene road surface area determination method of the invention, passes through building site of road prior probability point Butut is simultaneously attached directly in convolutional layer as the Feature Mapping of detection network, to construct location-prior information in reality Be capable of in road traffic environment flexible Application can traffic areas location-prior probability distribution graph, construct a kind of fusion location-prior The convolutional network frame of feature constructs the depth network architecture-UC-FCN network then in conjunction with full convolutional network and U-NET, as Realize the major network model of detection;Using building can traffic areas location-prior probability distribution graph as the depth network architecture- A kind of characteristic pattern of UC-FCN network maps, and obtains best additional position, is attached directly in the best additional position of full convolutional layer, Generate UC-FCN-L network;Based on vanishing Point Detection Method method to can traffic areas carry out detection and using obtained testing result as The true value of training dataset is trained UC-FCN-L network, obtains the depth network model for can travel extracted region, Using the mode of learning of self-supervisory, solve the problems, such as can traffic areas label it is difficult, strong applicability can be in various roads environment Lower steady operation, and real-time is preferable, can be widely used for intelligent automobile and DAS (Driver Assistant System), compared to having travelable region Cognition and understanding method, this method Detection accuracy is high, and adaptability, real-time and robustness are good, and method is simple and effective.
Further, lane urban road and realistic picture without lane urban road and true are concentrated with based on KITTI data Value figure, to its can traffic areas count, respectively obtain under two kinds of road conditions can traffic areas location-prior probability point Butut, then to obtained under two kinds of road conditions can traffic areas location-prior probability distribution graph merge, obtain to pass through Regional location prior probability distribution figure, thus eliminate traditional convolutional neural networks it is insensitive to location-prior and caused by will have The problem of similar appearance eigenforeground and background error detection.
Further, convolution is carried out using contraction structure and pondization operates, Spatial Dimension is gradually decreased, so that obtained figure As smaller and smaller, resolution ratio is lower and lower;Then contraction structure is replaced with up-sampling operation after convolutional layer by expansion structure Pondization operation after middle convolutional layer, the high-resolution features generated in network contraction structure have been connected to expansion structure convolution In result afterwards, increase the resolution ratio of output, gradually repair the details and Spatial Dimension of object, realizes that detection speed is promoted simultaneously The higher purpose of detection accuracy.
Detailed description of the invention
Fig. 1 is that scene can traffic areas cognition and understanding method general frame figure.
Fig. 2 is location-prior and location-prior feature schematic diagram, and (a) is the spatial distribution of object in actual traffic scene; It (b) is that lane urban road and realistic picture and true value figure without lane urban road are concentrated with based on KITTI data.
Fig. 3 is UC-FCN network architecture schematic diagram.
Fig. 4 is that different connections can the comparison signal of traffic areas location-prior probability distribution graph.
Fig. 5 is UC-FCN-L network overall schematic.
Specific embodiment
The invention will be described in further detail with reference to the accompanying drawing:
As shown in Figure 1, a kind of end-to-end unsupervised scene road surface area determination method, specifically includes following steps:
1), the regularity of distribution using road area in space and image, based on statistics building site of road prior probability Distribution map is simultaneously attached directly in convolutional layer as a kind of Feature Mapping of detection network, constructs location-prior information in reality In the road traffic environment of border can with flexible Application can traffic areas location-prior probability distribution graph;
2), for can traffic areas cognition and understanding method, both pavement detection and segmentation problem, in conjunction with full convolutional network (FCN) and U-NET constructs the new depth network architecture-UC-FCN network, as the major network model for realizing detection;
3), using building can traffic areas location-prior probability distribution graph reflect as a kind of characteristic pattern of UC-FCN network It penetrates, its best additional position by experimental verification is attached directly in the best additional position of full convolutional layer, generates UC-FCN-L net Network;
4), for traffic scene sets of video data is adopted certainly, to obtain the corresponding Pixel-level semantic label difficulty of training data big Problem proposes a kind of unsupervised training method, based on traditional vanishing Point Detection Method method to can traffic areas carry out rough detection simultaneously UC-FCN-L network being trained using obtained testing result as the true value of training dataset, being obtained for can travel region The depth network model of extraction.
In step 1), for eliminate traditional convolutional neural networks it is insensitive to location-prior and caused by will have similar appearance The problem of eigenforeground and background error detection, lane urban road is concentrated with based on KITTI data and without lane urban road Realistic picture and true value figure, to its can traffic areas count, respectively obtain under two kinds of road conditions can traffic areas position Prior probability distribution figure is set, then two kinds of prior probability distribution figures are merged, obtaining can traffic areas location-prior probability point Butut;As shown in Fig. 2 (a), the spatial distribution of object has specific rule in actual traffic scene, for example, sky is distributed In image top, building is distributed in image two sides, and road area is distributed in image base.Traditional convolutional neural networks are only To target local appearance feature-sensitive, location-prior information can not be utilized, construction area may mistakenly be regarded as with Its road area with similar appearance feature, if rationally can effectively eliminate such error detection using location-prior.For Make location-prior information flexible Application in real road traffic environment, it is special that different input pictures should possess same position Sign expression, therefore be attached directly to location-prior as a kind of Feature Mapping of detection network in convolutional layer.Such as Fig. 2 (b) institute Show, is concentrated with lane urban road and realistic picture and true value figure without lane urban road based on KITTI data, can pass through to it Region is counted, count each coordinate position be judged as can traffic areas number and average to it, respectively obtain Under two kinds of road conditions can traffic areas location-prior probability distribution graph, in probability distribution graph, the brightness of each pixel Indicate that the pixel belongs to the probability of target, the brightness of pixel is higher, and the probability for belonging to target is bigger;Conversely, brightness is got over Low, the probability for belonging to target is smaller.By probability distribution image, can traffic areas separated from scene.Again Two kinds of prior probability distribution figures are merged, obtaining can traffic areas location-prior probability distribution graph.
In step 2), a kind of new depth network architecture-UC-FCN network is proposed based on full convolutional neural networks.Convolution Neural network (CNN) since 2012, image classification and in terms of achieve huge achievement and extensive Using.Traditional CNN method can only extract some local features using block of pixels as sensing region, so as to cause the property of classification It can be restricted.For this problem, Jonathan Long of UC Berkeley et al. proposes Fully Convolutional Networks (FCN) is used for the segmentation of image, it is intended to each pixel institute is recovered from abstract feature The classification of category.Full articulamentum in traditional CNN is converted to convolutional layer one by one by FCN, and all layers are all convolutional layers, therefore is claimed For full convolutional network.
Based on FCN method to build network, our network of improved construction:
UC-FCN network includes contraction structure and expansion structure, and contraction structure carries out convolution and pondization operates, and is gradually decreased Spatial Dimension, so that obtained image is smaller and smaller, resolution ratio is lower and lower, in order to extensive from the low rough image of this resolution ratio The resolution ratio for arriving original image again replaces convolution in contraction structure with up-sampling operation specially after convolutional layer using expansion structure Pondization operation after layer, increases the resolution ratio of output, repairs the details and Spatial Dimension of object, gradually in order to use local letter Breath, in the setting connection of two intermodules to help expansion structure preferably to repair the details of target, specially in network contraction structure The high-resolution features of middle generation have been connected in the result after expansion structure convolution.
Based on being operated based on FCN network struction by convolution and pondization, Spatial Dimension is gradually decreased, so that it is more next to obtain image Based on smaller contraction structure and up-sampling, convolution operation, increase output resolution ratio, gradually repairs object detail and Spatial Dimension Expansion structure composition UC-FCN network, in order to use local message, between two structures setting connection to help expansion structure The high-resolution features generated in network contraction structure, are specially connected to expansion structure by the details for preferably repairing target In result after convolution, by the improvement to two-part structure, realize that detection speed promotes the higher purpose of detection accuracy simultaneously.
Specifically, as shown in figure 3, since what the height to width ratio of characteristic pattern inputted wants small, so can traffic areas position Set prior probability distribution figure should wait it is more identical to the last one characteristic pattern size connected to it than adjusting.It can traffic areas position Prior probability distribution figure is connected to behind two kinds of characteristic patterns of 33*33 or 15*15, the former possesses more accurate location-prior than the latter Information can describe more diversified, more irregular shape, more can preferably embody remote road and small-sized turning etc. Detailed information can obtain more accurate testing result.Using it is final can traffic areas location-prior probability distribution graph as UC- A kind of characteristic pattern mapping of FCN network is attached in its corresponding position, generates UC-FCN-L network;
Contraction structure is typical convolutional network framework, its framework is a kind of repetitive structure, repeat every time in have 2 Convolutional layer and a pond layer, convolution kernel size is 3*3 in convolutional layer, and activation primitive uses ReLU, after two convolutional layers It is the maximum pond layer that the step-length of a 2*2 is 2, the quantity in feature channel can all double after down-sampling each time, 5 convolution ponds It is full convolution convolutional coding structure after change repetitive structure, there is level 2 volume lamination, the improvement of FCN is exactly to change the full articulamentum of CNN here into Convolutional layer, for FCN based on feature extraction phases (contraction structure) uses VGG16, this network is in full convolution convolutional coding structure There is the filter of 4096 7*7, a large amount of large scale filter makes calculation amount larger, we are by the filter of full convolution convolutional coding structure Wave device quantity is reduced to 1024 from 4096, and filter size becomes 3*3 from 7*7, and the parameter of such network is reduced, calculation amount Corresponding to reduce, precision is also declined, and for the accuracy of identification for keeping network, does corresponding improvement in expansion structure;
Specifically, expansion structure using up-sampling convolution repetition framework, repeat framework in up-sampling be specially on adopt Sample adds ReLU activation primitive structure, using bilinear interpolation by 2 times of input up-sampling, solves gradient disappearance using ReLU later and asks Topic after up-sampling, uses convolution operation to realize the port number of characteristic pattern every time using up-sampling all by characteristic pattern size doubles Change, convolution kernel size is 3*3 in convolutional layer, by after convolution result and contraction structure in correspond to the characteristic pattern fusion of step Get up, finally passes through softmax layers, obtain recognition result.
In order to make the reduction of contraction structure filter quantity not influence accuracy of identification, specifically it is improved in expansion structure:
1) increase conv-Ncl layers between contraction structure and expansion structure, conv-Ncl layers of convolution kernel size are 1*1, are passed through Conv-Ncl layers of characteristic pattern port number are converted into specific quantity by 1024, and characteristic pattern size is converted into 1*1, in order to simplify subsequent point Conversion number of channels is directly set as classification number by class calculation amount;
2) to match expansion structure convolution results with contraction structure characteristic pattern port number, all framework layers of expansion structure All use scalar value C as convolution nuclear volume coefficient, new net for the substantial increase for avoiding network parameter using multiple convolution kernels The dilation of network has C*Ncl convolution kernel, is adjusted according to different individual features figure positions to C, is allowed to and accordingly Contraction structure convolution nuclear volume is identical.
In step 3), using building can traffic areas location-prior probability distribution graph as a kind of feature of UC-FCN network Figure mapping, is attached directly in full convolutional layer, extracts to position feature, generates UC-FCN-L network.Such as institute in step 1) It states, some error detections rationally can be effectively avoided using location-prior, inputted due to the height to width ratio of characteristic pattern Want small, thus can traffic areas location-prior probability distribution graph should wait than adjusting to the last one characteristic pattern size connected to it It is identical.It is not difficult to find out that, the convolution for generating characteristic pattern is a kind of repetitive structure, and this structure repeats 7 from UC-FCN network Secondary, exporting wide high is respectively 259*259,130*130,65*65,33*33,17*17,15*15 (finally full convolution characteristic pattern twice Size is constant), can traffic areas location-prior probability distribution graph pass through and can be led for the difference of convolution layer number of feature extraction It causing to generate result difference, the convolution number of plies is more, and the characteristic information extracted is more specific, more detail, and the convolution number of plies is fewer, The characteristic information moon profile extracted, can more cover Global Information.Can traffic areas location-prior probability distribution graph as can The auxiliary information of traffic areas detection, has done appropriate correction to testing result to a certain extent, to can traffic areas position it is first The extraction for testing probability distribution graph feature should retain profile information, have detailed information to be included, therefore can traffic areas position Prior probability distribution figure is connected to behind the characteristic pattern of 33*33, at this time can traffic areas location-prior probability distribution graph can retouch State more diversified, more irregular shape, the feature extracted can embody the profile informations such as road general shape, position and The detailed information such as remote road and small-sized turning can be preferably embodied, more accurate testing result can be obtained, it will most It is whole can traffic areas location-prior probability distribution graph be placed in its corresponding position, obtain the depth net for can travel extracted region Network model, as shown in Figure 5.
UC-FCN-L network being trained by the way of unsupervised in step 4), being obtained for can travel extracted region Depth network model.Sample is indispensable in the training process of deep learning, is broadly divided into marker samples and unmarked sample This, marker samples are exactly training sample set Dl={ (x1,y1),(x2,y2),K(xl,yl) in this l category label be known Sample, unmarked sample are exactly training sample set Du={ xl+1,xl+2,xl+u(u is unknown much larger than this u category label in l) Sample.Based on marked sample DlTraining construct model, unmarked sample DuThe unutilized such instruction of the information for being included The mode of white silk is referred to as supervised learning, if lacking marker samples DlIf sample, need to consider never marker samples DuMiddle realization pair The study of model, it is this to only have and be referred to as unsupervised learning merely with the training method of unmarked sample.
It is proposed by the present invention for scene can traffic areas cognition with understand the network architecture be based on from adopt traffic scene view Frequency data set, as shown in figure 4, the real image data comprising the acquisition of the scenes such as urban district, rural area and highway, chooses therein Part image data is trained and tests, can traffic areas cognition with understanding method be substantially to picture carry out pixel fraction It cuts, to obtain image segmentation true value, it is required that training data obtains its corresponding Pixel-level semantic label, however acquire It is very big that a large amount of live-action datas carry out Pixel-level label difficulty to it, must be trained using unsupervised method to network.
Specifically, first with conventional method to collected training image be based on disappearance point methods carries out can traffic areas inspection It surveys.End point is exactly unique intersection point of one group of space parallel lines imaging on the image plane.Passing through based on end point Region detection mainly has following steps:Texture analysis is carried out on multiple scales using Gabor wavelet, and it is inapparent to give up texture Point;The relationship for investigating each point and texture information calculates the score of each point using the method that texture is voted;It is sought according to end point Road edge is looked for, road surface region is obtained.Based on end point can traffic areas detection effect, the detection that will be obtained based on end point As a result as the true value GT of training data, in network training process, reduce the net of proposition by continuously improving network parameter Network model realization testing result is trained network with the difference for obtaining testing result based on end point, can be ultimately utilized in Can traffic areas detection the network architecture.

Claims (10)

1. a kind of end-to-end unsupervised scene road surface area determination method, which is characterized in that include the following steps:
Step 1), building site of road prior probability distribution figure and the Feature Mapping as detection network are attached directly to roll up In lamination, thus construct location-prior information be capable of in real road traffic environment flexible Application can traffic areas position it is first Test probability distribution graph;
Step 2) constructs the depth network architecture-UC-FCN network in conjunction with full convolutional network and U-NET, as the master for realizing detection Volume grid model;
Step 3), using building can traffic areas location-prior probability distribution graph as the depth network architecture-UC-FCN network A kind of characteristic pattern mapping, obtains best additional position, is attached directly in the best additional position of full convolutional layer, generate UC-FCN-L Network;
Step 4), based on vanishing Point Detection Method method to can traffic areas carry out detection and using obtained testing result as trained number UC-FCN-L network being trained according to the true value of collection, obtaining the depth network model for can travel extracted region.
2. the end-to-end unsupervised scene road surface area determination method of one kind according to claim 1, which is characterized in that step It 1), can traffic areas location-prior probability based on statistics building using the regularity of distribution of the road area in space and image in Distribution map.
3. the end-to-end unsupervised scene road surface area determination method of one kind according to claim 1 or 2, which is characterized in that It is concentrated with lane urban road and realistic picture and true value figure without lane urban road based on KITTI data in step 1), to it Can traffic areas counted, respectively obtain under two kinds of road conditions can traffic areas location-prior probability distribution graph, then it is right Obtained under two kinds of road conditions can traffic areas location-prior probability distribution graph merged, obtain can traffic areas position it is first Test probability distribution graph.
4. the end-to-end unsupervised scene road surface area determination method of one kind according to claim 3, which is characterized in that be based on KITTI data are concentrated with lane urban road and realistic picture and true value figure without lane urban road, to its can traffic areas into Row statistics, count each coordinate position be judged as can traffic areas number and average to it, respectively obtain two kinds of roads In the case of road can traffic areas location-prior probability distribution graph, in probability distribution graph, the brightness of each pixel indicates should Pixel belongs to the probability of target, and the brightness of pixel is higher, and the probability for belonging to target is bigger;Conversely, brightness is lower, belong to It is smaller in the probability of target;By probability distribution image, can traffic areas separated from scene, then it is general to two kinds of priori Rate distribution map is merged, and obtaining can traffic areas location-prior probability distribution graph.
5. the end-to-end unsupervised scene road surface area determination method of one kind according to claim 1, which is characterized in that step 2) in, UC-FCN network includes contraction structure and expansion structure, and contraction structure carries out convolution and pondization operates, and gradually decreases space Dimension, so that obtained image is smaller and smaller, resolution ratio is lower and lower;Then up-sampling is used after convolutional layer by expansion structure Operation replaces the pondization operation in contraction structure after convolutional layer, and the high-resolution features generated in network contraction structure are connected In result after having arrived expansion structure convolution, increase the resolution ratio of output, gradually repairs the details and Spatial Dimension of object.
6. the end-to-end unsupervised scene road surface area determination method of one kind according to claim 5, which is characterized in that expansion For structure using the repetition framework of up-sampling convolution, repeating the up-sampling in framework is specially up-sampling plus ReLU activation primitive knot Structure, using bilinear interpolation will 2 times of input up-sampling, solves the problems, such as gradient disappearance using ReLU later, after up-sampling, use Convolution operation realizes that the port number of characteristic pattern changes, and convolution kernel size is 3*3 in convolutional layer, by the result and contraction after convolution The characteristic pattern fusion that step is corresponded in structure, finally passes through softmax layers, obtains high-precision recognition result.
7. the end-to-end unsupervised scene road surface area determination method of one kind according to claim 1, which is characterized in that step 3) in, can traffic areas location-prior probability distribution graph should wait than adjusting to the last one characteristic pattern size phase connected to it Together, using it is adjusted can traffic areas location-prior probability distribution graph as UC-FCN network a kind of characteristic pattern mapping be attached to In its corresponding position, UC-FCN-L network is generated.
8. the end-to-end unsupervised scene road surface area determination method of one kind according to claim 1, which is characterized in that adopting The training image collected based on disappearance point methods carry out can traffic areas detection and using its testing result as the true of training data Value GT, in network training process, by continuously improving network parameter with reduce proposition network model realize testing result with The difference for obtaining testing result based on end point is trained network, obtain can be ultimately utilized in can traffic areas detection network Framework.
9. the end-to-end unsupervised scene road surface area determination method of one kind according to claim 1, which is characterized in that step 4) UC-FCN-L network being trained by the way of unsupervised in, obtaining the depth network mould for can travel extracted region Type.
10. the end-to-end unsupervised scene road surface area determination method of one kind according to claim 9, which is characterized in that nothing Monitor mode training, i.e., be divided into marker samples and unmarked sample for sample, marker samples are exactly training sample set Dl={ (x1, y1),(x2,y2),K(xl,yl) in this l category label be known sample, unmarked sample is exactly training sample set Du= {xl+1,xl+2,xl+uIn the unknown sample of this u category label, u is much larger than l, is based on marked sample DlTraining construct Model, unmarked sample DuThe unutilized such training method of the information for being included is referred to as supervised learning, if lacking mark Remember sample DlIf sample, need to consider never marker samples DuStudy of the middle realization to model.
CN201810636311.5A 2018-06-20 2018-06-20 End-to-end unsupervised scene passable area cognition and understanding method Active CN108876805B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810636311.5A CN108876805B (en) 2018-06-20 2018-06-20 End-to-end unsupervised scene passable area cognition and understanding method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810636311.5A CN108876805B (en) 2018-06-20 2018-06-20 End-to-end unsupervised scene passable area cognition and understanding method

Publications (2)

Publication Number Publication Date
CN108876805A true CN108876805A (en) 2018-11-23
CN108876805B CN108876805B (en) 2021-07-27

Family

ID=64340750

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810636311.5A Active CN108876805B (en) 2018-06-20 2018-06-20 End-to-end unsupervised scene passable area cognition and understanding method

Country Status (1)

Country Link
CN (1) CN108876805B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110415187A (en) * 2019-07-04 2019-11-05 深圳市华星光电技术有限公司 Image processing method and image processing system
CN111369566A (en) * 2018-12-25 2020-07-03 杭州海康威视数字技术股份有限公司 Method, device and equipment for determining position of pavement blanking point and storage medium
CN113221826A (en) * 2021-05-31 2021-08-06 浙江工商大学 Road detection method based on self-supervision learning significance estimation pixel embedding
CN113392809A (en) * 2019-02-21 2021-09-14 百度在线网络技术(北京)有限公司 Automatic driving information processing method and device and storage medium
WO2022087853A1 (en) * 2020-10-27 2022-05-05 深圳市深光粟科技有限公司 Image segmentation method and apparatus, and computer-readable storage medium
US11473927B2 (en) * 2020-02-05 2022-10-18 Electronic Arts Inc. Generating positions of map items for placement on a virtual map

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103034862A (en) * 2012-12-14 2013-04-10 北京诚达交通科技有限公司 Road snow and rain state automatic identification method based on feature information classification
CN106650690A (en) * 2016-12-30 2017-05-10 东华大学 Night vision image scene identification method based on deep convolution-deconvolution neural network
CN107492071A (en) * 2017-08-17 2017-12-19 京东方科技集团股份有限公司 Medical image processing method and equipment
CN107808140A (en) * 2017-11-07 2018-03-16 浙江大学 A kind of monocular vision Road Recognition Algorithm based on image co-registration
CN108062753A (en) * 2017-12-29 2018-05-22 重庆理工大学 The adaptive brain tumor semantic segmentation method in unsupervised domain based on depth confrontation study

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103034862A (en) * 2012-12-14 2013-04-10 北京诚达交通科技有限公司 Road snow and rain state automatic identification method based on feature information classification
CN106650690A (en) * 2016-12-30 2017-05-10 东华大学 Night vision image scene identification method based on deep convolution-deconvolution neural network
CN107492071A (en) * 2017-08-17 2017-12-19 京东方科技集团股份有限公司 Medical image processing method and equipment
CN107808140A (en) * 2017-11-07 2018-03-16 浙江大学 A kind of monocular vision Road Recognition Algorithm based on image co-registration
CN108062753A (en) * 2017-12-29 2018-05-22 重庆理工大学 The adaptive brain tumor semantic segmentation method in unsupervised domain based on depth confrontation study

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
STIAAN WIEHMAN.ETC: ""Unsupervised Pre-training for Fully Convolutional Neural Networks"", 《2016 PATTERN RECOGNITION ASSOCIATION OF SOUTH AFRICA AND ROBOTICS AND MECHATRONICS INTERNATIONAL CONFERENCE》 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111369566A (en) * 2018-12-25 2020-07-03 杭州海康威视数字技术股份有限公司 Method, device and equipment for determining position of pavement blanking point and storage medium
CN111369566B (en) * 2018-12-25 2023-12-05 杭州海康威视数字技术股份有限公司 Method, device, equipment and storage medium for determining position of pavement blanking point
CN113392809A (en) * 2019-02-21 2021-09-14 百度在线网络技术(北京)有限公司 Automatic driving information processing method and device and storage medium
CN113392809B (en) * 2019-02-21 2023-08-15 百度在线网络技术(北京)有限公司 Automatic driving information processing method, device and storage medium
CN110415187A (en) * 2019-07-04 2019-11-05 深圳市华星光电技术有限公司 Image processing method and image processing system
CN110415187B (en) * 2019-07-04 2021-07-23 Tcl华星光电技术有限公司 Image processing method and image processing system
US11473927B2 (en) * 2020-02-05 2022-10-18 Electronic Arts Inc. Generating positions of map items for placement on a virtual map
US20220412765A1 (en) * 2020-02-05 2022-12-29 Electronic Arts Inc. Generating Positions of Map Items for Placement on a Virtual Map
US11668581B2 (en) * 2020-02-05 2023-06-06 Electronic Arts Inc. Generating positions of map items for placement on a virtual map
WO2022087853A1 (en) * 2020-10-27 2022-05-05 深圳市深光粟科技有限公司 Image segmentation method and apparatus, and computer-readable storage medium
CN113221826A (en) * 2021-05-31 2021-08-06 浙江工商大学 Road detection method based on self-supervision learning significance estimation pixel embedding
CN113221826B (en) * 2021-05-31 2023-05-02 浙江工商大学 Road detection method based on self-supervision learning significance estimation pixel embedding

Also Published As

Publication number Publication date
CN108876805B (en) 2021-07-27

Similar Documents

Publication Publication Date Title
CN108876805A (en) The end-to-end unsupervised scene of one kind can traffic areas cognition and understanding method
CN109934163A (en) A kind of aerial image vehicle checking method merged again based on scene priori and feature
CN113936139B (en) Scene aerial view reconstruction method and system combining visual depth information and semantic segmentation
CN106920243A (en) The ceramic material part method for sequence image segmentation of improved full convolutional neural networks
CN106408015A (en) Road fork identification and depth estimation method based on convolutional neural network
CN110263833A (en) Based on coding-decoding structure image, semantic dividing method
CN110853057B (en) Aerial image segmentation method based on global and multi-scale full-convolution network
CN104318569A (en) Space salient region extraction method based on depth variation model
CN108256464A (en) High-resolution remote sensing image urban road extracting method based on deep learning
CN113052106B (en) Airplane take-off and landing runway identification method based on PSPNet network
CN106355643A (en) Method for generating three-dimensional real scene road model of highway
CN113505842A (en) Automatic extraction method suitable for large-scale regional remote sensing image urban building
Cao et al. MCS-YOLO: A multiscale object detection method for autonomous driving road environment recognition
CN115292913A (en) Vehicle-road-cooperation-oriented drive test perception simulation system
CN111599007B (en) Smart city CIM road mapping method based on unmanned aerial vehicle aerial photography
CN106295491A (en) Track line detection method and device
CN102254162B (en) Method for detecting airport runway in synthetic aperture radar (SAR) image based on minimum linear ratio
CN115661032A (en) Intelligent pavement disease detection method suitable for complex background
Tian et al. Road marking detection based on mask R-CNN instance segmentation model
CN114943902A (en) Urban vegetation unmanned aerial vehicle remote sensing classification method based on multi-scale feature perception network
CN110472508A (en) Lane line distance measuring method based on deep learning and binocular vision
CN113361528A (en) Multi-scale target detection method and system
CN114708560B (en) YOLOX algorithm-based illegal parking detection method and system
CN103886289A (en) Direction self-adaptive method and system for identifying on-water bridge targets
CN116385716A (en) Three-dimensional map ground object data automatic production method based on remote sensing map

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20240207

Address after: 710200 Jingwei Industrial Park, economic development zone, Xi'an City, Shaanxi Province

Patentee after: SHAANXI HEAVY DUTY AUTOMOBILE Co.,Ltd.

Country or region after: China

Address before: 710064 No. 33, South Second Ring Road, Shaanxi, Xi'an

Patentee before: CHANG'AN University

Country or region before: China

TR01 Transfer of patent right