CN108133484A

CN108133484A - Automatic Pilot processing method and processing device based on scene cut, computing device

Info

Publication number: CN108133484A
Application number: CN201711405705.1A
Authority: CN
Inventors: 董健; 韩玉刚; 颜水成
Original assignee: Beijing Qihoo Technology Co Ltd
Current assignee: Beijing Qihoo Technology Co Ltd
Priority date: 2017-12-22
Filing date: 2017-12-22
Publication date: 2018-06-08
Anticipated expiration: 2037-12-22
Also published as: CN108133484B

Abstract

The invention discloses a kind of automatic Pilot processing method and processing device based on scene cut, computing device, the frame image that method includes captured and/or institute's recorded video is grouped processing, including：Real-time image acquisition collecting device is captured and/or the vehicle drive way recorded in video in current frame image；Current frame image is input in trained obtained neural network, the frame position in grouping according to belonging to current frame image at it, scene cut is carried out to current frame image, obtains the scene cut result of current frame image；According to scene cut as a result, determining travel route and/or driving instruction；According to identified travel route and/or driving instruction, automatic Pilot control is carried out to vehicle.Frame position during the present invention is grouped according to belonging to current frame image at it is different, corresponding to carry out scene cut to frame image.Travel route and/or driving instruction are accurately determined using scene cut result, help to improve the safety of automatic Pilot.

Description

Automatic Pilot processing method and processing device based on scene cut, computing device

Technical field

The present invention relates to image processing fields, and in particular to a kind of automatic Pilot processing method and dress based on scene cut It puts, computing device.

Background technology

Image scene segmentation processing is mainly based upon the full convolutional neural networks in deep learning, these processing methods utilize The thought of transfer learning will pass through the obtained network migration of pre-training to image partitioned data set on extensive categorized data set On be trained, so as to obtain the segmentation network for scene cut, scene point then is carried out to image using the segmentation network It cuts.Higher requirement is had to the timeliness and accuracy of scene cut based on the automatic Pilot of scene cut, it is automatic to ensure The safety of driving.

The prior art is when being split scene, often using each frame image in video data as individual frame Image carries out scene cut, obtains the scene cut result of each frame image.But this processing mode carries out each frame image Identical processing does not account for the relevance between each frame image in video data.So that the speed of processing is slower, need to spend Take the more time.

Invention content

In view of the above problems, it is proposed that the present invention overcomes the above problem in order to provide one kind or solves at least partly State the automatic Pilot processing method and processing device based on scene cut, the computing device of problem.

According to an aspect of the invention, there is provided a kind of automatic Pilot processing method based on scene cut, method pair The frame image that captured and/or institute's recorded video is included is grouped processing, including：

Real-time image acquisition collecting device is captured and/or the vehicle drive way recorded in video in present frame figure Picture；

Current frame image is input in trained obtained neural network, in grouping according to belonging to current frame image at it Frame position, to current frame image carry out scene cut, obtain the scene cut result of current frame image；

According to scene cut as a result, determining travel route and/or driving instruction；

According to identified travel route and/or driving instruction, automatic Pilot control is carried out to vehicle.

Optionally, according to scene cut as a result, determining that travel route and/or driving instruction further comprise：

According to scene cut as a result, determining the profile information of special object；

According to the profile information of special object, the relative position relation of vehicle and special object is calculated；

According to the relative position relation being calculated, travel route and/or driving instruction are determined.

Optionally, the relative position relation of vehicle and special object include between vehicle and special object away from From information and/or angle information.

According to the road signs information included in scene cut result, determine that vehicle travel route and/or traveling refer to It enables.

According to the traffic lights information included in scene cut result, travel route and/or driving instruction are determined.

Optionally, current frame image is input in trained obtained neural network, according to current frame image in its institute Belong to the frame position in grouping, scene cut is carried out to current frame image, the scene cut result for obtaining current frame image is further Including：

Judge current frame image whether be any grouping the 1st frame image；

If so, current frame image is input in trained obtained neural network, all rolled up by the neural network After the operation of lamination and warp lamination, the scene cut result of current frame image is obtained；

If it is not, then current frame image is input in trained obtained neural network, the i-th of operation to neural network After layer convolutional layer obtains the operation result of i-th layer of convolutional layer, obtain the 1st frame image being grouped belonging to current frame image and be input to god The operation result of jth layer warp lamination through being obtained in network, directly by the operation result of i-th layer of convolutional layer and jth layer warp The operation result of lamination carries out image co-registration, obtains the scene cut result of current frame image；Wherein, i and j is natural number.

Optionally, after the 1st frame image that current frame image is not any grouping is judged, method further includes：

Calculate current frame image and the frame pitch of the 1st frame image being grouped belonging to it；

According to frame pitch, the value of i and j are determined；Wherein, the layer between i-th layer of convolutional layer and last layer of convolutional layer away from With frame pitch inversely, the layer between jth layer warp lamination and output layer is away from proportional with frame pitch.

Optionally, method further includes：Pre-set frame pitch and the correspondence of the value of i and j.

Optionally, the operation result of the operation result of i-th layer of convolutional layer and jth layer warp lamination is directly being subjected to image After fusion, method further includes：

If jth layer warp lamination is last layer of warp lamination of neural network, image co-registration result is input to defeated Go out layer, to obtain the scene cut result of current frame image；

If jth layer warp lamination is not last layer of warp lamination of neural network, image co-registration result is input to + 1 layer of warp lamination of jth, by the operation of follow-up warp lamination and output layer, to obtain the scene cut knot of current frame image Fruit.

Optionally, current frame image is input in trained obtained neural network, is all rolled up by the neural network After the operation of lamination and warp lamination, the scene cut result for obtaining current frame image further comprises：Passing through the nerve net After each layer of convolutional layer operation before last layer of convolutional layer of network, down-sampling is carried out to the operation result of each layer of convolutional layer Processing.

Optionally, before i-th layer of convolutional layer of operation to neural network obtains the operation result of i-th layer of convolutional layer, side Method further includes：After each layer of convolutional layer operation before i-th layer of convolutional layer by the neural network, to each layer of convolutional layer Operation result carry out down-sampling processing.

Optionally, every group of video includes n frame frame images；Wherein, n is fixed preset value.

According to another aspect of the present invention, a kind of automatic Pilot processing unit based on scene cut, device pair are provided The frame image that captured and/or institute's recorded video is included is grouped processing, including：

Acquisition module, suitable for regarding in captured by the real-time image acquisition collecting device and/or vehicle drive way recorded Current frame image in frequency；

Divide module, suitable for current frame image is input in trained obtained neural network, according to current frame image Frame position in grouping belonging to it carries out scene cut to current frame image, obtains the scene cut result of current frame image；

Determining module, suitable for according to scene cut as a result, determining travel route and/or driving instruction；

Control module, suitable for according to identified travel route and/or driving instruction, automatic Pilot is carried out to vehicle Control.

Optionally it is determined that module is further adapted for：

According to scene cut as a result, determining the profile information of special object；According to the profile information of special object, calculate certainly The relative position relation of body vehicle and special object；According to the relative position relation being calculated, travel route and/or row are determined Sail instruction.

Optionally it is determined that module is further adapted for：

Optionally, segmentation module further comprises：

Judging unit, suitable for judging whether current frame image is the 1st frame image of any grouping, if so, performing first point Cut unit；Otherwise, the second cutting unit is performed；

First cutting unit, suitable for current frame image is input in trained obtained neural network, by the nerve After the operation of network whole convolutional layer and warp lamination, the scene cut result of current frame image is obtained；

Second cutting unit, suitable for current frame image is input in trained obtained neural network, in operation to god After i-th layer of convolutional layer of network obtains the operation result of i-th layer of convolutional layer, the 1st frame being grouped belonging to current frame image is obtained Image is input to the operation result of jth layer warp lamination obtained in neural network, directly by the operation result of i-th layer of convolutional layer Image co-registration is carried out with the operation result of jth layer warp lamination, obtains the scene cut result of current frame image；Wherein, i and j For natural number.

Optionally, segmentation module further includes：

Frame pitch computing unit, suitable for the frame pitch of the 1st frame image for calculating current frame image and being grouped belonging to it；

Determination unit, suitable for according to frame pitch, determining the value of i and j；Wherein, i-th layer of convolutional layer and last layer of convolution Layer between layer is away from inversely, layer between jth layer warp lamination and output layer to frame pitch away from directlying proportional with frame pitch Relationship.

Optionally, segmentation module further includes：

Default unit, suitable for pre-setting frame pitch and the correspondence of the value of i and j.

Optionally, the second cutting unit is further adapted for：

Optionally, the first cutting unit is further adapted for：

After each layer of convolutional layer operation before last layer of convolutional layer by the neural network, to each layer of convolution The operation result of layer carries out down-sampling processing.

Optionally, the second cutting unit is further adapted for：

After each layer of convolutional layer operation before i-th layer of convolutional layer by the neural network, to each layer of convolutional layer Operation result carry out down-sampling processing.

According to another aspect of the invention, a kind of computing device is provided, including：Processor, memory, communication interface and Communication bus, processor, memory and communication interface complete mutual communication by communication bus；

For memory for storing an at least executable instruction, it is above-mentioned based on scene cut that executable instruction performs processor The corresponding operation of automatic Pilot processing method.

In accordance with a further aspect of the present invention, a kind of computer storage media is provided, at least one is stored in storage medium Executable instruction, executable instruction make processor perform such as the above-mentioned corresponding behaviour of automatic Pilot processing method based on scene cut Make.

According to the automatic Pilot processing method and processing device provided by the invention based on scene cut, computing device, obtain in real time Take the current frame image in the video captured by image capture device and/or in the vehicle drive way recorded；By present frame figure As being input in trained obtained neural network, the frame position in grouping according to belonging to current frame image at it, to present frame Image carries out scene cut, obtains the scene cut result of current frame image；According to scene cut as a result, determining travel route And/or driving instruction；According to identified travel route and/or driving instruction, automatic Pilot control is carried out to vehicle.This Invention in scene cut, video packets is handled using continuity, relevance between each frame image in video, according to working as Frame position of the prior image frame in grouping belonging to it is different, corresponding to carry out scene cut to frame image, further, in every group The operation of whole convolutional layers and warp lamination is completed by neural network to the 1st frame image, to other frames in addition to the 1st frame image To i-th layer of convolutional layer, the operation result of jth layer warp lamination that the 1st frame image of multiplexing has obtained carries out figure for image only operation As fusion, the operand of neural network is greatly reduced, improves the speed of scene cut.Using scene cut result accurately It determines travel route and/or driving instruction, helps to improve the safety of automatic Pilot.

Above description is only the general introduction of technical solution of the present invention, in order to better understand the technological means of the present invention, And it can be implemented in accordance with the contents of the specification, and in order to allow above and other objects of the present invention, feature and advantage can It is clearer and more comprehensible, below the special specific embodiment for lifting the present invention.

Description of the drawings

By reading the detailed description of hereafter preferred embodiment, it is various other the advantages of and benefit it is common for this field Technical staff will become clear.Attached drawing is only used for showing the purpose of preferred embodiment, and is not considered as to the present invention Limitation.And throughout the drawings, the same reference numbers will be used to refer to the same parts.In the accompanying drawings：

Fig. 1 shows the flow of the automatic Pilot processing method according to an embodiment of the invention based on scene cut Figure；

Fig. 2 shows the flows of the automatic Pilot processing method in accordance with another embodiment of the present invention based on scene cut Figure；

Fig. 3 shows the functional block of the automatic Pilot processing unit according to an embodiment of the invention based on scene cut Figure；

Fig. 4 shows a kind of structure diagram of computing device according to an embodiment of the invention.

Specific embodiment

The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although the disclosure is shown in attached drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here It is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure Completely it is communicated to those skilled in the art.

Fig. 1 shows the flow of the automatic Pilot processing method according to an embodiment of the invention based on scene cut Figure.As shown in Figure 1, the automatic Pilot processing method based on scene cut specifically comprises the following steps：

Step S101, real-time image acquisition collecting device is captured and/or the vehicle drive way recorded in video in Current frame image.

Image capture device is illustrated by taking the camera set on automatic driving vehicle as an example in the present embodiment.For reality Existing automatic Pilot can pass through the traffic information around the camera collection vehicle that is set on automatic driving vehicle, then in step In S101, current frame image when current frame image or shooting video of the camera in recorded video is obtained in real time.

Each frame in the video captured by image capture device and/or in the vehicle drive way recorded is utilized in the present embodiment Continuity, relevance between image, when carrying out scene cut to each frame image in video, first by each frame figure in video As being grouped processing.When being grouped processing, the incidence relation between each frame image is considered, incidence relation in each frame image is tight Close frame image is divided into one group.The frame number of frame image specifically included in different framing image can be identical or different , it is assumed that n frame frame images are included in per framing image, n can be fixed value or on-fixed value, and the value of n is according to performance Setting.When obtaining current frame image in real time, just current frame image is grouped, determines whether it is one in current group Frame image is the 1st frame image in new grouping.Specifically, it needs according to current frame image and previous frame image or former frame figures Incidence relation as between is grouped.Such as using track algorithm, if track algorithm obtains current frame image as effective tracking As a result, current frame image is determined as to the frame image in current group, if it is invalid that track algorithm, which obtains current frame image, Current frame image is really the 1st frame image in new grouping by tracking result；Or the sequence according to each frame image, it will be adjacent Two frames or three frame images be divided into one group, by taking one group of three frame image as an example, in video the 1st frame image be first grouping the 1st frame Image, 2nd frame image of the 2nd frame image for the first grouping, 3rd frame image of the 3rd frame image for the first grouping, the 4th frame image For the 1st frame image of second packet, the 5th frame image is the 2nd frame image of second packet, and the 6th frame image is the 3 of second packet Frame image, and so on.Specific packet mode is certain according to performance in implementation, does not limit herein.

Current frame image is input in trained obtained neural network, according to current frame image at it by step S102 Frame position in affiliated grouping carries out scene cut to current frame image, obtains the scene cut result of current frame image.

After current frame image is input in trained obtained neural network, grouping according to belonging to current frame image at it In frame position, to current frame image carry out scene cut.According to the difference of present frame frame position in affiliated grouping, to its into The processing of row scene cut is also different.

Specifically, judge current frame image whether be any of which grouping the 1st frame image, if judging, current frame image is Current frame image, then be input in trained obtained neural network, successively by the god by the 1st frame image of any of which grouping The operation of whole convolutional layers and the operation of warp lamination are performed to it through network, finally obtains the scene cut of current frame image As a result.Specifically, such as the operation comprising 4 layers of convolutional layer in the neural network and the operation of 3 layers of warp lamination, by current frame image The neural network is input to by the whole operation of 4 layers of convolutional layer and the operation of 3 layers of warp lamination.

If it is not the 1st frame image in any grouping to judge current frame image, current frame image is input to trained In obtained neural network, at this point, not needing to perform it by the neural network operation of whole convolutional layers and warp lamination Operation, after only the i-th of operation to neural network layer convolutional layer obtains the operation result of i-th layer of convolutional layer, directly acquire current The 1st frame image being grouped belonging to frame image is input to the operation result of jth layer warp lamination obtained in neural network, by i-th The operation result of layer convolutional layer carries out image co-registration with the operation result of jth layer warp lamination, it is possible to obtain current frame image Scene cut result.Wherein, there is correspondence, the correspondence is specific between i-th layer of convolutional layer and jth layer warp lamination Operation result for i-th layer of convolutional layer is identical with the output dimension of the operation result of jth layer warp lamination.I and j is nature Number, and the value of i is no more than the number of plies of last layer of convolutional layer that neural network is included, the value of j is no more than neural network Comprising last layer of warp lamination the number of plies.Specifically, such as current frame image is input in neural network, operation to god Through network level 1 volume lamination, the operation result of level 1 volume lamination is obtained, directly acquires the 1st frame being grouped belonging to current frame image Image is input to the operation result of the 3rd layer of warp lamination obtained in neural network, by the operation result of level 1 volume lamination and The operation result of 3rd layer of warp lamination of 1 frame image is merged.Wherein, the operation result of level 1 volume lamination and the 3rd layer are anti- The output dimension of the operation result of convolutional layer is identical.It is obtained by the 1st frame image operation in grouping belonging to multiplexing The operation result of jth layer warp lamination, it is possible to reduce operation of the neural network to current frame image greatly speeds up neural network Processing speed, so as to improve the computational efficiency of neural network.Further, if jth layer warp lamination is last of neural network Layer warp lamination, then be input to output layer, to obtain the scene cut result of current frame image by image co-registration result.If jth Layer warp lamination is not last layer of warp lamination of neural network, then image co-registration result is input to+1 layer of deconvolution of jth Layer, by the operation of follow-up each warp lamination and output layer, to obtain the scene cut result of current frame image.

It is not the 1st frame image in any grouping for current frame image, it is thus necessary to determine that the value of i and j.Judging to work as Prior image frame is not calculating current frame image and the frame of the 1st frame image being grouped belonging to it after the 1st frame image of any grouping Spacing.Such as the 3rd frame image that current frame image is any grouping, the interframe of its 1st frame image with affiliated grouping is calculated Away from being 2.According to obtained frame pitch, it may be determined that the value of the i of i-th layer of convolutional layer and the 1st frame image jth in neural network The value of the j of layer warp lamination.

When determining i and j, it is believed that between i-th layer of convolutional layer and last layer of convolutional layer (the bottleneck layer of convolutional layer) Layer away from inversely, the layer between jth layer warp lamination and output layer is away from proportional with frame pitch with frame pitch.When When frame pitch is bigger, away from smaller, i values are bigger, more need to run more for layer between i-th layer of convolutional layer and last layer of convolutional layer Convolutional layer operation；Away from bigger, j values are smaller, need to obtain the anti-of the smaller number of plies for layer between jth layer warp lamination and output layer The operation result of convolutional layer.For including 1-4 layers of convolutional layer in neural network, wherein, the 4th layer of convolutional layer is last layer Convolutional layer；1-3 layers of warp lamination and output layer are further comprised in neural network.When frame pitch is 1, i-th layer of convolution is determined Layer between layer and last layer of convolutional layer is away from being 3, and it is 1 to determine i, i.e. operation to level 1 volume lamination determines jth layer deconvolution For layer between layer and output layer away from being 1, it is 3 to determine j, obtains the operation result of the 3rd layer of warp lamination；When frame pitch is 2, really For fixed layer between i-th layer of convolutional layer and last layer of convolutional layer away from being 2, it is 2 to determine i, i.e. operation to level 2 volume lamination determines Layer between jth layer warp lamination and output layer obtains the operation result of the 2nd layer of warp lamination away from for 2, j 2.Specific layer away from The convolutional layer that is included of size and neural network and warp lamination each number of plies and actual implementation institute effect phase to be achieved It closes, is to illustrate above.

Alternatively, when determining i and j, it is corresponding with the value of i and j that frame pitch can be pre-set directly according to frame pitch Relationship.Specifically, pre-setting the value of different i and j according to different frame pitch, if frame pitch is 1, the value for setting i is 1, j value is 3；Frame pitch is 2, and the value that the value for setting i is 2, j is 2；Or can also according to different frame pitch, The value of identical i and j are set；Though as frame pitch size when, the value that the value that is respectively provided with corresponding i is 2, j is 2； Or can also the value of identical i and j be set to a part of different frame pitch, if frame pitch is 1 and 2, setting is corresponding The value that the value of i is 1, j is 3；Frame pitch is 3 and 4, and the value that the value for setting corresponding i is 2, j is 2.With specific reference to reality The situation of applying is configured, and is not limited herein.

Further, the arithmetic speed for raising neural network, if judging the 1st frame that current frame image is grouped for any of which Image, after each layer of convolutional layer operation before last layer of convolutional layer by the neural network, to each layer of convolutional layer Operation result carry out down-sampling processing.If it is not the 1st frame image in any grouping to judge current frame image, by being somebody's turn to do After each layer of convolutional layer operation before i-th layer of convolutional layer of neural network, the operation result of each layer of convolutional layer is carried out down Sampling processing.After current frame image is inputted neural network, after level 1 volume lamination operation, operation result adopt Sample processing reduces the resolution ratio of operation result, then the operation result after down-sampling is carried out level 2 volume lamination operation, and to the 2nd The operation result of layer convolutional layer also carries out down-sampling processing, and so on, until last layer of convolutional layer of neural network (is rolled up The bottleneck layer of lamination) or i-th layer of convolutional layer, by taking last layer of convolutional layer or i-th layer of convolutional layer is the 4th layer of convolutional layers as an example, Down-sampling processing is no longer done after 4th layer of convolutional layer operation result.After each layer of convolutional layer operation before 4th layer of convolutional layer, Down-sampling processing is carried out to the operation result of each layer of convolutional layer, reduces the resolution ratio of the frame image of each layer convolutional layer input, it can To improve the arithmetic speed of neural network.It should be noted that in the first time convolutional layer operation of neural network, input is The current frame image obtained in real time without carrying out down-sampling processing, can obtain the thin of relatively good current frame image in this way Section.Later, when carrying out down-sampling processing to the operation result of output, the details of current frame image had not only been interfered with, but also can be with Improve the arithmetic speed of neural network.

Step S103, according to scene cut as a result, determining travel route and/or driving instruction.

Various objects are contained in scene cut result, according to the relationship, various right between various objects and vehicle As prompting message to vehicle etc., it can determine vehicle travel route within a preset time interval and/or determine Driving instruction.Specifically, driving instruction may include starting running, stop traveling, be travelled or according to certain according to a certain travel speed One acceleration carries out the instructions such as acceleration or deceleration traveling.Those skilled in the art can be set according to actual needs between preset time Every not limiting herein.

According to identified travel route and/or driving instruction, automatic Pilot control is carried out to vehicle by step S104 System.

After travel route and/or driving instruction is determined, so that it may refer to according to identified travel route and/or traveling It enables, automatic Pilot control is carried out to vehicle.Assuming that determining driving instruction is according to 6m/s²Acceleration carry out deceleration row It sails, then in step S104, automatic Pilot control is carried out to vehicle, controls the brake system of vehicle so that from Body vehicle is according to 6m/s²Acceleration carry out Reduced Speed Now.

According to the automatic Pilot processing method provided by the invention based on scene cut, real-time image acquisition collecting device institute The current frame image in video in shooting and/or the vehicle drive way recorded；Current frame image is input to trained obtain To neural network in, according to belonging to current frame image at it grouping in frame position, to current frame image carry out scene cut, Obtain the scene cut result of current frame image；According to scene cut as a result, determining travel route and/or driving instruction；According to Identified travel route and/or driving instruction carry out automatic Pilot control to vehicle.The present invention utilizes each frame in video Continuity, relevance between image, in scene cut, video packets are handled, and are divided according to belonging to current frame image at it Frame position in group is different, corresponding to carry out scene cut to frame image, further, in every group to the 1st frame image by nerve Network completes the operation of whole convolutional layers and warp lamination, to other frame image only operations in addition to the 1st frame image to i-th layer Convolutional layer, the operation result of jth layer warp lamination that the 1st frame image of multiplexing has obtained carry out image co-registration, greatly reduce The operand of neural network improves the speed of scene cut.Using scene cut result accurately determine travel route and/or Driving instruction helps to improve the safety of automatic Pilot.

Fig. 2 shows the flows of the automatic Pilot processing method in accordance with another embodiment of the present invention based on scene cut Figure.As shown in Fig. 2, the automatic Pilot processing method based on scene cut specifically comprises the following steps：

Step S201, real-time image acquisition collecting device is captured and/or the vehicle drive way recorded in video in Current frame image.

Current frame image is input in trained obtained neural network, according to current frame image at it by step S202 Frame position in affiliated grouping carries out scene cut to current frame image, obtains the scene cut result of current frame image.

Above step is with reference to the step S101-S102 in Fig. 1 embodiments, and details are not described herein.

Step S203, according to scene cut as a result, determining the profile information of special object.

Specifically, special object may include the objects such as vehicle, pedestrian, road, barrier.Those skilled in the art can basis Actual needs setting special object, does not limit herein.After scene cut result corresponding with current frame image has been obtained, It can be according to scene cut corresponding with current frame image as a result, determining the profile letter of the special objects such as vehicle, pedestrian, road Breath, subsequently to calculate the relative position relation of vehicle and special object.

Step S204 according to the profile information of special object, calculates the relative position relation of vehicle and special object.

Assuming that determine to have obtained the profile information of vehicle 1 and the profile information of vehicle 2 in step S203, then in step , can be according to the profile information of vehicle 1 and the profile information of vehicle 2 in S204, the relative position for calculating vehicle and vehicle 1 is closed System and the relative position relation of vehicle and vehicle 2.

The relative position relation of vehicle and special object includes the distance between vehicle and special object information, If the air line distance of vehicle and vehicle 1 is 200 meters；The relative position relation of vehicle and special object further comprises certainly Angle information between body vehicle and special object, if vehicle is in 10 degree of the right rear side of vehicle 1 angular direction.

Step S205 according to the relative position relation being calculated, determines travel route and/or driving instruction.

According to the vehicle and the relative position relation of special object being calculated, it can determine the vehicle pre- If travel route and/or determining driving instruction in time interval.Specifically, driving instruction may include starting running, stop row It sails, travelled according to a certain travel speed or carry out the instructions such as acceleration or deceleration traveling according to a certain acceleration.People in the art Member can set prefixed time interval according to actual needs, not limit herein.

Such as according to the relative position relation being calculated it is found that 10 meter Chu You a group traveling together of vehicle front, then It can be to carry out Reduced Speed Now according to the acceleration of 6m/s2 to determine driving instruction；Or according to the relative position relation being calculated It is found that there is vehicle 1 immediately ahead of vehicle at 200 meters of distance, there is vehicle 2 at 45 degree of 2 meters of angular direction distances on the left of vehicle, Then it is determined that travel route can be along front route running.

Step S206, according to the road signs information included in scene cut result, determine vehicle travel route and/ Or driving instruction.

Various road signs informations, such as caution sign are contained in scene cut result：Traffic circle, to the left racing Curved, consecutive curve, Tunnel ahead etc.；Prohibitory sign：Forbid straight trip, No entry；Warning Mark：Speed limit is divided to Travel vehicle Road allows to turn around；Road construction safety sign：Men working, the closing of left road etc.；Also fingerpost, tourism distinctive emblem, auxiliary Help mark etc..According to these specific road signs informations, it may be determined that vehicle travel route and/or driving instruction.

For example, current vehicle speed 100km/h, according to the speed limit of front 500m included in scene cut result The road signs information of 80km/h determines that vehicle Reduced Speed Now instructs；Or according to being included in scene cut result before The road signs information of the left road closings of face 200m, determines vehicle road driving to the right.

Step S207 according to the traffic lights information included in scene cut result, determines travel route and/or traveling Instruction.

Traffic lights information is contained in scene cut result, it, can be true according to traffic lights information such as traffic lights information Fixed current route of whether prolonging continues to travel or travel routes and/or the driving instruction such as ramp to stop.

Such as according to the red light information of front 10m in scene cut result, vehicle ramp to stop is determined；Alternatively, according to The green light information of front 10m in scene cut result determines vehicle after the present road traveling that renews.

Further, above step S205, S206 and S207 can be performed parallel, and comprehensive consideration is according to scene cut result meter Obtained relative position relation, the road signs information included and/or traffic lights information, determine travel route and/or Driving instruction.

According to identified travel route and/or driving instruction, automatic Pilot control is carried out to vehicle by step S208 System.

After travel route and/or driving instruction is determined, so that it may refer to according to identified travel route and/or traveling It enables, automatic Pilot control is carried out to vehicle.

According to the automatic Pilot processing method provided by the invention based on scene cut, using between frame image each in video Continuity, relevance, in scene cut, by frame image packet transaction each in video, according to current frame image belonging to it Frame position in grouping is different, corresponding to carry out scene cut to frame image, obtains the scene cut result of current frame image.Into It is special more can accurately to calculate vehicle and other vehicles, pedestrian, road etc. based on obtained scene cut result for one step Determine the relative position relation of object, according to the relative position relation being calculated can more accurately determine travel route and/ Or driving instruction.Based on included in obtained scene cut result road signs information, traffic lights information, conducive to itself Vehicle can preferably observe traffic laws rule, and the automatic Pilot that safety and precise is observed disciplines and obey laws improves the safety of automatic Pilot, excellent Automatic Pilot processing mode is changed.

Fig. 3 shows the functional block of the automatic Pilot processing unit according to an embodiment of the invention based on scene cut Figure.As shown in figure 3, the automatic Pilot processing unit based on scene cut includes following module：

Acquisition module 310, suitable in captured by the real-time image acquisition collecting device and/or vehicle drive way recorded Current frame image in video.

Image capture device is illustrated by taking the camera set on automatic driving vehicle as an example in the present embodiment.For reality Existing automatic Pilot can pass through the traffic information around the camera collection vehicle that is set on automatic driving vehicle, acquisition module 310 Current frame image when current frame image or shooting video of the camera in recorded video is obtained in real time.

Each frame in the video captured by image capture device and/or in the vehicle drive way recorded is utilized in the present embodiment Continuity, relevance between image, when carrying out scene cut to each frame image in video, first by each frame figure in video As being grouped processing.When being grouped processing, the incidence relation between each frame image is considered, incidence relation in each frame image is tight Close frame image is divided into one group.The frame number of frame image specifically included in different framing image can be identical or different , it is assumed that n frame frame images are included in per framing image, n can be fixed value or on-fixed value, and the value of n is according to performance Setting.When obtaining current frame image in real time, just current frame image is grouped, determines whether it is one in current group Frame image is the 1st frame image in new grouping.Specifically, it needs according to current frame image and previous frame image or former frame figures Incidence relation as between is grouped.Such as using track algorithm, if track algorithm obtains current frame image as effective tracking As a result, current frame image is determined as to the frame image in current group, if it is invalid that track algorithm, which obtains current frame image, Current frame image is really the 1st frame image in new grouping by tracking result；Or the sequence according to each frame image, it will be adjacent Two frames or three frame images be divided into one group, by taking one group of three frame image as an example, in video data the 1st frame image for the first grouping the 1 frame image, 2nd frame image of the 2nd frame image for the first grouping, 3rd frame image of the 3rd frame image for the first grouping, the 4th frame figure The 1st frame image as being second packet, the 5th frame image are the 2nd frame image of second packet, and the 6th frame image is second packet 3rd frame image, and so on.Specific packet mode is certain according to performance in implementation, does not limit herein.

Divide module 320, suitable for current frame image is input in trained obtained neural network, according to present frame figure Frame position in grouping as belonging at it carries out scene cut to current frame image, obtains the scene cut knot of current frame image Fruit.

After current frame image is input in trained obtained neural network by segmentation module 320, according to current frame image Frame position in grouping belonging to it, segmentation module 320 carry out scene cut to current frame image.According to present frame at affiliated point The difference of frame position in group, it is also different that segmentation module 320 carries out the processing of scene cut to it.

Segmentation module 320 includes judging unit 321, the first cutting unit 322 and the second cutting unit 323.

Specifically, judging unit 321 judges whether current frame image is the 1st frame image of any of which grouping, if judging Unit 321 judges the 1st frame image that current frame image is grouped for any of which, then the first cutting unit 322 is by current frame image It is input in trained obtained neural network, performs operation and the warp of whole convolutional layers to it by the neural network successively The operation of lamination finally obtains the scene cut result of current frame image.Specifically, as included 4 layers of convolution in the neural network Current frame image is input to the neural network by complete by the operation of layer and the operation of 3 layers of warp lamination, the first cutting unit 322 The operation of 4 layers of convolutional layer in portion and the operation of 3 layers of warp lamination.

It is not the 1st frame image in any grouping that if judging unit 321, which judges current frame image, the second cutting unit 323 are input to current frame image in trained obtained neural network, at this point, not needing to perform entirely it by the neural network The operation of the convolutional layer in portion and the operation of warp lamination, i-th layer of convolutional layer of the second cutting unit 323 only operation to neural network After obtaining the operation result of i-th layer of convolutional layer, the second cutting unit 323 directly acquires the 1st frame being grouped belonging to current frame image Image is input to the operation result of jth layer warp lamination obtained in neural network, and the second cutting unit 323 is by i-th layer of convolution The operation result of layer carries out image co-registration with the operation result of jth layer warp lamination, it is possible to obtain the scene of current frame image Segmentation result.Wherein, there is correspondence, which is specially i-th between i-th layer of convolutional layer and jth layer warp lamination The operation result of layer convolutional layer is identical with the output dimension of the operation result of jth layer warp lamination.I and j is natural number, and i Value be no more than the number of plies of last layer of convolutional layer that is included of neural network, the value of j is included no more than neural network Last layer of warp lamination the number of plies.Specifically, as current frame image is input to neural network by the second cutting unit 323 In, operation to neural network level 1 volume lamination obtains the operation result of level 1 volume lamination, and the second cutting unit 323 directly obtains The 1st frame image being grouped belonging to current frame image is taken to be input to the operation result of the 3rd layer of warp lamination obtained in neural network, Second cutting unit 323 by the operation result of the operation result of level 1 volume lamination and the 3rd layer of warp lamination of the 1st frame image into Row fusion.Wherein, the output dimension of the operation result of the operation result of level 1 volume lamination and the 3rd layer of warp lamination is identical. The operation knot of jth layer warp lamination that second cutting unit 323 is obtained by the 1st frame image operation in grouping belonging to multiplexing Fruit, it is possible to reduce operation of the neural network to current frame image greatly speeds up the processing speed of neural network, so as to improve nerve The computational efficiency of network.Further, if jth layer warp lamination is last layer of warp lamination of neural network, the second segmentation Image co-registration result is input to output layer by unit 323, to obtain the scene cut result of current frame image.If jth layer warp Lamination is not last layer of warp lamination of neural network, then image co-registration result is input to jth+1 by the second cutting unit 323 Layer warp lamination, by the operation of follow-up each warp lamination and output layer, to obtain the scene cut knot of current frame image Fruit.

Segmentation module 320 further comprises frame pitch computing unit 324, determination unit 325 and/or default unit 326.

It is not the 1st frame image in any grouping for current frame image, segmentation module 320 is it needs to be determined that i's and j takes Value.After the 1st frame image for judging that current frame image is not any grouping in judging unit 321, frame pitch computing unit 324 Calculate current frame image and the frame pitch of the 1st frame image being grouped belonging to it.Such as the 3rd frame figure that current frame image is any grouping Picture, the frame pitch that its 1st frame image with affiliated grouping is calculated in frame pitch computing unit 324 are 2.Determination unit 325 According to obtained frame pitch, it may be determined that the value of the i of i-th layer of convolutional layer and the 1st frame image jth layer deconvolution in neural network The value of the j of layer.

Determination unit 325 is when determining i and j, it is believed that i-th layer of convolutional layer and last layer of convolutional layer be (convolutional layer Bottleneck layer) between layer away from frame pitch inversely, layer between jth layer warp lamination and output layer away from frame pitch into Proportional relation.When frame pitch is bigger, layer between i-th layer of convolutional layer and last layer of convolutional layer is away from smaller, and i values are bigger, more Need to run the operation of more convolutional layer；Away from bigger, j values are smaller, need to obtain for layer between jth layer warp lamination and output layer The operation result of the warp lamination of the smaller number of plies.For including 1-4 layers of convolutional layer in neural network, wherein, the 4th layer of convolution Layer is last layer of convolutional layer；1-3 layers of warp lamination and output layer are further comprised in neural network.When frame pitch computing unit When 324 calculating frame pitch are 1, the determining layer between i-th layer of convolutional layer and last layer of convolutional layer of determination unit 325 is away from being 3, really It is 1 to determine i, i.e. 323 operation of the second cutting unit to level 1 volume lamination, determination unit 325 determines jth layer warp lamination and output Layer between layer is away from being 1, and it is 3 to determine j, and the second cutting unit 323 obtains the operation result of the 3rd layer of warp lamination；Work as frame pitch When the calculating frame pitch of computing unit 324 is 2, determination unit 325 determines the layer between i-th layer of convolutional layer and last layer of convolutional layer Away from being 2, it is 2 to determine i, i.e. 323 operation of the second cutting unit to level 2 volume lamination, determination unit 325 determines jth layer deconvolution For layer between layer and output layer away from for 2, j 2, the second cutting unit 323 obtains the operation result of the 2nd layer of warp lamination.Specifically Layer away from the convolutional layer that is included of size and neural network and warp lamination each number of plies and actual implementation institute effect to be achieved Fruit is related, is to illustrate above.

Alternatively, when determining i and j, default unit 326 can pre-set frame pitch and i and j directly according to frame pitch Value correspondence.Specifically, default unit 326 pre-sets the value of different i and j according to different frame pitch, such as It is 1 that frame pitch computing unit 324, which calculates frame pitch, and it is 3 to preset unit 326 and set the value that the value of i is 1, j；Frame pitch meter It is 2 to calculate unit 324 and calculate frame pitch, and it is 2 to preset unit 326 and set the value that the value of i is 2, j；Or default unit 326 is also Can the value of identical i and j be set according to different frame pitch；No matter during such as size of frame pitch, it is equal to preset unit 326 The value that the value for setting corresponding i is 2, j is 2；Or default unit 326 can also to a part of different frame pitch, if The value of identical i and j are put, is 1 and 2 as frame pitch computing unit 324 calculates frame pitch, default unit 326 sets corresponding i Value be 1, j value be 3；It is 3 and 4 that frame pitch computing unit 324, which calculates frame pitch, and default unit 326 sets corresponding i Value be 2, j value be 2.It is configured with specific reference to performance, does not limit herein.

Further, the arithmetic speed for raising neural network, if judging unit 321 judges current frame image for any of which 1st frame image of grouping, each layer volume of first cutting unit 322 before last layer of convolutional layer Jing Guo the neural network After lamination operation, down-sampling processing is carried out to the operation result of each layer of convolutional layer.If judging unit judges current frame image not It is the 1st frame image in any grouping, then the second cutting unit 323 is before i-th layer of convolutional layer Jing Guo the neural network After each layer of convolutional layer operation, down-sampling processing is carried out to the operation result of each layer of convolutional layer.That is the first cutting unit 322 or After current frame image is inputted neural network by the second cutting unit 323, after level 1 volume lamination operation, operation result is carried out Down-sampling processing reduces the resolution ratio of operation result, then the operation result after down-sampling is carried out level 2 volume lamination operation, and Down-sampling processing is also carried out to the operation result of level 2 volume lamination, and so on, until last layer of convolutional layer of neural network (i.e. the bottleneck layer of convolutional layer) or i-th layer of convolutional layer are as the 4th layer of convolutional layer using last layer of convolutional layer or i-th layer of convolutional layer Example, the first cutting unit 322 or the second cutting unit 323 no longer do down-sampling processing after the 4th layer of convolutional layer operation result. After each layer of convolutional layer operation before 4th layer of convolutional layer, the first cutting unit 322 or the second cutting unit 323 are to each layer The operation result of convolutional layer carries out down-sampling processing, reduces the resolution ratio of the frame image of each layer convolutional layer input, can improve god Arithmetic speed through network.It should be noted that in the first time convolutional layer operation of neural network, input is to obtain in real time Current frame image, without carrying out down-sampling processing, the details of relatively good current frame image can be obtained in this way.Later, When the operation result to output carries out down-sampling processing, the details of current frame image was not only interfered with, but also nerve can be improved The arithmetic speed of network.

Determining module 330, suitable for according to scene cut as a result, determining travel route and/or driving instruction.

Specifically, special object may include the objects such as vehicle, pedestrian, road, barrier.Those skilled in the art can basis Actual needs setting special object, does not limit herein.Scene corresponding with current frame image point has been obtained in identification module 420 After cutting result, determining module 330 can according to scene cut corresponding with current frame image as a result, determine vehicle, pedestrian, The profile information of the special objects such as road.As determining module 330 determines to have obtained the profile information of vehicle 1 and the profile of vehicle 2 Information, according to the profile information of vehicle 1 and the profile information of vehicle 2, calculate the relative position relation of vehicle and vehicle 1 with And the relative position relation of vehicle and vehicle 2.

The relative position relation of vehicle and special object includes the distance between vehicle and special object information, As determining module 330 determines that the air line distance of vehicle and vehicle 1 is 200 meters；The opposite position of vehicle and special object The relationship of putting further comprises the angle information between vehicle and special object, as determining module 330 determines vehicle in vehicle 1 10 degree of right rear side angular direction.

Determining module 330 can determine this according to the vehicle being calculated and the relative position relation of special object Vehicle travel route within a preset time interval and/or determining driving instruction.Such as determining module 330 is according to calculating Obtained relative position relation is it is found that 10 meter Chu You a group traveling together, determining module 330 determine that driving instruction can immediately ahead of vehicle To carry out Reduced Speed Now according to the acceleration of 6m/s2；Or determining module 330 can according to the relative position relation being calculated Know there is vehicle 1 immediately ahead of vehicle at 200 meters of distance, has vehicle 2 at 45 degree of 2 meters of the angular direction distances in vehicle left side, really The determining travel route of cover half block 330 can be along front route running.

Determining module 330 is also further adapted for, according to the road signs information included in scene cut result, determining itself Route or travel by vehicle and/or driving instruction.

Various road signs informations, such as caution sign are contained in scene cut result：Traffic circle, to the left racing Curved, consecutive curve, Tunnel ahead etc.；Prohibitory sign：Forbid straight trip, No entry；Warning Mark：Speed limit is divided to Travel vehicle Road allows to turn around；Road construction safety sign：Men working, the closing of left road etc.；Also fingerpost, tourism distinctive emblem, auxiliary Help mark etc..Determining module 330 is according to these specific road signs informations, it may be determined that vehicle travel route and/or Driving instruction.

For example, current vehicle speed 100km/h, determining module 330 is according to the front included in scene cut result The road signs information of the speed limit 80km/h of 500m determines that vehicle Reduced Speed Now instructs；Or determining module 330 is according to field The road signs information of the left road closings of 200m, determines vehicle road driving to the right before being included in scape segmentation result.

Determining module 330 is further adapted for according to the traffic lights information included in scene cut result, determines traveling Route and/or driving instruction.

Traffic lights information is contained in scene cut result, such as traffic lights information, determining module 330 can be according to red Green light information determines whether to prolong current route and continues to travel or travel routes and/or the driving instruction such as ramp to stop.

If determining module 330 is according to the red light information of front 10m in scene cut result, determine that vehicle deceleration stops Vehicle；Alternatively, green light information of the determining module 330 according to front 10m in scene cut result, determines that vehicle is current after reneing Road driving.

Control module 340, suitable for according to identified travel route and/or driving instruction, being carried out to vehicle automatic Driving control.

After determining module 330 determines travel route and/or driving instruction, control module 340 can according to really Fixed travel route and/or driving instruction carries out automatic Pilot control to vehicle.Assuming that the row that determining module 330 is determining It is according to 6m/s to sail instruction²Acceleration carry out Reduced Speed Now, control module 340 to vehicle carry out automatic Pilot control, Control the brake system of vehicle so that vehicle is according to 6m/s²Acceleration carry out Reduced Speed Now.

According to the automatic Pilot processing unit provided by the invention based on scene cut, real-time image acquisition collecting device institute The current frame image in video in shooting and/or the vehicle drive way recorded；Current frame image is input to trained obtain To neural network in, according to belonging to current frame image at it grouping in frame position, to current frame image carry out scene cut, Obtain the scene cut result of current frame image；According to scene cut as a result, determining travel route and/or driving instruction；According to Identified travel route and/or driving instruction carry out automatic Pilot control to vehicle.The present invention utilizes each frame in video Continuity, relevance between image, in scene cut, video packets are handled, and are divided according to belonging to current frame image at it Frame position in group is different, corresponding to carry out scene cut to frame image, further, in every group to the 1st frame image by nerve Network completes the operation of whole convolutional layers and warp lamination, to other frame image only operations in addition to the 1st frame image to i-th layer Convolutional layer, the operation result of jth layer warp lamination that the 1st frame image of multiplexing has obtained carry out image co-registration, greatly reduce The operand of neural network improves the speed of scene cut.It more can accurately be counted based on obtained scene cut result The relative position relation of the special objects such as vehicle and other vehicles, pedestrian, road is calculated, according to the relative position being calculated Relationship more can accurately determine travel route and/or driving instruction.Based on the friendship included in obtained scene cut result Logical flag information, traffic lights information, can preferably observe traffic laws rule conducive to vehicle, and safety and precise is observed disciplines and obey laws Automatic Pilot, improve the safety of automatic Pilot, optimize automatic Pilot processing mode.

Present invention also provides a kind of nonvolatile computer storage media, the computer storage media is stored at least One executable instruction, the computer executable instructions can perform in above-mentioned any means embodiment based on the automatic of scene cut Drive processing method.

Fig. 4 shows a kind of structure diagram of computing device according to an embodiment of the invention, and the present invention is specific real Example is applied not limit the specific implementation of computing device.

As shown in figure 4, the computing device can include：Processor (processor) 402, communication interface (Communications Interface) 404, memory (memory) 406 and communication bus 408.

Wherein：

Processor 402, communication interface 404 and memory 406 complete mutual communication by communication bus 408.

Communication interface 404, for communicating with the network element of miscellaneous equipment such as client or other servers etc..

Processor 402 for performing program 410, can specifically perform the above-mentioned automatic Pilot processing based on scene cut Correlation step in embodiment of the method.

Specifically, program 410 can include program code, which includes computer-managed instruction.

Processor 402 may be central processor CPU or specific integrated circuit ASIC (Application Specific Integrated Circuit) or be arranged to implement the embodiment of the present invention one or more integrate electricity Road.The one or more processors that computing device includes can be same type of processor, such as one or more CPU；Also may be used To be different types of processor, such as one or more CPU and one or more ASIC.

Memory 406, for storing program 410.Memory 406 may include high-speed RAM memory, it is also possible to further include Nonvolatile memory (non-volatile memory), for example, at least a magnetic disk storage.

Program 410 specifically can be used for so that processor 402 performs dividing based on scene in above-mentioned any means embodiment The automatic Pilot processing method cut.The specific implementation of each step may refer to above-mentioned based on the automatic of scene cut in program 410 Corresponding description in corresponding steps and the unit in Processing Example is driven, this will not be repeated here.Those skilled in the art can To be well understood, for convenience and simplicity of description, the equipment of foregoing description and the specific work process of module can refer to Corresponding process description in preceding method embodiment, details are not described herein.

Algorithm and display be not inherently related to any certain computer, virtual system or miscellaneous equipment provided herein. Various general-purpose systems can also be used together with teaching based on this.As described above, required by constructing this kind of system Structure be obvious.In addition, the present invention is not also directed to any certain programmed language.It should be understood that it can utilize various Programming language realizes the content of invention described herein, and the description done above to language-specific is to disclose this hair Bright preferred forms.

In the specification provided in this place, numerous specific details are set forth.It is to be appreciated, however, that the implementation of the present invention Example can be put into practice without these specific details.In some instances, well known method, structure is not been shown in detail And technology, so as not to obscure the understanding of this description.

Similarly, it should be understood that in order to simplify the disclosure and help to understand one or more of each inventive aspect, Above in the description of exemplary embodiment of the present invention, each feature of the invention is grouped together into single implementation sometimes In example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention：I.e. required guarantor Shield the present invention claims the more features of feature than being expressly recited in each claim.More precisely, as following Claims reflect as, inventive aspect is all features less than single embodiment disclosed above.Therefore, Thus the claims for following specific embodiment are expressly incorporated in the specific embodiment, wherein each claim is in itself Separate embodiments all as the present invention.

Those skilled in the art, which are appreciated that, to carry out adaptively the module in the equipment in embodiment Change and they are arranged in one or more equipment different from the embodiment.It can be the module or list in embodiment Member or component be combined into a module or unit or component and can be divided into addition multiple submodule or subelement or Sub-component.Other than such feature and/or at least some of process or unit exclude each other, it may be used any Combination is disclosed to all features disclosed in this specification (including adjoint claim, abstract and attached drawing) and so to appoint Where all processes or unit of method or equipment are combined.Unless expressly stated otherwise, this specification is (including adjoint power Profit requirement, abstract and attached drawing) disclosed in each feature can be by providing the alternative features of identical, equivalent or similar purpose come generation It replaces.

In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments In included certain features rather than other feature, but the combination of the feature of different embodiments means in of the invention Within the scope of and form different embodiments.For example, in the following claims, embodiment claimed is appointed One of meaning mode can use in any combination.

The all parts embodiment of the present invention can be with hardware realization or to be run on one or more processor Software module realize or realized with combination thereof.It will be understood by those of skill in the art that it can use in practice Microprocessor or digital signal processor (DSP) realize the automatic Pilot according to embodiments of the present invention based on scene cut The some or all functions of some or all components in the device of processing.The present invention is also implemented as performing this In described method some or all equipment or program of device (for example, computer program and computer program Product).It is such realize the present invention program can may be stored on the computer-readable medium either can have there are one or it is more The form of a signal.Such signal can be downloaded from internet website obtain either providing on carrier signal or with Any other form provides.

It should be noted that the present invention will be described rather than limits the invention, and ability for above-described embodiment Field technique personnel can design alternative embodiment without departing from the scope of the appended claims.In the claims, Any reference mark between bracket should not be configured to limitations on claims.Word "comprising" does not exclude the presence of not Element or step listed in the claims.Word "a" or "an" before element does not exclude the presence of multiple such Element.The present invention can be by means of including the hardware of several different elements and being come by means of properly programmed computer real It is existing.If in the unit claim for listing equipment for drying, several in these devices can be by same hardware branch To embody.The use of word first, second, and third does not indicate that any sequence.These words can be explained and run after fame Claim.

Claims

1. a kind of automatic Pilot processing method based on scene cut, the method wraps captured and/or institute's recorded video The frame image contained is grouped processing, including：

Real-time image acquisition collecting device is captured and/or the vehicle drive way recorded in video in current frame image；

The current frame image is input in trained obtained neural network, is divided according to belonging to the current frame image at it Frame position in group carries out scene cut to current frame image, obtains the scene cut result of the current frame image；

According to the scene cut as a result, determining travel route and/or driving instruction；

2. according to the method described in claim 1, wherein, it is described according to the scene cut as a result, determine travel route and/or Driving instruction further comprises：

According to the scene cut as a result, determining the profile information of special object；

According to the profile information of the special object, the relative position relation of vehicle and the special object is calculated；

3. according to the method described in claim 2, wherein, the relative position relation packet of the vehicle and the special object Include the distance between vehicle and the special object information and/or angle information.

4. according to the method described in claim 1, wherein, it is described according to the scene cut as a result, determine travel route and/or Driving instruction further comprises：

According to the road signs information included in the scene cut result, determine that vehicle travel route and/or traveling refer to It enables.

5. according to the method described in claim 1, wherein, it is described according to the scene cut as a result, determine travel route and/or Driving instruction further comprises：

According to the traffic lights information included in the scene cut result, travel route and/or driving instruction are determined.

6. method according to any one of claims 1-5, wherein, it is described the current frame image is input to it is trained In obtained neural network, the frame position in grouping according to belonging to the current frame image at it carries out field to current frame image Scape is divided, and the scene cut result for obtaining the current frame image further comprises：

Judge the current frame image whether be any grouping the 1st frame image；

If so, the current frame image is input in trained obtained neural network, all rolled up by the neural network After the operation of lamination and warp lamination, the scene cut result of the current frame image is obtained；

If it is not, then the current frame image is input in trained obtained neural network, in operation to the neural network I-th layer of convolutional layer obtain the operation result of i-th layer of convolutional layer after, obtain current frame image belonging to be grouped the 1st frame image it is defeated Enter the operation result of jth layer warp lamination obtained into the neural network, directly by the operation knot of i-th layer of convolutional layer Fruit and the operation result of the jth layer warp lamination carry out image co-registration, obtain the scene cut result of the current frame image； Wherein, i and j is natural number.

7. according to the method described in claim 6, wherein, judging that the current frame image is not the 1st frame of any grouping After image, the method further includes：

Calculate the current frame image and the frame pitch of the 1st frame image being grouped belonging to it；

According to the frame pitch, the value of i and j are determined；Wherein, between i-th layer of convolutional layer and last layer of convolutional layer Layer away from the frame pitch inversely, layer between the jth layer warp lamination and output layer away from the frame pitch into Proportional relation.

8. a kind of automatic Pilot processing unit based on scene cut, described device wraps captured and/or institute's recorded video The frame image contained is grouped processing, including：

Acquisition module, suitable in the video in captured by the real-time image acquisition collecting device and/or vehicle drive way recorded Current frame image；

Divide module, suitable for the current frame image is input in trained obtained neural network, according to the present frame Frame position of the image in grouping belonging to it carries out scene cut to current frame image, obtains the scene of the current frame image Segmentation result；

Determining module, suitable for according to the scene cut as a result, determining travel route and/or driving instruction；

Control module, suitable for according to identified travel route and/or driving instruction, automatic Pilot control is carried out to vehicle System.

9. a kind of computing device, including：Processor, memory, communication interface and communication bus, the processor, the storage Device and the communication interface complete mutual communication by the communication bus；

For the memory for storing an at least executable instruction, the executable instruction makes the processor perform right such as will Ask the corresponding operation of automatic Pilot processing method based on scene cut described in any one of 1-7.

10. a kind of computer storage media, an at least executable instruction, the executable instruction are stored in the storage medium Processor is made to perform the corresponding behaviour of automatic Pilot processing method based on scene cut as described in any one of claim 1-7 Make.