CN110084182A - It is a kind of based on 3D convolutional neural networks divert one's attention drive recognition methods - Google Patents
It is a kind of based on 3D convolutional neural networks divert one's attention drive recognition methods Download PDFInfo
- Publication number
- CN110084182A CN110084182A CN201910335667.XA CN201910335667A CN110084182A CN 110084182 A CN110084182 A CN 110084182A CN 201910335667 A CN201910335667 A CN 201910335667A CN 110084182 A CN110084182 A CN 110084182A
- Authority
- CN
- China
- Prior art keywords
- attention
- size
- picture
- convolution kernel
- convolutional
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 17
- 238000013527 convolutional neural network Methods 0.000 title claims abstract description 14
- 238000004364 calculation method Methods 0.000 claims abstract description 21
- 238000012545 processing Methods 0.000 claims abstract description 8
- 238000012360 testing method Methods 0.000 claims description 6
- 238000012549 training Methods 0.000 claims description 6
- 230000001537 neural effect Effects 0.000 claims 1
- 238000013135 deep learning Methods 0.000 abstract description 3
- 238000003909 pattern recognition Methods 0.000 abstract description 2
- 238000013528 artificial neural network Methods 0.000 description 8
- 230000006399 behavior Effects 0.000 description 8
- 238000005070 sampling Methods 0.000 description 4
- 210000002569 neuron Anatomy 0.000 description 3
- 230000033228 biological regulation Effects 0.000 description 2
- 102100034761 Cilia- and flagella-associated protein 418 Human genes 0.000 description 1
- 101100439214 Homo sapiens CFAP418 gene Proteins 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/59—Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
- G06V20/597—Recognising the driver's state or behaviour, e.g. attention or drowsiness
Abstract
What the invention discloses a kind of based on 3D convolutional neural networks diverts one's attention to drive recognition methods, belongs to deep learning, area of pattern recognition, more particularly to diverting one's attention based on depth convolutional neural networks drives recognition methods.Superposition processing building input layer is done to posture picture is driven.Doing convolutional calculation to picture cube first is C1 convolutional layer;Then convolutional calculation is done by the different convolution kernel of two-way, then does maximum pondization and calculates, continuously repeated four times, be C2, C3, C4, C5 convolutional layer of network;The C5 two category feature figures exported are finally done into union operation, then successively pass through two full articulamentum L1, L2, calculating softmax is finally exported and corresponding divert one's attention to drive classification.2D picture is superimposed building 3D input by the present invention, while the convolution kernel for having used two-way different proposes feature, and network has generalization ability strong, the high advantage of accuracy of identification.
Description
Technical field
The invention belongs to deep learnings, area of pattern recognition, more particularly to divert one's attention to drive based on depth convolutional neural networks
Sail recognition methods.
Background technique
According to the definition of International Organization for standardization, it is uncorrelated to normal driving to divert one's attention to drive attention direction when referring to driving
Activity, so as to cause driver behavior ability decline a kind of phenomenon.It is common to divert one's attention to drive when mainly including that driver drives
It makes a phone call, plays mobile phone, drinks water, engaging in a small talk etc. behaviors with passenger.For driving using mobile phone, when there is information alert, driver
Automatic sight would generally be transferred on mobile phone screen from road.It usually has a look at mobile phone and needs 3 seconds, it is assumed that motor vehicle
With 60km/h, it then completely blind can hold 50 meters within 3 seconds, can be very dangerous if an emergency situation is encountered.China's " traffic safety
Method implementing regulations " regulation, operating motor vehicles, which must not have to dial, answers the row that hand-held phone, viewing TV etc. interfere safe driving
For.
Traditional image processing means are mostly based on to the intellectual analysis of driving behavior at present, by support vector machines come structure
Build Image Classifier.Correlative study in recent years shows that deep learning method can greatly improve image classification and prediction
Accuracy rate.The present invention is based on 3D deep neural networks, make anticipation to driving behavior of diverting one's attention, and can preferably standardize driving row
To improve the safety of road traffic.
Summary of the invention
What the purpose of the present invention is to propose to a kind of based on 3D convolutional neural networks diverts one's attention to drive recognition methods.
Technical solution of the present invention: this method does superposition processing building input layer to posture picture is driven.First to picture
It is C1 is convolutional layer that cube, which does convolutional calculation,;Then convolutional calculation is done by the different convolution kernel of two-way, then does maximum pond
It calculates, continuously repeats four times, be C2, C3, C4, C5 volumes of bases of network;The C5 two category feature figures exported are finally done into merging behaviour
Make, then successively pass through two full articulamentum L1, L2, calculating softmax is finally exported and corresponding divert one's attention to drive classification.
Specific step is as follows:
Step 1: the driving behavior that will divert one's attention is defined as n class.Picture is uniformly scaled 300*200, then same class diverts one's attention to drive
It sails after picture does superposition processing, the input of 2D picture is switched into 3D input.
Step 2: the training sample training 3D convolutional neural networks obtained using step 1
Step 2.1: input cube passes through convolutional calculation, is C1 convolutional layer.
Step 2.2: the characteristic pattern cube exported to step 2.1 does convolutional calculation by the different convolution kernel of two-way, then
Maximum pondization is done to calculate.It continuously repeats four times, is C2, C3, C4, C5 convolutional layer of network.
Step 2.3: two characteristic pattern cubes that step 2.2 is exported merge into a characteristic pattern cube.
Step 2.4: full connection twice continuously being done to the characteristic pattern cube that step 2.3 exports and is calculated, is connected entirely for F1, F2
Layer.
Step 2.5: Softmax and loss being calculated according to the output of step 2.4, and is joined according to the reversed corrective networks of loss
Number.Step 2.1- step 2.5 is repeated, until loss restrains.
Step 3: test picture being done into duplication superposition processing, constructs 3D cube structure.The 3D convolution obtained using step 2
Neural network testing classification result.
Above-mentioned steps 2.3 have used two-way difference convolution kernel, wherein C2 layers of convolution kernel are having a size of 64@8*8*3 and 64@6*
6*2;C3 layers of convolution kernel are having a size of 128@5*3*2 and 128@7*3*3;C4 layers of convolution kernel are having a size of 256@6*3*2 and 256@5*3*
1;C5 layers of convolution kernel are having a size of 512@3*3*3 and 512@6*5*3.
Beneficial effects of the present invention:
1. in terms of depth network structure: doing feature extraction using the various sizes of convolution kernel of two-way, improve network generalization.
2. model adaptation application aspect: the present invention constructs input layer by the way of picture superposition, is view to input data
The case where frequency, equally has compatibility, and several frame building data cubes need to be only selected from video.
3. product practices aspect: not interfering driver's normal driving, taken the photograph using traffic control department monitoring camera or car
As the identification to driving behavior of diverting one's attention can be realized in head.
Detailed description of the invention
Fig. 1 is flow chart of the present invention.
Fig. 2 is 3D convolutional neural networks structure chart of the present invention.
Specific implementation method
A kind of to divert one's attention to drive recognition methods based on 3D convolutional neural networks, this method is realized by following step:
Step 1: diverting one's attention driving behavior equipped with n class.Picture is uniformly scaled 300*200,30 same class is taken to divert one's attention driving figure
Piece does superposition processing, and cube size is 3@300*200*30 at this time, wherein 30 indicate the superposition of 30 pictures, 300*200 is empty
Between dimension size, 3 be port number.
Step 2: the training sample training 3D convolutional neural networks obtained using step 1
Step 2.1:C1 is neural network first layer, does convolution sum maximum pondization to input cube and calculates.Convolution kernel size is
32@11*7*3,2 be time dimension size, and 11*7 is Spatial Dimension size, shares 32 convolution kernels, and step-length is (1,1,1).Volume
The size for the characteristic pattern that product exports after calculating is 32@(300-11+1) * (200-7+1) * (30-2+1)=32@290*194*
29.It does maximum pondization after convolutional calculation to calculate, sampling window size is 2*2*1, and 1 is time dimension length, and 2*2 is Spatial Dimension.
Characteristic pattern after pond is having a size of 32@(290/2) * (194/2) * (29/1)=32@145*97*29.Final C1 layers of output feature
Figure size are as follows: 32 145*97*29.Wherein, convolutional calculation formula is as follows
Step 2.2:
C2 layers are the neural network second layer, do convolutional calculation to characteristic pattern cube using the different convolution kernel of two-way.Upper layer
For convolution kernel having a size of 64@8*8*3, step-length is (1,1,1).The characteristic pattern exported after convolutional calculation is having a size of 64@(145-8+
1)*(97-8+1)*(29-3+1)=64@138*90*27.The upper layer C2 pond window size is 3*3*1, characteristic pattern after pondization calculates
Size is 64@(138/3) * (90/3) * (27/1)=64@46*30*27.Lower layer's convolution kernel is having a size of 64@6*6*2.Convolutional calculation
The characteristic pattern exported later is having a size of 64@(145-6+1) * (97-6+1) * (29-2+1)=64@140*92*28.C2 lower layer pond
Window size is 2*2*2, and characteristic pattern is having a size of 64@(140/2) * (92/2) * (28/2)=64@70*46*14 after pondization calculates.
C3 layers are neural network third layer, and the characteristic pattern that this layer respectively exports C2 layers does convolution kernel pondization and calculates.Upper layer
Convolution kernel size be 128@5*3*2.The characteristic pattern exported after convolutional calculation is having a size of 128@(46-5+1) * (30-3+1) *
(27-2+1)=128@42*28*26.The upper layer C3 pond window size is 2*2*1, and Chi Huahou characteristic pattern size is 128 * (42/
2)*(28/2)* (26/1)=128@21*14*26.Lower layer's convolution kernel size is 128@7*3*3.The spy exported after convolutional calculation
Figure is levied having a size of 128@(70-7+1) * (46-3+1) * (14-3+1)=128@64*44*12.C3 lower layer pond window size is
2*2*1, Chi Huahou characteristic pattern size are 128 (64/2) * (44/2) * (14/1)=128 32*22*14.
C4 layers are the 4th layer of neural network, this layer is still independent convolution sum Chi Huaji to the characteristic pattern of upper and lower level respectively
It calculates.The convolution kernel size on upper layer is 256@6*3*3.The characteristic pattern exported after convolutional calculation is having a size of 256@(21-6+1) *
(14-3+1)*(26-3+1)= 256@16*12*24.The upper layer C4 pond window size is 2*2*2, and characteristic pattern size is after sampling
256@*(16/2)*(12/2)*(24/2)=256@8*6*12.Lower layer's convolution kernel size is 256@5*3*1.It is defeated after convolutional calculation
Characteristic pattern out is having a size of 256@(32-5+1) * (22-3+1) * (12-1+1)=256@28*20*12.C4 lower layer pond window is big
Small is 2*2*1, and characteristic pattern size is 256@* (28/2) * (20/2) * (12/1)=256@14*10*12 after pondization operation.
C5 layers are neural network layer 5, this layer still does respective calculating to the characteristic pattern of upper and lower level respectively.Upper layer
Convolution kernel size is 512@3*3*3, there is 512 convolution kernels.The characteristic pattern exported after convolutional calculation is having a size of 512@(8-3+
1)*(6-3+1)*(12-3+1)=512@6*4*10.The upper layer C5 sampling window size is 2*2*2, and characteristic pattern size is after sampling
512@*(6/2)*(4/2)*(10/2)=512@3*2*5.Lower layer's convolution kernel size is 512@6*5*3.It is exported after convolutional calculation
Characteristic pattern having a size of 512@(14-6+1) * (10-5+1) * (12-3+1)=512@9*6*10.C5 lower layer pond window size is
3*3*2, characteristic pattern size is 512@* (9/3) * (6/3) * (10/2)=512@3*2*5 after pondization calculates.
Step 2.3: being converted to a characteristic pattern after the identical characteristic pattern of two sizes of C5 upper and lower level is merged channel, at this time
Channel changes, remaining size constancy, and the characteristic pattern exported after connection is having a size of 1024@3*2*5.
Step 2.4:F1 shares 4096 neurons as first full articulamentum, and each neuron dimension is 1024*3*
2*5, obtained output are 1*1*1*4096.F2 is the full articulamentum of the second layer, is the output layer of network.F2 layers use n 1*1*
1*4096 ties up neuron and does full connection calculating to F1 layers of output, and F2 layers of output is that size is 1*1*1*n.
Wherein connection calculation formula is as follows entirely:
Step 2.5: Softmax, loss being calculated according to the output of step 2.4, and according to the reversed corrective networks parameter of loss.It repeats
Step 2.1- step 2.5, until loss restrains.The calculation formula of Softmax and loss is as follows:
Step 3: test picture being done into 30 parts of duplication and by superposition building 3D cube input.3D volumes obtained using step 2
Product neural network testing classification result.
Claims (3)
1. it is a kind of based on 3D convolutional neural networks divert one's attention drive recognition methods, it is characterised in that this method includes
Step 1: the driving behavior that will divert one's attention is defined as n class;Original driving picture is pre-processed, it is defeated that the input of 2D picture is switched to 3D
Enter;
Step 2: the training sample training 3D convolutional neural networks obtained using step 1;
Step 2.1: input cube passes through convolutional calculation, is C1 convolutional layer,
Step 2.2: the characteristic pattern cube exported to step 2.1 does convolutional calculation by the different convolution kernel of two-way, then do most
Great Chiization calculates, and continuously repeats four times, is C2, C3, C4, C5 convolutional layer of network,
Step 2.3: two characteristic pattern cubes that step 2.2 is exported merge into a characteristic pattern cube,
Step 2.4: full connection twice is continuously done to the characteristic pattern cube that step 2.3 exports and is calculated, is the full articulamentum of F1, F2,
Step 2.5: Softmax, loss being calculated according to the output of step 2.4, and according to the reversed corrective networks parameter of loss;It repeats
Step 2.1- step 2.5, until loss restrains;
Step 3: test picture being done into duplication superposition processing, constructs 3D cube structure;The 3D convolutional Neural obtained using step 2
Network test classification results.
2. it is as described in claim 1 it is a kind of based on 3D convolutional neural networks divert one's attention drive recognition methods, it is characterised in that institute
Stating step 1 two-dimension picture input is switched to three-dimensional input includes: that picture is scaled uniform sizes size first, then will be same
Class, which diverts one's attention to drive after picture does superposition processing, inputs 3D network.
3. a kind of as described in claim 1 divert one's attention to drive recognition methods based on 3D convolutional neural networks, step 2.3 is used
Two-way difference convolution kernel, it is characterized in that: C2 layers of convolution kernel are having a size of 64@8*8*3 and 64@6*6*2;C3 layers of convolution kernel having a size of
128@5*3*2 and 128@7*3*3;C4 layers of convolution kernel are having a size of 256@6*3*2 and 256@5*3*1;C5 layers of convolution kernel having a size of
512@3*3*3 and 512@6*5*3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910335667.XA CN110084182A (en) | 2019-04-24 | 2019-04-24 | It is a kind of based on 3D convolutional neural networks divert one's attention drive recognition methods |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910335667.XA CN110084182A (en) | 2019-04-24 | 2019-04-24 | It is a kind of based on 3D convolutional neural networks divert one's attention drive recognition methods |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110084182A true CN110084182A (en) | 2019-08-02 |
Family
ID=67416509
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910335667.XA Pending CN110084182A (en) | 2019-04-24 | 2019-04-24 | It is a kind of based on 3D convolutional neural networks divert one's attention drive recognition methods |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110084182A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111860427A (en) * | 2020-07-30 | 2020-10-30 | 重庆邮电大学 | Driving distraction identification method based on lightweight class eight-dimensional convolutional neural network |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108171176A (en) * | 2017-12-29 | 2018-06-15 | 中车工业研究院有限公司 | A kind of subway driver's emotion identification method and device based on deep learning |
CN108182441A (en) * | 2017-12-29 | 2018-06-19 | 华中科技大学 | Parallel multichannel convolutive neural network, construction method and image characteristic extracting method |
CN108875674A (en) * | 2018-06-29 | 2018-11-23 | 东南大学 | A kind of driving behavior recognition methods based on multiple row fusion convolutional neural networks |
CN109376634A (en) * | 2018-10-15 | 2019-02-22 | 北京航天控制仪器研究所 | A kind of Bus driver unlawful practice detection system neural network based |
-
2019
- 2019-04-24 CN CN201910335667.XA patent/CN110084182A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108171176A (en) * | 2017-12-29 | 2018-06-15 | 中车工业研究院有限公司 | A kind of subway driver's emotion identification method and device based on deep learning |
CN108182441A (en) * | 2017-12-29 | 2018-06-19 | 华中科技大学 | Parallel multichannel convolutive neural network, construction method and image characteristic extracting method |
CN108875674A (en) * | 2018-06-29 | 2018-11-23 | 东南大学 | A kind of driving behavior recognition methods based on multiple row fusion convolutional neural networks |
CN109376634A (en) * | 2018-10-15 | 2019-02-22 | 北京航天控制仪器研究所 | A kind of Bus driver unlawful practice detection system neural network based |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111860427A (en) * | 2020-07-30 | 2020-10-30 | 重庆邮电大学 | Driving distraction identification method based on lightweight class eight-dimensional convolutional neural network |
CN111860427B (en) * | 2020-07-30 | 2022-07-01 | 重庆邮电大学 | Driving distraction identification method based on lightweight class eight-dimensional convolutional neural network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106407931B (en) | A kind of depth convolutional neural networks moving vehicle detection method | |
WO2020181685A1 (en) | Vehicle-mounted video target detection method based on deep learning | |
DE102019115707A1 (en) | SPATIAL AND TIMELINE ATTENTION-BASED DEPTH LEARNING LEARNING OF HIERARCHICAL Lane-changing Strategies for Controlling an Autonomous Vehicle | |
CN107368890A (en) | A kind of road condition analyzing method and system based on deep learning centered on vision | |
DE102019115809A1 (en) | METHOD AND SYSTEM FOR THE CONTINUOUS LEARNING OF CONTROL COMMANDS FOR AUTONOMOUS VEHICLES | |
CN109977793A (en) | Trackside image pedestrian's dividing method based on mutative scale multiple features fusion convolutional network | |
CN110298262A (en) | Object identification method and device | |
CN109740463A (en) | A kind of object detection method under vehicle environment | |
CN106651913A (en) | Target tracking method based on correlation filtering and color histogram statistics and ADAS (Advanced Driving Assistance System) | |
CN109147368A (en) | Intelligent driving control method device and electronic equipment based on lane line | |
CN108985269A (en) | Converged network driving environment sensor model based on convolution sum cavity convolutional coding structure | |
CN109584507A (en) | Driver behavior modeling method, apparatus, system, the vehicles and storage medium | |
CN107886073A (en) | A kind of more attribute recognition approaches of fine granularity vehicle based on convolutional neural networks | |
CN106297297A (en) | Traffic jam judging method based on degree of depth study | |
CN106407903A (en) | Multiple dimensioned convolution neural network-based real time human body abnormal behavior identification method | |
CN107274445A (en) | A kind of image depth estimation method and system | |
CN107133974A (en) | The vehicle type classification method that Gaussian Background modeling is combined with Recognition with Recurrent Neural Network | |
Ou et al. | Enhancing driver distraction recognition using generative adversarial networks | |
CN110210474A (en) | Object detection method and device, equipment and storage medium | |
CN111275638B (en) | Face repairing method for generating confrontation network based on multichannel attention selection | |
CN110414421B (en) | Behavior identification method based on continuous frame images | |
CN110197152A (en) | A kind of road target recognition methods for automated driving system | |
CN106570444A (en) | On-board smart prompting method and system based on behavior identification | |
CN108205649A (en) | Driver drives to take the state identification method and device of phone | |
CN110363093A (en) | A kind of driver's action identification method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190802 |