CN107679462A - A kind of depth multiple features fusion sorting technique based on small echo - Google Patents

A kind of depth multiple features fusion sorting technique based on small echo Download PDF

Info

Publication number
CN107679462A
CN107679462A CN201710823051.8A CN201710823051A CN107679462A CN 107679462 A CN107679462 A CN 107679462A CN 201710823051 A CN201710823051 A CN 201710823051A CN 107679462 A CN107679462 A CN 107679462A
Authority
CN
China
Prior art keywords
mrow
msub
multiple features
layer
mtd
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710823051.8A
Other languages
Chinese (zh)
Other versions
CN107679462B (en
Inventor
于刚
李艇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Graduate School Harbin Institute of Technology
Original Assignee
Shenzhen Graduate School Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Graduate School Harbin Institute of Technology filed Critical Shenzhen Graduate School Harbin Institute of Technology
Priority to CN201710823051.8A priority Critical patent/CN107679462B/en
Publication of CN107679462A publication Critical patent/CN107679462A/en
Application granted granted Critical
Publication of CN107679462B publication Critical patent/CN107679462B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a kind of depth multiple features fusion sorting technique based on small echo, including cognitive phase on line lower training stage and line, wherein, the line lower training stage is trained by building convolutional neural networks to the sample of n class labels, wavelet transform is added in the convolutional layer of model end and full articulamentum to decompose the mapping of depth multiple features, the high-low frequency weight linear fusion that will be obtained, so as to obtain optimal weights;Cognitive phase is identified and classified to the action in image and video with convolutional neural networks collocation SVMs on line.The beneficial effects of the invention are as follows:Improve the accuracy rate of the Classification and Identification of image/video.

Description

A kind of depth multiple features fusion sorting technique based on small echo
Technical field
The present invention relates to robot vision image procossing, more particularly to a kind of depth multiple features fusion classification based on small echo Method.
Background technology
Recent years, deep learning become science and technology and enclose most fiery vocabulary.It has gradually overturned speech recognition, image point The algorithm mentality of designing of the various fields such as class, text understanding, one kind has been gradually formed from training data, has been arrived by an end The model at end, then directly output obtains a kind of new model of final result.As the arrival in big data epoch and GPU etc. are each The development of the more powerful computing device of kind, deep learning is further strengthened, can make full use of various mass datas, fully automated Initial data is condensed into certain knowledge by ground study to abstract knowledge representation.Wherein convolutional neural networks are depth again Most common framework in habit.
With the continuous expansion of convolutional neural networks framework, the network number of plies deepens constantly, the feature of each module extraction Map gradually increases, by simply by convolutional layer flatten be a vector carry out again full connection not only amount of calculation it is huge but also Feature Fuzzy can be caused, so as to influence the accuracy rate of the Classification and Identification of image/video.
The content of the invention
In order to solve the problems of the prior art, the invention provides a kind of standard for the Classification and Identification for improving image/video The depth multiple features fusion sorting technique based on small echo of true rate.
The invention provides a kind of depth multiple features fusion sorting technique based on small echo, including line lower training stage and line Upper cognitive phase, wherein, the line lower training stage is trained by building convolutional neural networks to the sample of n class labels, in mould The convolutional layer of type end and full articulamentum add wavelet transform and the mapping of depth multiple features are decomposed, the height that will be obtained Frequency component linear fusion, so as to obtain optimal weights;Cognitive phase convolutional neural networks collocation SVMs pair on line Action in image and video is identified and classified.
As a further improvement on the present invention, the line lower training stage comprises the following steps:
Step 1:Structure convolutional neural networks first are trained;
Step 2:3 passages are set in first layer, are respectively:1 gray channel, 2 light stream passages, wherein gray scale are led to Road includes the gray level image group of video segment, and light stream passage includes the movement relation information of the interframe of video segment two;
Step 3:Build multimode convolutional neural networks;
Step 4:Using wavelet transform, high and low frequency is extracted from the feature map of the full articulamentum of each module Component, the high-low frequency weight in three modules is each merged;
Step 5:High-low frequency weight after fusion is connected by merge layers and is connected entirely with next layer, is obtained To the feature map of one group of 128 dimension;
Step 6:N output node, corresponding n kinds classification behavior, each node and all feature of last layer are set Map is connected entirely;
Step 7:Calculating parameter between each layer is adjusted by back-propagation algorithm so that the output of each sample Error between label declines, and after error meets requirement, training finishes, then to each output vector according to its corresponding sample Video line is that title sets label.
As a further improvement on the present invention, the training stage comprises the following steps on line:
Step 8:Input needs the video flowing identified, the pretreatment in step 1 is carried out to video, in being trained under line Obtained optimal models, it is loaded into weight, it would be desirable to Internet of the video flowing of identification by step 2 to step 8, extract feature Vector;
Step 9:Characteristic vector in step 10 is classified using SVMs, finds the mark most matched therewith Label, obtain optimal accuracy rate.
As a further improvement on the present invention, comprise the following steps:
S1:Obtain training sample image;
S2:Image preprocessing;
S3:Build gray scale, light stream multichannel network passage;
S4:Gray scale, light stream x and y channel network are built respectively;
S5:Wavelet transform is carried out to the full articulamentum Feature Mapping of each channel end;
S6:High and low frequency component is extracted, carries out the Fusion Features of interchannel;
S7:Pass through the feature after merge layer fused in tandem;
S8:Train and extract optimal weights;
S9:Video is sent into the optimal models trained and carries out feature extraction;
S10:ONLINE RECOGNITION is carried out using SVMs.
As a further improvement on the present invention, in step sl, training sample and sample label are obtained from data set; In step s 2, it is unified that resolution ratio is carried out to the video flowing that training sample is concentrated, resolution ratio system is carried out using Lanczos interpolation methods One, along x in Interpolation Process, respectively to eight adjacent click-through row interpolations, that is, calculate weighted sum, Lanczos is inserted in y directions The window function of value method is:
Two dimensional form is then:L (x, y)=L (x) L (y).
As a further improvement on the present invention, in step s3, by establishing gray channel to the gray processing of video flowing, ash Degree figure retains the most basic information of original image, and the light of x and y both directions is established in the extraction to inter motion information in video flowing Circulation road, the Optic flow information of interframe is extracted using improved L-K optical flow methods, pyramid down-sampling is replaced using convolution kernel, it is first First partial derivative f is tried to achieve from f (x, y, t)x,fy,ft, convolution kernel selects Prewitt wave filters, i.e.,:
Ix=I*Dx, Iy=I*Dy, It=I*Dt
Velocity estimation is carried out using least square method:
As a further improvement on the present invention, in step s 4, each passage passes through sampling processing, and dimension of picture is changed into 150*100,5 layers of convolutional layer are built, 3 layers of pond layer, reconnect one layer of full articulamentum, first layer convolutional layer convolution kernel size afterwards For 5*5*5, convolutional layer convolution kernel size afterwards is 3*3*3, and step-length is arranged to 1, and pond layer uses 3D maxpooling, Core selection 2*2*2 and two kinds of the 2*2*1 of pond layer, activation primitive selection relu.
As a further improvement on the present invention, in step s 5, the Feature Mapping of the full articulamentum of each channel end is used Wavelet transform carries out extraction high-low frequency weight, passes through continuous wavelet function ψa,b(t) it can be write as discrete wavelet function:
Show that wavelet transform form is:
As a further improvement on the present invention, in step s 6, by gray channel, the full articulamentum of light stream x and y passage 512 dimension feature map are decomposed into 3 couples of 128 dimension feature map containing high-low frequency weight, then 128 dimensions by each passage Feature map carry out vector product calculation, obtain the two groups of feature for containing 128 dimensions map;In the step s 7, by setting up Merge layers, mode set concat, the high fdrequency component of fusion and low frequency component are connected, and set n output node, right The classification behavior of n kinds is answered to be connected entirely with all feature map in upper strata.
As a further improvement on the present invention, in step s 8, training sample set is put into network and be trained, adjusted back The minimum model of penalty values, preserves optimal weights;In step slo, the video flowing of input is extracted by convolutional neural networks The feature map of 128 dimensions, it is linear function to select kernel function, and structure SVMs carries out Classification and Identification.
The beneficial effects of the invention are as follows:By such scheme, to changing in the convolutional neural networks training process of classics Enter, add wavelet transform and the depth characteristic in training process is decomposed, extract multiresolution features, then will be each Corresponding multiresolution features just merge in depth characteristic, strengthen bottom-up information, strengthen high layer information, reduce network calculations Complexity, while enhance the robustness of network training, improve the accuracy rate of the Classification and Identification of image/video.
Brief description of the drawings
Fig. 1 is a kind of flow chart of the depth multiple features fusion sorting technique based on small echo of the present invention.
Fig. 2 is single channel network.
Fig. 3 is to be based on the improved convolutional neural networks overall construction drawing of small echo.
Embodiment
The invention will be further described for explanation and embodiment below in conjunction with the accompanying drawings.
A kind of depth multiple features fusion sorting technique based on small echo, in two stages:Know on line lower training stage and line The other stage.The sample of n class labels is trained by building convolutional neural networks, convolutional layer and full connection in model end Layer adds wavelet transform and the mapping of depth multiple features is decomposed, the high-low frequency weight linear fusion that will be obtained, so as to obtain Optimal weights are obtained, then the action in image and video is identified and classified with neutral net collocation SVMs.
(1) the line lower training stage
Step 1:Structure convolutional neural networks first are trained, by taking action recognition as an example, using Activity recognition data set HMDB51 is training set, and video segment is pre-processed, unified video resolution;
Step 2:3 passages are set in first layer, are respectively:1 gray channel, 2 light stream passages, wherein gray scale are led to Road includes the gray level image group of video segment, and light stream passage includes the movement relation information of the interframe of video segment two;
Step 3:Build multimode convolutional neural networks
Step 4:Using wavelet transform, high and low frequency is extracted from the feature map of the full articulamentum of each module Component, the high-low frequency weight in three modules is each merged;
Step 5:High-low frequency weight after fusion is connected by merge layers and is connected entirely with next layer, is obtained To the feature map of one group of 128 dimension;
Step 6:N output node, corresponding n kind classification behaviors (label) are set, and each node owns with last layer Feature map are connected entirely;
Step 7:Calculating parameter between each layer is adjusted by back-propagation algorithm so that the output of each sample Error between label declines, and after error meets requirement, training finishes, then to each output vector according to its corresponding sample Video line is that title sets label;
(2) ONLINE RECOGNITION
Step 8:Input needs the video flowing identified, the pretreatment in step 1 is carried out to video, in being trained under line Obtained optimal models, it is loaded into weight, it would be desirable to Internet of the video flowing of identification by step 2 to step 8, extract feature Vector;
Step 9:Characteristic vector in step 10 is classified using SVMs, finds the mark most matched therewith Label, obtain optimal accuracy rate.
A kind of depth multiple features fusion sorting technique based on small echo provided by the invention, convolution god of this method to classics Through being improved in network training process, add wavelet transform and the depth characteristic in training process is decomposed, carry Multiresolution features are taken, then corresponding multiresolution features in each depth characteristic are just merged, strengthen bottom-up information, are strengthened High layer information, reduces the complexity of network calculations, while enhances the robustness of network training.
As shown in figure 1, a kind of depth multiple features fusion sorting technique based on small echo, specifically includes following steps:
S1:Obtain training sample image:
Training sample and sample label are obtained from HMDB51 data sets.
S2:Image preprocessing:
It is unified that resolution ratio is carried out to the video flowing that training sample is concentrated, when carrying out resolution ratio unified operation, image border It can obscure, always cause information loss.Resolution ratio unification is carried out using Lanczos interpolation methods herein, along x, y in Interpolation Process Direction is respectively to eight adjacent click-through row interpolations, that is, calculate weighted sum, so it is 8*8 description.Although Lanczos interpolation calculations amount is more complicated compared with other interpolation methods, but due to being run on GPU, it is little to overall performance impact, Effect is also more notable than other interpolation methods simultaneously.Its window function is:
Two dimensional form is then:L (x, y)=L (x) L (y).
S3:Build gray scale, light stream multichannel network passage:
By establishing gray channel to the gray processing of video flowing, gray-scale map retains the most basic information of original image, so Gray channel is essential.The light stream passage of x and y both directions is established in extraction to inter motion information in video flowing. Light stream is the instantaneous velocity of pixel motion of the space motion object on observation imaging plane, is to utilize pixel in image sequence to exist The correlation between change and consecutive frame in time-domain finds previous frame with existing corresponding relation between present frame, from And calculate a kind of method of the movable information of object between consecutive frame.For in action recognition, light stream passage is equally must not Can be less.Herein using the Optic flow information of improved L-K optical flow methods extraction interframe.Pyramid down-sampling is replaced using convolution kernel Amount of calculation can be reduced, while effect is more excellent.First partial derivative f is tried to achieve from f (x, y, t)x,fy,ft, convolution kernel selects Prewitt Wave filter, i.e.,:
Ix=I*Dx, Iy=I*Dy, It=I*Dt
Velocity estimation is carried out using least square method:
S4:Gray scale, light stream x and y channel network are built respectively:
Fig. 2 is single channel network structure, and each passage is handled by down-sampling, and dimension of picture is changed into 150*100, is built 5 layers of convolutional layer, 3 layers of pond layer, one layer of full articulamentum are reconnected afterwards.First layer convolutional layer convolution kernel size is 5*5*5, afterwards Convolutional layer convolution kernel size be 3*3*3, step-length is arranged to 1.Pond layer uses 3D maxpooling, the core choosing of pond layer Two kinds of 2*2*2 and 2*2*1 are selected, prevents that dimension declines too fast on later time.Activation primitive selects relu, and the function can simulate brain Neuron receives signal and more accurately activates model, and the sigmoid functions that compare have unilateral suppression, relatively broad excited side The characteristics of boundary and sparse activity.
S5:Wavelet transform is carried out to the full articulamentum Feature Mapping of each channel end:
The Feature Mapping of the full articulamentum of each channel end is subjected to extraction high-low frequency weight using wavelet transform, passed through Continuous wavelet function ψa,b(t) it can be write as discrete wavelet function:
Show that wavelet transform form is:
S6:High and low frequency component is extracted, carries out the Fusion Features of interchannel:
Gray channel, 512 dimension feature map of the full articulamentum of light stream x and y passage are decomposed into 3 by Dwt operations in Fig. 3 Vector product fortune is carried out to the 128 dimension feature map containing high-low frequency weight, then by 128 dimension feature map of each passage Calculate, obtain the two groups of feature for containing 128 dimensions map.
S7:Pass through the feature after merge layer fused in tandem:
By setting up merge layers, mode sets concat, the high fdrequency component of fusion and low frequency component is connected, if N output node is put, corresponding n kind classification behaviors (label) feature maps all with upper strata carry out being connected entirely
S8:Train and extract optimal weights:
Training sample set is put into network and is trained, the minimum model of readjustment penalty values, preserves optimal weights.
S9:Video is sent into the optimal models trained and carries out feature extraction.
S10:ONLINE RECOGNITION is carried out using SVMs:
By feature map of the video flowing of input by the dimension of convolutional neural networks extraction 128, it is linear to select kernel function Function, structure SVMs carry out Classification and Identification.
It can relatively be reached without the convolutional neural networks model for adding small echo progress depth characteristic fusion, method of the invention To more preferable effect, tested in common data sets, also reach higher accuracy rate.Meanwhile limitation does not have the present invention For the identification of action in body embodiment, the Classification and Identification of image/video can be widely applied to.
A kind of depth multiple features fusion sorting technique based on small echo provided by the invention, using wavelet transform from Low frequency and high fdrequency component are extracted in feature map, high-low frequency weight is merged respectively, reaches enhancing bottom-up information, is strengthened high The purpose of layer information, so as to improve the accuracy rate of Network Recognition and robustness.
A kind of depth multiple features fusion sorting technique based on small echo provided by the invention, suitable for robot vision image Processing technology field, be particularly suitable for use in deep learning, feature extraction, Computer Vision.
Above content is to combine specific preferred embodiment further description made for the present invention, it is impossible to is assert The specific implementation of the present invention is confined to these explanations.For general technical staff of the technical field of the invention, On the premise of not departing from present inventive concept, some simple deduction or replace can also be made, should all be considered as belonging to the present invention's Protection domain.

Claims (10)

  1. A kind of 1. depth multiple features fusion sorting technique based on small echo, it is characterised in that:Including on line lower training stage and line Cognitive phase, wherein, the line lower training stage is trained by building convolutional neural networks to the sample of n class labels, in model The convolutional layer of end and full articulamentum add wavelet transform and the mapping of depth multiple features are decomposed, the low-and high-frequency that will be obtained Component linear fusion, so as to obtain optimal weights;Cognitive phase arranges in pairs or groups SVMs to figure with the convolutional neural networks on line Action in picture and video is identified and classified.
  2. 2. the depth multiple features fusion sorting technique according to claim 1 based on small echo, it is characterised in that trained under line Stage comprises the following steps:
    Step 1:Structure convolutional neural networks first are trained;
    Step 2:3 passages are set in first layer, are respectively:1 gray channel, wherein 2 light stream passages, gray channel bag Gray level image group containing video segment, light stream passage include the movement relation information of the interframe of video segment two;
    Step 3:Build multimode convolutional neural networks;
    Step 4:Using wavelet transform, high and low frequency point is extracted from the feature map of the full articulamentum of each module Amount, the high-low frequency weight in three modules is each merged;
    Step 5:High-low frequency weight after fusion is connected by merge layers and is connected entirely with next layer, obtains one The feature map of the dimension of group 128;
    Step 6:N output node is set, and corresponding n kinds classification behavior, each node and all feature map of last layer are complete Connection;
    Step 7:Calculating parameter between each layer is adjusted by back-propagation algorithm so that the output of each sample and mark Error between label declines, and after error meets requirement, training finishes, then to each output vector according to its corresponding Sample video Behavior title sets label.
  3. 3. the depth multiple features fusion sorting technique according to claim 2 based on small echo, it is characterised in that trained on line Stage comprises the following steps:
    Step 8:Input needs the video flowing identified, and the pretreatment in step 1 is carried out to video, by being obtained in being trained under line Optimal models, be loaded into weight, it would be desirable to the video flowing of identification by step 2 arrive step 8 Internet, extract feature to Amount;
    Step 9:Characteristic vector in step 10 is classified using SVMs, the label most matched therewith is found, obtains To optimal accuracy rate.
  4. 4. the depth multiple features fusion sorting technique according to claim 1 based on small echo, it is characterised in that including following Step:
    S1:Obtain training sample image;
    S2:Image preprocessing;
    S3:Build gray scale, light stream multichannel network passage;
    S4:Gray scale, light stream x and y channel network are built respectively;
    S5:Wavelet transform is carried out to the full articulamentum Feature Mapping of each channel end;
    S6:High and low frequency component is extracted, carries out the Fusion Features of interchannel;
    S7:Pass through the feature after merge layer fused in tandem;
    S8:Train and extract optimal weights;
    S9:Video is sent into the optimal models trained and carries out feature extraction;
    S10:ONLINE RECOGNITION is carried out using SVMs.
  5. 5. the depth multiple features fusion sorting technique according to claim 4 based on small echo, it is characterised in that in step S1 In, training sample and sample label are obtained from data set;In step s 2, the video flowing concentrated to training sample divides Resolution is unified, carries out resolution ratio unification using Lanczos interpolation methods, along x in Interpolation Process, y directions are respectively to adjacent eight Individual click-through row interpolation, that is, weighted sum is calculated, the window function of Lanczos interpolation methods is:
    <mrow> <mi>L</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <mi>sin</mi> <mi> </mi> <mi>c</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>)</mo> </mrow> <mi>sin</mi> <mi> </mi> <mi>c</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>/</mo> <mi>a</mi> <mo>)</mo> </mrow> </mrow> </mtd> <mtd> <mtable> <mtr> <mtd> <mrow> <mi>i</mi> <mi>f</mi> </mrow> </mtd> <mtd> <mrow> <mo>-</mo> <mi>a</mi> <mo>&lt;</mo> <mi>x</mi> <mo>&lt;</mo> <mi>a</mi> </mrow> </mtd> </mtr> </mtable> </mtd> </mtr> <mtr> <mtd> <mn>0</mn> </mtd> <mtd> <mrow> <mi>o</mi> <mi>t</mi> <mi>h</mi> <mi>e</mi> <mi>r</mi> <mi>w</mi> <mi>i</mi> <mi>s</mi> <mi>e</mi> </mrow> </mtd> </mtr> </mtable> </mfenced> </mrow>
    Two dimensional form is then:L (x, y)=L (x) L (y).
  6. 6. the depth multiple features fusion sorting technique according to claim 5 based on small echo, it is characterised in that in step S3 In, by establishing gray channel to the gray processing of video flowing, gray-scale map retains the most basic information of original image, in video flowing The light stream passage of x and y both directions is established in the extraction of inter motion information, and the light of interframe is extracted using improved L-K optical flow methods Stream information, pyramid down-sampling is replaced using convolution kernel, try to achieve partial derivative f from f (x, y, t) firstx,fy,ft, convolution kernel choosing With Prewitt wave filters, i.e.,:
    Ix=I*Dx, Iy=I*Dy, It=I*Dt
    Velocity estimation is carried out using least square method:
    <mrow> <mi>E</mi> <mrow> <mo>(</mo> <msub> <mi>u</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>u</mi> <mn>2</mn> </msub> <mo>)</mo> </mrow> <mo>=</mo> <munder> <mo>&amp;Sigma;</mo> <mrow> <mi>x</mi> <mo>,</mo> <mi>y</mi> </mrow> </munder> <mi>g</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <msup> <mrow> <mo>&amp;lsqb;</mo> <msub> <mi>u</mi> <mn>1</mn> </msub> <msub> <mi>f</mi> <mi>x</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>,</mo> <mi>t</mi> <mo>)</mo> </mrow> <mo>+</mo> <msub> <mi>u</mi> <mn>2</mn> </msub> <msub> <mi>f</mi> <mi>y</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>,</mo> <mi>t</mi> <mo>)</mo> </mrow> <mo>+</mo> <msub> <mi>f</mi> <mi>t</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>,</mo> <mi>t</mi> <mo>)</mo> </mrow> <mo>&amp;rsqb;</mo> </mrow> <mn>2</mn> </msup> <mo>.</mo> </mrow>
  7. 7. the depth multiple features fusion sorting technique according to claim 6 based on small echo, it is characterised in that in step S4 In, each passage passes through sampling processing, and dimension of picture is changed into 150*100, builds 5 layers of convolutional layer, 3 layers of pond layer, connects again afterwards One layer of full articulamentum is connect, first layer convolutional layer convolution kernel size is 5*5*5, and convolutional layer convolution kernel size afterwards is 3*3*3, Step-length is arranged to 1, and pond layer uses 3D maxpooling, core selection 2*2*2 and two kinds of the 2*2*1 of pond layer, activation primitive Select relu.
  8. 8. the depth multiple features fusion sorting technique according to claim 7 based on small echo, it is characterised in that in step S5 In, the Feature Mapping of the full articulamentum of each channel end is subjected to extraction high-low frequency weight using wavelet transform, by continuous Wavelet function ψa,b(t) it can be write as discrete wavelet function:
    <mrow> <msub> <mi>&amp;psi;</mi> <mrow> <mi>m</mi> <mo>,</mo> <mi>n</mi> </mrow> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> <mo>=</mo> <msubsup> <mi>a</mi> <mn>0</mn> <mrow> <mo>-</mo> <mi>m</mi> <mo>/</mo> <mn>2</mn> </mrow> </msubsup> <mi>&amp;psi;</mi> <mrow> <mo>(</mo> <msubsup> <mi>a</mi> <mn>0</mn> <mrow> <mo>-</mo> <mi>m</mi> </mrow> </msubsup> <mi>t</mi> <mo>-</mo> <msub> <mi>b</mi> <mn>0</mn> </msub> <mi>n</mi> <mo>)</mo> </mrow> </mrow>
    Show that wavelet transform form is:
    <mrow> <msub> <mi>Wf</mi> <mrow> <mi>m</mi> <mo>,</mo> <mi>n</mi> </mrow> </msub> <mo>=</mo> <msubsup> <mi>a</mi> <mn>0</mn> <mrow> <mo>-</mo> <mi>m</mi> <mo>/</mo> <mn>2</mn> </mrow> </msubsup> <munderover> <mo>&amp;Integral;</mo> <mrow> <mo>-</mo> <mi>&amp;infin;</mi> </mrow> <mrow> <mo>+</mo> <mi>&amp;infin;</mi> </mrow> </munderover> <mi>f</mi> <mrow> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> <msup> <mi>&amp;psi;</mi> <mo>*</mo> </msup> <mrow> <mo>(</mo> <msubsup> <mi>a</mi> <mn>0</mn> <mrow> <mo>-</mo> <mi>m</mi> </mrow> </msubsup> <mi>t</mi> <mo>-</mo> <msub> <mi>nb</mi> <mn>0</mn> </msub> <mo>)</mo> </mrow> <mi>d</mi> <mi>t</mi> <mo>=</mo> <mo>&lt;</mo> <mi>f</mi> <mrow> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> <mo>,</mo> <msub> <mi>&amp;psi;</mi> <mrow> <mi>m</mi> <mo>,</mo> <mi>n</mi> </mrow> </msub> <mrow> <mo>(</mo> <mi>t</mi> <mo>)</mo> </mrow> <mo>&gt;</mo> <mo>,</mo> <mi>m</mi> <mo>,</mo> <mi>n</mi> <mo>&amp;Element;</mo> <mi>Z</mi> <mo>.</mo> </mrow>
  9. 9. the depth multiple features fusion sorting technique according to claim 8 based on small echo, it is characterised in that in step S6 In, by gray channel, 512 dimension feature map of the full articulamentum of light stream x and y passage are decomposed into 3 pairs containing high-low frequency weight 128 dimension feature map, then 128 dimension feature map of each passage are subjected to vector product calculation, obtain two groups and contain 128 dimensions Feature map;In the step s 7, by setting up merge layers, mode sets concat, by the high fdrequency component of fusion and low Frequency component is connected, and sets n output node, and corresponding n kinds classification behavior is connected entirely with all feature map in upper strata Connect.
  10. 10. the depth multiple features fusion sorting technique according to claim 9 based on small echo, it is characterised in that in step In S8, training sample set is put into network and is trained, the minimum model of readjustment penalty values, preserve optimal weights;In step In S10, by feature map of the video flowing of input by the dimension of convolutional neural networks extraction 128, it is linear letter to select kernel function Number, structure SVMs carry out Classification and Identification.
CN201710823051.8A 2017-09-13 2017-09-13 Depth multi-feature fusion classification method based on wavelets Active CN107679462B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710823051.8A CN107679462B (en) 2017-09-13 2017-09-13 Depth multi-feature fusion classification method based on wavelets

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710823051.8A CN107679462B (en) 2017-09-13 2017-09-13 Depth multi-feature fusion classification method based on wavelets

Publications (2)

Publication Number Publication Date
CN107679462A true CN107679462A (en) 2018-02-09
CN107679462B CN107679462B (en) 2021-10-19

Family

ID=61136412

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710823051.8A Active CN107679462B (en) 2017-09-13 2017-09-13 Depth multi-feature fusion classification method based on wavelets

Country Status (1)

Country Link
CN (1) CN107679462B (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108564326A (en) * 2018-04-19 2018-09-21 安吉汽车物流股份有限公司 Prediction technique and device, computer-readable medium, the logistics system of order
CN108830296A (en) * 2018-05-18 2018-11-16 河海大学 A kind of improved high score Remote Image Classification based on deep learning
CN108830308A (en) * 2018-05-31 2018-11-16 西安电子科技大学 A kind of Modulation Identification method that traditional characteristic signal-based is merged with depth characteristic
CN108957173A (en) * 2018-06-08 2018-12-07 山东超越数控电子股份有限公司 A kind of detection method for avionics system state
CN109117711A (en) * 2018-06-26 2019-01-01 西安交通大学 Layered characteristic based on eye movement data extracts and the focus detection device and method that merge
CN109214440A (en) * 2018-08-23 2019-01-15 华北电力大学(保定) A kind of multiple features data classification recognition methods based on clustering algorithm
CN109620244A (en) * 2018-12-07 2019-04-16 吉林大学 The Infants With Abnormal behavioral value method of confrontation network and SVM is generated based on condition
CN109741348A (en) * 2019-01-07 2019-05-10 哈尔滨理工大学 A kind of diabetic retina image partition method
CN110236518A (en) * 2019-04-02 2019-09-17 武汉大学 The method and device of electrocardio and the heart shake combined signal classification neural network based
CN110633735A (en) * 2019-08-23 2019-12-31 深圳大学 Progressive depth convolution network image identification method and device based on wavelet transformation
CN110852195A (en) * 2019-10-24 2020-02-28 杭州趣维科技有限公司 Video slice-based video type classification method
CN112288345A (en) * 2019-07-25 2021-01-29 顺丰科技有限公司 Method and device for detecting loading and unloading port state, server and storage medium
CN112330650A (en) * 2020-11-12 2021-02-05 李庆春 Retrieval video quality evaluation method
CN112418168A (en) * 2020-12-10 2021-02-26 深圳云天励飞技术股份有限公司 Vehicle identification method, device, system, electronic equipment and storage medium
CN113408815A (en) * 2021-07-02 2021-09-17 湘潭大学 Deep learning-based traction load ultra-short-term prediction method
CN113658230A (en) * 2020-05-12 2021-11-16 武汉Tcl集团工业研究院有限公司 Optical flow estimation method, terminal and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104281853A (en) * 2014-09-02 2015-01-14 电子科技大学 Behavior identification method based on 3D convolution neural network
CN104866831A (en) * 2015-05-29 2015-08-26 福建省智慧物联网研究院有限责任公司 Feature weighted face identification algorithm
CN106228137A (en) * 2016-07-26 2016-12-14 广州市维安科技股份有限公司 A kind of ATM abnormal human face detection based on key point location
CN106251375A (en) * 2016-08-03 2016-12-21 广东技术师范学院 A kind of degree of depth study stacking-type automatic coding of general steganalysis
CN106529467A (en) * 2016-11-07 2017-03-22 南京邮电大学 Group behavior identification method based on multi-feature fusion

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104281853A (en) * 2014-09-02 2015-01-14 电子科技大学 Behavior identification method based on 3D convolution neural network
CN104866831A (en) * 2015-05-29 2015-08-26 福建省智慧物联网研究院有限责任公司 Feature weighted face identification algorithm
CN106228137A (en) * 2016-07-26 2016-12-14 广州市维安科技股份有限公司 A kind of ATM abnormal human face detection based on key point location
CN106251375A (en) * 2016-08-03 2016-12-21 广东技术师范学院 A kind of degree of depth study stacking-type automatic coding of general steganalysis
CN106529467A (en) * 2016-11-07 2017-03-22 南京邮电大学 Group behavior identification method based on multi-feature fusion

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SHUIWANG JI等: "3D Convolutional Neural Networks for Human Action Recognition", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 *
杨丽召: "基于多特征融合的行为识别算法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108564326A (en) * 2018-04-19 2018-09-21 安吉汽车物流股份有限公司 Prediction technique and device, computer-readable medium, the logistics system of order
CN108564326B (en) * 2018-04-19 2021-12-21 安吉汽车物流股份有限公司 Order prediction method and device, computer readable medium and logistics system
CN108830296B (en) * 2018-05-18 2021-08-10 河海大学 Improved high-resolution remote sensing image classification method based on deep learning
CN108830296A (en) * 2018-05-18 2018-11-16 河海大学 A kind of improved high score Remote Image Classification based on deep learning
CN108830308A (en) * 2018-05-31 2018-11-16 西安电子科技大学 A kind of Modulation Identification method that traditional characteristic signal-based is merged with depth characteristic
CN108957173A (en) * 2018-06-08 2018-12-07 山东超越数控电子股份有限公司 A kind of detection method for avionics system state
CN109117711A (en) * 2018-06-26 2019-01-01 西安交通大学 Layered characteristic based on eye movement data extracts and the focus detection device and method that merge
CN109214440A (en) * 2018-08-23 2019-01-15 华北电力大学(保定) A kind of multiple features data classification recognition methods based on clustering algorithm
CN109620244A (en) * 2018-12-07 2019-04-16 吉林大学 The Infants With Abnormal behavioral value method of confrontation network and SVM is generated based on condition
CN109741348A (en) * 2019-01-07 2019-05-10 哈尔滨理工大学 A kind of diabetic retina image partition method
CN110236518A (en) * 2019-04-02 2019-09-17 武汉大学 The method and device of electrocardio and the heart shake combined signal classification neural network based
CN112288345A (en) * 2019-07-25 2021-01-29 顺丰科技有限公司 Method and device for detecting loading and unloading port state, server and storage medium
CN110633735B (en) * 2019-08-23 2021-07-30 深圳大学 Progressive depth convolution network image identification method and device based on wavelet transformation
CN110633735A (en) * 2019-08-23 2019-12-31 深圳大学 Progressive depth convolution network image identification method and device based on wavelet transformation
CN110852195A (en) * 2019-10-24 2020-02-28 杭州趣维科技有限公司 Video slice-based video type classification method
CN113658230A (en) * 2020-05-12 2021-11-16 武汉Tcl集团工业研究院有限公司 Optical flow estimation method, terminal and storage medium
CN113658230B (en) * 2020-05-12 2024-05-28 武汉Tcl集团工业研究院有限公司 Optical flow estimation method, terminal and storage medium
CN112330650A (en) * 2020-11-12 2021-02-05 李庆春 Retrieval video quality evaluation method
CN112330650B (en) * 2020-11-12 2024-06-28 李庆春 Retrieval video quality evaluation method
CN112418168A (en) * 2020-12-10 2021-02-26 深圳云天励飞技术股份有限公司 Vehicle identification method, device, system, electronic equipment and storage medium
CN112418168B (en) * 2020-12-10 2024-04-02 深圳云天励飞技术股份有限公司 Vehicle identification method, device, system, electronic equipment and storage medium
CN113408815A (en) * 2021-07-02 2021-09-17 湘潭大学 Deep learning-based traction load ultra-short-term prediction method

Also Published As

Publication number Publication date
CN107679462B (en) 2021-10-19

Similar Documents

Publication Publication Date Title
CN107679462A (en) A kind of depth multiple features fusion sorting technique based on small echo
CN110210551B (en) Visual target tracking method based on adaptive subject sensitivity
Iglovikov et al. Ternausnet: U-net with vgg11 encoder pre-trained on imagenet for image segmentation
CN110163299B (en) Visual question-answering method based on bottom-up attention mechanism and memory network
CN104850845B (en) A kind of traffic sign recognition method based on asymmetric convolutional neural networks
CN105701508B (en) Global local optimum model and conspicuousness detection algorithm based on multistage convolutional neural networks
CN105956560B (en) A kind of model recognizing method based on the multiple dimensioned depth convolution feature of pondization
CN109543502A (en) A kind of semantic segmentation method based on the multiple dimensioned neural network of depth
CN104217214B (en) RGB D personage&#39;s Activity recognition methods based on configurable convolutional neural networks
CN110188817A (en) A kind of real-time high-performance street view image semantic segmentation method based on deep learning
CN113469094A (en) Multi-mode remote sensing data depth fusion-based earth surface coverage classification method
CN106920243A (en) The ceramic material part method for sequence image segmentation of improved full convolutional neural networks
CN109543667A (en) A kind of text recognition method based on attention mechanism
CN107609638A (en) A kind of method based on line decoder and interpolation sampling optimization convolutional neural networks
CN107229904A (en) A kind of object detection and recognition method based on deep learning
CN104281853A (en) Behavior identification method based on 3D convolution neural network
CN111681178B (en) Knowledge distillation-based image defogging method
CN110046671A (en) A kind of file classification method based on capsule network
CN110263833A (en) Based on coding-decoding structure image, semantic dividing method
CN112926396A (en) Action identification method based on double-current convolution attention
CN107945153A (en) A kind of road surface crack detection method based on deep learning
CN110378208B (en) Behavior identification method based on deep residual error network
CN105701507A (en) Image classification method based on dynamic random pooling convolution neural network
CN106682569A (en) Fast traffic signboard recognition method based on convolution neural network
CN112488025B (en) Double-temporal remote sensing image semantic change detection method based on multi-modal feature fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant