CN107679462B - Depth multi-feature fusion classification method based on wavelets - Google Patents

Depth multi-feature fusion classification method based on wavelets Download PDF

Info

Publication number
CN107679462B
CN107679462B CN201710823051.8A CN201710823051A CN107679462B CN 107679462 B CN107679462 B CN 107679462B CN 201710823051 A CN201710823051 A CN 201710823051A CN 107679462 B CN107679462 B CN 107679462B
Authority
CN
China
Prior art keywords
feature
channel
frequency components
layer
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710823051.8A
Other languages
Chinese (zh)
Other versions
CN107679462A (en
Inventor
于刚
李艇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Graduate School Harbin Institute of Technology
Original Assignee
Shenzhen Graduate School Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Graduate School Harbin Institute of Technology filed Critical Shenzhen Graduate School Harbin Institute of Technology
Priority to CN201710823051.8A priority Critical patent/CN107679462B/en
Publication of CN107679462A publication Critical patent/CN107679462A/en
Application granted granted Critical
Publication of CN107679462B publication Critical patent/CN107679462B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The invention provides a depth multi-feature fusion classification method based on wavelets, which comprises an offline training stage and an online identification stage, wherein the offline training stage trains samples of n types of labels by constructing a convolutional neural network, discrete wavelet transformation is added into a convolutional layer and a full connection layer at the tail end of a model to decompose depth multi-feature mapping, and the obtained high and low frequency components are fused linearly so as to obtain the optimal weight; and in the on-line identification stage, the convolutional neural network is matched with a support vector machine to identify and classify the actions in the images and videos. The invention has the beneficial effects that: the accuracy of classification and identification of the image video is improved.

Description

Depth multi-feature fusion classification method based on wavelets
Technical Field
The invention relates to robot visual image processing, in particular to a depth multi-feature fusion classification method based on wavelets.
Background
In recent years, deep learning has become the most fierce vocabulary of the science and technology circle. The method gradually subverts algorithm design ideas in numerous fields of voice recognition, image classification, text understanding and the like, and gradually forms a new mode which starts from training data, passes through an end-to-end model and then directly outputs to obtain a final result. With the advent of the big data era and the development of various more powerful computing devices such as a GPU and the like, deep learning like tiger is more important, various mass data can be fully utilized, abstract knowledge expression can be completely and automatically learned, and original data are concentrated into certain knowledge. Which is again the most common framework in deep learning.
With the continuous expansion of the convolutional neural network framework and the continuous deepening of the network layer number, feature maps extracted by each module are gradually increased, and the convolutional layer flatten is simply taken as a vector and then fully connected, so that the calculated amount is huge, and the feature blurring is caused, and the accuracy of the classification and identification of the image video is influenced.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a depth multi-feature fusion classification method based on wavelets, which improves the accuracy of classification and identification of image videos.
The invention provides a depth multi-feature fusion classification method based on wavelets, which comprises an offline training stage and an online identification stage, wherein the offline training stage trains samples of n types of labels by constructing a convolutional neural network, discrete wavelet transformation is added into a convolutional layer and a full connection layer at the tail end of a model to decompose depth multi-feature mapping, and the obtained high and low frequency components are fused linearly so as to obtain the optimal weight; and in the on-line identification stage, the convolutional neural network is matched with a support vector machine to identify and classify the actions in the images and videos.
As a further improvement of the present invention, the offline training phase comprises the steps of:
the method comprises the following steps: firstly, constructing a convolutional neural network for training;
step two: set up 3 passageways in the first layer, respectively: the video clip comprises 1 gray level channel and 2 optical flow channels, wherein the gray level channel comprises a gray level image group of a video clip, and the optical flow channels comprise motion relation information between two frames of the video clip;
step three: constructing a multi-module convolutional neural network;
step four: extracting high-frequency and low-frequency components from feature maps of all the module full-connection layers by adopting discrete wavelet transform, and fusing the high-frequency and low-frequency components in the three modules respectively;
step five: connecting the fused high-frequency and low-frequency components in series through the merge layer and fully connecting the fused high-frequency and low-frequency components with the next layer to obtain a group of 128-dimensional feature maps;
step six: setting n output nodes corresponding to the n classification behaviors, wherein each node is fully connected with all feature maps of the previous layer;
step seven: and adjusting the calculation parameters among the layers through a back propagation algorithm to reduce the error between the output of each sample and the label, and setting the label for each output vector according to the corresponding sample video behavior name after the error meets the requirement and the training is finished.
As a further improvement of the invention, the on-line training phase comprises the following steps:
step eight: inputting a video stream to be identified, preprocessing the video in the first step, loading a weight through an optimal model obtained in offline training, and extracting a feature vector from the video stream to be identified through the network layers in the second step to the eighth step;
step nine: and (4) classifying the feature vectors in the step ten by adopting a support vector machine, and finding out the label which is most matched with the feature vectors to obtain the optimal accuracy.
As a further improvement of the invention, the method comprises the following steps:
s1: acquiring a training sample image;
s2: preprocessing an image;
s3: constructing a gray scale and optical flow multi-channel network channel;
s4: respectively constructing a gray level, optical flow x and y channel network;
s5: performing discrete wavelet transform on the feature mapping of the full connection layer at the tail end of each channel;
s6, extracting high-frequency and low-frequency components, and carrying out feature fusion between channels;
s7, serially connecting and fusing the characteristics through a merge layer;
s8: training and extracting the optimal weight;
s9: sending the video to a trained optimal model for feature extraction;
s10: online identification is performed using a support vector machine.
As a further improvement of the present invention, in step S1, a training sample and a sample label are obtained from the data set; in step S2, unifying the resolutions of the video streams in the training sample set, unifying the resolutions by using a Lanczos interpolation method, and interpolating eight adjacent points in the interpolation process along the x and y directions, that is, calculating a weighted sum, where a window function of the Lanczos interpolation method is:
Figure BDA0001406796200000031
the two-dimensional form is then: l (x, y) ═ L (x) L (y).
As a further improvement of the invention, in step S3, a gray scale channel is established by graying the video stream, the gray scale image retains the most basic information of the original image, optical flow channels in the x and y directions are established for extracting the inter-frame motion information in the video stream, the optical flow information between frames is extracted by adopting an improved L-K optical flow method, a convolution kernel is used to replace pyramid downsampling, and a partial derivative f is firstly obtained from f (x, y, t)x,fy,ftThe convolution kernel adopts a Prewitt filter, namely:
Ix=I*Dx,Iy=I*Dy,It=I*Dt
velocity estimation using the least squares method:
Figure BDA0001406796200000032
as a further improvement of the present invention, in step S4, each channel is sampled, the picture size is changed to 150 × 100, 5 convolutional layers are constructed, 3 pooling layers are connected, and then one full-connected layer is connected, the convolutional kernel size of the first convolutional layer is 5 × 5, the convolutional kernel sizes of the subsequent convolutional layers are all 3 × 3, the step size is set to 1, 3D maxpoling is used for the pooling layers, the kernel selection of the pooling layers is two, 2 × 2 and 2 × 1, and the activation function selects relu.
As a further improvement of the invention, in step S5, the feature mapping of the full connection layer at each channel end is used for extracting high and low frequency components by discrete wavelet transform, and the high and low frequency components are extracted by continuous wavelet function psia,b(t) can be written as a discrete wavelet function:
Figure BDA0001406796200000033
the discrete wavelet transform is obtained in the form:
Figure BDA0001406796200000034
as a further improvement of the present invention, in step S6, decomposing the 512-dimensional feature map of the fully connected layer of the grayscale channel, the optical flow x and the y channel into 3 pairs of 128-dimensional feature maps containing high and low frequency components, and then performing vector product operation on the 128-dimensional feature maps of each channel to obtain two sets of 128-dimensional feature maps; in step S7, by adding a merge layer, mode sets concat, concatenates the fused high-frequency component and low-frequency component, sets n output nodes, and connects all feature maps of the upper layer corresponding to the n classification behaviors.
As a further improvement of the present invention, in step S8, a training sample set is put into the network for training, a model with the minimum loss value is recalled, and the optimal weight is saved; in step S10, the input video stream is passed through a convolutional neural network to extract a 128-dimensional feature map, a kernel function is selected as a linear function, and a support vector machine is constructed for classification and identification.
The invention has the beneficial effects that: through the scheme, the classical convolutional neural network training process is improved, the discrete wavelet transform is added to decompose the depth features in the training process, the multi-resolution features are extracted, the corresponding multi-resolution features in all the depth features are fused, the bottom layer information is enhanced, the high layer information is enhanced, the complexity of network calculation is reduced, meanwhile, the robustness of network training is enhanced, and the accuracy of classification and identification of image videos is improved.
Drawings
FIG. 1 is a flow chart of a depth multi-feature fusion classification method based on wavelets according to the present invention.
Fig. 2 is a diagram of a single channel network.
Fig. 3 is a general structure diagram of a convolutional neural network based on wavelet improvement.
Detailed Description
The invention is further described with reference to the following description and embodiments in conjunction with the accompanying drawings.
A depth multi-feature fusion classification method based on wavelets is divided into two stages: an offline training phase and an online recognition phase. Training samples of n types of labels by constructing a convolutional neural network, adding discrete wavelet transform to a convolutional layer and a full connection layer at the tail end of a model to decompose deep multi-feature mapping, linearly fusing obtained high and low frequency components to obtain optimal weight, and then identifying and classifying actions in images and videos by matching the neural network with a support vector machine.
Off-line training phase
The method comprises the following steps: firstly, a convolutional neural network is constructed for training, a behavior recognition data set HMDB51 is taken as a training set by taking action recognition as an example, video segments are preprocessed, and the video resolution is unified;
step two: set up 3 passageways in the first layer, respectively: the video clip comprises 1 gray level channel and 2 optical flow channels, wherein the gray level channel comprises a gray level image group of a video clip, and the optical flow channels comprise motion relation information between two frames of the video clip;
step three: constructing a multi-module convolutional neural network
Step four: extracting high-frequency and low-frequency components from feature maps of all the module full-connection layers by adopting discrete wavelet transform, and fusing the high-frequency and low-frequency components in the three modules respectively;
step five: connecting the fused high-frequency and low-frequency components in series through the merge layer and fully connecting the fused high-frequency and low-frequency components with the next layer to obtain a group of 128-dimensional feature maps;
step six: setting n output nodes corresponding to n classification behaviors (labels), wherein each node is fully connected with all feature maps on the previous layer;
step seven: adjusting the calculation parameters among all layers through a back propagation algorithm to reduce the error between the output of each sample and the label, and setting the label for each output vector according to the corresponding sample video behavior name after the error meets the requirement and the training is finished;
(II) on-line identification
Step eight: inputting a video stream to be identified, preprocessing the video in the first step, loading a weight through an optimal model obtained in offline training, and extracting a feature vector from the video stream to be identified through the network layers in the second step to the eighth step;
step nine: and (4) classifying the feature vectors in the step ten by adopting a support vector machine, and finding out the label which is most matched with the feature vectors to obtain the optimal accuracy.
The invention provides a depth multi-feature fusion classification method based on wavelets, which improves the classic convolutional neural network training process, adds discrete wavelet transform to decompose the depth features in the training process, extracts multi-resolution features, fuses corresponding multi-resolution features in each depth feature, enhances bottom information, enhances high-level information, reduces the complexity of network calculation, and enhances the robustness of network training.
As shown in fig. 1, a depth multi-feature fusion classification method based on wavelets specifically includes the following steps:
s1: acquiring a training sample image:
training samples and sample labels are obtained from the HMDB51 dataset.
S2: image preprocessing:
the video streams in the training sample set are subjected to resolution unification, and when the resolution unification operation is performed, the image edges are blurred, so that information loss is caused. The resolution is unified by using a Lanczos interpolation method, and eight adjacent points are interpolated in the x and y directions in the interpolation process, that is, a weighted sum is calculated, so that the method is an 8 × 8 descriptor. Although the calculation amount of the Lanczos interpolation method is more complicated than that of other interpolation methods, the Lanczos interpolation method has little influence on the overall performance due to operation on a GPU, and the effect is more remarkable than that of other interpolation methods. The window function is:
Figure BDA0001406796200000051
the two-dimensional form is then: l (x, y) ═ L (x) L (y).
S3: constructing a gray scale and optical flow multi-channel network channel:
by establishing a grayscale channel for the graying of the video stream, the grayscale map retains the most basic information of the original image, so the grayscale channel is essential. The extraction of the inter-frame motion information in the video stream establishes optical flow channels in the x and y directions. The optical flow is the instantaneous speed of the pixel motion of a space moving object on an observation imaging plane, and is a method for finding the corresponding relation between the previous frame and the current frame by using the change of the pixels in an image sequence on a time domain and the correlation between adjacent frames so as to calculate the motion information of the object between the adjacent frames. Optical flow channels are also essential for motion recognition. The improved L-K optical flow method is adopted to extract the optical flow information between frames. The convolution kernel is used for replacing pyramid downsampling, so that the calculation amount can be reduced, and the effect is better. First, the partial derivative f is determined from f (x, y, t)x,fy,ftThe convolution kernel adopts a Prewitt filter, namely:
Ix=I*Dx,Iy=I*Dy,It=I*Dt
velocity estimation using the least squares method:
Figure BDA0001406796200000061
s4: respectively constructing a gray level, optical flow x and y channel network:
fig. 2 is a diagram of a single-channel network structure, each channel is processed by down-sampling, the picture size is changed to 150 × 100, 5 convolutional layers and 3 pooling layers are constructed, and then a full connection layer is connected. The first layer of convolutional kernels has a size of 5 x 5, the subsequent convolutional kernels have a size of 3 x 3, and the step size is set to 1. 3D maxporoling is adopted in the pooling layer, and the cores of the pooling layer are selected from 2 x 2 and 2 x 1, so that the later time dimension is prevented from being reduced too fast. The relu is selected as an activation function, the function can simulate a more accurate activation model of a brain neuron receiving signal, and compared with a sigmoid function, the function has the characteristics of unilateral inhibition, relatively wide excitation boundary and sparse activation.
S5: performing discrete wavelet transform on the feature mapping of the full connection layer at the tail end of each channel:
extracting high and low frequency components from the feature mapping of the full connection layer at the tail end of each channel by discrete wavelet transform, and performing continuous wavelet function psia,b(t) can be written as a discrete wavelet function:
Figure BDA0001406796200000062
the discrete wavelet transform is obtained in the form:
Figure BDA0001406796200000063
s6, extracting high-frequency and low-frequency components, and carrying out feature fusion between channels:
in fig. 3, Dwt operation decomposes 512-dimensional feature maps of the fully-connected layers of the gray-scale channel, the optical flow x and the y channel into 3 pairs of 128-dimensional feature maps containing high and low frequency components, and then performs vector product operation on the 128-dimensional feature maps of the channels to obtain two sets of 128-dimensional feature maps.
S7 characteristics after tandem fusion through merge layers:
by additionally arranging the merge layer, the mode sets concat, the fused high-frequency component and low-frequency component are connected in series, n output nodes are set, and the corresponding n classification behaviors (labels) are fully connected with all feature maps on the upper layer.
S8: training and extracting optimal weight:
and putting the training sample set into a network for training, recalling the model with the minimum loss value, and storing the optimal weight.
S9: and sending the video to a trained optimal model for feature extraction.
S10: online identification using a support vector machine:
extracting a 128-dimensional feature map from an input video stream through a convolutional neural network, selecting a kernel function as a linear function, and constructing a support vector machine for classification and identification.
Compared with a convolutional neural network model which is not added with wavelets for depth feature fusion, the method disclosed by the invention can achieve a better effect, and can achieve higher accuracy by testing on a public data set. Meanwhile, the invention is not limited to the recognition of the action in the specific implementation scheme, and can be widely used for the classification recognition of the image video.
According to the depth multi-feature fusion classification method based on the wavelet, provided by the invention, the discrete wavelet transform is adopted to extract low-frequency components and high-frequency components from the feature map, and the high-frequency components and the low-frequency components are respectively fused, so that the purposes of enhancing bottom information and enhancing high-level information are achieved, and the accuracy and the robustness of network identification are improved.
The depth multi-feature fusion classification method based on the wavelet is suitable for the technical field of robot visual image processing, and is particularly suitable for depth learning, feature extraction and video image processing.
The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

Claims (7)

1. A depth multi-feature fusion classification method based on wavelets is characterized in that: the method comprises an offline training stage and an online identification stage, wherein the offline training stage trains n types of labeled samples by constructing a convolutional neural network, discrete wavelet transform is added to a full connection layer to decompose depth multi-feature mapping, and obtained high and low frequency components are linearly fused to obtain optimal weight; in the on-line identification stage, the convolutional neural network is matched with a support vector machine to identify and classify the actions in the images and videos;
the offline training phase comprises the following steps:
the method comprises the following steps: firstly, constructing a convolutional neural network for training;
step two: set up 3 passageways in the first layer, respectively: the video clip comprises 1 gray level channel and 2 optical flow channels, wherein the gray level channel comprises a gray level image group of a video clip, and the optical flow channels comprise motion relation information between two frames of the video clip;
step three: constructing a multi-module convolutional neural network, wherein each module corresponds to one channel;
step four: extracting high-frequency and low-frequency components from feature maps of all connection layers of all channels by adopting discrete wavelet transform, fusing the high-frequency components in all the channels, and fusing the low-frequency components in all the channels;
step five: connecting the fused high-frequency and low-frequency components in series through the merge layer and fully connecting the fused high-frequency and low-frequency components with the next layer to obtain a group of 128-dimensional feature maps;
step six: setting n output nodes corresponding to the n classification behaviors, wherein each node is fully connected with all feature maps of the previous layer;
step seven: adjusting the calculation parameters among all layers through a back propagation algorithm to reduce the error between the output of each sample and the label, and setting the label for each output vector according to the corresponding sample video behavior name after the error meets the requirement and the training is finished;
the on-line identification phase comprises the following steps:
step eight: inputting a video stream to be identified, preprocessing the video, loading a weight through an optimal model obtained in offline training, and extracting a feature vector from the video stream to be identified through the network layers from the second step to the seventh step;
step nine: classifying the feature vectors in the step eight by adopting a support vector machine, and finding out the label which is most matched with the feature vectors to obtain the optimal accuracy;
the depth multi-feature fusion classification method based on the wavelet comprises the following steps:
s1: acquiring a training sample image;
s2: preprocessing an image;
s3: constructing a gray scale and optical flow multi-channel network channel;
s4: respectively constructing a gray level, optical flow x and y channel network;
s5: performing discrete wavelet transform on the feature mapping of the full connection layer at the tail end of each channel;
s6, extracting high-frequency and low-frequency components, fusing the high-frequency components in each channel, and fusing the low-frequency components in each channel;
s7, serially connecting and fusing the characteristics through a merge layer;
s8: training and extracting the optimal weight;
s9: sending the video to a trained optimal model for feature extraction;
s10: online identification is performed using a support vector machine.
2. The wavelet-based depth multi-feature fusion classification method of claim 1, wherein in step S1, training samples and sample labels are obtained from a dataset; in step S2, unifying the resolutions of the video streams in the training sample set, unifying the resolutions by using a Lanczos interpolation method, and interpolating eight adjacent points in the interpolation process along the x and y directions, that is, calculating a weighted sum, where a window function of the Lanczos interpolation method is:
Figure FDA0003157751750000021
the two-dimensional form is then: l (x, y) ═ L (x) L (y).
3. The wavelet-based depth multi-feature fusion classification method of claim 2, wherein in step S3, a grayscale channel is established by graying the video stream, the grayscale map retains the most basic information of the original image, optical flow channels in both x and y directions are established for the extraction of inter-frame motion information in the video stream, the improved L-K optical flow method is used to extract the inter-frame optical flow information, a convolution kernel is used to replace pyramid down-sampling, and first the partial derivative f (x, y, t) is obtained from f (x, y, t)x,fy,ftThe convolution kernel adopts a Prewitt filter, namely:
Ix=I*Dx,Iy=I*Dy,It=I*Dt
velocity estimation using the least squares method:
Figure FDA0003157751750000022
4. the wavelet-based depth multi-feature fusion classification method of claim 3, wherein in step S4, each channel is sampled, the picture size is changed to 150 × 100, 5 convolutional layers are constructed, 3 pooling layers are connected, then one fully-connected layer is connected, the first convolutional layer convolution kernel size is 5 × 5, the subsequent convolutional layer convolution kernel sizes are 3 × 3, the step size is set to 1, 3D maxpoling is adopted for the pooling layers, the kernels of the pooling layers are selected to be two of 2 × 2 and 2 × 1, and the activation function selects relu.
5. The wavelet-based depth multi-feature fusion classification method of claim 4, wherein in step S5, the feature mapping of the full connection layer at each channel end is transformed by discrete wavelet to extract high and low frequency components, and the continuous wavelet function ψ is used to extract high and low frequency componentsa,b(t) can be written as a discrete wavelet function:
Figure FDA0003157751750000031
the discrete wavelet transform is obtained in the form:
Figure FDA0003157751750000032
6. the wavelet-based depth multi-feature fusion classification method of claim 5, wherein in step S6, decomposing 512-dimensional feature maps of fully connected layers of gray-scale channels, optical flows x and y channels into 3 pairs of 128-dimensional feature maps containing high and low frequency components, and then performing vector product operation on the 128-dimensional feature maps of each channel to obtain two sets of 128-dimensional feature maps; in step S7, by adding a merge layer, mode sets concat, concatenates the fused high-frequency component and low-frequency component, sets n output nodes, and connects all feature maps of the upper layer corresponding to the n classification behaviors.
7. The wavelet-based depth multi-feature fusion classification method of claim 6, wherein in step S8, a training sample set is put into a network for training, a model with the minimum loss value is recalled, and an optimal weight is saved; in step S10, the input video stream is passed through a convolutional neural network to extract a 256-dimensional feature map, a kernel function is selected as a linear function, and a support vector machine is constructed for classification and identification.
CN201710823051.8A 2017-09-13 2017-09-13 Depth multi-feature fusion classification method based on wavelets Active CN107679462B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710823051.8A CN107679462B (en) 2017-09-13 2017-09-13 Depth multi-feature fusion classification method based on wavelets

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710823051.8A CN107679462B (en) 2017-09-13 2017-09-13 Depth multi-feature fusion classification method based on wavelets

Publications (2)

Publication Number Publication Date
CN107679462A CN107679462A (en) 2018-02-09
CN107679462B true CN107679462B (en) 2021-10-19

Family

ID=61136412

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710823051.8A Active CN107679462B (en) 2017-09-13 2017-09-13 Depth multi-feature fusion classification method based on wavelets

Country Status (1)

Country Link
CN (1) CN107679462B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108564326B (en) * 2018-04-19 2021-12-21 安吉汽车物流股份有限公司 Order prediction method and device, computer readable medium and logistics system
CN108830296B (en) * 2018-05-18 2021-08-10 河海大学 Improved high-resolution remote sensing image classification method based on deep learning
CN108830308B (en) * 2018-05-31 2021-12-14 西安电子科技大学 Signal-based traditional feature and depth feature fusion modulation identification method
CN108957173A (en) * 2018-06-08 2018-12-07 山东超越数控电子股份有限公司 A kind of detection method for avionics system state
CN109117711B (en) * 2018-06-26 2021-02-19 西安交通大学 Eye movement data-based concentration degree detection device and method based on hierarchical feature fusion
CN109214440A (en) * 2018-08-23 2019-01-15 华北电力大学(保定) A kind of multiple features data classification recognition methods based on clustering algorithm
CN109620244B (en) * 2018-12-07 2021-07-30 吉林大学 Infant abnormal behavior detection method based on condition generation countermeasure network and SVM
CN109741348A (en) * 2019-01-07 2019-05-10 哈尔滨理工大学 A kind of diabetic retina image partition method
CN110236518B (en) * 2019-04-02 2020-12-11 武汉大学 Electrocardio and heart-shock signal combined classification method and device based on neural network
CN112288345A (en) * 2019-07-25 2021-01-29 顺丰科技有限公司 Method and device for detecting loading and unloading port state, server and storage medium
CN110633735B (en) * 2019-08-23 2021-07-30 深圳大学 Progressive depth convolution network image identification method and device based on wavelet transformation
CN110852195A (en) * 2019-10-24 2020-02-28 杭州趣维科技有限公司 Video slice-based video type classification method
CN113658230A (en) * 2020-05-12 2021-11-16 武汉Tcl集团工业研究院有限公司 Optical flow estimation method, terminal and storage medium
CN112330650A (en) * 2020-11-12 2021-02-05 李庆春 Retrieval video quality evaluation method
CN112418168B (en) * 2020-12-10 2024-04-02 深圳云天励飞技术股份有限公司 Vehicle identification method, device, system, electronic equipment and storage medium
CN113408815A (en) * 2021-07-02 2021-09-17 湘潭大学 Deep learning-based traction load ultra-short-term prediction method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104281853A (en) * 2014-09-02 2015-01-14 电子科技大学 Behavior identification method based on 3D convolution neural network
CN104866831A (en) * 2015-05-29 2015-08-26 福建省智慧物联网研究院有限责任公司 Feature weighted face identification algorithm
CN106228137A (en) * 2016-07-26 2016-12-14 广州市维安科技股份有限公司 A kind of ATM abnormal human face detection based on key point location
CN106251375A (en) * 2016-08-03 2016-12-21 广东技术师范学院 A kind of degree of depth study stacking-type automatic coding of general steganalysis
CN106529467A (en) * 2016-11-07 2017-03-22 南京邮电大学 Group behavior identification method based on multi-feature fusion

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104281853A (en) * 2014-09-02 2015-01-14 电子科技大学 Behavior identification method based on 3D convolution neural network
CN104866831A (en) * 2015-05-29 2015-08-26 福建省智慧物联网研究院有限责任公司 Feature weighted face identification algorithm
CN106228137A (en) * 2016-07-26 2016-12-14 广州市维安科技股份有限公司 A kind of ATM abnormal human face detection based on key point location
CN106251375A (en) * 2016-08-03 2016-12-21 广东技术师范学院 A kind of degree of depth study stacking-type automatic coding of general steganalysis
CN106529467A (en) * 2016-11-07 2017-03-22 南京邮电大学 Group behavior identification method based on multi-feature fusion

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
3D Convolutional Neural Networks for Human Action Recognition;Shuiwang Ji等;《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》;20130131;论文摘要,第2、4节 *
基于多特征融合的行为识别算法研究;杨丽召;《中国优秀硕士学位论文全文数据库信息科技辑》;20140115;第2.1节 *

Also Published As

Publication number Publication date
CN107679462A (en) 2018-02-09

Similar Documents

Publication Publication Date Title
CN107679462B (en) Depth multi-feature fusion classification method based on wavelets
Xiao et al. Satellite video super-resolution via multiscale deformable convolution alignment and temporal grouping projection
CN108665496B (en) End-to-end semantic instant positioning and mapping method based on deep learning
CN110111366B (en) End-to-end optical flow estimation method based on multistage loss
CN109299274B (en) Natural scene text detection method based on full convolution neural network
CN111275713B (en) Cross-domain semantic segmentation method based on countermeasure self-integration network
CN111639692A (en) Shadow detection method based on attention mechanism
Yan et al. Combining the best of convolutional layers and recurrent layers: A hybrid network for semantic segmentation
CN112396607A (en) Streetscape image semantic segmentation method for deformable convolution fusion enhancement
CN113642634A (en) Shadow detection method based on mixed attention
CN114898284B (en) Crowd counting method based on feature pyramid local difference attention mechanism
CN112101262B (en) Multi-feature fusion sign language recognition method and network model
CN110532959B (en) Real-time violent behavior detection system based on two-channel three-dimensional convolutional neural network
McIntosh et al. Recurrent segmentation for variable computational budgets
Ma et al. Fusioncount: Efficient crowd counting via multiscale feature fusion
CN116129291A (en) Unmanned aerial vehicle animal husbandry-oriented image target recognition method and device
CN109871790B (en) Video decoloring method based on hybrid neural network model
CN109919215B (en) Target detection method for improving characteristic pyramid network based on clustering algorithm
Zeng et al. Self-attention learning network for face super-resolution
CN115049945A (en) Method and device for extracting lodging area of wheat based on unmanned aerial vehicle image
CN111027472A (en) Video identification method based on fusion of video optical flow and image space feature weight
CN110751271A (en) Image traceability feature characterization method based on deep neural network
CN111242003A (en) Video salient object detection method based on multi-scale constrained self-attention mechanism
CN115393950A (en) Gesture segmentation network device and method based on multi-branch cascade Transformer
CN110853040B (en) Image collaborative segmentation method based on super-resolution reconstruction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant