CN108154120A - video classification model training method, device, storage medium and electronic equipment - Google Patents
video classification model training method, device, storage medium and electronic equipment Download PDFInfo
- Publication number
- CN108154120A CN108154120A CN201711420935.5A CN201711420935A CN108154120A CN 108154120 A CN108154120 A CN 108154120A CN 201711420935 A CN201711420935 A CN 201711420935A CN 108154120 A CN108154120 A CN 108154120A
- Authority
- CN
- China
- Prior art keywords
- vector
- subcharacter
- video
- target
- feature vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention discloses a kind of video classification model training method, device, storage medium and electronic equipment, this method and includes:Video file input video disaggregated model is learnt, obtains feature vector;Described eigenvector is divided into multiple subcharacter vectors;A sub- eigen vector is chosen from the multiple subcharacter vector as target subcharacter vector;The target subcharacter vector is inputted the video classification model to be trained, obtains final video disaggregated model.Partial Feature vector has been intercepted as target signature subvector input video disaggregated model for training, has reduced the size of input data and the data of its conversion, so as to reduce training parameter, improves training effectiveness.
Description
Technical field
The present invention relates to video field, more specifically, being related to a kind of video classification model training method, device, storage
Medium and electronic equipment.
Background technology
When classifying to video file, need in advance to be trained video classification model, the visual classification after being optimized
Model.Required parameter is more when being trained to video classification model, directly extremely inefficient using traditional algorithm so that training time mistake
It is long.
Invention content
The technical problems to be solved by the invention are to provide a kind of video classification model training method, device, storage medium
And electronic equipment, can raising training effectiveness, reduce the training time.
The purpose of the present invention is achieved through the following technical solutions:
In a first aspect, the embodiment of the present application provides a kind of video classification model training method, including:
Video file input video disaggregated model is learnt, obtains feature vector;
Described eigenvector is divided into multiple subcharacter vectors;
A sub- eigen vector is chosen from the multiple subcharacter vector as target subcharacter vector;
The target subcharacter vector is inputted the video classification model to be trained, obtains final video classification mould
Type.
Second aspect, the embodiment of the present application provide a kind of video classification model training device, including:
First acquisition unit for video file input video disaggregated model to be learnt, obtains feature vector;
Division unit, for described eigenvector to be divided into multiple subcharacter vectors;
Selection unit, for from the multiple subcharacter vector choose a sub- eigen vector as target subcharacter to
Amount;
Training unit is trained for the target subcharacter vector to be inputted the video classification model, obtains most
Whole video classification model.
The third aspect, the embodiment of the present application provide a kind of storage medium, computer program are stored thereon with, when the calculating
When machine program is run on computers so that the computer performs above-mentioned video classification model training.
Fourth aspect, the embodiment of the present application provide a kind of electronic equipment, and including processor and memory, the memory has
Computer program, the processor is by calling the computer program, for performing above-mentioned video classification model training side
Method.
Video classification model training method provided by the embodiments of the present application, device, storage medium and electronic equipment, pass through by
Video file input video disaggregated model is learnt, and obtains feature vector;Feature vector is divided into multiple subcharacter vectors;From
A sub- eigen vector is chosen in multiple subcharacter vectors as target subcharacter vector;It will be described in the input of target subcharacter vector
Video classification model is trained, and obtains final video disaggregated model.Intercepted Partial Feature vector as target signature to
Amount input video disaggregated model reduces the size of input data and the data of its conversion, so as to reduce instruction for training
Practice parameter, improve training effectiveness.
Description of the drawings
Attached drawing to be used is needed to be briefly described.It should be evident that the accompanying drawings in the following description is only the application's
Some embodiments, for those skilled in the art, without creative efforts, can also be attached according to these
Figure obtains other attached drawings.
Fig. 1 is the first flow diagram of video classification model training method provided by the embodiments of the present application;
Fig. 2 is the block diagram representation of video classification model training method provided by the embodiments of the present application;
Fig. 3 is second of flow diagram of video classification model training method provided by the embodiments of the present application;
Fig. 4 is the third flow diagram of video classification model training method provided by the embodiments of the present application;
Fig. 5 is the 4th kind of flow diagram of video classification model training method provided by the embodiments of the present application;
Fig. 6 is the structure diagram of visual classification device provided by the embodiments of the present application.
Specific embodiment
Schema is please referred to, wherein identical element numbers represent identical component, the principle of the application is to implement one
It is illustrated in appropriate computing environment.The following description be based on illustrated the application specific embodiment, should not be by
It is considered as limitation the application other specific embodiments not detailed herein.
In the following description, the specific embodiment of the application will be with reference to as the step performed by one or multi-section computer
And symbol illustrates, unless otherwise stating clearly.Therefore, these steps and operation will have to mention for several times is performed by computer, this paper institutes
The computer execution of finger includes by representing with the computer processing unit of the electronic signal of the data in a structuring pattern
Operation.This operation is converted at the data or the position being maintained in the memory system of the computer, reconfigurable
Or in addition change the running of the computer in a manner of known to the tester of this field.The data structure that the data are maintained
For the provider location of the memory, there is the specific feature as defined in the data format.But the application principle is with above-mentioned text
Word illustrates that be not represented as a kind of limitation, this field tester will appreciate that plurality of step as described below and behaviour
Also it may be implemented in hardware.
Term as used herein " unit " can regard the software object to be performed in the arithmetic system as.It is as described herein
Different components, unit, engine and service can be regarded as the objective for implementation in the arithmetic system.And device as described herein and side
Method can be implemented in a manner of software, can also be implemented on hardware certainly, within the application protection domain.
Term " comprising " and " having " and their any deformations in the application, it is intended that cover non-exclusive packet
Contain.Such as it contains process, method, system, product or the equipment of series of steps or module and is not limited to the step listed
Rapid or module, but some embodiments further include the step of not listing or module or some embodiments are further included for these
Process, method, product or equipment intrinsic other steps or module.
Referenced herein " embodiment " is it is meant that a particular feature, structure, or characteristic described can wrap in conjunction with the embodiments
It is contained at least one embodiment of the application.Each position in the description occur the phrase might not each mean it is identical
Embodiment, nor the independent or alternative embodiment with other embodiments mutual exclusion.Those skilled in the art explicitly and
Implicitly understand, embodiment described herein can be combined with other embodiments.
The embodiment of the present application provides a kind of video classification model training method, the execution of the video classification model training method
Main body can be video classification model training device provided by the embodiments of the present application or be integrated with video classification model training
The electronic equipment of device, wherein hardware may be used in the video classification model training device or the mode of software is realized.
The embodiment of the present application will be described from the angle of video classification model training device, video classification model training
Device can specifically integrate in the electronic device.The video classification model training method includes:By video file input video point
Class model is learnt, and obtains feature vector;Feature vector is divided into multiple subcharacter vectors;It is selected from multiple subcharacter vectors
A sub- eigen vector is taken as target subcharacter vector;Target subcharacter vector input video disaggregated model is trained,
Obtain final video disaggregated model.
Wherein electronic equipment is set including smart mobile phone, tablet computer, palm PC, computer, server, Cloud Server etc.
It is standby.
Please refer to Fig.1 and Fig. 2, Fig. 1 be video classification model training method provided by the embodiments of the present application the first stream
Journey schematic diagram, Fig. 2 are the block diagram representation of video classification model training method provided by the embodiments of the present application.The embodiment of the present application
The video classification model training method of offer, idiographic flow can be as follows:
Step 101, video file input video disaggregated model is learnt, obtains feature vector.
Video file can be the video file of the forms such as mjpeg, avi, rmvb, 3gp.Herein not to the lattice of video file
Formula is defined.
Video classification model can be convolutional neural networks algorithm model, Recognition with Recurrent Neural Network algorithm model etc..Video point
Class model can also be SENet (Squeeze-and-Excitation Networks) algorithm model.
By in video file input video disaggregated model, video classification model obtains corresponding to the video according to the video file
The eigen vector of document classification information.The scene characteristic of such as video file, character features, item characteristics, temporal characteristics.
Step 102, feature vector is divided into multiple subcharacter vectors.
In a dimension of eigen vector, which is divided into multiple feature vectors.Such as feature vector is
Feature vector is divided into multiple subcharacters vector of 4 512*200 by 2048*200.
Step 103, a sub- eigen vector is chosen from multiple subcharacter vectors as target subcharacter vector.
A sub- feature vector is arbitrarily chosen from multiple subcharacter vectors as target subcharacter vector.Can be first
A sub- feature vector can also be the last one subcharacter vector or any one intermediate subcharacter vector.Also may be used
Determined with being the data in each subcharacter vector, such as obtain the sum of data of each feature, access according to the sum of it is maximum or
Minimum or median is target subcharacter vector.It can also be by calculating the difference of two squares of each subcharacter vector, the difference of two squares is most
Small is target subcharacter vector.
Step 104, target subcharacter vector input video disaggregated model is trained, obtains final video classification mould
Type.
After obtaining target subcharacter vector, it is inputted video classification model and is trained, the video after being optimized point
Class model optimizes parameters value in video classification model.Wherein, specifically, video classification model can include NetVLad
Layer, the NetVLad layers is to make it can be micro- VLAD processing and add in one layer of process layer of formation in convolutional neural networks, should
NetVLad layers is can pass through the coding of back propagation learning picture.
Referring to Fig. 3, second of flow that Fig. 3 is video classification model training method provided by the embodiments of the present application is illustrated
Figure.Video classification model training method provided by the embodiments of the present application, idiographic flow can be as follows:
Step 201, video file input video disaggregated model is learnt, obtains feature vector.
Video classification model can be convolutional neural networks algorithm model, Recognition with Recurrent Neural Network algorithm model etc..Video point
Class model can also be SENet (Squeeze-and-Excitation Networks) algorithm model.
By in video file input video disaggregated model, video classification model obtains corresponding to the video according to the video file
The eigen vector of document classification information.The scene characteristic of such as video file, character features, item characteristics, temporal characteristics.
Step 202, it is equal and sequentially connected multiple sub special to be divided into multiple vector lengths by vector length for feature vector
Sign vector.
It can be divided according to the vector length of eigen vector, it is equal that division obtains multiple vector lengths, and is sequentially connected
Multiple subcharacters vector.It is, for example, possible to use data are carried out the one-dimensional characteristic that compression formation length is 2048 by full articulamentum
Vector, one-dimensional characteristic vector represent a frame image.Therefore, it is such as extracted with the frequency of one frame image of extraction per second from video file
200 frame images, then video file can extract the feature vector of one group of 2048*200.The figure for needing to extract can also be preset
As quantity, total reproduction time of video file is then obtained, then total reproduction time divided by amount of images, obtain obtaining a frame figure
The frequency of picture, for example, preset need extract 300 frame images, the total reproduction time of video file be 30 minutes, then remove within 30 minutes
With 300, obtain obtaining the frequency of a frame image to obtain within every 6 seconds a frame image, each frame image be length be 2048 it is one-dimensional
Feature vector, and then feature vector is 2048*300.
Then feature vector is pressed to the length 2048 of vector length, i.e. one-dimensional characteristic vector, be divided into multiple vector length phases
Deng and sequentially connected multiple subcharacters vector, be such as divided into the subcharacter vector of 4 512*200.
Step 203, it from multiple subcharacter vectors, chooses first or the last one subcharacter vector is special as target
Sign vector.
Partial video file can introduce the video clip of the video file in the beginning part or ending, therefore from multiple
In subcharacter vector, first sub- feature vector of corresponding video file the beginning part or corresponding video file tail end are chosen
The last one the subcharacter vector divided is as target subcharacter vector.
Step 204, target subcharacter vector input video disaggregated model is trained, obtains final video classification mould
Type.
After obtaining target subcharacter vector, it is inputted video classification model and is trained, the video after being optimized point
Class model optimizes parameters value in video classification model.
In some embodiments, any one subcharacter vector can be chosen as mesh from multiple subcharacter vectors
Mark subcharacter vector.
Referring to Fig. 4, the third flow that Fig. 4 is video classification model training method provided by the embodiments of the present application is illustrated
Figure.Video classification model training method provided by the embodiments of the present application, idiographic flow can be as follows:
Step 301, video file input video disaggregated model is learnt, obtains feature vector.
Video classification model can be convolutional neural networks algorithm model, Recognition with Recurrent Neural Network algorithm model etc..Video point
Class model can also be SENet (Squeeze-and-Excitation Networks) algorithm model.
By in video file input video disaggregated model, video classification model obtains corresponding to the video according to the video file
The eigen vector of document classification information.The scene characteristic of such as video file, character features, item characteristics, temporal characteristics.
Step 3021, by feature vector by vector length, it is divided into that multiple vector lengths are equal and continuous multiple features
Array section.
It can be divided according to the vector length of eigen vector, it is equal that division obtains multiple vector lengths, and is sequentially connected
Multiple feature vector sections.It is, for example, possible to use data are carried out the one-dimensional characteristic that compression formation length is 2048 by full articulamentum
Vector, one-dimensional characteristic vector represent a frame image.Therefore, it is such as extracted with the frequency of one frame image of extraction per second from video file
200 frame images, then video file can extract the feature vector of one group of 2048*200.The figure for needing to extract can also be preset
As quantity, total reproduction time of video file is then obtained, then total reproduction time divided by amount of images, obtain obtaining a frame figure
The frequency of picture, for example, preset need extract 300 frame images, the total reproduction time of video file be 30 minutes, then remove within 30 minutes
With 300, obtain obtaining the frequency of a frame image to obtain within every 6 seconds a frame image, each frame image be length be 2048 it is one-dimensional
Feature vector, and then feature vector is 2048*300.
Then feature vector is pressed to the length 2048 of vector length, i.e. one-dimensional characteristic vector, be divided into multiple vector length phases
Deng and sequentially connected multiple feature vector sections, be such as divided into the feature vector section of 16 128*200.
Step 3022, at least two feature vector sections in multiple feature vector sections are formed into a sub- feature vector, obtained
To multiple subcharacters vector, one of subcharacter vector includes first feature vector section and the last one feature vector section;
At least two feature vector sections in multiple feature vector sections are formed into a sub- feature vector, such as 2 features to
It measures section and forms a sub- feature vector, and then obtain multiple subcharacter vectors.Partial video file is in the beginning part or tail end
The video clip of the video file is introduced by branch, therefore from multiple subcharacter vectors, chooses corresponding video file the beginning part
First feature vector section and corresponding video file ending the last one feature vector section merge to obtain a son it is special
In addition sign vector, the subcharacter vector can only include first feature vector section and the last one feature vector section, can also be
Feature vector section including one or more other positions.
Step 303, the subcharacter vector for including first feature vector section and the last one feature vector section is chosen, as
Target subcharacter vector.
Step 304, target subcharacter vector input video disaggregated model is trained, obtains final video classification mould
Type.
After obtaining target subcharacter vector, it is inputted video classification model and is trained, the video after being optimized point
Class model optimizes parameters value in video classification model.
It should be noted that in the above-described embodiment, the vector length of target subcharacter vector for feature vector to
/ 8th of amount length are between half.Such as feature vector is 2048*200, then target subcharacter vector is 256*
200 between 1024*200.
Referring to Fig. 5, the 4th kind of flow that Fig. 5 is video classification model training method provided by the embodiments of the present application is illustrated
Figure.Video classification model training method provided by the embodiments of the present application, idiographic flow can be as follows:
Step 401, the front section of video file input video disaggregated model is learnt, obtains feature vector.
Step 402, feature vector is divided into multiple subcharacter vectors.
Step 403, a sub- eigen vector is chosen from multiple subcharacter vectors as target subcharacter vector.
Step 404, the hindfoot portion of target subcharacter vector input video disaggregated model is trained, is finally regarded
Frequency division class model.
By feature vector (2048*200) average intercept that video file learns from disaggregated model such as SENet models into four
Section (512*200), every section of subcharacter vector independent as one, then arbitrarily choose the entire video text of one section of conduct therein
Feature vector carry out next training.Reduce the size of each feature in this way, so as to reduce training parameter, improve instruction
Practice efficiency.The training that the feature of Partial Feature vector as a whole enters next layer has been intercepted, has reduced next each spy
The size of vector is levied, so as to reduce training parameter, improves training effectiveness.
In some embodiments, feature vector can be extracted from video file, this feature vector is inputted into algorithm mould
Type learns, and obtains the weighted value of each feature in corresponding this feature vector, weighted value is divided into several sections, will according to weighted value
Feature in feature vector is divided into multiple subcharacter vectors, and each subcharacter vector includes the feature of different weighted values, and different
Feature quantity in subcharacter vector in same weighted value section is equal.
In some embodiments, corresponding continuous multiple frames image can be extracted from video file, each frame image is existed
Classify in algorithm model, and formed and represent the other first group of feature of object type and represent second group of feature of scene type,
First group of feature and second group of Fusion Features are formed into a third one-dimensional vector, are upper by the third one-dimensional vector
The initial characteristics vector in embodiment is stated, is then trained according to the initial characteristics vector.Such as the spy obtained in step 101
Sign vector be third one-dimensional characteristic vector, by this feature vector according to scene type and object classification be divided into multiple subcharacters to
Amount, each subcharacter vector includes scene type and object type another characteristic quantity is equal.
From the foregoing, it will be observed that video classification model training method provided by the embodiments of the present application, by the way that video file input is regarded
Frequency division class model is learnt, and obtains feature vector;Feature vector is divided into multiple subcharacter vectors;From multiple subcharacters vector
One sub- eigen vector of middle selection is as target subcharacter vector;Target subcharacter vector input video disaggregated model is instructed
Practice, obtain final video disaggregated model.Partial Feature vector has been intercepted as target signature subvector input video disaggregated model
For training, reduce the size of input data and the data of its conversion, so as to reduce training parameter, improve training effect
Rate.
Referring to Fig. 6, Fig. 6 is the structure diagram of video classification model training device provided by the embodiments of the present application.Its
In the video classification model training device 500 include first acquisition unit 501, division unit 502, selection unit 503 and training
Unit 504.Wherein:
First acquisition unit 501 for video file input video disaggregated model to be learnt, obtains feature vector.
Video file can be the video file of the forms such as mjpeg, avi, rmvb, 3gp.Herein not to the lattice of video file
Formula is defined.
Video classification model can be convolutional neural networks algorithm model, Recognition with Recurrent Neural Network algorithm model etc..Video point
Class model can also be SENet (Squeeze-and-Excitation Networks) algorithm model.
By in video file input video disaggregated model, video classification model obtains corresponding to the video according to the video file
The eigen vector of document classification information.The scene characteristic of such as video file, character features, item characteristics, temporal characteristics.
Division unit 502, for feature vector to be divided into multiple subcharacter vectors.
In a dimension of eigen vector, which is divided into multiple feature vectors.Such as feature vector is
Feature vector is divided into multiple subcharacters vector of 4 512*200 by 2048*200.
Selection unit 503, for from multiple subcharacter vectors choose a sub- eigen vector as target subcharacter to
Amount.
A sub- feature vector is arbitrarily chosen from multiple subcharacter vectors as target subcharacter vector.Can be first
A sub- feature vector can also be the last one subcharacter vector or any one intermediate subcharacter vector.Also may be used
Determined with being the data in each subcharacter vector, such as obtain the sum of data of each feature, access according to the sum of it is maximum or
Minimum or median is target subcharacter vector.It can also be by calculating the difference of two squares of each subcharacter vector, the difference of two squares is most
Small is target subcharacter vector.
Training unit 504 for target subcharacter vector input video disaggregated model to be trained, obtains final video
Disaggregated model.
After obtaining target subcharacter vector, it is inputted video classification model and is trained, the video after being optimized point
Class model optimizes parameters value in video classification model.
In some embodiments, division unit 502 are additionally operable to feature vector being divided into multiple vectors by vector length
Equal length and sequentially connected multiple subcharacter vectors.
It can be divided according to the vector length of eigen vector, it is equal that division obtains multiple vector lengths, and is sequentially connected
Multiple subcharacters vector.It is, for example, possible to use data are carried out the one-dimensional characteristic that compression formation length is 2048 by full articulamentum
Vector, one-dimensional characteristic vector represent a frame image.Therefore, it is such as extracted with the frequency of one frame image of extraction per second from video file
200 frame images, then video file can extract the feature vector of one group of 2048*200.The figure for needing to extract can also be preset
As quantity, total reproduction time of video file is then obtained, then total reproduction time divided by amount of images, obtain obtaining a frame figure
The frequency of picture, for example, preset need extract 300 frame images, the total reproduction time of video file be 30 minutes, then remove within 30 minutes
With 300, obtain obtaining the frequency of a frame image to obtain within every 6 seconds a frame image, each frame image be length be 2048 it is one-dimensional
Feature vector, and then feature vector is 2048*300.
Then feature vector is pressed to the length 2048 of vector length, i.e. one-dimensional characteristic vector, be divided into multiple vector length phases
Deng and sequentially connected multiple subcharacters vector, be such as divided into the subcharacter vector of 4 512*200.
Selection unit 503 is additionally operable to from multiple subcharacter vectors, chooses first or the last one subcharacter vector is made
For target subcharacter vector.
Partial video file can introduce the video clip of the video file in the beginning part or ending, therefore from multiple
In subcharacter vector, first sub- feature vector of corresponding video file the beginning part or corresponding video file tail end are chosen
The last one the subcharacter vector divided is as target subcharacter vector.
In some embodiments, selection unit 503 are additionally operable to from multiple subcharacter vectors, choose any one height
Feature vector is as target subcharacter vector.
In some embodiments, division unit 502 are additionally operable to feature vector being divided into multiple vectors by vector length
Equal length and continuous multiple feature vector sections;At least two feature vector sections in multiple feature vector sections are formed one
A sub- feature vector obtains multiple subcharacter vectors, and one of subcharacter vector is including first feature vector section and finally
One feature vector section.
It can be divided according to the vector length of eigen vector, it is equal that division obtains multiple vector lengths, and is sequentially connected
Multiple feature vector sections.It is, for example, possible to use data are carried out the one-dimensional characteristic that compression formation length is 2048 by full articulamentum
Vector, one-dimensional characteristic vector represent a frame image.Therefore, it is such as extracted with the frequency of one frame image of extraction per second from video file
200 frame images, then video file can extract the feature vector of one group of 2048*200.The figure for needing to extract can also be preset
As quantity, total reproduction time of video file is then obtained, then total reproduction time divided by amount of images, obtain obtaining a frame figure
The frequency of picture, for example, preset need extract 300 frame images, the total reproduction time of video file be 30 minutes, then remove within 30 minutes
With 300, obtain obtaining the frequency of a frame image to obtain within every 6 seconds a frame image, each frame image be length be 2048 it is one-dimensional
Feature vector, and then feature vector is 2048*300.
Then feature vector is pressed to the length 2048 of vector length, i.e. one-dimensional characteristic vector, be divided into multiple vector length phases
Deng and sequentially connected multiple feature vector sections, be such as divided into the feature vector section of 16 128*200.
At least two feature vector sections in multiple feature vector sections are formed into a sub- feature vector, such as 2 features to
It measures section and forms a sub- feature vector, and then obtain multiple subcharacter vectors.Partial video file is in the beginning part or tail end
The video clip of the video file is introduced by branch, therefore from multiple subcharacter vectors, chooses corresponding video file the beginning part
First feature vector section and corresponding video file ending the last one feature vector section merge to obtain a son it is special
In addition sign vector, the subcharacter vector can only include first feature vector section and the last one feature vector section, can also be
Feature vector section including one or more other positions.
Selection unit 503 is additionally operable to choose the son spy for including first feature vector section and the last one feature vector section
Sign vector, as target subcharacter vector.
In some embodiments, the vector length of target subcharacter vector for feature vector vector length eight/
One between half.Such as feature vector be 2048*200, then target subcharacter vector for 256*200 to 1024*200 it
Between.
In some embodiments, disaggregated model includes front section and hindfoot portion.First acquisition unit 501, is also used
Learn in by the front section of video file input video disaggregated model, obtain feature vector.Training unit 504, is also used
It is trained in by the hindfoot portion of target subcharacter vector input video disaggregated model.
By feature vector (2048*200) average intercept that video file learns from disaggregated model such as SENet models into four
Section (512*200), every section of subcharacter vector independent as one, then arbitrarily choose the entire video text of one section of conduct therein
Feature vector carry out next training.Reduce the size of each feature in this way, so as to reduce training parameter, improve instruction
Practice efficiency.The training that the feature of Partial Feature vector as a whole enters next layer has been intercepted, has reduced next each spy
The size of vector is levied, so as to reduce training parameter, improves training effectiveness.
From the foregoing, it will be observed that video classification model training device provided by the embodiments of the present application, by the way that video file input is regarded
Frequency division class model is learnt, and obtains feature vector;Feature vector is divided into multiple subcharacter vectors;From multiple subcharacters vector
One sub- eigen vector of middle selection is as target subcharacter vector;Target subcharacter vector input video disaggregated model is instructed
Practice, obtain final video disaggregated model.Partial Feature vector has been intercepted as target signature subvector input video disaggregated model
For training, reduce the size of input data and the data of its conversion, so as to reduce training parameter, improve training effect
Rate.
When it is implemented, Yi Shang modules can be independent entity to realize, arbitrary combination can also be carried out, is made
It is realized for same or several entities, the specific implementation of more than modules can be found in the embodiment of the method for front, herein not
It repeats again.
In the embodiment of the present application, video classification model training device and the video classification model training side in foregoing embodiments
It is owned by France in same design, can run in video classification model training method embodiment and carry on video classification model training device
The either method of confession, specific implementation process refer to the embodiment of video classification model training method, and details are not described herein again.
The embodiment of the present application also provides a kind of electronic equipment.Electronic equipment includes processor and memory.Wherein, it handles
Device is electrically connected with memory.
Processor is the control centre of electronic equipment, utilizes each portion of various interfaces and the entire electronic equipment of connection
Point, the data being stored in memory and are called at computer program in memory by operation or load store, are performed
The various functions of electronic equipment simultaneously handle data, so as to carry out integral monitoring to electronic equipment.
Memory can be used for storage software program and unit, and processor is stored in the computer journey of memory by operation
Sequence and unit, so as to perform various functions application and data processing.Memory can mainly include storing program area and storage
Data field, wherein, storing program area can storage program area, the computer program needed at least one function (for example broadcast by sound
Playing function, image player function etc.) etc.;Storage data field can be stored uses created data etc. according to electronic equipment.This
Outside, memory can include high-speed random access memory, can also include nonvolatile memory, for example, at least a disk
Memory device, flush memory device or other volatile solid-state parts.Correspondingly, memory can also include memory control
Device, to provide access of the processor to memory.
In the embodiment of the present application, the processor in electronic equipment can be according to the steps, by one or more
Computer program process it is corresponding instruction be loaded into memory, and calculating stored in memory is run by processor
Machine program is as follows so as to fulfill various functions:
Video file input video disaggregated model is learnt, obtains feature vector;
Feature vector is divided into multiple subcharacter vectors;
A sub- eigen vector is chosen from multiple subcharacter vectors as target subcharacter vector;
Target subcharacter vector input video disaggregated model is trained, obtains final video disaggregated model.
The embodiment of the present application also provides a kind of storage medium, and storage medium is stored with computer program, works as computer program
When running on computers so that computer performs the application program management-control method in any of the above-described embodiment, such as:By video
File input video disaggregated model is learnt, and obtains feature vector;Feature vector is divided into multiple subcharacter vectors;From multiple
A sub- eigen vector is chosen in subcharacter vector as target subcharacter vector;Target subcharacter vector input video is classified
Model is trained, and obtains final video disaggregated model.
In the embodiment of the present application, storage medium can be magnetic disc, CD, read-only memory (Read Only Memory,
) or random access memory (Random Access Memory, RAM) etc. ROM.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment
Point, it may refer to the associated description of other embodiment.
It should be noted that for the video classification model training method of the embodiment of the present application, this field common test
Personnel are appreciated that realize all or part of flow of the embodiment of the present application video classification model training method, are that can pass through meter
Calculation machine program is completed to control relevant hardware, and computer program can be stored in a computer read/write memory medium, such as
It is stored in the memory of electronic equipment, and is performed by least one processor in the electronic equipment, in the process of implementation may be used
The flow of embodiment including such as audio frequency playing method.Wherein, storage medium can be magnetic disc, CD, read-only memory, random
Access/memory body etc..
The above content is a further detailed description of the present invention in conjunction with specific preferred embodiments, it is impossible to assert
The specific implementation of the present invention is confined to these explanations.For those of ordinary skill in the art to which the present invention belongs, exist
Under the premise of not departing from present inventive concept, several simple deduction or replace can also be made, should all be considered as belonging to the present invention's
Protection domain.
Claims (10)
1. a kind of video classification model training method, which is characterized in that including:
Video file input video disaggregated model is learnt, obtains feature vector;
Described eigenvector is divided into multiple subcharacter vectors;
A sub- eigen vector is chosen from the multiple subcharacter vector as target subcharacter vector;
The target subcharacter vector is inputted the video classification model to be trained, obtains final video disaggregated model.
2. video classification model training method as described in claim 1, which is characterized in that the disaggregated model includes leading portion portion
Point and hindfoot portion;
The step of video file input video disaggregated model is learnt, including:
The front section of video file input video disaggregated model is learnt, obtains feature vector;
The step of target subcharacter vector input video disaggregated model is trained, including:
The hindfoot portion of the target subcharacter vector input video disaggregated model is trained.
3. video classification model training method as described in claim 1, which is characterized in that described to be divided into described eigenvector
The step of multiple subcharacter vectors, including;
By described eigenvector by vector length, it is divided into that multiple vector lengths are equal and sequentially connected multiple subcharacter vectors;
It is described that step of the sub- eigen vector as target subcharacter vector, packet are chosen from the multiple subcharacter vector
It includes:
From the multiple subcharacter vector, choose first or the last one subcharacter vector is vectorial as target subcharacter.
4. video classification model training method as described in claim 1, which is characterized in that be divided into described eigenvector multiple
The step of subcharacter vector, including;
By described eigenvector by vector length, it is divided into that multiple vector lengths are equal and continuous multiple feature vector sections;
By in multiple feature vector sections at least two feature vector sections formed a sub- feature vector, obtain multiple subcharacters to
Amount, one of subcharacter vector include first feature vector section and the last one feature vector section;
It is described that step of the sub- eigen vector as target subcharacter vector, packet are chosen from the multiple subcharacter vector
It includes:
Choose include first feature vector section and the last one feature vector section subcharacter it is vectorial, as target subcharacter to
Amount.
5. the video classification model training method as described in claim 1-4 is any, which is characterized in that the target subcharacter to
The vector length of amount is the vector length of described eigenvector 1/8th between half.
6. a kind of video classification model training device, which is characterized in that including:
First acquisition unit for video file input video disaggregated model to be learnt, obtains feature vector;
Division unit, for described eigenvector to be divided into multiple subcharacter vectors;
Selection unit, for choosing a sub- eigen vector from the multiple subcharacter vector as target subcharacter vector;
Training unit is trained for the target subcharacter vector to be inputted the video classification model, is finally regarded
Frequency division class model.
7. video classification model training device as claimed in claim 6, which is characterized in that the disaggregated model includes leading portion portion
Point and hindfoot portion;
The first acquisition unit is additionally operable to learn the front section of video file input video disaggregated model, obtain
Feature vector;
The training unit is additionally operable to instruct the hindfoot portion of the target subcharacter vector input video disaggregated model
Practice.
8. video classification model training device as claimed in claim 6, which is characterized in that
It is equal and connect successively to be additionally operable to described eigenvector being divided by vector length multiple vector lengths for the division unit
The multiple subcharacters vector connect;
The selection unit is additionally operable to from the multiple subcharacter vector, chooses first or the last one subcharacter vector
As target subcharacter vector.
9. a kind of storage medium, is stored thereon with computer program, which is characterized in that when the computer program on computers
During operation so that the computer performs such as video classification model training method described in any one of claim 1 to 5.
10. a kind of electronic equipment, including processor and memory, the memory has computer program, which is characterized in that described
Processor is instructed by calling the computer program for performing video classification model described in any one of claim 1 to 5 such as
Practice method.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711420935.5A CN108154120A (en) | 2017-12-25 | 2017-12-25 | video classification model training method, device, storage medium and electronic equipment |
PCT/CN2018/079907 WO2019127940A1 (en) | 2017-12-25 | 2018-03-21 | Video classification model training method, device, storage medium, and electronic device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711420935.5A CN108154120A (en) | 2017-12-25 | 2017-12-25 | video classification model training method, device, storage medium and electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108154120A true CN108154120A (en) | 2018-06-12 |
Family
ID=62465816
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711420935.5A Pending CN108154120A (en) | 2017-12-25 | 2017-12-25 | video classification model training method, device, storage medium and electronic equipment |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN108154120A (en) |
WO (1) | WO2019127940A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109165722A (en) * | 2018-07-09 | 2019-01-08 | 北京市商汤科技开发有限公司 | Model expansion method and device, electronic equipment and storage medium |
CN109214399A (en) * | 2018-10-12 | 2019-01-15 | 清华大学深圳研究生院 | A kind of improvement YOLOV3 Target Recognition Algorithms being embedded in SENet structure |
CN109614517A (en) * | 2018-12-04 | 2019-04-12 | 广州市百果园信息技术有限公司 | Classification method, device, equipment and the storage medium of video |
CN110175266A (en) * | 2019-05-28 | 2019-08-27 | 复旦大学 | A method of it is retrieved for multistage video cross-module state |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102360434A (en) * | 2011-10-09 | 2012-02-22 | 江苏大学 | Target classification method of vehicle and pedestrian in intelligent traffic monitoring |
US20120076401A1 (en) * | 2010-09-27 | 2012-03-29 | Xerox Corporation | Image classification employing image vectors compressed using vector quantization |
CN105912611A (en) * | 2016-04-05 | 2016-08-31 | 中国科学技术大学 | CNN based quick image search method |
CN106101831A (en) * | 2016-07-15 | 2016-11-09 | 合网络技术(北京)有限公司 | video vectorization method and device |
CN106650617A (en) * | 2016-11-10 | 2017-05-10 | 江苏新通达电子科技股份有限公司 | Pedestrian abnormity identification method based on probabilistic latent semantic analysis |
CN107341452A (en) * | 2017-06-20 | 2017-11-10 | 东北电力大学 | Human bodys' response method based on quaternary number space-time convolutional neural networks |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6915009B2 (en) * | 2001-09-07 | 2005-07-05 | Fuji Xerox Co., Ltd. | Systems and methods for the automatic segmentation and clustering of ordered information |
CN102930294A (en) * | 2012-10-18 | 2013-02-13 | 上海交通大学 | Chaotic characteristic parameter-based motion mode video segmentation and traffic condition identification method |
CN103218608B (en) * | 2013-04-19 | 2017-05-10 | 中国科学院自动化研究所 | Network violent video identification method |
CN105512631B (en) * | 2015-12-07 | 2019-01-25 | 上海交通大学 | Video detecting method is feared cruelly based on MoSIFT and CSD feature |
-
2017
- 2017-12-25 CN CN201711420935.5A patent/CN108154120A/en active Pending
-
2018
- 2018-03-21 WO PCT/CN2018/079907 patent/WO2019127940A1/en active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120076401A1 (en) * | 2010-09-27 | 2012-03-29 | Xerox Corporation | Image classification employing image vectors compressed using vector quantization |
CN102360434A (en) * | 2011-10-09 | 2012-02-22 | 江苏大学 | Target classification method of vehicle and pedestrian in intelligent traffic monitoring |
CN105912611A (en) * | 2016-04-05 | 2016-08-31 | 中国科学技术大学 | CNN based quick image search method |
CN106101831A (en) * | 2016-07-15 | 2016-11-09 | 合网络技术(北京)有限公司 | video vectorization method and device |
CN106650617A (en) * | 2016-11-10 | 2017-05-10 | 江苏新通达电子科技股份有限公司 | Pedestrian abnormity identification method based on probabilistic latent semantic analysis |
CN107341452A (en) * | 2017-06-20 | 2017-11-10 | 东北电力大学 | Human bodys' response method based on quaternary number space-time convolutional neural networks |
Non-Patent Citations (1)
Title |
---|
卓柳: "胎儿颜面部三维超声基准标准切面自动校对系统的研究", 《中国优秀硕士学位论文全文数据库 医药卫生科技辑》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109165722A (en) * | 2018-07-09 | 2019-01-08 | 北京市商汤科技开发有限公司 | Model expansion method and device, electronic equipment and storage medium |
CN109165722B (en) * | 2018-07-09 | 2021-07-23 | 北京市商汤科技开发有限公司 | Model expansion method and device, electronic equipment and storage medium |
CN109214399A (en) * | 2018-10-12 | 2019-01-15 | 清华大学深圳研究生院 | A kind of improvement YOLOV3 Target Recognition Algorithms being embedded in SENet structure |
CN109614517A (en) * | 2018-12-04 | 2019-04-12 | 广州市百果园信息技术有限公司 | Classification method, device, equipment and the storage medium of video |
CN109614517B (en) * | 2018-12-04 | 2023-08-01 | 广州市百果园信息技术有限公司 | Video classification method, device, equipment and storage medium |
CN110175266A (en) * | 2019-05-28 | 2019-08-27 | 复旦大学 | A method of it is retrieved for multistage video cross-module state |
CN110175266B (en) * | 2019-05-28 | 2020-10-30 | 复旦大学 | Cross-modal retrieval method for multi-segment video |
Also Published As
Publication number | Publication date |
---|---|
WO2019127940A1 (en) | 2019-07-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Rochan et al. | Video summarization using fully convolutional sequence networks | |
Sun et al. | Lattice long short-term memory for human action recognition | |
CN111209440B (en) | Video playing method, device and storage medium | |
CN110147711A (en) | Video scene recognition methods, device, storage medium and electronic device | |
CN108090203A (en) | Video classification methods, device, storage medium and electronic equipment | |
CN110990631A (en) | Video screening method and device, electronic equipment and storage medium | |
Wang et al. | Dynamic attention guided multi-trajectory analysis for single object tracking | |
Hou et al. | Content-attention representation by factorized action-scene network for action recognition | |
CN108154120A (en) | video classification model training method, device, storage medium and electronic equipment | |
Hii et al. | Multigap: Multi-pooled inception network with text augmentation for aesthetic prediction of photographs | |
CN112200041B (en) | Video motion recognition method and device, storage medium and electronic equipment | |
CN110688524A (en) | Video retrieval method and device, electronic equipment and storage medium | |
CN111539290A (en) | Video motion recognition method and device, electronic equipment and storage medium | |
CN109086697A (en) | A kind of human face data processing method, device and storage medium | |
Wang et al. | Multiscale deep alternative neural network for large-scale video classification | |
CN111783712A (en) | Video processing method, device, equipment and medium | |
CN113761359B (en) | Data packet recommendation method, device, electronic equipment and storage medium | |
Su et al. | Transfer learning for video recognition with scarce training data for deep convolutional neural network | |
CN108133020A (en) | Video classification methods, device, storage medium and electronic equipment | |
CN113779303A (en) | Video set indexing method and device, storage medium and electronic equipment | |
CN114637923A (en) | Data information recommendation method and device based on hierarchical attention-graph neural network | |
CN112784929A (en) | Small sample image classification method and device based on double-element group expansion | |
CN111310041A (en) | Image-text publishing method, model training method and device and storage medium | |
WO2022183805A1 (en) | Video classification method, apparatus, and device | |
Chiang et al. | A multi-embedding neural model for incident video retrieval |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20180612 |