CN109992679A - A kind of classification method and device of multi-medium data - Google Patents
A kind of classification method and device of multi-medium data Download PDFInfo
- Publication number
- CN109992679A CN109992679A CN201910218914.8A CN201910218914A CN109992679A CN 109992679 A CN109992679 A CN 109992679A CN 201910218914 A CN201910218914 A CN 201910218914A CN 109992679 A CN109992679 A CN 109992679A
- Authority
- CN
- China
- Prior art keywords
- medium data
- multiframe
- data
- characteristic information
- group
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention discloses a kind of classification method of multi-medium data and devices, are applied to technical field of information processing.In the method for the present embodiment, the sorter of multi-medium data can sequentially in time be divided multiframe multi-medium data to be processed, be divided into multiple groups multi-medium data, and extract the corresponding assemblage characteristic information of each group multi-medium data;Finally further according to the corresponding assemblage characteristic information of each group multi-medium data, the global feature information of multiframe multi-medium data is determined, to classify to multiframe multi-medium data.In this way, the sorter of multi-medium data is when describing multiframe multi-medium data progress feature, consider the temporal characteristics between multiframe multi-medium data, finally obtained global feature information is enabled preferably to reflect multiframe multi-medium data, so that more acurrate to the classification of multiframe multi-medium data.
Description
Technical field
The present invention relates to technical field of information processing, in particular to the classification method and device of a kind of multi-medium data.
Background technique
It is existing usually first to extract the characteristic information of video to be processed when classifying to video to be processed, then root again
Determine that video to be processed belongs to the probability of each type according to the characteristic information and video classification model of extraction.
Under normal circumstances, the characteristic information of the video to be processed of extraction is characterized vector description, can specifically include: ladder
It spends histogram (Histogram of Gradient), light stream histogram (Histogram of Optical Flow), visual word
Bag indicates (Bag of Visual Words), Fei Sheer vector (Fisher Vector), localized clusters feature vector (Vector
Of Locally Aggregated Descriptor, VLAD) and network part aggregation characteristic vector (Vector of
Network Locally Aggregated Descriptor, NetVLAD) etc..Different feature vectors describes method to difference
The video or picture classification model of feature have different classification performances.
The characteristic information of the video to be processed extracted at present mainly accounts for single frames grade another characteristic in video, no
It is very comprehensively, so that it is very accurate for obtaining result not eventually by video classification model.
Summary of the invention
The embodiment of the present invention provides the classification method and device of a kind of multi-medium data, realizes according to multiple groups multimedia number
According to corresponding assemblage characteristic information, the global feature information of multiframe multi-medium data is determined.
First aspect of the embodiment of the present invention provides a kind of classification method of multi-medium data, comprising:
Obtain multiframe multi-medium data to be processed;
The multiframe multi-medium data is divided into multiple groups multi-medium data sequentially in time, in every group of multi-medium data
An at least frame multi-medium data including Time Continuous;
Extract the corresponding assemblage characteristic information of each group multi-medium data;
According to the corresponding assemblage characteristic information of each group multi-medium data, the multiframe multi-medium data is determined
Global feature information, to classify to the multiframe multi-medium data.
Second aspect of the embodiment of the present invention provides a kind of sorter of multi-medium data, comprising:
Data capture unit, for obtaining multiframe multi-medium data to be processed;
Division unit, for the multiframe multi-medium data to be divided into multiple groups multi-medium data sequentially in time, often
It include an at least frame multi-medium data for Time Continuous in group multi-medium data;
Feature extraction unit, for extracting the corresponding assemblage characteristic information of each group multi-medium data;
Characteristics determining unit, for determining institute according to the corresponding assemblage characteristic information of each group multi-medium data
The global feature information of multiframe multi-medium data is stated, to classify to the multiframe multi-medium data.
The third aspect of the embodiment of the present invention provides a kind of storage medium, and the storage medium stores a plurality of instruction, the finger
It enables and being suitable for as processor loads and executes the classification method of the multi-medium data as described in first aspect of the embodiment of the present invention.
A kind of server of fourth aspect of the embodiment of the present invention, including pocessor and storage media, the processor, for real
Existing each instruction;The storage medium is for storing a plurality of instruction, and described instruction by processor for being loaded and executing such as this hair
The classification method of multi-medium data described in bright embodiment first aspect.
As it can be seen that the sorter of multi-medium data can be sequentially in time to be processed in the method for the present embodiment
Multiframe multi-medium data is divided, and multiple groups multi-medium data is divided into, and extracts the corresponding combination of each group multi-medium data
Characteristic information;Finally further according to the corresponding assemblage characteristic information of each group multi-medium data, multiframe multi-medium data is determined
Global feature information, to classify to multiframe multi-medium data.In this way, the sorter of multi-medium data is to the more matchmakers of multiframe
When volume data progress feature describes, it is contemplated that the temporal characteristics between multiframe multi-medium data, so that finally obtained entirety
Characteristic information can preferably reflect multiframe multi-medium data, so that more acurrate to the classification of multiframe multi-medium data.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of invention without any creative labor, may be used also for those of ordinary skill in the art
To obtain other drawings based on these drawings.
Fig. 1 is a kind of schematic diagram of the classification method of multi-medium data provided in an embodiment of the present invention;
Fig. 2 is a kind of flow chart of the classification method of the multi-medium data provided in one embodiment of the invention;
Fig. 3 is a kind of schematic diagram of the classification method for multi-medium data that Application Example of the present invention provides;
Fig. 4 is the video file that the application server that application terminal is shown in Application Example of the present invention is uploaded according to user
The recommendation information of transmission;
Fig. 5 is the schematic diagram that the first global feature information is obtained in Application Example of the present invention;
Fig. 6 is a kind of structural schematic diagram of the sorter of multi-medium data provided in an embodiment of the present invention;
Fig. 7 is a kind of structural schematic diagram of server provided in an embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
Description and claims of this specification and term " first ", " second ", " third " " in above-mentioned attached drawing
The (if present)s such as four " are to be used to distinguish similar objects, without being used to describe a particular order or precedence order.It should manage
The data that solution uses in this way are interchangeable under appropriate circumstances, so that the embodiment of the present invention described herein for example can be to remove
Sequence other than those of illustrating or describe herein is implemented.In addition, term " includes " and " having " and theirs is any
Deformation, it is intended that cover not exclusively include, for example, containing the process, method of a series of steps or units, system, production
Product or equipment those of are not necessarily limited to be clearly listed step or unit, but may include be not clearly listed or for this
A little process, methods, the other step or units of product or equipment inherently.
The embodiment of the present invention provides a kind of classification method of multi-medium data, as shown in Figure 1, mainly multi-medium data
Sorter is implemented by the following steps:
Obtain multiframe multi-medium data to be processed;The multiframe multi-medium data is divided into multiple groups sequentially in time
Multi-medium data includes an at least frame multi-medium data for Time Continuous in every group of multi-medium data;Extract each group multimedia number
According to corresponding assemblage characteristic information;According to the corresponding assemblage characteristic information of each group multi-medium data, institute is determined
The global feature information of multiframe multi-medium data is stated, to classify to the multiframe multi-medium data.
Method in the embodiment of the present invention can be applied in the recommender system of multi-medium data, also can be applied to
The video that family uploads is filtered and is classified etc. in business.
In this way, the sorter of multi-medium data is when describing multiframe multi-medium data progress feature, it is contemplated that more
Temporal characteristics between frame multi-medium data enable finally obtained global feature information preferably to reflect multiframe multimedia number
According to so that more acurrate to the classification of multiframe multi-medium data.
The embodiment of the present invention provides a kind of classification method of multi-medium data, the classification of mainly above-mentioned multi-medium data
Method performed by device, flow chart are as shown in Figure 2, comprising:
Step 101, multiframe multi-medium data to be processed is obtained.
It is appreciated that after multiframe multi-medium data is uploaded to application server by a certain application terminal by user, it can
To be directed to the multiframe multi-medium data that application terminal uploads by the sorter of multi-medium data, the process of the present embodiment is initiated.
Wherein, the sorter of multi-medium data can be application server, be also possible to the equipment independently of application server;Multiframe
Multi-medium data is specially the multi-medium datas such as corresponding image data of multiple moment and audio data, such as video data
Deng.
Step 102, multiframe multi-medium data is divided into multiple groups multi-medium data according to the time, in every group of multi-medium data
An at least frame multi-medium data including Time Continuous.
Specifically, multiframe multi-medium data can be divided into mutually by the sorter of multi-medium data sequentially in time
Disjoint multiple groups multi-medium data;And the different frame numbers for organizing the multi-medium data for including in multi-medium datas can be identical,
Can be different, for example include n frame multi-medium data in a certain group of multi-medium data, and in another group of multi-medium data include m frame
Multi-medium data, then m can be equal or different to n.
For example, multiframe multi-medium data is the multi-medium data at T1 moment, the multi-medium data ... ... at T2 moment, when Tn
The sorter of the multi-medium data at quarter, multi-medium data can sequentially in time, by the multi-medium data of T1 and T2 moment
It is divided into one group, the multi-medium data at T3, T4 and T5 moment is divided into one group ... ..., by the multimedia of Tn-1 and Tn moment
Data are divided into one group.In this way, available arrive multi-medium data in different time periods.
Step 103, the corresponding assemblage characteristic information of each group multi-medium data is extracted.
In one case, the sorter of multi-medium data is extracting the corresponding assemblage characteristic of any group of multi-medium data
When information, the corresponding characteristic information of all frame multi-medium datas for including in this group of multi-medium data can be first extracted, so
The corresponding characteristic information of the every frame multi-medium data of this group of multi-medium data is spliced again afterwards, it is specifically suitable according to the time
After being spliced, and using characteristic information after obtained splicing as the corresponding assemblage characteristic information of this group of multi-medium data.
Wherein, the sorter of multi-medium data is specifically when extracting the characteristic information of a certain frame multi-medium data, if
The frame multi-medium data is image data, and the sorter of multi-medium data can use convolutional neural networks picture classification mould
The models such as type, such as Inception-V4 extract the characteristic information of the frame image data.If the frame multi-medium data is audio
The sorter of data, multi-medium data can be using moulds such as convolutional neural networks audio signal classification models, such as VGGish
Type extracts the characteristic information of the frame audio data.
In other cases, the sorter of multi-medium data is extracting the corresponding assemblage characteristic of any group of multi-medium data
When information, it can not have to obtain the corresponding assemblage characteristic information of each group multi-medium data by way of above-mentioned splicing, than
Such as, all frame multi-medium datas that can directly include by one group of multi-medium data are directly inputted to convolutional neural networks or recurrence
In neural network, i.e., the corresponding assemblage characteristic information of exportable this group of multi-medium data.Can with other way, herein not into
Row limits.
Step 104, according to the corresponding assemblage characteristic information of each group multi-medium data, above-mentioned multiframe multimedia number is determined
According to global feature information, to classify to multiframe multi-medium data.
Specifically, the sorter of multi-medium data can believe the corresponding assemblage characteristic of multiple groups multi-medium data
Breath, according to the modes such as NetVLAD, NeXtVLAD [3] or NL-NetVLAD [4], determines global feature information.For example, can use
The difference of the assemblage characteristic information of each group multi-medium data cluster centre nearest with it respectively indicates global feature information, such as
These differences are added to obtained additive value characteristic information as a whole.
It further, can be according to determining whole after the global feature information that above-mentioned multiframe multi-medium data has been determined
Body characteristics information and preset disaggregated model, classify to multiframe multi-medium data, obtain the type of multiframe multi-medium data
Information.Wherein, directly by determining global feature information input into preset disaggregated model, multiframe multimedia number can be obtained
According to type information.
It should be noted that if above-mentioned multiframe multi-medium data is video data, including multiframe image data and multiframe
Audio data, then for the sorter of multi-medium data when executing above-mentioned steps 103, the combination that can extract each group image data is special
The assemblage characteristic information of reference breath and multiple groups audio data;It, can be according to the group of each group image data when executing above-mentioned steps 104
The first global feature information that characteristic information determines multiframe image data is closed, it is true according to the assemblage characteristic information of each group audio data
Determine the second global feature information of multiframe audio data.In this way, when executing the step of above-mentioned classification, it can be according to first obtained
Global feature information and/or the second global feature information and preset disaggregated model, classify to video data.
In addition, it is necessary to explanation, above-mentioned steps 101 to 104 are by once being divided to multiframe multi-medium data
Afterwards, the global feature information obtained.In other embodiments, the sorter of multi-medium data can be performed a plurality of times above-mentioned
Step 102 to 104, that is, above-mentioned division multiple groups multi-medium data is executed, assemblage characteristic information is extracted and determines global feature information
The step of it is multiple, every time execute step 102 to 104 after, an available global feature information can obtain multiple candidates in this way
Global feature information;Then according to the global feature information of multiple candidates, the final whole spy of multiframe multi-medium data is determined
Reference breath, such as the global feature that the weighted sum of the global feature information of multiple candidates is final as multiframe multi-medium data
Information.Wherein, the multiple groups multi-medium data obtained when dividing every time to multiframe multi-medium data is different from.
In this case, the sorter of multi-medium data is determining the final entirety of above-mentioned multiframe multi-medium data
After characteristic information, multiframe multi-medium data can be divided according to final global feature information and preset disaggregated model
Class.
As it can be seen that the sorter of multi-medium data can be sequentially in time to be processed in the method for the present embodiment
Multiframe multi-medium data is divided, and multiple groups multi-medium data is divided into, and extracts the corresponding combination of each group multi-medium data
Characteristic information;Finally further according to the corresponding assemblage characteristic information of each group multi-medium data, multiframe multi-medium data is determined
Global feature information, to classify to multiframe multi-medium data.In this way, the sorter of multi-medium data is to the more matchmakers of multiframe
When volume data progress feature describes, it is contemplated that the temporal characteristics between multiframe multi-medium data, so that finally obtained entirety
Characteristic information can preferably reflect multiframe multi-medium data, so that more acurrate to the classification of multiframe multi-medium data.
Illustrate the classification method of multi-medium data in the present invention with next specific application example, in the present embodiment
Method be applied particularly in application system, in the application system include application terminal and application server, application server
The sorter of specially above-mentioned multi-medium data, and in the present embodiment, multi-medium data is specially video data, finally
Obtained global feature information is the network part aggregation characteristic vector description (Temporal based on time relationship
Relationbased NetVLAD, TR-NetVLAD).Then the classification method of the multi-medium data of the present embodiment can be by such as
Lower step realizes that schematic diagram is as shown in Figure 3, comprising:
Step 201, user's operation application terminal, so that application terminal uploaded videos file is to application server.
It step 202, can be by video encoding/decoding method, with certain sampling after application server receives video file
Frequency (for example 1 frame is per second), sampling obtain the multiframe image data and multiframe audio data for including in video file.
Step 203, application server respectively to above-mentioned steps 202 obtain multiframe image data and multiframe audio data into
Row character representation.
For example, for the video of T frame, the characteristic information of available T frame image dataAnd T frame sound
The characteristic information of frequency evidence
Step 204, application server is according to the characteristic information of multiframe image dataDetermine multiframe picture
First global feature information of data, i.e. TR-NetVLAD Feature Descriptor vector, are denoted as Vvideo;According to multiframe audio data
Characteristic informationThe the second global feature information for determining multiframe audio data, is denoted as Vaudio。
Step 205, application server can be by the first global feature information VvideoWith the second global feature information Vaudio, or
Any global feature information in person's the first global feature information and the second global feature information, is input to preset disaggregated model
In, the available C dimension video classification class vector indicated with probability, wherein C is the other quantity of preset video class.It should
The numerical value of each position represents video as the probability of corresponding classification, finally by the video category classification in video classification class vector
Vector is converted to classification, to obtain the classification results of above-mentioned video file.
Illustrate by taking C=3 as an example, it is first that video classification class vector vector [0.1,0.9,0.7], which indicates that video file is,
The probability of type is 0.1, and the probability of Second Type is 0.9, and the probability of third type is 0.7.Here each type can be independent
Ground occurs, do not require numerical value in the other class vector of video class and be 1, but is also not excluded for this programme for single category classification
Situation, i.e., in video classification class vector numerical value and be 1 the case where.
Step 206, the classification results for the video file that application server can be obtained according to above-mentioned steps 205, are classifying
As a result include at least one type in, and the information of this at least one type is sent to application terminal as recommendation information and is carried out
Display.
Such as Fig. 4 show application server system according to the multiple types in classification results, is the user using eventually
The recommendation interface for holding recommendation information, if in classification results including: types of entertainment, sports genre and current events type, in this way, answering
Recommended in interface with what terminal was shown just including news, the news of sports genre and the news of current events type etc. of types of entertainment.
It should be noted that given T frame D dimensional feature informationObtain corresponding TR-NetVLAD feature description
Subvector v, that is, global feature information, the length of the vector v are D × K, and K is the number of preset cluster centre here.Wherein, right
In the improvement that the method for obtaining global feature information is to NetVLAD character description method, therefore, VLAD feature is first introduced below
Description method, and then NetVLAD character description method is introduced, the TR-NetVLAD feature description in the present embodiment is finally introduced again
Method:
(1) VLAD character description method
For the characteristic information of N number of D dimensionVLAD character description method is desirable to by finding these features
K D of information ties up cluster centreSpy is carried out with the difference of each characteristic information to the cluster centre nearest with it
Sign description.D × K ties up VLAD Feature Descriptor vector VVLADIt can be indicated by following formula 1:
Wherein, ak(xn) it is indicative function, work as ckFor with xnBetween distance nearest cluster centre when, ak(xn) it is 1, it is no
It is then 0.Specifically, when calculating VLAD Feature Descriptor vector, each characteristic information x is calculatednWith its nearest cluster centre ck's
D ties up residual vector, and obtained residual vector is added to matrix VVLADOn corresponding position.
It in this process, can be to each cluster centre L2 regularization;It can also be to VVLADIt is whole to carry out L2 regularization,
It specifically, can be first by VVLADIt is launched into D × K dimensional vector, then carries out Regularization again.In this way, feature can be optimized
Numerical value, so that subsequent classification is more acurrate.
(2) NetVLAD character description method
It is similar with VLAD character description method, unlike, in NetVLAD character description method, for indicative function
ak(xn) and cluster centreDefinition it is different.In NetVLAD character description method, by cluster centreParametrization, can be updated;By indicative function ak(xn) " softening " at the numerical tabular between one 0 to 1In this way by calculating each characteristic information xnThe cluster centre c nearest with itkDifference, and be finally normalized to phase
To weighted value.Therefore,It can be understood as cluster centre c hereinkFor characteristic information xnRelative importance,
Specifically,It can be indicated by normalizing formula 2 as follows:
HereValue range be 0 to 1 between, can be regarded as relative weighting.
Therefore, NetVLAD feature description vectors VNetVLADIt can be indicated by following formula 3:
(3) TR-NetVLAD character description method
In the TR-NetVLAD character description method of the present embodiment, time relationship expression first is carried out to characteristic information, specifically
Ground, can be by N number of characteristic informationEqual part becomes mutually disjoint N/ τ group sequentially in time, then every group of feature letter
It include τ continuous D dimensional feature information in breath;These characteristic informations can be spliced into τ * D dimensional feature information in chronological order,
I.e. every group of assemblage characteristic information, is denoted asAccording to every group of assemblage characteristic information and above-mentioned formula 3, can obtain
To global feature information, specially TR-NetVLAD Feature Descriptor vector, it is denoted asFor a time scale
The global feature information determined under (temporal scale) τ.
In the same manner, the global feature information under available different time scale τRoot again
According to the available final global feature information of the global feature information under different time scale τ, specially different time scale
Under global feature information weighted sum, can specifically be indicated by following formula 4:
Wherein, wτFor adjustable parameter, such as J time scale, wτIt can be set to 1/J.
Therefore, refering to what is shown in Fig. 5, application server determines the first global feature information in executing above-mentioned steps 204
When vvideo, it can be implemented by the following steps, comprising:
Step 301, application server is at time scale τ, by the characteristic information of T frame image data
It is divided into multiple groups characteristic information.
Step 302, application server respectively splices the characteristic information in every group of characteristic information in groups sequentially in time
Characteristic information is closed, is denoted as
Step 303, application server determines T frame figure according to the assemblage characteristic information and above-mentioned formula 3 of each group characteristic information
The global feature information of the piece number, is denoted as
Step 304, application server can execute above-mentioned steps 301 to 303, available different time rule by circulation
Global feature information under mould τAnd according to above-mentioned formula 4, available T frame image data it is final first
Global feature information VNetVLAD, as above-mentioned Vvideo。
Application server according to the method for above-mentioned steps 301 to 304, can also obtain the second entirety of T frame audio data
Characteristic information Vaudio。
As it can be seen that the feature of video data is described, most in the present embodiment by TR-NetVLAD character description method
The classification results obtained eventually according to TR-NetVLAD Feature Descriptor vector, than being described under similarity condition according to NetVLAD feature
The performance for the classification results that subvector is assigned to is higher by about 1.7%GAP 20, and first class hit rate is higher by about 2%.Wherein GAP@20 is
Multi-class visual classification performance indicator, first class hit refer to the highest classification hit true classification of video of classification confidence.
The embodiment of the present invention also provides a kind of sorter of multi-medium data, and structural schematic diagram is as shown in fig. 6, specific
May include:
Data capture unit 10, for obtaining multiframe multi-medium data to be processed.
Division unit 11, the multiframe multi-medium data for obtaining the data capture unit 10 are drawn sequentially in time
It is divided into multiple groups multi-medium data, includes an at least frame multi-medium data for Time Continuous in every group of multi-medium data.Wherein, described
The frame number for the multi-medium data for mutually disjointing between multiple groups multi-medium data, and including in different group multi-medium data is identical or not
It is identical.
Feature extraction unit 12 is respectively corresponded for extracting each group multi-medium data that the division of division unit 11 obtains
Assemblage characteristic information.
The feature extraction unit 12, specifically for extracting in one group of multi-medium data including being all frame multimedias
The corresponding characteristic information of data;By the corresponding characteristic information of frame multi-medium data every in one group of multi-medium data
Spliced, using characteristic information after obtained splicing as the corresponding assemblage characteristic information of one group of multi-medium data.
Characteristics determining unit 13 is distinguished for extracting obtained each group multi-medium data according to the feature extraction unit 12
Corresponding assemblage characteristic information determines the global feature information of the multiframe multi-medium data, to the multiframe multimedia number
According to classifying.
Specifically, characteristics determining unit 13, specifically for being distinguished with the assemblage characteristic information of each group multi-medium data
The difference of the cluster centre nearest with it indicates the global feature information.
It should be noted that above-mentioned division unit 11, feature extraction unit 12 and characteristics determining unit 13 can execute respectively
The step of division multiple groups multi-medium data, extraction assemblage characteristic information and determining global feature information, is multiple, obtains multiple
Candidate global feature information;Then this feature determination unit 13 is also used to the global feature according to the multiple candidate
Information determines the final global feature information of the multiframe multi-medium data.
Further, the sorter of multi-medium data can also include: taxon 14, for true according to the feature
The global feature information and preset disaggregated model that order member 13 determines, classify to the multiframe multi-medium data, obtain
The type information of the multiframe multi-medium data.
As it can be seen that division unit 11 can be treated sequentially in time in the sorter of the multi-medium data of the present embodiment
The multiframe multi-medium data of processing is divided, and is divided into multiple groups multi-medium data, and more by the extraction of feature extraction unit 12 each group
The corresponding assemblage characteristic information of media data;Last characteristics determining unit 13 is respectively corresponded further according to each group multi-medium data
Assemblage characteristic information, the global feature information of multiframe multi-medium data is determined, to classify to multiframe multi-medium data.This
Sample, the sorter of multi-medium data is when describing multiframe multi-medium data progress feature, it is contemplated that multiframe multimedia number
Temporal characteristics between enable finally obtained global feature information preferably to reflect multiframe multi-medium data, to make
It obtains more acurrate to the classification of multiframe multi-medium data.
The embodiment of the present invention also provides a kind of server, and as shown with 7, which can be because of configuration or property for structural schematic diagram
Energy is different and generates bigger difference, may include one or more central processing units (central processing
Units, CPU) 20 (for example, one or more processors) and memory 21, one or more storage application programs
221 or data 222 storage medium 22 (such as one or more mass memory units).Wherein, memory 21 and storage
Medium 22 can be of short duration storage or persistent storage.The program for being stored in storage medium 22 may include one or more moulds
Block (diagram does not mark), each module may include to the series of instructions operation in server.Further, central processing
Device 20 can be set to communicate with storage medium 22, execute the series of instructions operation in storage medium 22 on the server.
Specifically, the application program 221 stored in storage medium 22 includes the application program of multimedia data classification, and
The program may include the data capture unit 10 in the sorter of above-mentioned multi-medium data, division unit 11, feature extraction
Unit 12, characteristics determining unit 13 and taxon 14, herein without repeating.Further, central processing unit 20 can be with
It is set as communicating with storage medium 22, executes the application journey of the multimedia data classification stored in storage medium 22 on the server
The corresponding sequence of operations of sequence.
Server can also include one or more power supplys 23, one or more wired or wireless network interfaces
24, and/or, one or more operating systems 223, such as Windows ServerTM, Mac OS XTM, UnixTM,
LinuxTM, FreeBSDTM etc..
The step as performed by the sorter of multi-medium data described in above method embodiment can be based on the Fig. 7
Shown in server structure.
The embodiment of the present invention also provides a kind of storage medium, and the storage medium stores a plurality of instruction, and described instruction is suitable for
It is loaded as processor and executes the classification method of the multi-medium data as performed by the sorter of above-mentioned multi-medium data.
The embodiment of the present invention also provides a kind of terminal device, including pocessor and storage media, the processor, for real
Existing each instruction;The storage medium is for storing a plurality of instruction, and described instruction is for being loaded by processor and being executed as above-mentioned
The classification method of multi-medium data performed by the sorter of multi-medium data.
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of above-described embodiment is can
It is completed with instructing relevant hardware by program, which can be stored in a computer readable storage medium, storage
Medium may include: read-only memory (ROM), random access memory ram), disk or CD etc..
The classification method and device for being provided for the embodiments of the invention multi-medium data above are described in detail, this
Apply that a specific example illustrates the principle and implementation of the invention in text, the explanation of above example is only intended to
It facilitates the understanding of the method and its core concept of the invention;At the same time, for those skilled in the art, think of according to the present invention
Think, there will be changes in the specific implementation manner and application range, in conclusion the content of the present specification should not be construed as pair
Limitation of the invention.
Claims (10)
1. a kind of classification method of multi-medium data characterized by comprising
Obtain multiframe multi-medium data to be processed;
The multiframe multi-medium data is divided into multiple groups multi-medium data sequentially in time, includes in every group of multi-medium data
An at least frame multi-medium data for Time Continuous;
Extract the corresponding assemblage characteristic information of each group multi-medium data;
According to the corresponding assemblage characteristic information of each group multi-medium data, the entirety of the multiframe multi-medium data is determined
Characteristic information, to classify to the multiframe multi-medium data.
2. the method as described in claim 1, which is characterized in that mutually disjoint between the multiple groups multi-medium data, and different
The frame number for the multi-medium data for including in group multi-medium data is identical or not identical.
3. the method as described in claim 1, which is characterized in that the corresponding combination of one group of multi-medium data of the extraction is special
Reference breath, specifically includes:
It extracts in one group of multi-medium data including being the corresponding characteristic information of all frame multi-medium datas;
The corresponding characteristic information of frame multi-medium data every in one group of multi-medium data is spliced, the spelling that will be obtained
Rear characteristic information is connect as the corresponding assemblage characteristic information of one group of multi-medium data.
4. method as described in any one of claims 1 to 3, which is characterized in that the method also includes:
According to the global feature information of the determination and preset disaggregated model, classify to the multiframe multi-medium data,
Obtain the type information of the multiframe multi-medium data.
5. method as described in any one of claims 1 to 3, which is characterized in that the method also includes:
The step of executing the division multiple groups multi-medium data respectively, extracting assemblage characteristic information and determine global feature information is more
It is secondary, obtain the global feature information of multiple candidates;
According to the global feature information of the multiple candidate, the final global feature information of the multiframe multi-medium data is determined.
6. method as claimed in claim 5, which is characterized in that the global feature information according to the multiple candidate, really
The final global feature information of the fixed multiframe multi-medium data, specifically includes:
Using the whole spy that the weighted sum of the global feature information of the multiple candidate is final as the multiframe multi-medium data
Reference breath.
7. method as described in any one of claims 1 to 3, which is characterized in that described according to each group multi-medium data point
Not corresponding assemblage characteristic information, determines the global feature information of the multiframe multi-medium data, specifically includes:
The entirety is indicated with the difference of the assemblage characteristic information of each group multi-medium data cluster centre nearest with it respectively
Characteristic information.
8. method as described in any one of claims 1 to 3, which is characterized in that the multiframe multi-medium data is video data,
Including multiframe image data and multiframe audio data;The multiple groups multi-medium data includes multiple groups image data and multiple groups audio number
According to;
The then corresponding assemblage characteristic information of the extraction each group multi-medium data, specifically includes: extracting each group image data
Assemblage characteristic information and multiple groups audio data assemblage characteristic information;
It is described according to the corresponding assemblage characteristic information of each group multi-medium data, determine the multiframe multi-medium data
Global feature information, specifically includes: determining the of the multiframe image data according to the assemblage characteristic information of each group image data
One global feature information determines the second whole spy of the multiframe audio data according to the assemblage characteristic information of each group audio data
Reference breath.
9. a kind of sorter of multi-medium data characterized by comprising
Data capture unit, for obtaining multiframe multi-medium data to be processed;
Division unit, for the multiframe multi-medium data to be divided into multiple groups multi-medium data sequentially in time, more than every group
It include an at least frame multi-medium data for Time Continuous in media data;
Feature extraction unit, for extracting the corresponding assemblage characteristic information of each group multi-medium data;
Characteristics determining unit, for determining described more according to the corresponding assemblage characteristic information of each group multi-medium data
The global feature information of frame multi-medium data, to classify to the multiframe multi-medium data.
10. a kind of server, which is characterized in that including pocessor and storage media, the processor, for realizing each finger
It enables;
The storage medium is for storing a plurality of instruction, and described instruction by processor for being loaded and executing such as claim 1 to 8
The classification method of described in any item multi-medium datas.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910218914.8A CN109992679A (en) | 2019-03-21 | 2019-03-21 | A kind of classification method and device of multi-medium data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910218914.8A CN109992679A (en) | 2019-03-21 | 2019-03-21 | A kind of classification method and device of multi-medium data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109992679A true CN109992679A (en) | 2019-07-09 |
Family
ID=67130720
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910218914.8A Pending CN109992679A (en) | 2019-03-21 | 2019-03-21 | A kind of classification method and device of multi-medium data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109992679A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110489574A (en) * | 2019-08-05 | 2019-11-22 | 东软集团股份有限公司 | A kind of multimedia messages recommended method, device and relevant device |
CN110751030A (en) * | 2019-09-12 | 2020-02-04 | 厦门网宿有限公司 | Video classification method, device and system |
CN111813996A (en) * | 2020-07-22 | 2020-10-23 | 四川长虹电器股份有限公司 | Video searching method based on sampling parallelism of single frame and continuous multi-frame |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103093183A (en) * | 2011-10-27 | 2013-05-08 | 索尼公司 | Classifier generating device and method thereof, video detecting device and method thereof and video monitoring system |
CN103336795A (en) * | 2013-06-09 | 2013-10-02 | 华中科技大学 | Video indexing method based on multiple features |
CN106570466A (en) * | 2016-11-01 | 2017-04-19 | 金鹏电子信息机器有限公司 | Video classification method and system |
CN107341462A (en) * | 2017-06-28 | 2017-11-10 | 电子科技大学 | A kind of video classification methods based on notice mechanism |
CN109189950A (en) * | 2018-09-03 | 2019-01-11 | 腾讯科技(深圳)有限公司 | Multimedia resource classification method, device, computer equipment and storage medium |
CN109271876A (en) * | 2018-08-24 | 2019-01-25 | 南京理工大学 | Video actions detection method based on temporal evolution modeling and multi-instance learning |
CN109376696A (en) * | 2018-11-28 | 2019-02-22 | 北京达佳互联信息技术有限公司 | Method, apparatus, computer equipment and the storage medium of video actions classification |
-
2019
- 2019-03-21 CN CN201910218914.8A patent/CN109992679A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103093183A (en) * | 2011-10-27 | 2013-05-08 | 索尼公司 | Classifier generating device and method thereof, video detecting device and method thereof and video monitoring system |
CN103336795A (en) * | 2013-06-09 | 2013-10-02 | 华中科技大学 | Video indexing method based on multiple features |
CN106570466A (en) * | 2016-11-01 | 2017-04-19 | 金鹏电子信息机器有限公司 | Video classification method and system |
CN107341462A (en) * | 2017-06-28 | 2017-11-10 | 电子科技大学 | A kind of video classification methods based on notice mechanism |
CN109271876A (en) * | 2018-08-24 | 2019-01-25 | 南京理工大学 | Video actions detection method based on temporal evolution modeling and multi-instance learning |
CN109189950A (en) * | 2018-09-03 | 2019-01-11 | 腾讯科技(深圳)有限公司 | Multimedia resource classification method, device, computer equipment and storage medium |
CN109376696A (en) * | 2018-11-28 | 2019-02-22 | 北京达佳互联信息技术有限公司 | Method, apparatus, computer equipment and the storage medium of video actions classification |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110489574A (en) * | 2019-08-05 | 2019-11-22 | 东软集团股份有限公司 | A kind of multimedia messages recommended method, device and relevant device |
CN110489574B (en) * | 2019-08-05 | 2022-06-03 | 东软集团股份有限公司 | Multimedia information recommendation method and device and related equipment |
CN110751030A (en) * | 2019-09-12 | 2020-02-04 | 厦门网宿有限公司 | Video classification method, device and system |
WO2021046957A1 (en) * | 2019-09-12 | 2021-03-18 | 厦门网宿有限公司 | Video classification method, device and system |
CN111813996A (en) * | 2020-07-22 | 2020-10-23 | 四川长虹电器股份有限公司 | Video searching method based on sampling parallelism of single frame and continuous multi-frame |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113255694B (en) | Training image feature extraction model and method and device for extracting image features | |
CN109325148A (en) | The method and apparatus for generating information | |
CN109117777A (en) | The method and apparatus for generating information | |
CA3066029A1 (en) | Image feature acquisition | |
CN107545038B (en) | Text classification method and equipment | |
CN109992679A (en) | A kind of classification method and device of multi-medium data | |
CN113742488B (en) | Embedded knowledge graph completion method and device based on multitask learning | |
CN109271516A (en) | Entity type classification method and system in a kind of knowledge mapping | |
CN110990563A (en) | Artificial intelligence-based traditional culture material library construction method and system | |
CN113569895A (en) | Image processing model training method, processing method, device, equipment and medium | |
CN115130536A (en) | Training method of feature extraction model, data processing method, device and equipment | |
CN116665083A (en) | Video classification method and device, electronic equipment and storage medium | |
CN112884569A (en) | Credit assessment model training method, device and equipment | |
CN115131698A (en) | Video attribute determination method, device, equipment and storage medium | |
CN112259078A (en) | Method and device for training audio recognition model and recognizing abnormal audio | |
CN111488813A (en) | Video emotion marking method and device, electronic equipment and storage medium | |
CN113870863A (en) | Voiceprint recognition method and device, storage medium and electronic equipment | |
CN111984842B (en) | Bank customer data processing method and device | |
CN109447112A (en) | A kind of portrait clustering method, electronic equipment and storage medium | |
CN115114469A (en) | Picture identification method, device and equipment and storage medium | |
CN113392867A (en) | Image identification method and device, computer equipment and storage medium | |
WO2024051146A1 (en) | Methods, systems, and computer-readable media for recommending downstream operator | |
CN116956117A (en) | Method, device, equipment, storage medium and program product for identifying label | |
CN116955788A (en) | Method, device, equipment, storage medium and program product for processing content | |
CN114566160A (en) | Voice processing method and device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190709 |