CN106202128A - The sorting technique of sequential file and categorizing system - Google Patents
The sorting technique of sequential file and categorizing system Download PDFInfo
- Publication number
- CN106202128A CN106202128A CN201510232775.6A CN201510232775A CN106202128A CN 106202128 A CN106202128 A CN 106202128A CN 201510232775 A CN201510232775 A CN 201510232775A CN 106202128 A CN106202128 A CN 106202128A
- Authority
- CN
- China
- Prior art keywords
- statistic
- matrix
- sequential file
- sequential
- file
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Abstract
The invention discloses sorting technique and the categorizing system of a kind of sequential file.According to an aspect of the present invention, the sorting technique of sequential file includes: extract temporal aspect from multiple sequential files;Calculate the statistic vector of each sequential file according to the temporal aspect extracted, the element in wherein said statistic vector reflects the statistic result of corresponding time sequence file;Utilize the statistic vector construction feature matrix of the plurality of sequential file;And according to described eigenmatrix, the plurality of sequential file is classified.Thus, it is possible to sequential file is classified by utilization state statistical information, ensure that the reliability of classification results, and computation complexity can be simplified, thus realize classifying fast and accurately.
Description
Technical field
The present invention relates to computer documents classification field, be specifically related to the sorting technique of sequential file and divide
Class system.
Background technology
Along with the development of multimedia application, the substantial amounts of sequential file resource with music file as representative goes out
Existing.In recent years, organization and management sequential file, is increasingly paid close attention to by people the most effectively.
As a example by music file, it is organization and management sequential file effectively to music file mark classification
Important means.Such as, according to music style (such as jazz, Bruce, allusion, rural area, rock and roll
Deng) music file is labeled, it is an importance of mark classification.Due to music file
Quantity is very big, so artificial mark is often wasted time and energy, and accuracy rate is the highest, easily due to manually
Carelessness or cognitive competence limited and make mistakes.
In order to solve the problems referred to above, in the prior art, it is proposed that sequential file is classified automatically
Multiple method.In order to realize the correct classification automatically of sequential file, mainly there are following two main points.
One is how to design and select suitable grader to classify sequential file, and another is from former
Which type of feature beginning sequential file extracts to be applicable to classification.
In the prior art, on the one hand have been presented for much for how to design and select suitably classifying
The solution of device.But then, from original temporal file, which type of feature is extracted to be suitable for
In the difficult point of classification always research, also lack gratifying solution in the prior art.
Summary of the invention
In view of this, the present invention proposes sorting technique and the categorizing system of a kind of sequential file, in order to
With state statistical information, sequential file is classified.
According to an aspect of the invention, it is provided the sorting technique of a kind of sequential file, including: from
Multiple sequential files extract temporal aspect;Each sequential file is calculated according to the temporal aspect extracted
Statistic vector, the element in wherein said statistic vector reflects corresponding time sequence file
Statistic result;Utilize the statistic vector construction feature matrix of the plurality of sequential file;With
And according to described eigenmatrix, the plurality of sequential file is classified.
According to a further aspect in the invention, it is provided that the categorizing system of a kind of sequential file, including: special
Levy extraction element, from multiple sequential files, extract temporal aspect;Calculate device, according to described feature
The temporal aspect that extraction element extracts calculates the statistic vector of each sequential file, wherein said shape
Element in state statistical vector reflects the statistic result of corresponding time sequence file;Matrix builds dress
Put, utilize the statistic vector construction feature of the plurality of sequential file that described calculating device calculates
Matrix;And grader, according to described matrix construction device build eigenmatrix to time the plurality of
Preface part is classified.
According to technical scheme provided by the present invention, it is possible to use sequential file is entered by state statistical information
Row classification, this both will not as utilize all temporal aspects of being extracted calculate complexity, again can
Enough characteristic informations of abundant statistics that remains, for classified counting, ensure that classification knot
The reliability of fruit, can simplify again computation complexity, thus realize classifying fast and accurately.
Accompanying drawing explanation
The embodiments of the present invention are read with reference to the drawings, other spy of the present invention be will be better understood
Seeking peace advantage, accompanying drawing described here is intended merely to schematically illustrate embodiments of the present invention
Purpose, and not all possible enforcement, and be not intended to limit the scope of the present invention.In the accompanying drawings:
Fig. 1 shows the flow process of the sorting technique of the sequential file according to one embodiment of the present invention
Figure;
Fig. 2 shows the schematic diagram extracting temporal aspect from original temporal file;
Fig. 3 shows and calculates according to the temporal aspect extracted every according to one embodiment of the present invention
The flow chart of the statistic vector of individual sequential file;
Fig. 4 shows and enters the temporal aspect of N number of sequential file according to one embodiment of the invention
The schematic diagram of row cluster;
Fig. 5 shows the statistic calculating each sequential file according to one embodiment of the present invention
The flow chart of vector;
Fig. 6 shows the cluster result of the temporal aspect according to the sequential file shown in Fig. 4 and generates poly-
The schematic diagram of class state matrix;
Fig. 7 shows an example of the row combination chosen from the cluster state matrix shown in Fig. 6;
Fig. 8 shows and carries out double cunning in multiple row combine respectively according to one embodiment of the present invention
Dynamic Window state statistics is to generate the flow chart of the assembled state statistical matrix of sequential file;
Fig. 9 shows the example arranging outer window and interior window in the row combination shown in Fig. 7;
Figure 10 shows an example of the assembled state statistical matrix according to this embodiment;
Figure 11 shows the statistic utilizing multiple sequential file according to one embodiment of the present invention
The flow chart of vector construction feature matrix;
Figure 12 shows the frame of the categorizing system of the sequential file according to one embodiment of the present invention
Figure;
Figure 13 shows the block diagram calculating device according to one embodiment of the present invention;
Figure 14 shows the block diagram of the computing unit according to one embodiment of the present invention;
Figure 15 shows the block diagram of the statistics subelement according to one embodiment of the present invention;
Figure 16 shows the block diagram of the matrix construction device according to one embodiment of the present invention;And
Figure 17 shows the computer that can be used for implementing method and system according to embodiments of the present invention
Schematic block diagram.
Detailed description of the invention
Referring now to accompanying drawing, embodiments of the present invention are described in detail.Only it should be noted that following description
It is only exemplary, and is not intended to limit the present invention.Additionally, in the following description, phase will be used
Same drawing reference numeral represents the same or analogous parts in different accompanying drawing.Described below different real
Execute the different characteristic in mode, can be bonded to each other, to form other embodiments in the scope of the invention.
Fig. 1 shows the flow process of the sorting technique of the sequential file according to one embodiment of the present invention
Figure.As it is shown in figure 1, the sorting technique 1000 of sequential file can include step S1100 to S1400.
When sequential file is classified by needs, in step S1100, from multiple sequential files
Extract temporal aspect.Here, various method known in the art can be utilized from original sequential file
Middle extraction temporal aspect.And, the temporal aspect extracted can be that MFCC (fall by Mel frequency
Spectral coefficient, Mel Frequency Cepstrum Coefficient) feature, FFT (fast Fourier
Conversion, Fast Fourier Transformation) any of feature such as feature.Following
In description, it is described as a example by MFCC feature, but the application is not limited to this.It is appreciated that
Temporal aspect known to as various in FFT etc. is all applicable to the application.
Fig. 2 shows the schematic diagram extracting temporal aspect from original temporal file.As in figure 2 it is shown,
The feature extracted from original temporal file (such as, music file) is temporal aspect, at Fig. 2
Each column shown in You Ce represents a frame of the temporal aspect extracted from sequential file.
Return Fig. 1, in step S1200, calculate each sequential according to the temporal aspect extracted
The statistic vector of file.Statistic vector characterizes the statistic situation of sequential file, its
In element reflect the statistic result of sequential file.Concrete calculating process and statistic
The example of vector will be described below.
In step S1300, the statistic vector of calculated each sequential file is utilized to carry out structure
Build eigenmatrix.Subsequently, in step S1400, according to this feature matrix to preface time the plurality of
Part is classified.Available known any suitable grader is classified.Here, feature square
Battle array is that the statistic vector utilizing sequential file builds, therefore, when this feature matrix contains each
The state statistical information of preface part.These state statistical information are utilized sequential file to be classified, both
Will not as utilize all temporal aspects of being extracted calculate complexity, ensure that again and remain
The characteristic information of abundant statistics is used for classified counting, ensures that the reliability of classification results,
Computation complexity can be simplified again, thus realize classifying fast and accurately.
Fig. 3 shows and calculates according to the temporal aspect extracted every according to one embodiment of the present invention
The flow chart of the statistic vector of individual sequential file.As it is shown on figure 3, above-mentioned steps S1200 can be wrapped
Enclosed tool step S1210 and S1220.In sub-step S1210, will extract from each sequential file
Temporal aspect cluster.Temporal aspect (e.g., MFCC feature) due to sequential file
Between frame and frame, identical probability is the least, therefore to simplify the calculating process of classification, this Shen
The frame that please think the most similar belongs to same cluster, thus is gathered by each frame in temporal aspect
Class, in order to for follow-up statistical computation.
Fig. 4 shows and enters the temporal aspect of N number of sequential file according to one embodiment of the invention
The schematic diagram of row cluster.As shown in Figure 4, each sequential file in N number of sequential file 1 to N
Temporal aspect all includes multiple frame, and each frame is represented by a column on the left of Fig. 4, by pair time
The cluster of sequence characteristics, each frame of feature is clustered into the class in default classification.Concrete cluster
Method can use the most known method in prior art, such as mean shift clustering (Mean-shift
Clustering), Kmeans clusters methods such as (MiniBatchKmeans Clustering) in batches.
In the example depicted in fig. 4, it is assumed that having preset 10 classes, each frame of the most each sequential file can
The a certain class being clustered in classification 1 to 10, as shown in the right side of Fig. 4.Thus, cluster is utilized
Label instead of the primitive character extracted from sequential file, in order to follow-up statistical computation.This
Skilled person is appreciated that the number of default cluster can determine according to actual needs, and it is big
Cause suitable with the quantity of sequential file class.Describe for convenience, the most all it will be assumed that preset
10 clusters.
Return Fig. 3, in sub-step S1220, poly-according to the temporal aspect of each sequential file
Class result calculates the statistic vector of each sequential file.An embodiment according to the application,
In sub-step S1220, calculate the state of each sequential file based on double sliding window statistical models
Statistical vector.
Fig. 5 shows the statistic calculating each sequential file according to one embodiment of the present invention
The flow chart of vector.As it is shown in figure 5, above-mentioned sub-step S1220 can include sub-step S1221 extremely
S1224.In sub-step S1221, the cluster result according to the temporal aspect of each sequential file is raw
Become the cluster state matrix of this sequential file.
Fig. 6 shows the cluster result of the temporal aspect according to the sequential file 1 shown in Fig. 4 and generates
The schematic diagram of cluster state matrix.As shown in Figure 6, by sequential file 1 (such as, music file)
Temporal aspect cluster result generate sequential file 1 cluster state matrix, the line number etc. of this matrix
In the quantity of cluster preset when temporal aspect is clustered, in the example depicted in fig. 6, by
In having preset 10 classes, so cluster state matrix has 10 row.Cluster state matrix in Fig. 6
Every string all represent the cluster result of the frame in the temporal aspect of sequential file 1.Such as, this square
The first of battle array is classified as (1,0,0,0,0,0,0,0,0,0)T, i.e. first row only has the first row
Element be 1, the element of remaining row is 0, and this list shows that the cluster result of the first frame is the first kind.
Remaining respectively arranges and marks the most in this way.Such as, the 3rd of this matrix be classified as (0,1,0,0,0,
0,0,0,0,0)T, i.e. the element of the 3rd row only the second row is 1, and the element of remaining row is
0, this list shows that the cluster result of the 3rd frame is Equations of The Second Kind.Thus, characterized by the form of matrix
The clustering information of the temporal aspect of sequential file.Being appreciated that also can be according to actual needs with its other party
Formula design cluster state matrix, as long as it can characterize the clustering information of temporal aspect of sequential file.
Return Fig. 5, in sub-step S1222, from cluster state matrix, choose the combination of multiple row.
As a example by the cluster state matrix shown in Fig. 6, it has 10 row, can choose multiple from 10 row
Row combination, for subsequent treatment.
Fig. 7 shows an example of the row combination chosen from the cluster state matrix shown in Fig. 6.Such as figure
Row combination shown in 7 includes the 1-3 row of the cluster state matrix shown in Fig. 6, the reality in Fig. 7
Heart circle represents that cluster state matrix element value in this place is 1, the unit at solid circles does not occurs
Element value is 0.Calculating for the ease of subsequent statistical, in the combination of each row, the quantity of row can be equal.Example
As, as it is shown in fig. 7, each row combination of cluster state matrix can include 3 row.It is possible to haveIndividual row combines.According to another embodiment of the present invention, if it is considered that the combination of each row is internal
Putting in order of each row, i.e. such as 123 row and 321 row are different row combinations, then can have Individual row combines.
Return Fig. 5, in sub-step S1223, at the cluster state matrix from each sequential file
In multiple row combination of choosing carries out double sliding window statistic, respectively to generate this sequential file
Assembled state statistical matrix.Thus, the statistic letter of the temporal aspect of available each sequential file
Breath, for the classification to sequential file.Subsequently, in sub-step S1224, by preface time each
The assembled state statistical matrix planarization of part, to obtain the statistic vector of this sequential file.Under
Literary composition will be described in double sliding window statistic, assembled state statistical matrix and the mistake of planarization
Journey.
Fig. 8 shows and carries out double cunning in multiple row combine respectively according to one embodiment of the present invention
Dynamic Window state statistics is to generate the flow chart of the assembled state statistical matrix of sequential file.Such as Fig. 8
Shown in, above-mentioned sub-step S1223 can include sub-step S1223a to S1223e.In sub-step S1223a
In, outer window is set in each row combines, and interior window is set in this outer window.
Fig. 9 shows the example arranging outer window and interior window in the row combination shown in Fig. 7.Such as figure
Shown in 9, combination 123 of being expert at is provided with outer window Wout.It is appreciated that to obtain this row
Statistical information in combination, the height of outer window Wout need to cover each row in the combination of this row (at this
Example is 3 row), its length should be greater than the line number in row combination, and less than the total length of temporal aspect,
Those skilled in the art can set the length of outer window Wout, such as, 20 according to actual needs.Again
As it is shown in figure 9, outside window Wout is provided with interior window Winner, interior window Winner
Height cover this row combination in each row, for the ease of follow-up statistical computation, its length is also 3.
Return Fig. 8, in sub-step S1223b, window in sliding in window Wout outside
Winner, to obtain multiple State Viewpoint measured value.First, outer window Wout is kept not slide, interior window
Often slide in mouth Winner window Wout outside lattice, a shape of available corresponding time sequence file
State observation.State Viewpoint measured value is a vector, for characterizing the cluster state matrix that interior window comprises
The value state of element.Such as, as it is shown in figure 9, the shape now obtained by interior window Winner
State observation is (1,1,0)T, wherein " 1 " of the first row and the second row represents at the first of interior window
Row and the second row occur in that solid circles (i.e. cluster state matrix element value in this place is 1),
There is not solid circles in the third line of interior window in 0 expression of the third line.It is to say, at interior window
In the event of solid circles in certain a line of mouth, then corresponding state observation vector is at the element of this row
Value is 1, is otherwise 0.Visible, this State Viewpoint measured value of interior window Winner characterizes sequential
The temporal aspect of file state at a few frames.So, the state observation in example as shown in Figure 9
Value have following 8 kinds may:
Return Fig. 8, in sub-step S1223c, obtain according to window in sliding in window outside
To multiple State Viewpoint measured values add up the statistic value of outer window.Such as, as described in the above example,
The State Viewpoint measured value of each interior window has above-mentioned 8 kinds of possibilities.So, window in sliding in window outside
And obtained multiple State Viewpoint measured value, and these State Viewpoint measured values to be added up, the result of statistics is:
For each in above-mentioned 8 kinds of possible State Viewpoint measured values, it is likely to occur, it is also possible to do not go out
Existing.The result so added up, i.e. statistic value may have 28=256 kinds of results.It is to say,
The statistic value of outer window is one of these 256 kinds of possible results.
In sub-step S1223d, slide in combination of being expert at outer window, to add up multiple statistic
Value.Still as a example by above-mentioned example, once can obtain a statistic value due to outer window sliding, should
Value falls into one of above-mentioned 256 kinds of possible results.So, by the exterior window that constantly slides in combination of being expert at
Mouthful (often slide outer window, completes the slide of interior window, with root in being required to window outside
Statistic behavior statistical value is carried out according to multiple State Viewpoint measured values), multiple statistic value can be obtained.To these
The frequency that statistic value occurs in 256 kinds of possible results is added up.Lower list 1 is exemplary
Show the result that the multiple statistic values obtained in the combination of row are added up, table 1
In every a line represent a kind of state, in table 1, certain value of certain state takes 0 expression and forms this exterior window
There is not corresponding vector in multiple State Viewpoint measured values of the statistic value of mouth, takes 1 and represent occur
Corresponding vector.Rightmost side string in table 1 represents every kind of state in multiple statistic values
Frequency of occurrence.
Table 1
The benefit of the double sliding window statistic model of employing is: if only with single window sliding, then
At a certain frame of temporal aspect, can only have a value cluster is 1 (seeing Fig. 6), then the shape of statistics
State feature is the most definitely;And using Dual-window to slide, the information of statistics is more and more comprehensively, for
Each frame, all added up its various states may, and be not only that can only to have a state be 1, and
Remaining is all only 0.Therefore, double sliding window statistic models are both original by cluster reduction
Temporal aspect, improves processing speed, maintains again abundant statistical information, just to guarantee classification
Really property.
In sub-step S1223e, the statistic value combined by multiple row is grouped together, and constitutes
The assembled state statistical matrix of sequential file.Still as a example by above-mentioned example, Figure 10 shows according to being somebody's turn to do
One example of the assembled state statistical matrix of embodiment.As shown in Figure 10, by a sequential file
The statistic value of multiple row combination is grouped together, and constitutes the assembled state statistics of this sequential file
Matrix.Every a line of this matrix is made up of the statistic value of a row combination of this sequential file.
After sub-step S1223a to S1223e completes, carry out above-mentioned sub-step S1224.With Figure 10
As a example by shown assembled state statistical matrix, the state of the sequential file obtained by this matrix is planarized
Statistical vector is as follows:
(203 ..., 2901,127 ..., 321,29 ..., 92 ... ..., 231 ..., 102)
Figure 11 shows the statistic utilizing multiple sequential file according to one embodiment of the present invention
The flow chart of vector construction feature matrix.As shown in figure 11, above-mentioned steps S1300 can include sub-step
Rapid S1310 and S1320.In sub-step S1310, by the statistic vector of multiple sequential files
It is combined into statistic matrix.Still as a example by above-mentioned example, the dimension of statistic matrix M is
M × n, wherein m is the number of sequential file,Wherein nDims represents
Each row combines the line number comprised, thereforeRepresent the statistic result of a row combination
(that is, a line in Figure 10);AndRepresent the quantity of the row combination of a sequential file.
Then the example of statistic matrix M is as follows:
In sub-step S1320, statistic matrix is carried out weight conversion, to form eigenmatrix.
An embodiment according to the application, by calculating each element in statistic matrix and its institute
The difference of other elements of place's row carries out weight conversion.
According to another embodiment, above-mentioned sequential file is music file.So, for music file,
Can be by calculating entropy and the ratio of the edge entropy of row residing for it of each element in statistic matrix
Carry out weight conversion.Such as, for statistic matrix M, can be counted by following formula 1
Calculate its element MijEntropy, and calculated the string of statistic matrix M by following formula 2
Edge entropy.
So, H (Mij) and H (M·j) ratio can be used as the element F of eigenmatrixij, as following
Formula 3 represents.
Thus, i.e. can get eigenmatrix F.This feature matrix F is used for by grader music file
Carry out the input feature vector classified.Utilize this feature matrix F that music file is classified, both will not picture
Utilize the original temporal audio frequency characteristics extracted to carry out calculating such complexity, ensure that again and remain foot
The characteristic information of enough statistics is used for classified counting, ensures that the reliability of classification results, again
Computation complexity can be simplified, thus realize classifying fast and accurately.
Figure 12 shows the frame of the categorizing system of the sequential file according to one embodiment of the present invention
Figure.As shown in figure 12, the categorizing system 1200 of sequential file can include feature deriving means 1210,
Calculate device 1220, matrix construction device 1230 and grader 1240.Feature deriving means 1210
Temporal aspect can be extracted from multiple sequential files.Calculating device 1220 can be according to feature deriving means
1210 temporal aspects extracted calculate the statistic vector of each sequential file, wherein this state system
Element in meter vector reflects the statistic result of corresponding time sequence file.Matrix construction device
The 1230 available statistic vector construction features calculating multiple sequential files that device 1220 calculates
Matrix.Grader 1240 can be according to the eigenmatrix of matrix construction device 1230 structure to multiple sequential
File is classified.
Figure 13 shows the block diagram calculating device according to one embodiment of the present invention.Such as Figure 13
Shown in, calculate device 1220 and can include cluster cell 1221 and computing unit 1222.Cluster cell
The temporal aspect that feature deriving means 1210 extracts from each sequential file can be gathered by 1221
Class.Computing unit 1222 can be according to poly-to the temporal aspect of each sequential file of cluster cell 1221
Class result calculates the statistic vector of each sequential file.
According to an embodiment of the invention, computing unit 1222 can be added up based on double sliding windows
Model calculates the statistic vector of each sequential file.
Figure 14 shows the block diagram of the computing unit according to one embodiment of the present invention.Such as Figure 14
Shown in, computing unit 1222 can include matrix generate subelement 1222a, combination subelement 1222b,
Statistics subelement 1222c and smooth beggar unit 1222d.Matrix generates subelement 1222a can basis
Cluster cell 1221 generates the poly-of sequential file to the cluster result of the temporal aspect of each sequential file
Class state matrix.Combination subelement 1222b can generate, from matrix, the cluster shape that subelement 1222a generates
State matrix is chosen the combination of multiple row.Statistics subelement 1222c can be selected by combination subelement 1222b
The multiple row combination taken carries out double sliding window statistic, respectively to generate the combination of sequential file
Statistic matrix.This assembled state statistical matrix can be planarized by planarization subelement 1222d, with
Obtain the statistic vector of sequential file.
Figure 15 shows the block diagram of the statistics subelement according to one embodiment of the present invention.Such as Figure 15
Shown in, statistics subelement 1222c can include window controlling module 1222c1, logging modle 1222c2,
Statistical module 1222c3 and matrix constitute module 1222c4.Window controlling module 1222c1 can be each
Row combination arranges outer window, set outer window arranges interior window, and controls this interior window
Slip in window outside and this outer window be expert at combination in slip.Logging modle 1222c2 can
The State Viewpoint measured value of window in record.Statistical module 1222c3 can be according to logging modle 1222c2 record
Multiple State Viewpoint measured values carry out statistic behavior statistical value.Matrix constitutes module 1222c4 can be by multiple row
The statistic value of combination constitutes combinations thereof statistic matrix.
According to an embodiment of the invention, matrix generates each sequential that subelement 1222a generates
The line number of the cluster state matrix of file is equal to when temporal aspect is clustered by cluster cell 1221
The quantity of cluster preset, and the every string clustering state matrix all represents that the sequential of sequential file is special
The cluster result of the frame in levying.
Figure 16 shows the block diagram of the matrix construction device according to one embodiment of the present invention.Such as figure
Shown in 16, matrix construction device 1230 can include that matrix forms unit 1231 and weight converting unit
1232.Matrix forms unit 1231 can will calculate the state of multiple sequential files that device 1220 calculates
Statistical vector is combined into statistic matrix.Weight converting unit 1232 can form unit to matrix
The 1231 statistic matrixes formed carry out weight conversion, to form eigenmatrix.
According to an embodiment of the invention, weight converting unit 1232 can be united by calculating state
Each element in meter matrix carries out weight conversion with the difference of other elements of its residing row.
According to another implementation of the invention, above-mentioned multiple sequential file can be multiple music literary composition
Part.Thus, weight converting unit 1232 can be by each element in calculating statistic matrix
Entropy carries out weight conversion with the ratio of the edge entropy of its residing row.
It addition, still need here it is noted that in said system each building block can pass through software,
The mode of firmware, hardware or a combination thereof configures.Configure spendable specific means or mode for this
Known to skilled person, do not repeat them here.In the case of being realized by software or firmware,
From storage medium or network to computer (such as general shown in Figure 17 with specialized hardware structure
Computer 1700) install constitute this software program, this computer when being provided with various program,
It is able to carry out various functions etc..
Figure 17 shows the computer that can be used for implementing method and system according to embodiments of the present invention
Schematic block diagram.
In fig. 17, CPU (CPU) 1701 is according in read only memory (ROM) 1702
The program stored or the program being loaded into random access memory (RAM) 1703 from storage part 1708
Perform various process.In RAM 1703, perform various always according to needs storage as CPU 1701
Data required during process etc..CPU 1701, ROM 1702 and RAM 1703 are via bus
1704 are connected to each other.Input/output interface 1705 is also connected to bus 1704.
Components described below is connected to input/output interface 1705: importation 1706 (includes keyboard, Mus
Mark etc.), output part 1707 (include display, such as cathode ray tube (CRT), liquid crystal
Show device (LCD) etc., and speaker etc.), storage part 1708 (including hard disk etc.), communications portion
1709 (including NIC such as LAN card, modem etc.).Communications portion 1709 warp
Communication process is performed by network such as the Internet.As required, driver 1710 can be connected to defeated
Enter/output interface 1705.Detachable media 1711 such as disk, CD, magneto-optic disk, quasiconductor are deposited
Reservoir etc. can be installed in driver 1710 as required so that the computer read out
Program is installed to store in part 1708 as required.
In the case of realizing above-mentioned series of processes by software, it is situated between from network such as the Internet or storage
Matter such as detachable media 1711 installs the program constituting software.
It will be understood by those of skill in the art that this storage medium is not limited to its shown in Figure 17
In have program stored therein and equipment distributes the detachable media of the program that provides a user with separately
1711.The example of detachable media 1711 comprises disk (comprising floppy disk (registered trade mark)), CD (comprises
Compact disc read-only memory (CD-ROM) and digital universal disc (DVD)), magneto-optic disk (comprise mini-disk
(MD) (registered trade mark)) and semiconductor memory.Or, storage medium can be ROM 1702,
Hard disk of comprising etc. in storage part 1708, wherein computer program stored, and with comprise setting of they
For being distributed to user together.
The present invention also proposes the program product that a kind of storage has the instruction code of machine-readable.Described finger
When making code be read by machine and perform, the above-mentioned method according to embodiment of the present invention can be performed.
Correspondingly, for carrying the depositing of program product that above-mentioned storage has the instruction code of machine-readable
Storage media is intended to be included within the scope of the present invention.Described storage medium include but not limited to floppy disk, CD,
Magneto-optic disk, storage card, memory stick etc..
It should be noted that, the method for the present invention be not limited to specifications described in time sequencing hold
OK, it is also possible to sequentially, in parallel or independently perform according to other order.Therefore, this explanation
The technical scope of the present invention is not construed as limiting by the execution sequence of the method described in book.
The description of embodiment each to the present invention is to be more fully understood that the present invention above, and it is only
Exemplary, and be not intended to limit the invention.It should be noted that in the above description, for one
Kind embodiment description and/or the feature illustrated can be in same or similar mode one or more
Other embodiment individual uses, combined with the feature in other embodiment, or it is real to substitute other
Execute the feature in mode.It will be understood by those skilled in the art that in the inventive concept without departing from the present invention
In the case of, the variations and modifications carried out for embodiment described above, belong to this
In the range of invention.
To sum up, in an embodiment according to the present invention, the invention provides following technical scheme.
Scheme 1, the sorting technique of a kind of sequential file, including:
Temporal aspect is extracted from multiple sequential files;
The statistic vector of each sequential file is calculated according to the temporal aspect extracted, wherein said
Element in statistic vector reflects the statistic result of corresponding time sequence file;
Utilize the statistic vector construction feature matrix of the plurality of sequential file;And
According to described eigenmatrix, the plurality of sequential file is classified.
Scheme 2, method as described in scheme 1, wherein calculate each according to the temporal aspect extracted
The statistic vector of sequential file includes:
The temporal aspect extracted from each sequential file is clustered;And
The cluster result of the temporal aspect according to each sequential file calculates the state of each sequential file
Statistical vector.
Scheme 3, method as described in scheme 2, the statistic vector of the most each sequential file is
Calculated based on double sliding window statistical models.
Scheme 4, method as described in scheme 2, wherein calculate each sequential file by following steps
Statistic vector:
The cluster result of the temporal aspect according to each sequential file generates the cluster of described sequential file
State matrix;
The combination of multiple row is chosen from described cluster state matrix;
Double sliding window statistic is carried out respectively, to generate described sequential in the plurality of row combines
The assembled state statistical matrix of file;And
Described assembled state statistical matrix is planarized, with obtain the statistic of described sequential file to
Amount.
Scheme 5, method as described in scheme 4, wherein carry out double in the plurality of row combines respectively
Sliding window statistic, includes generating the assembled state statistical matrix of described sequential file:
Outer window is set in each row combines, and interior window is set in described outer window;
Slide in described outer window described interior window, to obtain multiple State Viewpoint measured value;
Statistic behavior statistical value is carried out according to the plurality of State Viewpoint measured value;
Slide in described row combines described outer window, to add up multiple statistic value;And
The statistic value combined by the plurality of row constitutes the assembled state statistics of described sequential file
Matrix.
Scheme 6, method as described in scheme 4 or 5, the cluster state square of the most each sequential file
The line number of battle array is equal to the quantity of the cluster preset when being clustered by temporal aspect, and described cluster
Every string of state matrix all represents the cluster result of the frame in the temporal aspect of described sequential file.
Scheme 7, method as according to any one of scheme 1-6, preface when wherein utilizing the plurality of
The statistic vector construction feature matrix of part includes:
By the statistic Vector Groups synthetic state statistical matrix of the plurality of sequential file;And
Described statistic matrix is carried out weight conversion, to form eigenmatrix.
Scheme 8, method as described in scheme 7, wherein by calculating in described statistic matrix
Each element carries out weight conversion with the difference of other elements of its residing row.
Scheme 9, method as described in scheme 7, wherein said multiple sequential files are multiple music literary composition
Part.
Scheme 10, method as described in scheme 9, wherein by calculating in described statistic matrix
The entropy of each element carry out weight conversion with the ratio of the edge entropy of its residing row.
Scheme 11, the categorizing system of a kind of sequential file, including:
Feature deriving means, extracts temporal aspect from multiple sequential files;
Calculate device, calculate each sequential file according to the temporal aspect that described feature deriving means extracts
Statistic vector, the element in wherein said statistic vector reflects corresponding time sequence file
Statistic result;
Matrix construction device, utilizes the state system of the plurality of sequential file that described calculating device calculates
Meter vector construction feature matrix;And
Grader, according to described matrix construction device build eigenmatrix to the plurality of sequential file
Classify.
Scheme 12, system as described in scheme 11, wherein said calculating device includes:
Cluster cell, enters the temporal aspect that described feature deriving means extracts from each sequential file
Row cluster;And
Computing unit, according to the described cluster cell cluster result to the temporal aspect of each sequential file
Calculate the statistic vector of each sequential file.
Scheme 13, system as described in scheme 12, wherein said computing unit is based on double sliding windows
Statistical model calculates the statistic vector of each sequential file.
Scheme 14, system as described in scheme 12, wherein said computing unit includes:
Matrix generates subelement, according to poly-to the temporal aspect of each sequential file of described cluster cell
Class result generates the cluster state matrix of described sequential file;
Combination subelement, chooses multiple from the cluster state matrix that described matrix generation subelement generates
Row combination;
Statistics subelement, carries out double sliding in the multiple row selected by described combination subelement combine respectively
Dynamic Window state statistics, to generate the assembled state statistical matrix of described sequential file;And
Planarization subelement, planarizes described assembled state statistical matrix, preface during to obtain described
The described statistic vector of part.
Scheme 15, system as described in scheme 14, wherein said statistics subelement includes:
Window controlling module, arranges outer window, in arranging in described outer window in each row combines
Window, and control the slip in described outer window of the described interior window and described outer window at described row
Slip in combination;
Logging modle, records the State Viewpoint measured value of described interior window;
Statistical module, carrys out statistic behavior statistics according to multiple State Viewpoint measured values of described logging modle record
Value;And
Matrix constitutes module, the plurality of row the statistic value combined constitutes described assembled state system
Meter matrix.
Scheme 16, system as described in scheme 14 or 15, it is raw that wherein said matrix generates subelement
The line number of the cluster state matrix of each sequential file become is special by sequential equal at described cluster cell
Levy the quantity of the cluster preset when clustering, and every string of described cluster state matrix all represents
The cluster result of the frame in the temporal aspect of described sequential file.
Scheme 17, system as according to any one of scheme 11-16, wherein said matrix construction device
Including:
Matrix forms unit, by the statistic of the plurality of sequential file that described calculating device calculates
Vector Groups synthetic state statistical matrix;And
Weight converting unit, the statistic matrix forming described matrix formation unit carries out weight and turns
Change, to form eigenmatrix.
Scheme 18, system as described in scheme 17, wherein said weight converting unit is by calculating institute
State the difference of other elements of each element in statistic matrix and its residing row to carry out weight
Conversion.
Scheme 19, method as described in scheme 17, wherein said multiple sequential files are multiple music
File.
Scheme 20, system as described in scheme 19, wherein said weight converting unit is by calculating institute
The ratio of the entropy and the edge entropy of its residing row of stating each element in statistic matrix is weighed
Heavily change.
Claims (10)
1. a sorting technique for sequential file, including:
Temporal aspect is extracted from multiple sequential files;
The statistic vector of each sequential file is calculated according to the temporal aspect extracted, wherein said
Element in statistic vector reflects the statistic result of corresponding time sequence file;
Utilize the statistic vector construction feature matrix of the plurality of sequential file;And
According to described eigenmatrix, the plurality of sequential file is classified.
2. the method for claim 1, wherein calculates each according to the temporal aspect extracted
The statistic vector of sequential file includes:
The temporal aspect extracted from each sequential file is clustered;And
The cluster result of the temporal aspect according to each sequential file calculates the state of each sequential file
Statistical vector.
3. method as claimed in claim 2, the statistic vector of the most each sequential file is
Calculated based on double sliding window statistical models.
4. method as claimed in claim 2, wherein calculates each sequential file by following steps
Statistic vector:
The cluster result of the temporal aspect according to each sequential file generates the cluster of described sequential file
State matrix;
The combination of multiple row is chosen from described cluster state matrix;
Double sliding window statistic is carried out respectively, to generate described sequential in the plurality of row combines
The assembled state statistical matrix of file;And
Described assembled state statistical matrix is planarized, with obtain the statistic of described sequential file to
Amount.
5. method as claimed in claim 4, wherein carries out double in the plurality of row combines respectively
Sliding window statistic, includes generating the assembled state statistical matrix of described sequential file:
Outer window is set in each row combines, and interior window is set in described outer window;
Slide in described outer window described interior window, to obtain multiple State Viewpoint measured value;
Statistic behavior statistical value is carried out according to the plurality of State Viewpoint measured value;
Slide in described row combines described outer window, to add up multiple statistic value;And
The statistic value combined by the plurality of row constitutes the assembled state statistics of described sequential file
Matrix.
6. method as claimed in claim 4, the cluster state matrix of the most each sequential file
Line number is equal to the quantity of the cluster preset when being clustered by temporal aspect, and described cluster state
Every string of matrix all represents the cluster result of the frame in the temporal aspect of described sequential file.
7. the method as according to any one of claim 1-6, preface when wherein utilizing the plurality of
The statistic vector construction feature matrix of part includes:
By the statistic Vector Groups synthetic state statistical matrix of the plurality of sequential file;And
Described statistic matrix is carried out weight conversion, to form eigenmatrix.
8. method as claimed in claim 7, wherein by calculating in described statistic matrix
Each element carries out weight conversion with the difference of other elements of its residing row.
9. method as claimed in claim 7, wherein said multiple sequential files are multiple music literary composition
Part, by calculating entropy and the edge entropy of row residing for it of each element in described statistic matrix
Ratio carries out weight conversion.
10. a categorizing system for sequential file, including:
Feature deriving means, extracts temporal aspect from multiple sequential files;
Calculate device, calculate each sequential file according to the temporal aspect that described feature deriving means extracts
Statistic vector, the element in wherein said statistic vector reflects corresponding time sequence file
Statistic result;
Matrix construction device, utilizes the state system of the plurality of sequential file that described calculating device calculates
Meter vector construction feature matrix;And
Grader, according to described matrix construction device build eigenmatrix to the plurality of sequential file
Classify.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510232775.6A CN106202128A (en) | 2015-05-08 | 2015-05-08 | The sorting technique of sequential file and categorizing system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510232775.6A CN106202128A (en) | 2015-05-08 | 2015-05-08 | The sorting technique of sequential file and categorizing system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106202128A true CN106202128A (en) | 2016-12-07 |
Family
ID=57459878
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510232775.6A Pending CN106202128A (en) | 2015-05-08 | 2015-05-08 | The sorting technique of sequential file and categorizing system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106202128A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110134839A (en) * | 2019-03-27 | 2019-08-16 | 平安科技(深圳)有限公司 | Time series data characteristic processing method, apparatus and computer readable storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101599271A (en) * | 2009-07-07 | 2009-12-09 | 华中科技大学 | A kind of recognition methods of digital music emotion |
CN102129456A (en) * | 2011-03-09 | 2011-07-20 | 天津大学 | Method for monitoring and automatically classifying music factions based on decorrelation sparse mapping |
CN102842310A (en) * | 2012-08-10 | 2012-12-26 | 上海协言科学技术服务有限公司 | Method for extracting and utilizing audio features for repairing Chinese national folk music audios |
-
2015
- 2015-05-08 CN CN201510232775.6A patent/CN106202128A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101599271A (en) * | 2009-07-07 | 2009-12-09 | 华中科技大学 | A kind of recognition methods of digital music emotion |
CN102129456A (en) * | 2011-03-09 | 2011-07-20 | 天津大学 | Method for monitoring and automatically classifying music factions based on decorrelation sparse mapping |
CN102842310A (en) * | 2012-08-10 | 2012-12-26 | 上海协言科学技术服务有限公司 | Method for extracting and utilizing audio features for repairing Chinese national folk music audios |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110134839A (en) * | 2019-03-27 | 2019-08-16 | 平安科技(深圳)有限公司 | Time series data characteristic processing method, apparatus and computer readable storage medium |
CN110134839B (en) * | 2019-03-27 | 2023-06-06 | 平安科技(深圳)有限公司 | Time sequence data characteristic processing method and device and computer readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106529503B (en) | A kind of integrated convolutional neural networks face emotion identification method | |
Kosinski et al. | Mining big data to extract patterns and predict real-life outcomes. | |
CN101587493B (en) | Text classification method | |
CN108804677B (en) | Deep learning problem classification method and system combining multi-level attention mechanism | |
CN102521656B (en) | Integrated transfer learning method for classification of unbalance samples | |
CN104346629B (en) | A kind of model parameter training method, apparatus and system | |
CN106815244B (en) | Text vector representation method and device | |
CN108875067A (en) | text data classification method, device, equipment and storage medium | |
CN107463605A (en) | The recognition methods and device of low-quality News Resources, computer equipment and computer-readable recording medium | |
CN105808524A (en) | Patent document abstract-based automatic patent classification method | |
WO2015165372A1 (en) | Method and apparatus for classifying object based on social networking service, and storage medium | |
CN107293308B (en) | A kind of audio-frequency processing method and device | |
CN108090800A (en) | A kind of game item method for pushing and device based on player's consumption potentiality | |
CN103824565A (en) | Humming music reading method and system based on music note and duration modeling | |
CN107944986A (en) | A kind of O2O Method of Commodity Recommendation, system and equipment | |
CN101556553A (en) | Defect prediction method and system based on requirement change | |
CN108960264A (en) | The training method and device of disaggregated model | |
CN103473556B (en) | Hierarchical SVM sorting technique based on rejection subspace | |
CN109948680A (en) | The classification method and system of medical record data | |
CN106598999A (en) | Method and device for calculating text theme membership degree | |
CN109784966A (en) | A kind of music website customer churn prediction method | |
CN106294882A (en) | Data digging method and device | |
CN108038108A (en) | Participle model training method and device and storage medium | |
CN104077303A (en) | Method and device for displaying data | |
Tang et al. | Improved convolutional neural networks for acoustic event classification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20161207 |
|
WD01 | Invention patent application deemed withdrawn after publication |