CN107341239A - A kind of company-data analysis method and device - Google Patents
A kind of company-data analysis method and device Download PDFInfo
- Publication number
- CN107341239A CN107341239A CN201710541642.6A CN201710541642A CN107341239A CN 107341239 A CN107341239 A CN 107341239A CN 201710541642 A CN201710541642 A CN 201710541642A CN 107341239 A CN107341239 A CN 107341239A
- Authority
- CN
- China
- Prior art keywords
- data
- time
- point
- time point
- change
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2282—Tablespace storage structures; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
- G06F16/2358—Change logging, detection, and notification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/285—Clustering or classification
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
This application discloses company-data analysis method and device, including, choose in predetermined amount of time and be separated by mobile cluster object data corresponding to the time point of predetermined time interval;Establish abnormal data dynamic table;Exceptional data point in the mobile cluster object data and abnormal data dynamic table at each time point is classified, obtains preliminary classification result, non-classified mobile cluster object data as exceptional data point and is stored into abnormal data dynamic table;Since first time point, the change of the preliminary classification result at each time point and the preliminary classification result at the previous time point at time point is analyzed, and situation mark is changed to the preliminary classification result of each time according to the situation of change, obtains classification results.The abnormal data dynamic table of unfiled data can be preserved by establishing, abnormal data is stored, avoids the loss of useful data, while the abnormal data is also included in classification, the process degree of accuracy of data analysis can be made higher.
Description
Technical field
The application is related to big data mobile data analysis field, more particularly to a kind of company-data analysis method and device.
Background technology
Widely available with big data technology, the application of big data is very common in daily life, especially in data
The stronger contents of specific aim such as advertisement, message push are purposefully pushed to most suitable pair by manufacturer according to the analysis of big data
As this is also one of important application of big data.Meanwhile mobile data increases, that is, movement knowledge comprising object and
The data of positional information increase, and more purposefully can sell product to object.Using mobile data, traffic can also be studied and gathered around
Stifled prediction and animal migrate.But utilizing mobile data to mobile in the mode excavation of picture, being included due to object data
Type diversity, and it is higher to the requirement of real-time of data analysis, therefore bring challenge to excavate the pattern of mobile data.
The pattern for generally excavating mobile data is applied for example, traffic administration, logistics distribution and crowd monitoring.These need
Analyze the situation of change of cluster.And for the essence of cluster change:Whether one cluster simply disappears corresponding to one group of automobile
Or the member in cluster is moved in other clusters, whether emerging cluster reflects new vehicle or new mesh occurs
Mark colony, or existing customer hobby transformation and it is caused.
Therefore, research cluster situation of change is to analyze the company-data situation of change in a period of time, first will be original
Data are divided into class, can just be studied in units of cluster, then the difference for the cluster for passing through different time points judges its change.On
State and current in general company-data analysis method.
But current analysis method is in the data of lesser amt when applying, the result and the mistake of reality that draw
Difference is less, and when data volume increases, the result of the pattern analysis of the above method and real deviation are larger, do not meet expected knot
Fruit.
Therefore, how to solve the problems, such as that company-data analysis method error is larger, be that those skilled in the art are of interest
Hot issue.
The content of the invention
The purpose of the application is to provide a kind of company-data analysis method and device, by being built in traditional analysis method
The table of vertical storage abnormal data, then also the data in table are classified in classification, to avoid losing with break-up value
Data, prevent that analysis process error is larger, the situation for not meeting expected results occur.
In order to solve the above technical problems, the application provides a kind of company-data analysis method, including:
Choose in predetermined amount of time and be separated by mobile cluster object data corresponding to the time point of predetermined time interval;
Establish abnormal data dynamic table;
By the abnormal number in the mobile cluster object data and the abnormal data dynamic table at each time point
Strong point is classified, and obtains preliminary classification result, using the non-classified mobile cluster object data as the abnormal data
Put and store into abnormal data dynamic table;
Since first time point, the preliminary classification result at each time point and the time are analyzed
The change of the preliminary classification result at the previous time point of point, and according to the situation of the change to it is each described when
Between preliminary classification result be changed situation mark, obtain classification results.
Optionally, in addition to:
The relation between the class and class at each time point is determined according to the classification results, builds mobile cluster pattern
Tree;
According to the mobile cluster scheme-tree, it is determined that the frequent information of related mobile cluster.
Optionally, the mark of the situation of change, is specifically included:
Retain, merge, separate, expand, shrink, disappear, occur.
Optionally, it is described to establish abnormal data dynamic table, including:
Establish the abnormal data dynamic table;
Correlation parameters are set;Wherein, the processing parameter includes dynamic change time and renewal time.
Optionally, it is described using do not have in classification sort out the mobile cluster object data as the exceptional data point simultaneously
Store into abnormal data dynamic table, in addition to:
According to the processing parameter, judge whether the existence time of the exceptional data point exceeds the renewal time;
If so, then update the exceptional data point.
The application also provides a kind of company-data analytical equipment, and described device includes:
Data module is chosen, is moved for choosing being separated by corresponding to the time point of predetermined time interval in predetermined amount of time
Cluster object data;
Table module is built, for establishing abnormal data dynamic table;
Preliminary classification module, for by the mobile cluster object data at each time point and the abnormal data
Exceptional data point in dynamic table is classified, and obtains preliminary classification result, by the non-classified mobile cluster object data
As the exceptional data point and store into abnormal data dynamic table;
Change mark module, for since first time point, analyzing the described initial of each time point
The change of the preliminary classification result at the previous time point at classification results and the time point, and according to the change
Situation the preliminary classification result of each time is changed situation mark, obtain classification results.
Optionally, in addition to:
Achievement module, the relation between class and class for determining each time point according to the classification results, structure
Build mobile cluster scheme-tree;
Module is excavated, for according to the mobile cluster scheme-tree, it is determined that the frequent information of related mobile cluster.
Optionally, it is described to build table module, including:
Table unit is built, for establishing the abnormal data dynamic table
Arrange parameter unit, for setting correlation parameters;Wherein, the processing parameter include the dynamic change time and
Renewal time.
Optionally, the preliminary classification module, in addition to:Updating block, wherein, the updating block includes:
Time judgment sub-unit, for according to the processing parameter, judge the exceptional data point existence time whether
Beyond the renewal time;
Subelement is updated, exceeds the renewal time for the existence time when the exceptional data point, renewal is described different
Constant strong point.
Due to existing company-data analysis method, all non-classified data are lost in meeting in assorting process is, but
It is the data for a period, moment non-classified abnormal data is beneficial for the classification results of subsequent time at present
Influence.Therefore, errors of analytical results can be caused larger, the reality of description does not meet expected requirement.
Therefore, a kind of company-data analysis method provided herein, including, being separated by selection predetermined amount of time is pre-
Fix time mobile cluster object data corresponding to interlude point;Establish abnormal data dynamic table;Will each time point
The mobile cluster object data and the abnormal data dynamic table in exceptional data point classified, obtain preliminary classification
As a result, the non-classified mobile cluster object data as the exceptional data point and is stored to abnormal data dynamic table
In;Since first time point, the preliminary classification result at each time point and the time point are analyzed
The change of the preliminary classification result at the previous time point, and according to the situation of the change to each time
Preliminary classification result is changed situation mark, obtains classification results.
The abnormal data dynamic table of unfiled data can be preserved by establishing, abnormal data is stored, avoids useful number
According to loss, while the abnormal data is also included in classification, the process degree of accuracy of data analysis can be made higher.The application is also
A kind of company-data analytical equipment is provided, there is above beneficial effect, will not be described here.
Brief description of the drawings
, below will be to embodiment or existing in order to illustrate more clearly of the embodiment of the present application or technical scheme of the prior art
There is the required accompanying drawing used in technology description to be briefly described, it should be apparent that, drawings in the following description are only this
The embodiment of application, for those of ordinary skill in the art, on the premise of not paying creative work, can also basis
The accompanying drawing of offer obtains other accompanying drawings.
Fig. 1 is the flow chart for the company-data analysis method that the embodiment of the present application provides;
Fig. 2 is the particular flow sheet for the data analysis that the embodiment of the present application provides;
Fig. 3 is the partial process view for the assorting process that the embodiment of the present application provides;
Fig. 4 is the flow chart for the analytical model that the embodiment of the present application provides;
Fig. 5 is the forming types tree graph that the embodiment of the present application provides;
Fig. 6 is the flow chart for establishing dynamic table that the embodiment of the present application provides;
Fig. 7 is the flow chart for the renewal dynamic table that the embodiment of the present application provides;
Fig. 8 is the block diagram for the company-data analytical equipment that the embodiment of the present application provides;
Fig. 9 is the block diagram for the forming types tree that the embodiment of the present application provides;
Figure 10 is the block diagram for building table module that the embodiment of the present application provides.
Embodiment
The core of the application is to provide a kind of company-data analysis method, and by establishing abnormal data dynamic table, storage is different
The data that regular data and renewal are stored, avoid because lose useful data and caused by errors of analytical results it is larger, improve
The degree of accuracy of analysis method.
To make the purpose, technical scheme and advantage of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application
In accompanying drawing, the technical scheme in the embodiment of the present application is clearly and completely described, it is clear that described embodiment is
Some embodiments of the present application, rather than whole embodiments.Based on the embodiment in the application, those of ordinary skill in the art
The every other embodiment obtained under the premise of creative work is not made, belong to the scope of the application protection.
Fig. 1 is refer to, Fig. 1 is the flow chart for the company-data analysis method that the embodiment of the present application provides.
The present embodiment can include:
S100, choose in predetermined amount of time and be separated by mobile cluster number of objects corresponding to the time point of predetermined time interval
According to;
S200, establish abnormal data dynamic table;
It should be noted that be not in contact between step S100 and step S200, therefore the also precedence relationship without execution,
Step S200 can be first carried out and perform step S100 again, the two steps can also be performed simultaneously, do not limited herein.
Wherein, predetermined amount of time signified step S100 refers to the period to be analyzed of this research institute, can be by dividing
Depending on the actual conditions of analysis.For example, the company-data of a certain section of highway 5 points to 7 points of the vehicle of evening of research, then should select
Period comprising this period.That is this period of nonessential selection, because research is the data changed,
Situation of change is also observed for the data of period beginning and end, so pre- because increasing the suitable time in beginning and end
Length is stayed, for comprehensively analyzing the data in the period.
Meanwhile the predetermined time interval refers to the interval of the sample point of continuous time in the period, Ke Yiyou
The opportunity situation analyzed determines, but it is the sampling within the period that the sampling for the period, which also has an important parameter,
The number of point, due to needing to analyze substantial amounts of data, one point of increase has a certain degree of increasing for the data volume to be analyzed
Add, it is therefore desirable to which accurate result is drawn with suitable sample point number.For example, it is desired to study 5 points to 7 at night of a certain section of highway
The company-data of the vehicle of point, general knowledge understand that vehicle flowrate now is larger, and speed is slow, can suitably subtract the points of sampling
It is few.And if research, the company-data of 5 points to 7 points of vehicle of a certain section of highway morning, now vehicle flowrate is less, speed
Comparatively fast, what is brought is that vehicle change in highway is very fast, therefore can suitably increase the points of sampling.
After determining time point, mobile cluster object data corresponding to access time.The expression of the mobile cluster object data
It is mobile object in some time point mobile data information O:
O=(oid, p (x, y), t)
Wherein, oid is data type identifier, and p (x, y) is the mobile object in time point t longitude and latitude, and x is longitude,
Y is latitude, and t is the time at the moment.
Ω (t) is defined, O ∈ Ω, Ω (t) are the set of one group of mobile data object data, are referred to as mobile object location
Consistent Sets.
For, the abnormal data dynamic table established in step S200, a tables of data should be established in data analysis, can
To be stored to data, the function such as change and delete.The entitled F-list of established dynamic table in the present embodiment.
S300, will be different in the mobile cluster object data and the abnormal data dynamic table at each time point
Regular data point is classified, and obtains preliminary classification result, using the non-classified mobile cluster object data as the exception
Data point is simultaneously stored into abnormal data dynamic table;
It should be noted that the classification for the mobile cluster object data can be classified using sorting technique, example
Such as, DBscan, KNN, K-means, can be according to the performance requirement of data analysis and the requirement selection sort side of result precision
Method, do not limit in the present embodiment.
Wherein, non-classified data occur in assorting process, it is necessary to be preserved as abnormal data to abnormal number
According in dynamic table.Likewise, the object of classification in the classification to data is all data, that is to say, that should comprising to be sorted
Data in the data and abnormal data dynamic table at time point.
Therefore, the application can preserve the abnormal data dynamic table of unfiled data by establishing, and store abnormal data, keep away
The loss of useful data is exempted from, while the abnormal data is also included in classification, the process degree of accuracy of data analysis can be made more
It is high.
S400, since first time point, analyze the preliminary classification result and the institute at each time point
State the change of the preliminary classification result at the previous time point at time point, and according to the situation of the change to each
The preliminary classification result of the time is changed situation mark, obtains classification results.
Wherein, the preliminary classification result obtained according to said process is the classification results at each time point, due to being to divide
The evolutionary pattern of analysis research company-data object is analyzed, it is necessary to which the classification results of the data at each time point are linked together
To dependency relation.Therefore, it is necessary to analyze the preliminary classification result at each time point and the previous time point at the time point at the beginning of
Beginning classification results, it is associated according to the preliminary classification result at two time points and draws class categories and identify situation of change.
In the present embodiment, using Jaccard similarities judge the situation of change of two adjacent time points, and by this
It is divided into corresponding situation of change classification and identifies.The problem of Jaccard similarities are related to confidence level, that is, adjacent time
In the cluster preliminary classification result of point, the similar proportion of latter time point data amount and previous time point data volume judges its change
Situation.Wherein, the ratio of similarity is needed with empirically determined, is not limited herein.
Wherein, depending on the classification of situation of change is typically because of the data concrete condition of its analysis.The data reached typically all can
The particular problem of corresponding reality, and its corresponding problem can substantially determine the situation of change of its data, and its situation of change classification.
Simple problem is such as analyzed, data typically have the situation of merging, separation, disappearance and appearance, and the classification of its situation of change can also divide
For this several class.Do not limit herein.
In the present embodiment, the practical problem of selection is analysis road traffic condition, therefore, the class of the situation of change of selection
There are not following seven kinds:Survives (reservation), merged (merging), splits (separation), expands (expansion), shrinks
(contraction), disappears (disappearance) and appears (appearance).
Fig. 2 is refer to, Fig. 2 is the particular flow sheet for the data analysis that the embodiment of the present application provides.
Wherein predetermined amount of time is represented with T, and predetermined time interval represents predetermined time interval with Δ t, time point it is initial
Time point is represented with t.
Fig. 3 is refer to, Fig. 3 is the partial process view for the assorting process that the embodiment of the present application provides.
Wherein, it is as follows for the flow of the assorting process of part.Because length limitation can not show complete assorting process
Flow chart, moreover, the flow chart handled in this exposition is as an example, complete flow chart can be according to this partial process view
Simple expand can obtain.Therefore, it is not fully described herein.
Time point in period is arranged to 6, time interval is △ t, the namely t since t, t+ △ t, t+2 △
At t, t+3 △ t, t+4 △ t, t+5 △ this 6 time points of t, classification analysis is carried out to this 6 time points.
In t, by sorted class C1, C2, C3, C4 is distinguished, and appears (appearance) is identified as this 4 classes,
The point that now having part can not sort out is stored in abnormal data dynamic table F-list.
In t+ △ t, sorted out, can now find that the C1 in previous time point, C2 merge into a class C1', so
It is identified with merged (merging);C3' number of clusters is identified than C3 popularization with expands (expansion);C4 is kept not
Become so with survives (reservations) marks, continue now to sort out and be a little stored in abnormal data dynamic table F-list.
In t+2 △ t, it can be seen that C3', C4 are merged into a major class C3 ", so C3 " is identified as merged (merging);With
C1' merges into C1 " with some data in abnormal data dynamic table simultaneously for this, is not identified as merged (merging) now, and
Expands (expansion) is masked as, continues now to sort out and is a little stored in abnormal data dynamic table F-list.
In t+3 △ t, because preceding time point t+2 △ t have been filled with, so being updated and continuing now to sort out
Point is stored in abnormal data dynamic table F-list, and corresponding now C1 " ' and C5, which is that previous time point C1 " is scattered, to be formed, thus C1 " ' and
C5 is identified as splits (separation), and now C3 " ' is that previous time point C3 " diminutions form, so being identified as shrinks (contractings
It is small).
In t+4 △ t, C1 " ' keeps constant, is identified as survives;C3 " " is previous time point C3 " ' diminution, mark
Know for shrinks (diminution);For C5, be then wholly absent, therefore be identified as disappears (disappearance), continue by now without
Method sorts out point and is stored in abnormal data dynamic table F-list.
For t+5 △ t, C1 " ' and C3 " " relative previous moment do not occur any change, all it is identified as survives
(reservation).
Fig. 4 and Fig. 5 are refer to, Fig. 4 is the flow chart for the analytical model that the embodiment of the present application provides, and Fig. 5 is that the application is real
The forming types tree graph of example offer is provided.
Based on above-described embodiment, the present embodiment can also include:
S500, the relation between the class and class at each time point is determined according to the classification results, builds mobile set
Group's scheme-tree;
S600, according to the mobile cluster scheme-tree, it is determined that the frequent information of related mobile cluster.
Wherein, the mobile cluster scheme-tree established, it is that the classification of the situation of change identified according to each time point is entered
Row structure, since first empty node of root (root), classification of the C1 in Each point in time is sequentially inserted into, builds first branch
Wooden fork, and indicate its situation of change.Second empty node is inserted into, Article 2 branch, foundation are built since second empty node
Classification results and situation of change, it is known that be merged into second time point C2 in C1, therefore indicate situation of change in tree and incite somebody to action
Process indicates.Build remaining branch successively by this, form complete scheme-tree.
Intergrate with practice situation again, selects the mode of suitable mined information, it is determined that the frequent information of related mobile cluster, can
With the association Move Mode frequently occurred.
For example, in actual traffic section, 5 points to 7 points of the period of evening of viaduct is selected, according to analytical model
Tree, it is found that merge (merged) and frequently occurred with expansion (expands), the vehicle condition of the period is defined successively, to handing over
It is logical to be modulated with great importance.
Fig. 6 is refer to, Fig. 6 is the flow chart for establishing dynamic table that the embodiment of the present application provides.
Based on above-described embodiment, the present embodiment establishes abnormal data dynamic table, can include:
S210, establish the abnormal data dynamic table;
S220, correlation parameters are set;Wherein, the processing parameter includes dynamic change time and renewal time.
It should be noted that after to the abnormal data dynamic table, correlation parameters are set, the table of abnormal data dynamic table
Show as follows:
F-list(τ,θ)
Wherein τ=T/n, n=1,2,3 ... represent certain time of the selected exceptional data point that should be preserved;θ=
τ/n, n=1,2,3 ... represent the presence sub- time of the selected exceptional data point that should be updated.
Above-mentioned parameter can be set according to the concrete condition of data and reality, and the value effect of parameter arrives subsequent classification
The degree of accuracy of the data volume and result of scanning, make simultaneous data volume excessive if numerical value crosses conference, cause sort pass
Load increase, influence the speed of data processing, if numerical value is too small useful data can be made to remove too early, cause subsequent analysis
Resultant error it is larger.Therefore, depending on concrete condition, it is not specifically limited herein.
In the present embodiment, it is 3 to set τ, that is, its dynamic table is filled with the data at 3 time points, just updates a number
According to, while it is 2 to set θ, that is, the data stored the first two time point when updating the data are deleted.
Fig. 7 is refer to, Fig. 7 is the flow chart for the renewal dynamic table that the embodiment of the present application provides.
Based on above-described embodiment, the present embodiment, can also include:
S321, according to the processing parameter, when judging whether the existence time of the exceptional data point exceeds the renewal
Between;
S322, if so, then updating the exceptional data point.
Corresponding above-described embodiment, needs to do corresponding deterministic process in processing procedure, when judging the exceptional data point
Beyond the data of renewal time i.e. τ values, then renewal the first two time point storage.
Wherein, the mode updated the data, be in order to avoid storing excessive redundant data in abnormal data dynamic table, and
Cause the data volume mistake of the scanning in classification, increase machine loading, therefore, it is specified that the time that its needs updates, to time-out, enters
Row renewal operation.Renewal operation can partly be deleted after all deleting or contrast, and can also deposit timeout datum
In other tables, used for follow-up, rather than deletion action.
In the present embodiment, selection is to carry out deletion action to the data of time-out, needs to scan number every time to reduce
According to data volume, while mitigate machine loading.
The embodiment of the present application provides a kind of company-data analysis method, by establishing abnormal data dynamic table, is stored in
The abnormal data occurred in assorting process, the situation for losing useful data is avoided, improves the degree of accuracy of analysis method.
The company-data analytical equipment provided below the embodiment of the present application is introduced, company-data described below point
Analysis apparatus can be mutually to should refer to above-described company-data analysis method.
Fig. 8 is refer to, Fig. 8 is the block diagram for the company-data analytical equipment that the embodiment of the present application provides.
The present embodiment provides a kind of company-data analytical equipment, can include:
Data module 100 is chosen, for choosing being separated by corresponding to the time point of predetermined time interval in predetermined amount of time
Mobile cluster object data;
Table module 200 is built, for establishing abnormal data dynamic table;
Preliminary classification module 300, for by the mobile cluster object data at each time point and the exception
Exceptional data point in data dynamic table is classified, and obtains preliminary classification result, by the non-classified mobile cluster object
Data are as the exceptional data point and store into abnormal data dynamic table;
Change mark module 400, for since first time point, analyzing the described first of each time point
The change of the preliminary classification result at the previous time point at beginning classification results and the time point, and according to the change
The situation of change is changed situation mark to the preliminary classification result of each time, obtains classification results.
Fig. 9 is refer to, Fig. 9 is the block diagram for the forming types tree that the embodiment of the present application provides.
Based on above-described embodiment, the present embodiment can also include:
Achievement module 500, the relation between class and class for determining each time point according to the classification results,
Build mobile cluster scheme-tree;
Module 600 is excavated, for according to the mobile cluster scheme-tree, it is determined that the frequent information of related mobile cluster.
Figure 10 is refer to, Figure 10 is the block diagram for building table module that the embodiment of the present application provides.
Based on above-described embodiment, this, which builds table module 200, to include:
Table unit 210 is built, for establishing the abnormal data dynamic table
Arrange parameter unit 220, for setting correlation parameters;Wherein, when the processing parameter includes dynamic change
Between and renewal time.
Based on above-described embodiment, the present embodiment can also include:Updating block, wherein, the updating block can include:
Time judgment sub-unit, for according to the processing parameter, judge the exceptional data point existence time whether
Beyond the renewal time;
Subelement is updated, exceeds the renewal time for the existence time when the exceptional data point, renewal is described different
Constant strong point.
Each embodiment is described by the way of progressive in specification, and what each embodiment stressed is and other realities
Apply the difference of example, between each embodiment identical similar portion mutually referring to.For device disclosed in embodiment
Speech, because it is corresponded to the method disclosed in Example, so description is fairly simple, related part is referring to method part illustration
.
Professional further appreciates that, with reference to the unit of each example of the embodiments described herein description
And algorithm steps, can be realized with electronic hardware, computer software or the combination of the two, in order to clearly demonstrate hardware and
The interchangeability of software, the composition and step of each example are generally described according to function in the above description.These
Function is performed with hardware or software mode actually, application-specific and design constraint depending on technical scheme.Specialty
Technical staff can realize described function using distinct methods to each specific application, but this realization should not
Think to exceed scope of the present application.
Directly it can be held with reference to the step of method or algorithm that the embodiments described herein describes with hardware, processor
Capable software module, or the two combination are implemented.Software module can be placed in random access memory (RAM), internal memory, read-only deposit
Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology
In any other form of storage medium well known in field.
A kind of company-data analysis method provided herein and device are described in detail above.Herein should
The principle and embodiment of the application are set forth with specific case, the explanation of above example is only intended to help and managed
Solve the present processes and its core concept.It should be pointed out that for those skilled in the art, do not departing from
On the premise of the application principle, some improvement and modification can also be carried out to the application, these are improved and modification also falls into this Shen
Please be in scope of the claims.
Claims (9)
1. a kind of company-data analysis method, it is characterised in that methods described includes:
Choose in predetermined amount of time and be separated by mobile cluster object data corresponding to the time point of predetermined time interval;
Establish abnormal data dynamic table;
By the exceptional data point in the mobile cluster object data and the abnormal data dynamic table at each time point
Classified, obtain preliminary classification result, using the non-classified mobile cluster object data as the exceptional data point simultaneously
Store into abnormal data dynamic table;
Since first time point, the preliminary classification result at each time point and the time point are analyzed
The change of the preliminary classification result at the previous time point, and according to the situation of the change to each time
Preliminary classification result is changed situation mark, obtains classification results.
2. according to the method for claim 1, it is characterised in that also include:
The relation between the class and class at each time point is determined according to the classification results, builds mobile cluster scheme-tree;
According to the mobile cluster scheme-tree, it is determined that the frequent information of related mobile cluster.
3. according to the method for claim 2, it is characterised in that the mark of the situation of change, specifically include:
Retain, merge, separate, expand, shrink, disappear, occur.
4. according to the method for claim 3, it is characterised in that it is described to establish abnormal data dynamic table, including:
Establish the abnormal data dynamic table;
Correlation parameters are set;Wherein, the processing parameter includes dynamic change time and renewal time.
5. according to the method for claim 4, it is characterised in that described there is no the mobile cluster pair sorted out in classification
Image data is as the exceptional data point and stores into abnormal data dynamic table, in addition to:
According to the processing parameter, judge whether the existence time of the exceptional data point exceeds the renewal time;
If so, then update the exceptional data point.
6. a kind of company-data analytical equipment, it is characterised in that described device includes:
Data module is chosen, is separated by mobile cluster corresponding to the time point of predetermined time interval for choosing in predetermined amount of time
Object data;
Table module is built, for establishing abnormal data dynamic table;
Preliminary classification module, for by the mobile cluster object data at each time point and the abnormal data dynamic
Exceptional data point in table is classified, and obtains preliminary classification result, using the non-classified mobile cluster object data as
The exceptional data point is simultaneously stored into abnormal data dynamic table;
Change mark module, for since first time point, analyzing the preliminary classification at each time point
As a result with the change of the preliminary classification result at the previous time point at the time point, and according to the feelings of the change
Condition is changed situation mark to the preliminary classification result of each time, obtains classification results.
7. device according to claim 6, it is characterised in that also include:
Achievement module, the relation between class and class for determining each time point according to the classification results, structure move
Dynamic cluster mode tree;
Module is excavated, for according to the mobile cluster scheme-tree, it is determined that the frequent information of related mobile cluster.
8. device according to claim 7, it is characterised in that it is described to build table module, including:
Table unit is built, for establishing the abnormal data dynamic table
Arrange parameter unit, for setting correlation parameters;Wherein, the processing parameter includes dynamic change time and renewal
Time.
9. device according to claim 8, it is characterised in that the preliminary classification module, in addition to:Updating block, its
In, the updating block includes:
Time judgment sub-unit, for according to the processing parameter, judging whether the existence time of the exceptional data point exceeds
The renewal time;
Subelement is updated, exceeds the renewal time for the existence time when the exceptional data point, updates the abnormal number
Strong point.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710541642.6A CN107341239B (en) | 2017-07-05 | 2017-07-05 | Cluster data analysis method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710541642.6A CN107341239B (en) | 2017-07-05 | 2017-07-05 | Cluster data analysis method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107341239A true CN107341239A (en) | 2017-11-10 |
CN107341239B CN107341239B (en) | 2020-08-07 |
Family
ID=60217957
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710541642.6A Active CN107341239B (en) | 2017-07-05 | 2017-07-05 | Cluster data analysis method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107341239B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109002261A (en) * | 2018-07-11 | 2018-12-14 | 佛山市云端容灾信息技术有限公司 | Difference block big data analysis method, apparatus, storage medium and server |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101908065A (en) * | 2010-07-27 | 2010-12-08 | 浙江大学 | On-line attribute abnormal point detecting method for supporting dynamic update |
US20120226475A1 (en) * | 2011-03-03 | 2012-09-06 | Hitachi Kokusai Electric Inc. | Substrate processing system, management apparatus, data analysis method |
CN104487991A (en) * | 2011-12-30 | 2015-04-01 | 施耐德电气(美国)公司 | Energy management with correspondence based data auditing signoff |
CN106101102A (en) * | 2016-06-15 | 2016-11-09 | 华东师范大学 | A kind of exception flow of network detection method based on PAM clustering algorithm |
CN106203519A (en) * | 2016-07-17 | 2016-12-07 | 合肥赑歌数据科技有限公司 | Fault pre-alarming algorithm based on taxonomic clustering |
CN106657065A (en) * | 2016-12-23 | 2017-05-10 | 陕西理工学院 | Network abnormality detection method based on data mining |
-
2017
- 2017-07-05 CN CN201710541642.6A patent/CN107341239B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101908065A (en) * | 2010-07-27 | 2010-12-08 | 浙江大学 | On-line attribute abnormal point detecting method for supporting dynamic update |
US20120226475A1 (en) * | 2011-03-03 | 2012-09-06 | Hitachi Kokusai Electric Inc. | Substrate processing system, management apparatus, data analysis method |
CN104487991A (en) * | 2011-12-30 | 2015-04-01 | 施耐德电气(美国)公司 | Energy management with correspondence based data auditing signoff |
CN106101102A (en) * | 2016-06-15 | 2016-11-09 | 华东师范大学 | A kind of exception flow of network detection method based on PAM clustering algorithm |
CN106203519A (en) * | 2016-07-17 | 2016-12-07 | 合肥赑歌数据科技有限公司 | Fault pre-alarming algorithm based on taxonomic clustering |
CN106657065A (en) * | 2016-12-23 | 2017-05-10 | 陕西理工学院 | Network abnormality detection method based on data mining |
Non-Patent Citations (4)
Title |
---|
MANZOOR: "Efficient Clustering_based Outlier Detection Algorithm for Dynamic Data Stream", 《2008 FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY》 * |
TICIANA L: "Discovering Frequent Mobility Patterns on Moving Object Data", 《2014MOBIGIS》 * |
孟静: "异常数据挖掘算法研究与应用", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
王传玉: "基于异常数据挖掘算法的研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109002261A (en) * | 2018-07-11 | 2018-12-14 | 佛山市云端容灾信息技术有限公司 | Difference block big data analysis method, apparatus, storage medium and server |
CN109002261B (en) * | 2018-07-11 | 2022-03-22 | 佛山市云端容灾信息技术有限公司 | Method and device for analyzing big data of difference block, storage medium and server |
Also Published As
Publication number | Publication date |
---|---|
CN107341239B (en) | 2020-08-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5759915B2 (en) | File list generation method and system, program, and file list generation device | |
CN105074724A (en) | Efficient query processing using histograms in a columnar database | |
CN107543553B (en) | Interest point updating method and device | |
CN103488475B (en) | Multidimensional data analysis system and multidimensional data analysis method | |
CN111401827B (en) | Digital acquisition system for bridge diseases | |
CN112052500A (en) | BIM-based method and system for updating components in intelligent building platform | |
CN110348143A (en) | A kind of track merging method, device and storage medium | |
CN111553017B (en) | BIM-based pavement disease analysis display method and system | |
CN112382083A (en) | Freight transportation OD analysis method, device and equipment based on GPS data | |
CN102210127B (en) | Path calculating method, and calculating apparatus | |
CN107341239A (en) | A kind of company-data analysis method and device | |
CN105989140A (en) | Data block processing method and equipment | |
CN105893471A (en) | Data processing method and electronic equipment | |
CN114581620A (en) | Road virtual elevation generation method and device, computer equipment and storage medium | |
US20070236508A1 (en) | Management of gridded map data regions | |
CN113656127B (en) | Page routing method, device, storage medium and processor | |
JP2008225686A (en) | Data arrangement management device and method in distributed data processing platform, and system and program | |
CN112966041B (en) | Data processing method, device, equipment and storage medium | |
CN114140735A (en) | Deep learning-based goods path accumulation detection method and system and storage medium | |
CN112989153A (en) | Data processing method and device and computer equipment | |
KR100500837B1 (en) | Method for managing common database in network operating systems | |
CN105698804A (en) | Complete package updating method and system for solving data collision in navigation data | |
Zhang et al. | Global optimization of combined region aggregation and leveling model | |
CN105335377A (en) | Information processing method and equipment | |
CN112685530B (en) | Method for determining isolated roads in electronic map, related device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |