CN107341239A - A kind of company-data analysis method and device - Google Patents

A kind of company-data analysis method and device Download PDF

Info

Publication number
CN107341239A
CN107341239A CN201710541642.6A CN201710541642A CN107341239A CN 107341239 A CN107341239 A CN 107341239A CN 201710541642 A CN201710541642 A CN 201710541642A CN 107341239 A CN107341239 A CN 107341239A
Authority
CN
China
Prior art keywords
data
time
point
time point
change
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710541642.6A
Other languages
Chinese (zh)
Other versions
CN107341239B (en
Inventor
程良伦
傅应龙
王卓薇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN201710541642.6A priority Critical patent/CN107341239B/en
Publication of CN107341239A publication Critical patent/CN107341239A/en
Application granted granted Critical
Publication of CN107341239B publication Critical patent/CN107341239B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2358Change logging, detection, and notification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

This application discloses company-data analysis method and device, including, choose in predetermined amount of time and be separated by mobile cluster object data corresponding to the time point of predetermined time interval;Establish abnormal data dynamic table;Exceptional data point in the mobile cluster object data and abnormal data dynamic table at each time point is classified, obtains preliminary classification result, non-classified mobile cluster object data as exceptional data point and is stored into abnormal data dynamic table;Since first time point, the change of the preliminary classification result at each time point and the preliminary classification result at the previous time point at time point is analyzed, and situation mark is changed to the preliminary classification result of each time according to the situation of change, obtains classification results.The abnormal data dynamic table of unfiled data can be preserved by establishing, abnormal data is stored, avoids the loss of useful data, while the abnormal data is also included in classification, the process degree of accuracy of data analysis can be made higher.

Description

A kind of company-data analysis method and device
Technical field
The application is related to big data mobile data analysis field, more particularly to a kind of company-data analysis method and device.
Background technology
Widely available with big data technology, the application of big data is very common in daily life, especially in data The stronger contents of specific aim such as advertisement, message push are purposefully pushed to most suitable pair by manufacturer according to the analysis of big data As this is also one of important application of big data.Meanwhile mobile data increases, that is, movement knowledge comprising object and The data of positional information increase, and more purposefully can sell product to object.Using mobile data, traffic can also be studied and gathered around Stifled prediction and animal migrate.But utilizing mobile data to mobile in the mode excavation of picture, being included due to object data Type diversity, and it is higher to the requirement of real-time of data analysis, therefore bring challenge to excavate the pattern of mobile data.
The pattern for generally excavating mobile data is applied for example, traffic administration, logistics distribution and crowd monitoring.These need Analyze the situation of change of cluster.And for the essence of cluster change:Whether one cluster simply disappears corresponding to one group of automobile Or the member in cluster is moved in other clusters, whether emerging cluster reflects new vehicle or new mesh occurs Mark colony, or existing customer hobby transformation and it is caused.
Therefore, research cluster situation of change is to analyze the company-data situation of change in a period of time, first will be original Data are divided into class, can just be studied in units of cluster, then the difference for the cluster for passing through different time points judges its change.On State and current in general company-data analysis method.
But current analysis method is in the data of lesser amt when applying, the result and the mistake of reality that draw Difference is less, and when data volume increases, the result of the pattern analysis of the above method and real deviation are larger, do not meet expected knot Fruit.
Therefore, how to solve the problems, such as that company-data analysis method error is larger, be that those skilled in the art are of interest Hot issue.
The content of the invention
The purpose of the application is to provide a kind of company-data analysis method and device, by being built in traditional analysis method The table of vertical storage abnormal data, then also the data in table are classified in classification, to avoid losing with break-up value Data, prevent that analysis process error is larger, the situation for not meeting expected results occur.
In order to solve the above technical problems, the application provides a kind of company-data analysis method, including:
Choose in predetermined amount of time and be separated by mobile cluster object data corresponding to the time point of predetermined time interval;
Establish abnormal data dynamic table;
By the abnormal number in the mobile cluster object data and the abnormal data dynamic table at each time point Strong point is classified, and obtains preliminary classification result, using the non-classified mobile cluster object data as the abnormal data Put and store into abnormal data dynamic table;
Since first time point, the preliminary classification result at each time point and the time are analyzed The change of the preliminary classification result at the previous time point of point, and according to the situation of the change to it is each described when Between preliminary classification result be changed situation mark, obtain classification results.
Optionally, in addition to:
The relation between the class and class at each time point is determined according to the classification results, builds mobile cluster pattern Tree;
According to the mobile cluster scheme-tree, it is determined that the frequent information of related mobile cluster.
Optionally, the mark of the situation of change, is specifically included:
Retain, merge, separate, expand, shrink, disappear, occur.
Optionally, it is described to establish abnormal data dynamic table, including:
Establish the abnormal data dynamic table;
Correlation parameters are set;Wherein, the processing parameter includes dynamic change time and renewal time.
Optionally, it is described using do not have in classification sort out the mobile cluster object data as the exceptional data point simultaneously Store into abnormal data dynamic table, in addition to:
According to the processing parameter, judge whether the existence time of the exceptional data point exceeds the renewal time;
If so, then update the exceptional data point.
The application also provides a kind of company-data analytical equipment, and described device includes:
Data module is chosen, is moved for choosing being separated by corresponding to the time point of predetermined time interval in predetermined amount of time Cluster object data;
Table module is built, for establishing abnormal data dynamic table;
Preliminary classification module, for by the mobile cluster object data at each time point and the abnormal data Exceptional data point in dynamic table is classified, and obtains preliminary classification result, by the non-classified mobile cluster object data As the exceptional data point and store into abnormal data dynamic table;
Change mark module, for since first time point, analyzing the described initial of each time point The change of the preliminary classification result at the previous time point at classification results and the time point, and according to the change Situation the preliminary classification result of each time is changed situation mark, obtain classification results.
Optionally, in addition to:
Achievement module, the relation between class and class for determining each time point according to the classification results, structure Build mobile cluster scheme-tree;
Module is excavated, for according to the mobile cluster scheme-tree, it is determined that the frequent information of related mobile cluster.
Optionally, it is described to build table module, including:
Table unit is built, for establishing the abnormal data dynamic table
Arrange parameter unit, for setting correlation parameters;Wherein, the processing parameter include the dynamic change time and Renewal time.
Optionally, the preliminary classification module, in addition to:Updating block, wherein, the updating block includes:
Time judgment sub-unit, for according to the processing parameter, judge the exceptional data point existence time whether Beyond the renewal time;
Subelement is updated, exceeds the renewal time for the existence time when the exceptional data point, renewal is described different Constant strong point.
Due to existing company-data analysis method, all non-classified data are lost in meeting in assorting process is, but It is the data for a period, moment non-classified abnormal data is beneficial for the classification results of subsequent time at present Influence.Therefore, errors of analytical results can be caused larger, the reality of description does not meet expected requirement.
Therefore, a kind of company-data analysis method provided herein, including, being separated by selection predetermined amount of time is pre- Fix time mobile cluster object data corresponding to interlude point;Establish abnormal data dynamic table;Will each time point The mobile cluster object data and the abnormal data dynamic table in exceptional data point classified, obtain preliminary classification As a result, the non-classified mobile cluster object data as the exceptional data point and is stored to abnormal data dynamic table In;Since first time point, the preliminary classification result at each time point and the time point are analyzed The change of the preliminary classification result at the previous time point, and according to the situation of the change to each time Preliminary classification result is changed situation mark, obtains classification results.
The abnormal data dynamic table of unfiled data can be preserved by establishing, abnormal data is stored, avoids useful number According to loss, while the abnormal data is also included in classification, the process degree of accuracy of data analysis can be made higher.The application is also A kind of company-data analytical equipment is provided, there is above beneficial effect, will not be described here.
Brief description of the drawings
, below will be to embodiment or existing in order to illustrate more clearly of the embodiment of the present application or technical scheme of the prior art There is the required accompanying drawing used in technology description to be briefly described, it should be apparent that, drawings in the following description are only this The embodiment of application, for those of ordinary skill in the art, on the premise of not paying creative work, can also basis The accompanying drawing of offer obtains other accompanying drawings.
Fig. 1 is the flow chart for the company-data analysis method that the embodiment of the present application provides;
Fig. 2 is the particular flow sheet for the data analysis that the embodiment of the present application provides;
Fig. 3 is the partial process view for the assorting process that the embodiment of the present application provides;
Fig. 4 is the flow chart for the analytical model that the embodiment of the present application provides;
Fig. 5 is the forming types tree graph that the embodiment of the present application provides;
Fig. 6 is the flow chart for establishing dynamic table that the embodiment of the present application provides;
Fig. 7 is the flow chart for the renewal dynamic table that the embodiment of the present application provides;
Fig. 8 is the block diagram for the company-data analytical equipment that the embodiment of the present application provides;
Fig. 9 is the block diagram for the forming types tree that the embodiment of the present application provides;
Figure 10 is the block diagram for building table module that the embodiment of the present application provides.
Embodiment
The core of the application is to provide a kind of company-data analysis method, and by establishing abnormal data dynamic table, storage is different The data that regular data and renewal are stored, avoid because lose useful data and caused by errors of analytical results it is larger, improve The degree of accuracy of analysis method.
To make the purpose, technical scheme and advantage of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application In accompanying drawing, the technical scheme in the embodiment of the present application is clearly and completely described, it is clear that described embodiment is Some embodiments of the present application, rather than whole embodiments.Based on the embodiment in the application, those of ordinary skill in the art The every other embodiment obtained under the premise of creative work is not made, belong to the scope of the application protection.
Fig. 1 is refer to, Fig. 1 is the flow chart for the company-data analysis method that the embodiment of the present application provides.
The present embodiment can include:
S100, choose in predetermined amount of time and be separated by mobile cluster number of objects corresponding to the time point of predetermined time interval According to;
S200, establish abnormal data dynamic table;
It should be noted that be not in contact between step S100 and step S200, therefore the also precedence relationship without execution, Step S200 can be first carried out and perform step S100 again, the two steps can also be performed simultaneously, do not limited herein.
Wherein, predetermined amount of time signified step S100 refers to the period to be analyzed of this research institute, can be by dividing Depending on the actual conditions of analysis.For example, the company-data of a certain section of highway 5 points to 7 points of the vehicle of evening of research, then should select Period comprising this period.That is this period of nonessential selection, because research is the data changed, Situation of change is also observed for the data of period beginning and end, so pre- because increasing the suitable time in beginning and end Length is stayed, for comprehensively analyzing the data in the period.
Meanwhile the predetermined time interval refers to the interval of the sample point of continuous time in the period, Ke Yiyou The opportunity situation analyzed determines, but it is the sampling within the period that the sampling for the period, which also has an important parameter, The number of point, due to needing to analyze substantial amounts of data, one point of increase has a certain degree of increasing for the data volume to be analyzed Add, it is therefore desirable to which accurate result is drawn with suitable sample point number.For example, it is desired to study 5 points to 7 at night of a certain section of highway The company-data of the vehicle of point, general knowledge understand that vehicle flowrate now is larger, and speed is slow, can suitably subtract the points of sampling It is few.And if research, the company-data of 5 points to 7 points of vehicle of a certain section of highway morning, now vehicle flowrate is less, speed Comparatively fast, what is brought is that vehicle change in highway is very fast, therefore can suitably increase the points of sampling.
After determining time point, mobile cluster object data corresponding to access time.The expression of the mobile cluster object data It is mobile object in some time point mobile data information O:
O=(oid, p (x, y), t)
Wherein, oid is data type identifier, and p (x, y) is the mobile object in time point t longitude and latitude, and x is longitude, Y is latitude, and t is the time at the moment.
Ω (t) is defined, O ∈ Ω, Ω (t) are the set of one group of mobile data object data, are referred to as mobile object location Consistent Sets.
For, the abnormal data dynamic table established in step S200, a tables of data should be established in data analysis, can To be stored to data, the function such as change and delete.The entitled F-list of established dynamic table in the present embodiment.
S300, will be different in the mobile cluster object data and the abnormal data dynamic table at each time point Regular data point is classified, and obtains preliminary classification result, using the non-classified mobile cluster object data as the exception Data point is simultaneously stored into abnormal data dynamic table;
It should be noted that the classification for the mobile cluster object data can be classified using sorting technique, example Such as, DBscan, KNN, K-means, can be according to the performance requirement of data analysis and the requirement selection sort side of result precision Method, do not limit in the present embodiment.
Wherein, non-classified data occur in assorting process, it is necessary to be preserved as abnormal data to abnormal number According in dynamic table.Likewise, the object of classification in the classification to data is all data, that is to say, that should comprising to be sorted Data in the data and abnormal data dynamic table at time point.
Therefore, the application can preserve the abnormal data dynamic table of unfiled data by establishing, and store abnormal data, keep away The loss of useful data is exempted from, while the abnormal data is also included in classification, the process degree of accuracy of data analysis can be made more It is high.
S400, since first time point, analyze the preliminary classification result and the institute at each time point State the change of the preliminary classification result at the previous time point at time point, and according to the situation of the change to each The preliminary classification result of the time is changed situation mark, obtains classification results.
Wherein, the preliminary classification result obtained according to said process is the classification results at each time point, due to being to divide The evolutionary pattern of analysis research company-data object is analyzed, it is necessary to which the classification results of the data at each time point are linked together To dependency relation.Therefore, it is necessary to analyze the preliminary classification result at each time point and the previous time point at the time point at the beginning of Beginning classification results, it is associated according to the preliminary classification result at two time points and draws class categories and identify situation of change.
In the present embodiment, using Jaccard similarities judge the situation of change of two adjacent time points, and by this It is divided into corresponding situation of change classification and identifies.The problem of Jaccard similarities are related to confidence level, that is, adjacent time In the cluster preliminary classification result of point, the similar proportion of latter time point data amount and previous time point data volume judges its change Situation.Wherein, the ratio of similarity is needed with empirically determined, is not limited herein.
Wherein, depending on the classification of situation of change is typically because of the data concrete condition of its analysis.The data reached typically all can The particular problem of corresponding reality, and its corresponding problem can substantially determine the situation of change of its data, and its situation of change classification. Simple problem is such as analyzed, data typically have the situation of merging, separation, disappearance and appearance, and the classification of its situation of change can also divide For this several class.Do not limit herein.
In the present embodiment, the practical problem of selection is analysis road traffic condition, therefore, the class of the situation of change of selection There are not following seven kinds:Survives (reservation), merged (merging), splits (separation), expands (expansion), shrinks (contraction), disappears (disappearance) and appears (appearance).
Fig. 2 is refer to, Fig. 2 is the particular flow sheet for the data analysis that the embodiment of the present application provides.
Wherein predetermined amount of time is represented with T, and predetermined time interval represents predetermined time interval with Δ t, time point it is initial Time point is represented with t.
Fig. 3 is refer to, Fig. 3 is the partial process view for the assorting process that the embodiment of the present application provides.
Wherein, it is as follows for the flow of the assorting process of part.Because length limitation can not show complete assorting process Flow chart, moreover, the flow chart handled in this exposition is as an example, complete flow chart can be according to this partial process view Simple expand can obtain.Therefore, it is not fully described herein.
Time point in period is arranged to 6, time interval is △ t, the namely t since t, t+ △ t, t+2 △ At t, t+3 △ t, t+4 △ t, t+5 △ this 6 time points of t, classification analysis is carried out to this 6 time points.
In t, by sorted class C1, C2, C3, C4 is distinguished, and appears (appearance) is identified as this 4 classes, The point that now having part can not sort out is stored in abnormal data dynamic table F-list.
In t+ △ t, sorted out, can now find that the C1 in previous time point, C2 merge into a class C1', so It is identified with merged (merging);C3' number of clusters is identified than C3 popularization with expands (expansion);C4 is kept not Become so with survives (reservations) marks, continue now to sort out and be a little stored in abnormal data dynamic table F-list.
In t+2 △ t, it can be seen that C3', C4 are merged into a major class C3 ", so C3 " is identified as merged (merging);With C1' merges into C1 " with some data in abnormal data dynamic table simultaneously for this, is not identified as merged (merging) now, and Expands (expansion) is masked as, continues now to sort out and is a little stored in abnormal data dynamic table F-list.
In t+3 △ t, because preceding time point t+2 △ t have been filled with, so being updated and continuing now to sort out Point is stored in abnormal data dynamic table F-list, and corresponding now C1 " ' and C5, which is that previous time point C1 " is scattered, to be formed, thus C1 " ' and C5 is identified as splits (separation), and now C3 " ' is that previous time point C3 " diminutions form, so being identified as shrinks (contractings It is small).
In t+4 △ t, C1 " ' keeps constant, is identified as survives;C3 " " is previous time point C3 " ' diminution, mark Know for shrinks (diminution);For C5, be then wholly absent, therefore be identified as disappears (disappearance), continue by now without Method sorts out point and is stored in abnormal data dynamic table F-list.
For t+5 △ t, C1 " ' and C3 " " relative previous moment do not occur any change, all it is identified as survives (reservation).
Fig. 4 and Fig. 5 are refer to, Fig. 4 is the flow chart for the analytical model that the embodiment of the present application provides, and Fig. 5 is that the application is real The forming types tree graph of example offer is provided.
Based on above-described embodiment, the present embodiment can also include:
S500, the relation between the class and class at each time point is determined according to the classification results, builds mobile set Group's scheme-tree;
S600, according to the mobile cluster scheme-tree, it is determined that the frequent information of related mobile cluster.
Wherein, the mobile cluster scheme-tree established, it is that the classification of the situation of change identified according to each time point is entered Row structure, since first empty node of root (root), classification of the C1 in Each point in time is sequentially inserted into, builds first branch Wooden fork, and indicate its situation of change.Second empty node is inserted into, Article 2 branch, foundation are built since second empty node Classification results and situation of change, it is known that be merged into second time point C2 in C1, therefore indicate situation of change in tree and incite somebody to action Process indicates.Build remaining branch successively by this, form complete scheme-tree.
Intergrate with practice situation again, selects the mode of suitable mined information, it is determined that the frequent information of related mobile cluster, can With the association Move Mode frequently occurred.
For example, in actual traffic section, 5 points to 7 points of the period of evening of viaduct is selected, according to analytical model Tree, it is found that merge (merged) and frequently occurred with expansion (expands), the vehicle condition of the period is defined successively, to handing over It is logical to be modulated with great importance.
Fig. 6 is refer to, Fig. 6 is the flow chart for establishing dynamic table that the embodiment of the present application provides.
Based on above-described embodiment, the present embodiment establishes abnormal data dynamic table, can include:
S210, establish the abnormal data dynamic table;
S220, correlation parameters are set;Wherein, the processing parameter includes dynamic change time and renewal time.
It should be noted that after to the abnormal data dynamic table, correlation parameters are set, the table of abnormal data dynamic table Show as follows:
F-list(τ,θ)
Wherein τ=T/n, n=1,2,3 ... represent certain time of the selected exceptional data point that should be preserved;θ= τ/n, n=1,2,3 ... represent the presence sub- time of the selected exceptional data point that should be updated.
Above-mentioned parameter can be set according to the concrete condition of data and reality, and the value effect of parameter arrives subsequent classification The degree of accuracy of the data volume and result of scanning, make simultaneous data volume excessive if numerical value crosses conference, cause sort pass Load increase, influence the speed of data processing, if numerical value is too small useful data can be made to remove too early, cause subsequent analysis Resultant error it is larger.Therefore, depending on concrete condition, it is not specifically limited herein.
In the present embodiment, it is 3 to set τ, that is, its dynamic table is filled with the data at 3 time points, just updates a number According to, while it is 2 to set θ, that is, the data stored the first two time point when updating the data are deleted.
Fig. 7 is refer to, Fig. 7 is the flow chart for the renewal dynamic table that the embodiment of the present application provides.
Based on above-described embodiment, the present embodiment, can also include:
S321, according to the processing parameter, when judging whether the existence time of the exceptional data point exceeds the renewal Between;
S322, if so, then updating the exceptional data point.
Corresponding above-described embodiment, needs to do corresponding deterministic process in processing procedure, when judging the exceptional data point Beyond the data of renewal time i.e. τ values, then renewal the first two time point storage.
Wherein, the mode updated the data, be in order to avoid storing excessive redundant data in abnormal data dynamic table, and Cause the data volume mistake of the scanning in classification, increase machine loading, therefore, it is specified that the time that its needs updates, to time-out, enters Row renewal operation.Renewal operation can partly be deleted after all deleting or contrast, and can also deposit timeout datum In other tables, used for follow-up, rather than deletion action.
In the present embodiment, selection is to carry out deletion action to the data of time-out, needs to scan number every time to reduce According to data volume, while mitigate machine loading.
The embodiment of the present application provides a kind of company-data analysis method, by establishing abnormal data dynamic table, is stored in The abnormal data occurred in assorting process, the situation for losing useful data is avoided, improves the degree of accuracy of analysis method.
The company-data analytical equipment provided below the embodiment of the present application is introduced, company-data described below point Analysis apparatus can be mutually to should refer to above-described company-data analysis method.
Fig. 8 is refer to, Fig. 8 is the block diagram for the company-data analytical equipment that the embodiment of the present application provides.
The present embodiment provides a kind of company-data analytical equipment, can include:
Data module 100 is chosen, for choosing being separated by corresponding to the time point of predetermined time interval in predetermined amount of time Mobile cluster object data;
Table module 200 is built, for establishing abnormal data dynamic table;
Preliminary classification module 300, for by the mobile cluster object data at each time point and the exception Exceptional data point in data dynamic table is classified, and obtains preliminary classification result, by the non-classified mobile cluster object Data are as the exceptional data point and store into abnormal data dynamic table;
Change mark module 400, for since first time point, analyzing the described first of each time point The change of the preliminary classification result at the previous time point at beginning classification results and the time point, and according to the change The situation of change is changed situation mark to the preliminary classification result of each time, obtains classification results.
Fig. 9 is refer to, Fig. 9 is the block diagram for the forming types tree that the embodiment of the present application provides.
Based on above-described embodiment, the present embodiment can also include:
Achievement module 500, the relation between class and class for determining each time point according to the classification results, Build mobile cluster scheme-tree;
Module 600 is excavated, for according to the mobile cluster scheme-tree, it is determined that the frequent information of related mobile cluster.
Figure 10 is refer to, Figure 10 is the block diagram for building table module that the embodiment of the present application provides.
Based on above-described embodiment, this, which builds table module 200, to include:
Table unit 210 is built, for establishing the abnormal data dynamic table
Arrange parameter unit 220, for setting correlation parameters;Wherein, when the processing parameter includes dynamic change Between and renewal time.
Based on above-described embodiment, the present embodiment can also include:Updating block, wherein, the updating block can include:
Time judgment sub-unit, for according to the processing parameter, judge the exceptional data point existence time whether Beyond the renewal time;
Subelement is updated, exceeds the renewal time for the existence time when the exceptional data point, renewal is described different Constant strong point.
Each embodiment is described by the way of progressive in specification, and what each embodiment stressed is and other realities Apply the difference of example, between each embodiment identical similar portion mutually referring to.For device disclosed in embodiment Speech, because it is corresponded to the method disclosed in Example, so description is fairly simple, related part is referring to method part illustration .
Professional further appreciates that, with reference to the unit of each example of the embodiments described herein description And algorithm steps, can be realized with electronic hardware, computer software or the combination of the two, in order to clearly demonstrate hardware and The interchangeability of software, the composition and step of each example are generally described according to function in the above description.These Function is performed with hardware or software mode actually, application-specific and design constraint depending on technical scheme.Specialty Technical staff can realize described function using distinct methods to each specific application, but this realization should not Think to exceed scope of the present application.
Directly it can be held with reference to the step of method or algorithm that the embodiments described herein describes with hardware, processor Capable software module, or the two combination are implemented.Software module can be placed in random access memory (RAM), internal memory, read-only deposit Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology In any other form of storage medium well known in field.
A kind of company-data analysis method provided herein and device are described in detail above.Herein should The principle and embodiment of the application are set forth with specific case, the explanation of above example is only intended to help and managed Solve the present processes and its core concept.It should be pointed out that for those skilled in the art, do not departing from On the premise of the application principle, some improvement and modification can also be carried out to the application, these are improved and modification also falls into this Shen Please be in scope of the claims.

Claims (9)

1. a kind of company-data analysis method, it is characterised in that methods described includes:
Choose in predetermined amount of time and be separated by mobile cluster object data corresponding to the time point of predetermined time interval;
Establish abnormal data dynamic table;
By the exceptional data point in the mobile cluster object data and the abnormal data dynamic table at each time point Classified, obtain preliminary classification result, using the non-classified mobile cluster object data as the exceptional data point simultaneously Store into abnormal data dynamic table;
Since first time point, the preliminary classification result at each time point and the time point are analyzed The change of the preliminary classification result at the previous time point, and according to the situation of the change to each time Preliminary classification result is changed situation mark, obtains classification results.
2. according to the method for claim 1, it is characterised in that also include:
The relation between the class and class at each time point is determined according to the classification results, builds mobile cluster scheme-tree;
According to the mobile cluster scheme-tree, it is determined that the frequent information of related mobile cluster.
3. according to the method for claim 2, it is characterised in that the mark of the situation of change, specifically include:
Retain, merge, separate, expand, shrink, disappear, occur.
4. according to the method for claim 3, it is characterised in that it is described to establish abnormal data dynamic table, including:
Establish the abnormal data dynamic table;
Correlation parameters are set;Wherein, the processing parameter includes dynamic change time and renewal time.
5. according to the method for claim 4, it is characterised in that described there is no the mobile cluster pair sorted out in classification Image data is as the exceptional data point and stores into abnormal data dynamic table, in addition to:
According to the processing parameter, judge whether the existence time of the exceptional data point exceeds the renewal time;
If so, then update the exceptional data point.
6. a kind of company-data analytical equipment, it is characterised in that described device includes:
Data module is chosen, is separated by mobile cluster corresponding to the time point of predetermined time interval for choosing in predetermined amount of time Object data;
Table module is built, for establishing abnormal data dynamic table;
Preliminary classification module, for by the mobile cluster object data at each time point and the abnormal data dynamic Exceptional data point in table is classified, and obtains preliminary classification result, using the non-classified mobile cluster object data as The exceptional data point is simultaneously stored into abnormal data dynamic table;
Change mark module, for since first time point, analyzing the preliminary classification at each time point As a result with the change of the preliminary classification result at the previous time point at the time point, and according to the feelings of the change Condition is changed situation mark to the preliminary classification result of each time, obtains classification results.
7. device according to claim 6, it is characterised in that also include:
Achievement module, the relation between class and class for determining each time point according to the classification results, structure move Dynamic cluster mode tree;
Module is excavated, for according to the mobile cluster scheme-tree, it is determined that the frequent information of related mobile cluster.
8. device according to claim 7, it is characterised in that it is described to build table module, including:
Table unit is built, for establishing the abnormal data dynamic table
Arrange parameter unit, for setting correlation parameters;Wherein, the processing parameter includes dynamic change time and renewal Time.
9. device according to claim 8, it is characterised in that the preliminary classification module, in addition to:Updating block, its In, the updating block includes:
Time judgment sub-unit, for according to the processing parameter, judging whether the existence time of the exceptional data point exceeds The renewal time;
Subelement is updated, exceeds the renewal time for the existence time when the exceptional data point, updates the abnormal number Strong point.
CN201710541642.6A 2017-07-05 2017-07-05 Cluster data analysis method and device Active CN107341239B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710541642.6A CN107341239B (en) 2017-07-05 2017-07-05 Cluster data analysis method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710541642.6A CN107341239B (en) 2017-07-05 2017-07-05 Cluster data analysis method and device

Publications (2)

Publication Number Publication Date
CN107341239A true CN107341239A (en) 2017-11-10
CN107341239B CN107341239B (en) 2020-08-07

Family

ID=60217957

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710541642.6A Active CN107341239B (en) 2017-07-05 2017-07-05 Cluster data analysis method and device

Country Status (1)

Country Link
CN (1) CN107341239B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109002261A (en) * 2018-07-11 2018-12-14 佛山市云端容灾信息技术有限公司 Difference block big data analysis method, apparatus, storage medium and server

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101908065A (en) * 2010-07-27 2010-12-08 浙江大学 On-line attribute abnormal point detecting method for supporting dynamic update
US20120226475A1 (en) * 2011-03-03 2012-09-06 Hitachi Kokusai Electric Inc. Substrate processing system, management apparatus, data analysis method
CN104487991A (en) * 2011-12-30 2015-04-01 施耐德电气(美国)公司 Energy management with correspondence based data auditing signoff
CN106101102A (en) * 2016-06-15 2016-11-09 华东师范大学 A kind of exception flow of network detection method based on PAM clustering algorithm
CN106203519A (en) * 2016-07-17 2016-12-07 合肥赑歌数据科技有限公司 Fault pre-alarming algorithm based on taxonomic clustering
CN106657065A (en) * 2016-12-23 2017-05-10 陕西理工学院 Network abnormality detection method based on data mining

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101908065A (en) * 2010-07-27 2010-12-08 浙江大学 On-line attribute abnormal point detecting method for supporting dynamic update
US20120226475A1 (en) * 2011-03-03 2012-09-06 Hitachi Kokusai Electric Inc. Substrate processing system, management apparatus, data analysis method
CN104487991A (en) * 2011-12-30 2015-04-01 施耐德电气(美国)公司 Energy management with correspondence based data auditing signoff
CN106101102A (en) * 2016-06-15 2016-11-09 华东师范大学 A kind of exception flow of network detection method based on PAM clustering algorithm
CN106203519A (en) * 2016-07-17 2016-12-07 合肥赑歌数据科技有限公司 Fault pre-alarming algorithm based on taxonomic clustering
CN106657065A (en) * 2016-12-23 2017-05-10 陕西理工学院 Network abnormality detection method based on data mining

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
MANZOOR: "Efficient Clustering_based Outlier Detection Algorithm for Dynamic Data Stream", 《2008 FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY》 *
TICIANA L: "Discovering Frequent Mobility Patterns on Moving Object Data", 《2014MOBIGIS》 *
孟静: "异常数据挖掘算法研究与应用", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
王传玉: "基于异常数据挖掘算法的研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109002261A (en) * 2018-07-11 2018-12-14 佛山市云端容灾信息技术有限公司 Difference block big data analysis method, apparatus, storage medium and server
CN109002261B (en) * 2018-07-11 2022-03-22 佛山市云端容灾信息技术有限公司 Method and device for analyzing big data of difference block, storage medium and server

Also Published As

Publication number Publication date
CN107341239B (en) 2020-08-07

Similar Documents

Publication Publication Date Title
JP5759915B2 (en) File list generation method and system, program, and file list generation device
CN105074724A (en) Efficient query processing using histograms in a columnar database
CN107543553B (en) Interest point updating method and device
CN103488475B (en) Multidimensional data analysis system and multidimensional data analysis method
CN111401827B (en) Digital acquisition system for bridge diseases
CN112052500A (en) BIM-based method and system for updating components in intelligent building platform
CN110348143A (en) A kind of track merging method, device and storage medium
CN111553017B (en) BIM-based pavement disease analysis display method and system
CN112382083A (en) Freight transportation OD analysis method, device and equipment based on GPS data
CN102210127B (en) Path calculating method, and calculating apparatus
CN107341239A (en) A kind of company-data analysis method and device
CN105989140A (en) Data block processing method and equipment
CN105893471A (en) Data processing method and electronic equipment
CN114581620A (en) Road virtual elevation generation method and device, computer equipment and storage medium
US20070236508A1 (en) Management of gridded map data regions
CN113656127B (en) Page routing method, device, storage medium and processor
JP2008225686A (en) Data arrangement management device and method in distributed data processing platform, and system and program
CN112966041B (en) Data processing method, device, equipment and storage medium
CN114140735A (en) Deep learning-based goods path accumulation detection method and system and storage medium
CN112989153A (en) Data processing method and device and computer equipment
KR100500837B1 (en) Method for managing common database in network operating systems
CN105698804A (en) Complete package updating method and system for solving data collision in navigation data
Zhang et al. Global optimization of combined region aggregation and leveling model
CN105335377A (en) Information processing method and equipment
CN112685530B (en) Method for determining isolated roads in electronic map, related device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant