CN110096519A - A kind of optimization method and device of big data classifying rules - Google Patents

A kind of optimization method and device of big data classifying rules Download PDF

Info

Publication number
CN110096519A
CN110096519A CN201910280279.6A CN201910280279A CN110096519A CN 110096519 A CN110096519 A CN 110096519A CN 201910280279 A CN201910280279 A CN 201910280279A CN 110096519 A CN110096519 A CN 110096519A
Authority
CN
China
Prior art keywords
data
classification
rule
scene
keyword
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910280279.6A
Other languages
Chinese (zh)
Inventor
黄浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhongkezhi Ying Technology Development Co Ltd
Original Assignee
Beijing Zhongkezhi Ying Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zhongkezhi Ying Technology Development Co Ltd filed Critical Beijing Zhongkezhi Ying Technology Development Co Ltd
Priority to CN201910280279.6A priority Critical patent/CN110096519A/en
Publication of CN110096519A publication Critical patent/CN110096519A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24564Applying rules; Deductive queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification

Abstract

The present invention provides the optimization methods and device of a kind of big data classifying rules, solve the technical issues of existing rules customization process lacks efficient multiplexing and improves.It include: the data structure for establishing storage rule;The rule set of determining theme is formed by the data structure;Scene classification is carried out to source data according to the rule set.The complete topology structure of subject classification and scene classification is formed using data structure, the parameter of different classifications process is in an ordered configuration, so that the parameter invocation procedure of big data disaggregated model complexity forms structural categories system, so that the classification iterative parameter data in disaggregated model, key enumerates data, assorting process adjusting thresholds section can effectively manage and reasonable disposition, so that big data disaggregated model has the technological means moderately adjusted for source data changing features, avoid the frequent adjusting training collection of differentiation and learning process of the objective scene for source data forming process, reduce the time cost and cost of labor of big data classification.

Description

A kind of optimization method and device of big data classifying rules
Technical field
The present invention relates to data classification technology fields, and in particular to a kind of optimization method and dress of big data classifying rules It sets.
Background technique
In the prior art, big data classification is the similar features obtained between data by valid data classification method, according to Data are divided into different classes of by similar features, and carry out further characteristic processing according to classification.It is learnt by oneself in data classification method It practises classification method to be limited by training set, the disaggregated model and big data industry field of formation are positively correlated, and disaggregated model does not have logical The property used.It is limited to the training set data human cost and training time cost of disaggregated model simultaneously, for flexible data classification Scene can only realize the efficient process of fixed cluster classification, cannot do to the data classification Change of types in source data evolutionary process Adjustment in time out.So that the classifying rules of disaggregated model does not have systematicness and continuity, rules customization process lacks effective A possibility that multiplexing and improvement, can not form general classifying rules customization means for the variation of data classification subject scenes. This, which just directly results in the corresponding data processing model in memory environment, model composition data and model addressing framework shortage The addressing and operational capacity of effect considerably increase each of model reconstruction so that data and model addressing can not be reused effectively Kind cost.
Summary of the invention
In view of the above problems, the embodiment of the present invention provides the optimization method and device of a kind of big data classifying rules, solves The technical issues of existing rules customization process lacks efficient multiplexing and improves.
The optimization method of the big data classifying rules of the embodiment of the present invention, comprising:
Establish the data structure of storage rule;
The rule set of determining theme is formed by the data structure;
Scene classification is carried out to source data according to the rule set.
In one embodiment of the invention, further includes:
The rule set is parsed and updates storage rule.
In one embodiment of the invention, further includes:
Partial data structure is parsed and obtained to the rule set and rule is forwarded.
In one embodiment of the invention, the data structure for establishing storage rule includes:
Set the texture field of single rule;
Set the field keyword and compound fields keyword of single rule;
The data structure of the rule is formed according to the field topological structure of single rule.
In one embodiment of the invention, the rule set for forming determining theme by the data structure includes:
Establish rule set keyword;
Establish rule set rule set subset and corresponding subset keyword;
Establish the scene collection and corresponding scene keyword of rule set subset;
Keyword topology knot is established according to the rule set keyword, the subset keyword or the scene keyword Structure;
The data structure for increasing rule in the keyword topological structure adds rule in the data structure of the rule Then parameter or relevant parameter.
It is described to include: to source data progress scene classification according to the rule set in one embodiment of the invention
Corresponding preliminary source data is determined according to preliminary classification;
Rule set corresponding with the preliminary classification is determined in rule set;
Scene classification classification and scene classification supplemental characteristic are extracted from the rule set;
Classified using the scene classification supplemental characteristic to the preliminary source data, is formed and determined under preliminary classification Corresponding classification source data under the scene classification classification;
The result classification data of preliminary classification is formed according to the classification source data.
In one embodiment of the invention, it is described the rule set is parsed and updates storage rule include:
It obtains determining classification and determines upgrading demand for scene;
According to upgrade demand determining rules data structure and the Rule content, shown by interactive frame;
More new content is updated into the corresponding rules data structure and Rule content by the interactive frame.
It is described that Partial data structure and rule progress are parsed and obtained to the rule set in one embodiment of the invention Forwarding includes:
The associated topologies structure for determining the data structure of classification and determining scene is obtained according to forwarding demand;
Interim rules data structure and Rule content are formed according to the associated topologies structure;
The interim rules data structure and Rule content are formed into independent data objects, data link is provided.
The big data classifying rules of the embodiment of the present invention optimizes device, comprising:
Memory, for storing any big data classifying rules optimization method treatment process of figure claim 1 to 8 Corresponding program code;
Processor, for executing said program code.
The big data classifying rules of the embodiment of the present invention optimizes device, comprising:
Rule setting module, for establishing the data structure of storage rule;
Rule forms module, for forming the rule set of determining theme by the data structure;
Rule application module, for carrying out scene classification to source data according to the rule set.
The big data classifying rules optimization method and device of the embodiment of the present invention using data structure formed subject classification and The complete topology structure of scene classification, the parameter of different classifications process is in an ordered configuration, so that big data disaggregated model is multiple Miscellaneous parameter invocation procedure forms structural categories system, so that classification iterative parameter data, the key in disaggregated model are enumerated Data, assorting process adjusting thresholds section can effectively manage and reasonable disposition, so that big data disaggregated model is directed to source number There is the technological means moderately adjusted according to changing features, the differentiation of the objective scene for source data forming process is avoided frequently to adjust Whole training set and learning process reduce the time cost and cost of labor of big data classification, improve big data disaggregated model Application versatility and model tolerance.
Detailed description of the invention
Fig. 1 show the flow diagram of one embodiment of the invention big data classifying rules optimization method.
Fig. 2 show the process signal that data structure is established in one embodiment of the invention big data classifying rules optimization method Figure.
Fig. 3 show the process signal that rule set is established in one embodiment of the invention big data classifying rules optimization method Figure.
Fig. 4 show the process signal that scene classification is carried out in one embodiment of the invention big data classifying rules optimization method Figure.
Fig. 5 show the stream for carrying out data processing in one embodiment of the invention big data classifying rules optimization method to rule Journey schematic diagram.
Fig. 6 show the configuration diagram of one embodiment of the invention big data classifying rules optimization device.
Fig. 7 show the flow diagram of the Classified optimization method of one embodiment of the invention data displaying.
Fig. 8, which is shown in the Classified optimization method of one embodiment of the invention data displaying, to be formed classification data structure, establishes Class categories and the flow diagram for being associated with related data.
Fig. 9 show the configuration diagram of the Classified optimization device of one embodiment of the invention data displaying.
Figure 10 show the main flow schematic diagram of one embodiment of the invention data classification optimization method.
Figure 11 show the detailed process schematic diagram of one embodiment of the invention data classification optimization method.
Figure 12 show the configuration diagram of one embodiment of the invention data classification optimization device.
Figure 13 show the main flow schematic diagram of the data processing method of one embodiment of the invention classification interactive interface.
Figure 14 show the detailed process schematic diagram of the data processing method of one embodiment of the invention classification interactive interface.
Figure 15 show the configuration diagram of the data processing equipment of one embodiment of the invention classification interactive interface.
Specific embodiment
To be clearer and more clear the objectives, technical solutions, and advantages of the present invention, below in conjunction with attached drawing and specific embodiment party The invention will be further described for formula.Obviously, described embodiments are only a part of the embodiments of the present invention, rather than all Embodiment.Based on the embodiments of the present invention, those of ordinary skill in the art institute without creative efforts The every other embodiment obtained, shall fall within the protection scope of the present invention.The number of step and the process of data processing are unrelated.
The big data classifying rules optimization method of one embodiment of the invention is as shown in Figure 1.In Fig. 1, the present embodiment includes:
Step 110: establishing the data structure of storage rule.
Data structure is used for the design parameter of storage rule, according to design parameter according to each rule of determination to big data source The classification benchmark of data forms assorting process.
Rule can be the numerical value of design parameter or relevant parameter, character string, feature vector value, data referencing position, number It is judged that logic or rule verification data etc., the design parameter or relevant parameter of data structure and rule have adapting universal, Adapting universal guarantees that data structure adapts to the type and magnitude of design parameter or relevant parameter.
Step 120: the rule set of determining theme is formed by data structure.
Data evolutionary process and the change in process that source data is formed in objective environment match, and the variation of scene of classifying can be with The set of the data evolutionary process of the source data of response taxonomy theme, scene of classifying has determined the characteristic of division of classification scheme, point Class theme includes further careful classification scene.Utilize data structure storage design parameter corresponding with classification scene or association The corresponding rule of scene of classifying is made up of the rule set of single theme data structure by parameter.
Step 130: scene classification is carried out to source data according to rule set.
Source data subject classification is carried out according to rule set, is formed in subject classification on the basis of forming Data subject classification The classification of different scenes.
According to the decision logic formed between the type, meaning of parameters and parameter of parameter of regularity in rule set, scene point is carried out Parameter of regularity can be used for the data handling procedures such as retrieval, sequence or exclusion in rule set when class.
The big data classifying rules optimization method of the embodiment of the present invention forms subject classification and scene point using data structure The complete topology structure of class, the parameter of different classifications process is in an ordered configuration, so that the ginseng of big data disaggregated model complexity Number invocation procedure forms structural categories system, so that classification iterative parameter data, the key in disaggregated model are enumerated data, divided Class process adjusting thresholds section can effectively manage and reasonable disposition, so that big data disaggregated model becomes for source data feature Changing has the technological means moderately adjusted, avoids the frequent adjusting training collection of differentiation of the objective scene for source data forming process And learning process, the time cost and cost of labor of big data classification are reduced, the application for improving big data disaggregated model is logical With property and model tolerance.
As shown in Figure 1, in an embodiment of the present invention, big data classifying rules optimization method further include:
Step 140: rule set being parsed and updates storage rule.
Parse and update be by read rule set data structure, according to data structure obtain determine rule parameter or Relevant parameter forms the rule set for determining regular update, and then updating determining theme by modifying content of parameter.
The big data classifying rules optimization method of the embodiment of the present invention provides a kind of big data disaggregated model local rule Positioning-amendment-replacement renewal process.While ensure that existing big data disaggregated model integrality, provides and classification is joined Number carries out the modified technological means of source data suitability.The data type of source data or Long-term change trend are being counted greatly in time According to making appropriate adjustment in disaggregated model, while guaranteeing the general trend and stable accuracy of big data classification.
As shown in Figure 1, in an embodiment of the present invention, big data classifying rules optimization method further include:
Step 150: Partial data structure is parsed and obtained to rule set and rule is forwarded.
Parsing and obtaining is to obtain by the data structure of reading rule set and be directed to specific type in big data disaggregated model The part classification of data and the data topology structure of scene classification and the parameter of regularity of storage form rule for determining theme Collection.
The big data classifying rules optimization method of the embodiment of the present invention provides a kind of big data disaggregated model local rule Extraction process.It ensure that existing big data disaggregated model can be from true for the classifying rules of specific classification type or scene of classifying Surely the rule classified and in accurately extract, realize effective multiplexing of reliable rules subset in mature big data disaggregated model.Rule The classifying rules for other big data disaggregated models being initially formed is improved in the multiplexing of subset has good effect, so that big data Disaggregated model can make in due course adaptation according to the commercial field or application field of source data, meet big data under business scenario The specific aim and accuracy of classification.For the combination tool of business datum, professional knowledge map and searching structure under business environment There is good data response effect.
The process of data structure is established as shown in Fig. 2 in big data classifying rules optimization method of the embodiment of the present invention.Scheming In 2, the process for establishing data structure includes:
Step 111: setting the texture field of single rule.
Using texture field setting and regular design parameter or the matched data type of relevant parameter, meet data storage The substantive requirements of form.Exist between texture field and includes logic.
Step 112: setting the field keyword and compound fields keyword of single rule.
The readability that field type increases single structure field is referred to by field keyword, passes through compound fields keyword The readability for increasing composite construction field meets personnel and reads understanding.
Step 113: according to the data structure of the field topological structure formation rule of single rule.
Data structure can be using normal structure in the prior art, such as XML standard, JSON standard etc..It (can with XML Extending mark language) for data structure standard, texture field is formed using XML standard, for texture field, keyword, benefit are set It is registered in XML standard with keyword and forms determining field topological structure, and then form the data structure of rules at different levels.Data The citation form of structure can be such that
<theme rule keyword><
Regular 1 ((parameter type supplemental characteristic), (parameter type supplemental characteristic), (parameter type supplemental characteristic))
Regular 2 ((parameter type supplemental characteristic), (parameter type supplemental characteristic), (parameter type supplemental characteristics))
……
></subject rule keywords>
Example 1
<[1.0]<accrue>(" electric business ", " e-commerce ", " difference quotient ") // with " electric business ", " e-commerce ", " difference quotient " For sort key word, weight coefficient is 1.0 in classification results
[0.5]<accrue>(" shop ", " B2B ", " O2O ") // with " shop ", " B2B ", " O2O " is that sort key word exists In classification results weight coefficient be 0.5 >
Example 2
< [1.0] " electric business ", [1.0] " e-commerce ", [1.0] " difference quotient ", [0.5] " shop ", [0.5] " B2B ", [0.5] " O2O ") // sort key word " electric business ", " e-commerce ", " difference quotient ", " shop ", " B2B " and " O2O " respective weights Coefficient be [1.0], [1.0], [1.0], [0.5], [0.5] and [0.5] >
Big data classifying rules optimization method of the embodiment of the present invention establishes base according to the topological structure of keyword and keyword This data structure makes the type adaptation of data structure and parameter of regularity, and the readability of data structure is improved by keyword.
The process of rule set is established as shown in Fig. 3 in one embodiment of the invention big data classifying rules optimization method.Scheming In 3, the process for establishing rule set includes:
Step 121: establishing rule set keyword.
Step 122: establish rule set rule set subset and corresponding subset keyword.
Step 123: establishing the scene collection and corresponding scene keyword of rule set subset.
Step 124: keyword topological structure is established according to rule set keyword, subset keyword or scene keyword.
Step 125: increasing the data structure of rule in keyword topological structure, rule are added in the data structure of rule Then parameter or relevant parameter.
By taking XML (extensible markup language) data structure standard as an example, add in conjunction with the data structure that above-described embodiment is formed Theme and the corresponding parameter of regularity of theme is added to form following rule set:
<ruleset name>
<rule name>
<regular scene title><
Rule 1
Rule 2
…>
Example 1
<topic name=" other "><[cdata (
{ political economy },
{ business },
{ society },
)]>
<topic name=" political economy "><[cdata (
{ political situation of the time comment },
{ macroeconomy },
{ politician },
)]>
<comment of the topic name=" political situation of the time "><[cdata (
[1.0] (" economic reform ", " political reform ", " political reform ", " educational reform ", " Education equity ", " college entrance examination immigrant ", " household registration system ")
[0.5] (" Gini coefficient ", " network public-opinion ", " liberalism ", " political examination ", " political development ", " political culture ", " nihilism ")
)]>
</topic>
Big data classifying rules optimization method of the embodiment of the present invention establishes classifying rules and scene according to each rank keyword The topological structure of rule, and fill in specific rules supplemental characteristic.Meet data structure on the basis of guaranteeing parameter storage efficiency With the readability of data content.
The process of scene classification is carried out in one embodiment of the invention big data classifying rules optimization method as shown in Fig. 4.? In Fig. 4, the process of scene classification includes:
Step 131: corresponding preliminary source data is determined according to preliminary classification.
Preliminary classification can be a kind of base categories, be also possible to the classification that a kind of data classification model on basis is formed. Preliminary source data is that some determines corresponding source data of classifying after base categories.
Step 132: rule set corresponding with preliminary classification is determined in rule set.
Rule set may include rule (son) collection of different classifications, and regular (son) concentrates the field including different field or classification The rule (son) of scape class categories collects.
Step 133: extracting scene classification classification and scene classification supplemental characteristic from rule set.
Scene classification classification can have default parameters, it is possible to have customize supplemental characteristic.
Step 134: being classified using scene classification supplemental characteristic to preliminary source data, formed and determined under preliminary classification Corresponding classification source data under scene classification classification.
Scene classification supplemental characteristic as further data classification, to preliminary source data formed scene classification classification into one Step is classified, the threshold decision of existing characteristics in assorting process, excludes to form classification source data except classification there are source data Possibility.
Step 135: the result classification data of preliminary classification is formed according to classification source data.
The screening to preliminary source data is formed using the set of classification source data, is conducive to exclude the number after source data evolution According to noise.
Big data classifying rules optimization method of the embodiment of the present invention is using rule set to basic big data disaggregated model shape At the original source data of determination preliminary classification carry out further scene classification, filtered in original source data by scene classification Bias data, and more accurately classification scene and corresponding classification data are formed, it improves in basic classification model according to scene spy Levy the further characteristic of division formed and nicety of grading.
Process such as Fig. 5 institute of data processing is carried out in one embodiment of the invention big data classifying rules optimization method to rule Show.In Fig. 5, the process for being updated data processing to rule includes:
Step 141: obtaining determining classification and determine upgrading demand for scene.
Step 142: according to determining rules data structure and the Rule content of upgrading demand, being shown by interactive frame.
Step 143: more new content is updated to the data structure and Rule content of the rule of correspondence by interactive frame.
In Fig. 5, include: to the process that rule carries out local acquisition and forwards data processing
Step 151: the associated topologies structure for determining the data structure of classification and determining scene is obtained according to forwarding demand.
Step 152: interim rules data structure and Rule content are formed according to associated topologies structure.
Step 153: interim rules data structure and Rule content being formed into independent data objects, data link is provided.
Big data classifying rules optimization method of the embodiment of the present invention is by the partial parameters root of complete big data disaggregated model It extracts and updates according to data structure, avoid the destruction to big data disaggregated model integrality and the interference of data assorting process. The update of classifying rules can have preferable timeliness, can substantially meet classification source data type and domain features it is progressive Formula develops.The part of classifying rules is extracted and the formation of response data object can be used as the presence of independent rules data source, is it His big data disaggregated model improves the reliable rule of data classification under related fields and scene, improves the reuse of authentic data rule Property, reduce big data classification cost.
One embodiment of the invention big data classifying rules optimizes device
Memory, for storing above-described embodiment big data classifying rules optimization method treatment process corresponding program generation Code;
Processor, for executing the program code of above-described embodiment big data classifying rules optimization method treatment process
The framework that one embodiment of the invention big data classifying rules optimizes device is as shown in Figure 6.In Fig. 6, the present embodiment Include:
Rule setting module 1110, for establishing the data structure of storage rule;
Rule forms module 1120, for forming the rule set of determining theme by data structure;
Rule application module 1130, for carrying out scene classification to source data according to rule set.
As shown in fig. 6, in an embodiment of the present invention, further includes:
Policy Updates module 1140, for being parsed to rule set and updating storage rule.
As shown in fig. 6, in an embodiment of the present invention, further includes:
Rule Extraction module 1150 turns for Partial data structure and rule to be parsed and obtained to rule set Hair.
As shown in fig. 6, in an embodiment of the present invention, rule setting module 1110 includes:
Rule field forms unit 1111, for setting the texture field of single rule;
Regular keyword forms unit 1112, for setting the field keyword and compound fields keyword of single rule;
Regular texture forms unit 1113, for the data knot according to the field topological structure formation rule of single rule Structure.
As shown in fig. 6, in an embodiment of the present invention, rule forms module 1120 and includes:
Primary keyword setting unit 1121, for establishing rule set keyword;
Rules subset setting unit 1122, for establish rule set rule set subset and corresponding subset keyword;
Regular scene setting unit 1123, for establishing the scene collection and corresponding scene keyword of rule set subset;
Crucial word association setting unit 1124, for being built according to rule set keyword, subset keyword or scene keyword Vertical keyword topological structure;
Parameter of regularity fills unit 1125, for increasing the data structure of rule in keyword topological structure, in rule Data structure in add parameter of regularity or relevant parameter.
As shown in fig. 6, in an embodiment of the present invention, rule application module 1130 includes:
Preliminary data acquiring unit 1131, for determining corresponding preliminary source data according to preliminary classification;
Rule determination unit 1132, for determining rule set corresponding with preliminary classification in rule set;
Regular configuration unit 1133, for extracting scene classification classification and scene classification supplemental characteristic from rule set;
Execution unit 1134 of classifying is formed true for being classified using scene classification supplemental characteristic to preliminary source data Determine corresponding classification source data under the scene classification classification under preliminary classification;
Data Synthesis unit 1135, for forming the result classification data of preliminary classification according to classification source data.
As shown in fig. 6, in an embodiment of the present invention, Policy Updates module 1140 includes:
Receiving unit 1141 is updated, for obtaining determining classification and determining upgrading demand for scene;
Interactive unit 1142 is updated, for passing through interaction according to upgrade demand determining rules data structure and Rule content Frame is shown;
Receiving unit 1143 is updated, for more new content to be updated to data structure and the rule of the rule of correspondence by interactive frame Then content.
As shown in fig. 6, in an embodiment of the present invention, Rule Extraction module 1150 includes:
Forwarding demand receiving unit 1151, for obtaining the data structure for determining classification and determining scene according to forwarding demand Associated topologies structure;
Demand determination unit 1152, for forming interim rules data structure and Rule content according to associated topologies structure;
Demand separate unit 1153 is mentioned for interim rules data structure and Rule content to be formed independent data objects For data link.
The optimization method of big data classifying rules makes classifying rules structuring, to the reuse of classifying rules, extension and more The purposes such as new have whole frame, can satisfy the classifying rules versatility of related fields data, for forming close neck The taxonomy model of versatility has supporting role between domain.
Data structure optimization is carried out for the data classification process of big data, equally could be formed with conducive to data displaying Classified optimization method
The Classified optimization method that one embodiment of the invention data are shown is as shown in Figure 7.In Fig. 7, the present embodiment includes:
Step 210: establishing the classification data structure of storage class categories.
Classification data structure is used to form the topological structure of class categories, constitutes the frame foundation of whole class categories.Make The class categories variation that combination is adapted in adaptation complicated applications field can be passed through by obtaining frame foundation.
Classification data structure is used to store the basic parameters of class categories, including but not limited to reference source, identifier or Name information etc..
Classification data structure also provides for the structure features such as the control index of class categories.
Step 220: class categories are established by classification data structure storage.
According to the characteristic of division of FIELD Data in application field, tied by the topology that classification data structure forms class categories Structure.
The topological structure of class categories forms the readable class categories descriptor format of user by classification data structure.
Step 230: the classifying rules of class categories is stored by classification data structure.
Classification system is established using classification data structure, and forms the data knot with rule using the versatility of classification system The compatibility of structure, the rule process process of an embodiment big data classifying rules optimization method forms class categories through the invention Classifying rules.The rule process process of one embodiment of the invention big data classifying rules optimization method in conjunction with the present embodiment, Data mapping, which is established, by data structure establishes data structure part and whole reference or reuse.
The Classified optimization method that one embodiment of the invention data are shown utilizes the classification data structural planning with versatility With the storage of design class categories, the class categories that the class categories formed through deep learning and supervised classification are formed are maintained The data framework consistency of storage organization, so that the data processing advantage that class categories data have application and inherit.Using point What the parameter of class data structure storage class categories advantageously formed different classifications classification same characteristic features or parameter shows consistency, It is also beneficial to determine class categories and corresponding parameter or feature, can be further formed and reuse for class categories or update Interactive means mitigate the application difficulty of data application person.There is product especially for the vertical data classification in single data fields Pole effect.
Classification data Structure formation method such as Fig. 8 institute in Classified optimization method that one embodiment of the invention data are shown Show.In fig. 8, classification data structure-forming process includes:
Step 211: forming class categories basic framework.
Basic framework includes that the formation such as the basic data type of structural classification data structure, Data Structures standard are complete The essential element of frame.
Step 212: the class categories set in basis of formation frame.
The classification basis of class categories is formed using class categories set.Classification basis is formed as with other data systems The mark of the data port of data exchange or one group of class categories.
Step 213: forming the class categories grade in class categories set.
The correlation logic of class categories class letter class categories, determine two neighbor classified classifications includes feature.Point Class classification grade forms the internal association of neighbor classified classification.
Step 214: forming the class categories topological structure in the class categories grade in class categories set.
Class categories topological structure is used to establish the additional characteristic feature of class categories set entirety.Class categories topological structure shape The internal association of integral class categories.
Step 215: forming the storage field of class categories in class categories grade.
Storing field includes but is not limited to numerical value, character string, feature vector value, data referencing position, data decision logic Or rule verification data etc..Formed storage field have suitability, can according to the prefabricated condition of adaptation rule formed or according to Data input type is formed, and the quantity for storing field does not do concrete regulation, in the way of the Memory Allocation of queue or structure of arrays It is adapted to.
Data structure can be using normal structure in the prior art, such as XML standard, JSON standard etc..It (can with XML Extending mark language) for data structure standard, texture field is formed using XML standard, for texture field, keyword, benefit are set It is registered in XML standard with keyword and forms determining field topological structure, and then form the data structure of rules at different levels.Data The citation form of structure can be such that
Example 3:
Step 216: according to the associated data of memory topology structure and storage field storage class categories.
Associated data can be the description data to class categories, such as the characteristic of class categories, be also possible to point The associated data of class classification, such as the regular data of class categories.
The Classified optimization method that data of the embodiment of the present invention are shown is by establishing classification data structure new city for close The preferable FIELD Data class categories frame of the versatility of data fields so that class categories have multiplexing, reuse and quickly more New general data processing basis.
As shown in figure 8, in an embodiment of the present invention, the process for establishing class categories includes:
Step 221: class indication is registered by class categories basic framework.
Registration is so that the treatment process of class categories basic framework and available data disaggregated model forms data connection, formation The addressable data port of data call.
Step 222: being established by class categories set and determine classification scheme collection.
Classification scheme collection determines the basis of classification in a data fields, establishes and the relatively independent number of other data fields According to domain or addressing data range.Data field or addressing data range can be independent data objects, such as data source, the company of link The classification scheme collection of the independent data files or link that connect.
Step 223: classification classification theme is established by class categories grade.
There are two the class categories tiered logics of neighbor classified classification to describe for class categories grade tool, completes two adjacent point The rank of class classification or description comprising logic are managed the building of class categories by classification classification theme.
Step 224: the substance parameter for determining theme is established by storage field.
Class categories include storage field, and determining class categories may include identical or different storage field.
The Classified optimization method that the data of the embodiment of the present invention are shown forms classification scheme collection using classification data structure, divides Class theme level and the structured storage for determining classification and corresponding parameter.Complicated classification hierarchical structure may be implemented, it is full simultaneously Extension, reuse and the association of sufficient class categories.
As shown in figure 8, in an embodiment of the present invention, the process for being associated with related data includes:
Step 231: passing through classification data structure determination class indication and corresponding classification scheme collection.
The data interaction port that classification scheme collection is obtained using classification data structure, in data of setting out basic framework and data The interactive process of appearance.
Step 232: the class categories for determining theme are obtained by classification scheme collection.
The class categories for determining theme include the topological structure and specific field structure of each determining subject classification classification.
Step 233: the rule set for determining theme is extracted by the data structure of rule.
The mapping of the data structure and classification data structure of formation rule, completes the rough set theory of data structure, will determine The rule set of theme is correspondingly connected with the class categories formation for determining theme.
Step 234: rule set is added in the class categories associate field of determining theme according to classification data structure.
The data structure of rule or rule set are added to using the data structure of rule and the mapping of classification data structure Classification data structure, and form the transmission of respective field and field contents.
The Classified optimization method that the data of the embodiment of the present invention are shown utilizes classification data structure and regular data structure Mapping form structure compatible so that rule and classification can be with separate storage, independently adjust and are respectively associated, so that adjacent data Assorting process in field can class categories in a distributed manner and classifying rules formed, the effective versatility of extension classification framework With the suitability to data type.
One embodiment of the invention data show Classified optimization device include:
Memory, for storing the Classified optimization method treatment process corresponding program generation of above-described embodiment data displaying Code;
Processor, the program code of the Classified optimization method treatment process for executing the displaying of above-described embodiment data
The Classified optimization device that one embodiment of the invention data are shown is as shown in Figure 9.In Fig. 9, the present embodiment includes:
Taxonomic structure forms module 2210, for establishing the classification data structure of storage class categories;
Categorised content memory module 2220, for establishing class categories by the storage of classification data structure;
Classifying rules relating module 2230, for storing the classifying rules of class categories by classification data structure.
As shown in figure 9, taxonomic structure forms module 2210 and includes: in one embodiment of the invention
Frame forms unit 2211, is used to form class categories basic framework;
Set forms unit 2212, the class categories set being used to form in basic framework;
Grade forms unit 2213, the class categories grade being used to form in class categories set;
Whole topology unit 2214, the class categories being used to form in the class categories grade in class categories set Topological structure;
Field forms unit 2215, is used to form the storage field of class categories in class categories grade;
Parameter storage unit 2216, for the incidence number according to memory topology structure and storage field storage class categories According to.
As shown in figure 9, in one embodiment of the invention, categorised content memory module 2220 includes:
Frame registering unit 2221, for registering class indication by class categories basic framework;
Theme registering unit 2222 determines classification scheme collection for establishing by class categories set;
It is classified registering unit 2223, for establishing classification classification theme by class categories grade;
Theme determination unit 2224, for establishing the substance parameter for determining theme by storage field.
As shown in figure 9, in one embodiment of the invention, classifying rules relating module 2230 includes:
Theme collection structure determination unit 2231, for being led by classification data structure determination class indication and corresponding classification Topic collection;
Class categories determination unit 2232, for obtaining the class categories for determining theme by classification scheme collection;
Rule determination unit 2233 extracts the rule set for determining theme for the data structure by rule;
Data structure map unit 2234, for rule set to be added to the classification of determining theme according to classification data structure In category associations field.
The classification shown using the optimization method and above-described embodiment data of above-described embodiment big data classifying rules is excellent The data classification system that change method is formed, can preferably form the unitized construction for covering data classification and classifying rules.Knot Conjunction unitized construction, which can be formed, is efficiently modified available data classification method.
One embodiment of the invention data classification optimization method is as shown in Figure 10.In Figure 10, the embodiment of the present invention includes:
Step 310: forming industry customization disaggregated model using classification data structure.
Versatility, scalability and data reusability based on the classification system that classification data structure is formed, by specific industry Or specific area data part classifying classification and corresponding classifying rules using classification data structure to form orderly industry fixed Disaggregated model processed.
Step 320: classification being carried out to source data by industry customization disaggregated model and forms data label.
Source data data classification has dominant classification and recessive classification, and dominant characteristic of division can be by industry customization point It directly defines and describes in class model, the data stealth characteristics expression of recessiveness classification is unsuitable for directly definition and description.Utilize row Industry customizes disaggregated model and carries out dominant classification, and carries out data markers to the characteristic of division of each source data and form corresponding data Label.
Step 330: source data is divided by data training set according to data label.
The set of source data of same label type is combined into a data training set using data label, ensure that dominant character Consistency.
Step 340: completing to classify according to data training set Intelligent Optimal disaggregated model and to source data.
The stealth characteristics for being combined supervised learning or semi-supervised learning acquisition data training set are formed to intelligent classification mould The training sizing of type.
Source data can be the source data for not carrying out classification, be conducive to the data classification of incremental data in this way.Source data It is also possible to whole source datas, is conducive to the accuracy of data classification in this way.
The data classification optimization method of the embodiment of the present invention using classification data structure formed it is reusable, take artificial point Rule-like is combined with the disaggregated model based on computer intelligence algorithm.Accurately extensive number is formed using manual sort's rule Intelligent classification model realization stealth classification is utilized using large-scale data training set Intelligent Optimal disaggregated model according to training set Accuracy and classification effectiveness.
One embodiment of the invention data classification optimization method is as shown in figure 11.In Figure 11, industry is formed in the present embodiment Customizing disaggregated model process includes:
Step 311: establishing the topological structure of class categories and class categories by classification data structure according to business demand.
The topological structure of class categories and class categories is formed using the normalization and setting logic of classification data structure
Step 312: establishing corresponding classifying rules according to class categories.
Classifying rules can be diversity data parameter corresponding with class categories, such as keyword, can be used as classification Foundation or retrieval foundation.
Step 313: classification thresholds are set according to classifying rules.
Classification thresholds can for retrieval or search result, as a result in every data have corresponding relevance score value, The data that fractional value is higher than the threshold value are wanted according to the classification thresholds preset to determine that classification collects.
Step 314: forming industry customization disaggregated model according to topological structure, classifying rules and classification thresholds.
Data structure can using normal structure in above-described embodiment or in the prior art, such as XML standard, JSON standard etc..By taking XML (extensible markup language) data structure standard as an example, texture field is formed using XML standard, for knot Keyword is arranged in structure field, is registered in XML standard using keyword and forms determining field topological structure, and then is formed each The data structure of grade rule.The citation form of data structure can be such that
Example 4
Classification system is established according to the hierarchical structure that enterprise provides, classification system refers to hierarchical structure and the classification of classification Specific name, it is as follows: to represent first-level class as " business ", lower business includes secondary classification " business persona "
Data classification of embodiment of the present invention optimization method forms orderly dominant characteristic of division using classification data structure Classification system using the topological structure and the respective classifying rules parameter of class categories between classification system expression class categories and is divided Class weight will be in an ordered configuration by the dominant character of the profession demand artificially adjusted, be the effective of dominant character Adjustment and multiplexing, reuse or update provide reliable technology adjustment basis.So that the classification of profession demand dominant character can To improve overall data classification effectiveness by operation personnel's complete independently.
As shown in figure 11, in an embodiment of the present invention, the process of formation data label includes: in the present embodiment
Step 321: obtaining interim source data.
Interim source data refers in unit time or concurrent or individual persistent state data in cycle duration.It can be The dynamic data of continuous static data or corresponding service condition.
Step 322: dominant classification data is obtained by industry customization disaggregated model filtration stage source data.
The filter data excessively or retrieval data formed using the keyword parameter of class categories in model belongs to dominant classification number According to.
Step 323: forming preliminary classification categorical data using classification thresholds adjustment classification data.
Classification thresholds are corresponding with classifying rules, and classification thresholds act on the data such as retrieval, the filtering of classifying rules formation, adjust Entire data drift rate, adjustment data are sorted out.
Step 324: every a kind of preliminary classification categorical data is identified to form data label as identical data.
Data label is using class categories as the data characteristics of independent dimension, so that linear point of preliminary classification categorical data Category feature quantization.
The data classification optimization method of the embodiment of the present invention is closed using the rule of class categories in industry customization disaggregated model Keyword and regular weight are filtered source data and classify and be identified, and can make full use of the process resource of computer system And storage resource, it efficiently completes dominant character classification and forms data label, improve overall data classification effectiveness.
As shown in figure 11, in an embodiment of the present invention, the process of formation data training set includes: in the present embodiment
Step 331: using data label as the characteristic of each preliminary classification categorical data.
Step 332: the characteristic dimension and feature vector of each preliminary classification categorical data are formed according to all kinds of characteristics.
Step 333: data training set is formed according to characteristic dimension and feature vector.
The data classification optimization method of the embodiment of the present invention, which is utilized, forms corresponding training according to intelligent classification model requirements Collection, training set include necessary data characteristics dimension and quantization vector, and data label forms recessive character meter as dominant character The basis of calculation utilizes the training for the dominant character substitution supervised learning that the process resource and storage resource of computer system are formed Data characteristics, so that needing the formation of grade efficient, high-quality.
As shown in figure 11, in an embodiment of the present invention, optimize in the present embodiment and the process classified includes:
Step 341: optimization being iterated to intelligent classification model by changing training set data scale.
Training set data scale can be the training set of continuous local data, continuous partial data training set, with The training set of the data of machine is applied to same intelligent classification model one by one.
Intelligent classification model in one embodiment of the invention uses model-naive Bayesian.
Step 342: being classified by intelligent classification model to whole source datas.
Intelligent classification model can form the whole classification of whole source datas, meet the analysis for determining business scale data and Classification.
Step 343: being classified by intelligent classification model to increment source data.
Intelligent classification model can form the consecutive sort of incremental data, meet the analysis for determining business creation data and divide Class.
The data classification optimization method of the embodiment of the present invention forms associated sub- training set by the fractionation of training set, utilizes The data difference of sub- training set is iterated optimization to intelligent classification model, improves intelligent classification model to recessive characteristic of division Classification effectiveness and quality.Existing classification data is obtained according to the data classification that source data generation time carries out increment and full dose simultaneously Further exhaustive division.
One embodiment of the invention data classification optimizes device
Memory, for storing the corresponding program code of above-described embodiment data classification optimization method treatment process;
Processor, for executing the program code of above-described embodiment data classification optimization method treatment process.
The framework that one embodiment of the invention data classification optimizes device is as shown in figure 12.In Figure 12, the present embodiment includes:
Disaggregated model forms module 3310, for forming industry customization disaggregated model using classification data structure;
Label mark module 3320 forms data mark for carrying out classification to source data by industry customization disaggregated model Label;
Training set forms module 3330, for source data to be divided data training set according to data label;
Classification forms module 3340, for completing to divide according to data training set Intelligent Optimal disaggregated model and to source data Class.
As shown in figure 12, in one embodiment of the invention, disaggregated model forms module 3310 further include:
Topological structure forms unit 3311, for establishing class categories by classification data structure according to business demand and dividing The topological structure of class classification;
Classifying rules forms unit 3312, for establishing corresponding classifying rules according to class categories;
Classification thresholds form unit 3313, for setting classification thresholds according to classifying rules;
Model forms unit 3314, for forming industry customization classification according to topological structure, classifying rules and classification thresholds Model.
As shown in figure 12, in one embodiment of the invention, label mark module 3320 further include:
Source data acquiring unit 3321, for obtaining interim source data;
Dominant taxon 3322, for obtaining dominant classification by industry customization disaggregated model filtration stage source data Data;
Classification adjustment unit 3323, for forming preliminary classification categorical data using classification thresholds adjustment classification data;
Tag making unit 3324, for identifying to form data mark as identical data to every a kind of preliminary classification categorical data Label.
As shown in figure 12, in one embodiment of the invention, training set forms module 3330 further include:
Label characteristics form unit 3331, for using data label as the characteristic of each preliminary classification categorical data;
Characteristic quantification unit 3332, for forming the characteristic dimension of each preliminary classification categorical data according to all kinds of characteristics And feature vector;
Training set synthesis unit 3333, for forming data training set according to characteristic dimension and feature vector.
As shown in figure 12, in one embodiment of the invention, classification forms module 3340 further include:
Iterative optimization unit 3341, it is excellent for being iterated by change training set data scale to intelligent classification model Change;
Whole taxon 3342, for being classified by intelligent classification model to whole source datas;
Increment sort unit 3343, for being classified by intelligent classification model to increment source data.
The Classified optimization side shown using the optimization method of above-described embodiment big data classifying rules, above-described embodiment data The data classification system and data point that the data classification system and above-described embodiment data classification optimization method that method is formed are formed Class result has good data requirements features of response.The data structure of combined data classification system can form efficient data Interaction display technology scheme.
The data processing method of one embodiment of the invention classification interactive interface is as shown in figure 13.In Figure 13, the present invention is real Applying example includes:
Step 410: the class categories the of relevant classification classification are formed according to the first interactive data retrieval class categories data One topological structure.
First interaction data includes search key, and search key can be class categories keyword or and class categories Keyword senses are close or text similar in vocabulary, the similarity mode for passing through search key and class categories keyword obtains Immediate relevant classification classification, and classification class is formed comprising logic according to the class categories data under relevant classification classification Other first topological structure.
Step 420: class categories keyword being formed according to the first topological structure of class categories and is orderly shown.
The first topological structure of class categories can form tree-like, layer by mature data display technique in display frame Folded or grouping orderly display.
Step 430: the first classification results data to search result adaptation are formed according to the first topological structure of class categories Collection carries out classification results data and orderly shows.
Class categories (the classification class including dominant attribute and stealthy attribute retained in class categories data structure system Not there is corresponding classification data, classification data trains the data classification model formed or keyword classification system according to corresponding) It is formed.
Step 440: class categories combinational logic being formed according to the second interaction data, is formed according to class categories combinational logic The second topological structure of class categories orderly shows class categories keyword according to the second topological structure of class categories.
Second interaction data includes to (the different node positions in data structure of topological node in the first topological structure of class categories The class categories set) selection, selection includes the combination or choice of node, and selection passes through topological node in the first topological structure The selection of the keyword of class categories embodies class categories combinational logic.
Determining class categories combinational logic forms the combination or choice of class categories, and then forms class categories second and open up Flutter structure.
The second topological structure of class categories can form tree-like, layer by mature data display technique in display frame Folded or grouping orderly adjustment.
Step 450: the second classification results data to search result adaptation are formed according to the second topological structure of class categories Collection carries out classification results data and orderly shows.
Class categories (the classification class including dominant attribute and stealthy attribute retained in class categories data structure system Not there is corresponding classification data, classification data trains the data classification model formed or keyword classification system according to corresponding) It is formed.
Adaptation includes the sequence to the data in different classifications data, duplicate removal or index.
The class categories number formed during the data processing method data classification of classification interactive interface of the embodiment of the present invention According to the classification data that structure and class categories and data classification source data are formed, is formed in conjunction with display technology and number is retrieved to magnanimity According in due course classification, sort merge and Data Matching show, avoid data handling procedure in existing classification interactive interface and handed over Mutual process influences search process and data classification dimension matching degree is limited, cannot quickly form data positioning and data match group is closed Defect.The classification system that searching classification process in interactive interface is formed with class categories data structure can orderly be tied It closes, the data information that data classification is formed sufficiently is presented in interactive process.One embodiment of the invention classification interaction The data processing method at interface is as shown in figure 14.In Figure 14, the embodiment of the present invention forms the first topological structure of class categories Process includes:
Step 411: similar key is determined in class categories data according to the first interaction data.
First interaction data includes searching keyword, fuzzy keyword or text paragraph.
Step 412: relevant classification classification is determined according to similar key.
Compare similarity using weight parameter regular in the fuzzy matching algorithm and class categories data of text with determination Relevant classification classification.
Step 413: the first topological structure of class categories is established according to relevant classification classification and class categories data structure.
By determining that relevant classification classification obtains the counterpart node position and associated bottom in class categories data structure Or host node position, the topological structure of associative classification classification is consequently formed.
The data processing method of one embodiment of the invention classification interactive interface is as shown in figure 14.In Figure 14, the present invention is real Applying the process that example formation class categories keyword is orderly shown includes:
Step 421: the class categories keyword root in the first topological structure of class categories is formed according to the first topological structure Optimize the first display data of topological structure.
Shown in the first topological structure of class categories according to the class categories data structure in above-mentioned data classification embodiment It include the classification informations such as the corresponding keyword of class categories, classifying rules and classified weight.
Step 422: the first display data are shown according to the data display strategy of display frame.
First display data include the Topological Mapping structure between class categories keyword, with the tree data knot between vocabulary Structure.
The data processing method of one embodiment of the invention classification interactive interface is as shown in figure 14.In Figure 14, the present invention is real Apply example formed the first classification results data set process include:
Step 431: determining the class categories in the first topological structure of class categories.
Determining class categories can be obtained according to each back end of the first topological structure of class categories.
Step 432: determining corresponding classification data according to class categories.
Show that source data can form classification number according to the class categories data structure in above-mentioned data classification embodiment According to.
Step 433: merging classification data progress duplicate removal to form the first classification results data set.
Influenced by characteristic of division diversity that there are redundant datas in classification data.
Step 434: the first classification results data set is shown according to the data display strategy of display frame.
First classification results data set forms as inquiry or search result data and shows content.
The data processing method of one embodiment of the invention classification interactive interface is as shown in figure 14.In Figure 14, the present invention is real Apply example formed the second topological structure of class categories process include:
Step 441: receiving the interactive selection to class categories keyword.
Interactive selection includes selecting the keyword of class categories, including increase or exclude.
Step 442: determining that class categories combinational logic forms the second topological structure of class categories according to interactive selection.
The selection result formed after selecting keyword carries out the ownership logic judgment of corresponding data structure node, into And form topological structure.
Step 443: the second display data and basis of optimization topological structure are formed according to the second topological structure of class categories The data display strategy of display frame is shown.
Second display data include the Topological Mapping structure between class categories keyword, with the tree data knot between vocabulary Structure.
The data processing method of one embodiment of the invention classification interactive interface is as shown in figure 14.In Figure 14, the present invention is real Apply example formed the second classification results data set process include:
Step 451: the variation of the second topological structure of class categories being followed to determine corresponding classification data.
Show that source data can form classification number according to the class categories data structure in above-mentioned data classification embodiment According to.
Step 452: merging classification data progress duplicate removal to form the second classification results data set.
Influenced by characteristic of division diversity that there are redundant datas in classification data.
Step 453: following data of the display of the second display data by the second classification results data set according to display frame Display strategy is shown.
One embodiment of the invention classification interactive interface data processing equipment include:
Memory, for storing the corresponding program of data processing method treatment process of above-described embodiment classification interactive interface Code;
Processor, the program generation of the data processing method treatment process for executing above-described embodiment classification interactive interface Code.
The data processing equipment of one embodiment of the invention classification interactive interface is as shown in figure 15.In Figure 15, the present embodiment Include:
First structure forms module 4410, for forming relevant classification according to the first interactive data retrieval class categories data The first topological structure of class categories of classification;
First structure display module 4420 has for forming class categories keyword according to the first topological structure of class categories Sequence is shown;
First data exhibiting module 4430, for being formed according to the first topological structure of class categories to search result adaptation First classification results data set carries out classification results data and orderly shows;
Second structure formed module 4440, for according to the second interaction data formed class categories combinational logic, according to point Class category combinations logic forms the second topological structure of class categories, according to the second topological structure of class categories to class categories key Word is orderly shown;
Second data exhibiting module 4450, for being formed according to the second topological structure of class categories to search result adaptation Second classification results data set carries out classification results data and orderly shows.
As shown in figure 15, in one embodiment of the invention, first structure forms module 4410 and includes:
Similar vocabulary determination unit 4411, for determining similar key in class categories data according to the first interaction data Word;
Classification determination unit 4412, for determining relevant classification classification according to similar key;
First topology establishes unit 4413, for establishing classification class according to relevant classification classification and class categories data structure Other first topological structure.
As shown in figure 15, in one embodiment of the invention, first structure display module 4420 includes:
First display planning unit 4421, for by the class categories keyword root evidence in the first topological structure of class categories First topological structure forms the first display data of optimization topological structure;
First display transmission unit 4422, for carrying out the first display data according to the data display strategy of display frame Display.
As shown in figure 15, in one embodiment of the invention, the first data exhibiting module 4430 includes:
Classification determination unit 4431, for determining the class categories in the first topological structure of class categories;
Classification data determination unit 4432, for determining corresponding classification data according to class categories;
Classification data integral unit 4433, for merging classification data progress duplicate removal to form the first classification results data Collection;
Classification data transmission unit 4434, for the first classification results data set to be shown plan according to the data of display frame Slightly shown.
As shown in figure 15, in one embodiment of the invention, the second structure forms module 4440 and includes:
Unit 4441 is established in interaction, for receiving the interactive selection to class categories keyword;
Second topological determination unit 4442, for determining that class categories combinational logic forms class categories according to interactive selection Second topological structure;
Second data transmission unit 4443, for forming the of optimization topological structure according to the second topological structure of class categories Two display data are simultaneously shown according to the data display strategy of display frame.
As shown in figure 15, in one embodiment of the invention, the second data exhibiting module 4450 includes:
Second classification determination unit 4451, for following the variation of the second topological structure of class categories to determine corresponding classification Data;
Second Data Integration unit 4452, for merging classification data progress duplicate removal to form the second classification results data Collection;
Data set transmissions unit 4453, for follow the display of the second display data by the second classification results data set according to The data display strategy of display frame is shown.
In one embodiment of the invention, processor can be using DSP (Digital Signal Processing) number letter Number processor, FPGA (Field-Programmable Gate Array) field programmable gate array, MCU (Microcontroller Unit) system board, SoC (system on a chip) system board or the PLC including I/O (Programmable Logic Controller) minimum system.
The foregoing is only a preferred embodiment of the present invention, but scope of protection of the present invention is not limited thereto, In the technical scope disclosed by the present invention, any changes or substitutions that can be easily thought of by anyone skilled in the art, It should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the protection model of claims Subject to enclosing.

Claims (10)

1. a kind of optimization method of big data classifying rules characterized by comprising
Establish the data structure of storage rule;
The rule set of determining theme is formed by the data structure;
Scene classification is carried out to source data according to the rule set.
2. the optimization method of big data classifying rules as described in claim 1, which is characterized in that further include:
The rule set is parsed and updates storage rule.
3. the optimization method of big data classifying rules as claimed in claim 2, which is characterized in that further include:
Partial data structure is parsed and obtained to the rule set and rule is forwarded.
4. the optimization method of big data classifying rules as claimed in claim 3, which is characterized in that the storage rule of establishing Data structure includes:
Set the texture field of single rule;
Set the field keyword and compound fields keyword of single rule;
The data structure of the rule is formed according to the field topological structure of single rule.
5. the optimization method of big data classifying rules as claimed in claim 3, which is characterized in that described to pass through the data knot Be configured to determine theme rule set include:
Establish rule set keyword;
Establish rule set rule set subset and corresponding subset keyword;
Establish the scene collection and corresponding scene keyword of rule set subset;
Keyword topological structure is established according to the rule set keyword, the subset keyword or the scene keyword;
The data structure for increasing rule in the keyword topological structure adds rule ginseng in the data structure of the rule Several or relevant parameter.
6. the optimization method of big data classifying rules as claimed in claim 3, which is characterized in that described according to the rule set Carrying out scene classification to source data includes:
Corresponding preliminary source data is determined according to preliminary classification;
Rule set corresponding with the preliminary classification is determined in rule set;
Scene classification classification and scene classification supplemental characteristic are extracted from the rule set;
Classified using the scene classification supplemental characteristic to the preliminary source data, is formed described under determining preliminary classification Corresponding classification source data under scene classification classification;
The result classification data of preliminary classification is formed according to the classification source data.
7. the optimization method of big data classifying rules as claimed in claim 3, which is characterized in that it is described to the rule set into Row parses and updates storage rule
It obtains determining classification and determines upgrading demand for scene;
According to upgrade demand determining rules data structure and the Rule content, shown by interactive frame;
More new content is updated into the corresponding rules data structure and Rule content by the interactive frame.
8. the optimization method of big data classifying rules as claimed in claim 3, which is characterized in that it is described to the rule set into Row parses and obtains Partial data structure and rule is forwarded and includes:
The associated topologies structure for determining the data structure of classification and determining scene is obtained according to forwarding demand;
Interim rules data structure and Rule content are formed according to the associated topologies structure;
The interim rules data structure and Rule content are formed into independent data objects, data link is provided.
9. a kind of big data classifying rules optimizes device characterized by comprising
Memory, it is corresponding for storing any big data classifying rules optimization method treatment process of figure claim 1 to 8 Program code;
Processor, for executing said program code.
10. a kind of big data classifying rules optimizes device characterized by comprising
Rule setting module, for establishing the data structure of storage rule;
Rule forms module, for forming the rule set of determining theme by the data structure;
Rule application module, for carrying out scene classification to source data according to the rule set.
CN201910280279.6A 2019-04-09 2019-04-09 A kind of optimization method and device of big data classifying rules Pending CN110096519A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910280279.6A CN110096519A (en) 2019-04-09 2019-04-09 A kind of optimization method and device of big data classifying rules

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910280279.6A CN110096519A (en) 2019-04-09 2019-04-09 A kind of optimization method and device of big data classifying rules

Publications (1)

Publication Number Publication Date
CN110096519A true CN110096519A (en) 2019-08-06

Family

ID=67444547

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910280279.6A Pending CN110096519A (en) 2019-04-09 2019-04-09 A kind of optimization method and device of big data classifying rules

Country Status (1)

Country Link
CN (1) CN110096519A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111881287A (en) * 2019-09-10 2020-11-03 马上消费金融股份有限公司 Classification ambiguity analysis method and device
CN112445810A (en) * 2020-12-11 2021-03-05 中国人寿保险股份有限公司 Data updating method and device for data warehouse, electronic device and storage medium
CN112800138A (en) * 2021-02-04 2021-05-14 广东云曌医疗科技有限公司 Big data classification method and system
CN113190650A (en) * 2021-04-21 2021-07-30 武汉卓尔信息科技有限公司 Method and system for screening big data of industrial product
CN113271232A (en) * 2020-10-27 2021-08-17 苏州铁头电子信息科技有限公司 Online office network disturbance processing method and device
CN114168075A (en) * 2021-11-29 2022-03-11 华中科技大学 Method, equipment and system for improving load access performance based on data relevance
CN114860797A (en) * 2022-03-16 2022-08-05 电子科技大学 Data derivation processing method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1889582A (en) * 2005-06-30 2007-01-03 华为技术有限公司 Method for conducting sorting to multi-protocol tag exchange business stream
CN102414677A (en) * 2009-04-22 2012-04-11 微软公司 Data classification pipeline including automatic classification rules
CN103678447A (en) * 2012-09-04 2014-03-26 Sap股份公司 Multivariate transaction classification
CN103729428A (en) * 2013-12-25 2014-04-16 中国科学院计算技术研究所 Big data classification method and system
US20150278313A1 (en) * 2005-05-24 2015-10-01 International Business Machines Corporation Tagging of facet elements in a facet tree
CN107704869A (en) * 2017-09-01 2018-02-16 厦门快商通科技股份有限公司 A kind of corpus data methods of sampling and model training method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150278313A1 (en) * 2005-05-24 2015-10-01 International Business Machines Corporation Tagging of facet elements in a facet tree
CN1889582A (en) * 2005-06-30 2007-01-03 华为技术有限公司 Method for conducting sorting to multi-protocol tag exchange business stream
CN102414677A (en) * 2009-04-22 2012-04-11 微软公司 Data classification pipeline including automatic classification rules
CN103678447A (en) * 2012-09-04 2014-03-26 Sap股份公司 Multivariate transaction classification
CN103729428A (en) * 2013-12-25 2014-04-16 中国科学院计算技术研究所 Big data classification method and system
CN107704869A (en) * 2017-09-01 2018-02-16 厦门快商通科技股份有限公司 A kind of corpus data methods of sampling and model training method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张明卫: "一种大数据环境中分布式辅助关联分类算法", 《软件学报》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111881287A (en) * 2019-09-10 2020-11-03 马上消费金融股份有限公司 Classification ambiguity analysis method and device
CN111881287B (en) * 2019-09-10 2021-08-17 马上消费金融股份有限公司 Classification ambiguity analysis method and device
CN113271232A (en) * 2020-10-27 2021-08-17 苏州铁头电子信息科技有限公司 Online office network disturbance processing method and device
CN112445810A (en) * 2020-12-11 2021-03-05 中国人寿保险股份有限公司 Data updating method and device for data warehouse, electronic device and storage medium
CN112800138A (en) * 2021-02-04 2021-05-14 广东云曌医疗科技有限公司 Big data classification method and system
CN112800138B (en) * 2021-02-04 2021-10-15 广东云曌医疗科技有限公司 Big data classification method and system
CN113190650A (en) * 2021-04-21 2021-07-30 武汉卓尔信息科技有限公司 Method and system for screening big data of industrial product
CN114168075A (en) * 2021-11-29 2022-03-11 华中科技大学 Method, equipment and system for improving load access performance based on data relevance
CN114860797A (en) * 2022-03-16 2022-08-05 电子科技大学 Data derivation processing method

Similar Documents

Publication Publication Date Title
CN110096519A (en) A kind of optimization method and device of big data classifying rules
CN107728995B (en) A kind of technical documentation auxiliary writing system and method
CN109886349B (en) A kind of user classification method based on multi-model fusion
CN102067128A (en) Data processing device, data processing method, program, and integrated circuit
CN105975531B (en) Robot dialog control method and system based on dialogue knowledge base
CN110263180A (en) It is intended to knowledge mapping generation method, intension recognizing method and device
CN106503863A (en) Based on the Forecasting Methodology of the age characteristicss of decision-tree model, system and terminal
CN108459955A (en) Software Defects Predict Methods based on depth autoencoder network
CN107977363A (en) Title generation method, device and electronic equipment
CN110442725A (en) Entity relation extraction method and device
WO2023124191A1 (en) Depth map matching-based automatic classification method and system for medical data elements
CN109582849A (en) A kind of Internet resources intelligent search method of knowledge based map
CN107918657A (en) The matching process and device of a kind of data source
CN110377751A (en) Courseware intelligent generation method, device, computer equipment and storage medium
CN112308115A (en) Multi-label image deep learning classification method and equipment
CN110059177A (en) A kind of activity recommendation method and device based on user&#39;s portrait
CN110083663A (en) A kind of Classified optimization method and apparatus that data are shown
CN108647258A (en) A kind of expression learning method based on entity associated constraint
CN110109902A (en) A kind of electric business platform recommender system based on integrated learning approach
CN111814528B (en) Connectivity analysis noctilucent image city grade classification method
CN107016566A (en) User model construction method based on body
Castano et al. Classifying and reusing conceptual schemas
CN110110756A (en) A kind of data classification optimization method and optimization device
CN110084668A (en) A kind of data processing method and data processing equipment of interactive interface of classifying
CN104765763B (en) A kind of semantic matching method of the Heterogeneous Spatial Information classification of service based on concept lattice

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190806

RJ01 Rejection of invention patent application after publication