CN110096519A - A kind of optimization method and device of big data classifying rules - Google Patents
A kind of optimization method and device of big data classifying rules Download PDFInfo
- Publication number
- CN110096519A CN110096519A CN201910280279.6A CN201910280279A CN110096519A CN 110096519 A CN110096519 A CN 110096519A CN 201910280279 A CN201910280279 A CN 201910280279A CN 110096519 A CN110096519 A CN 110096519A
- Authority
- CN
- China
- Prior art keywords
- data
- classification
- rule
- scene
- keyword
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24564—Applying rules; Deductive queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/285—Clustering or classification
Abstract
The present invention provides the optimization methods and device of a kind of big data classifying rules, solve the technical issues of existing rules customization process lacks efficient multiplexing and improves.It include: the data structure for establishing storage rule;The rule set of determining theme is formed by the data structure;Scene classification is carried out to source data according to the rule set.The complete topology structure of subject classification and scene classification is formed using data structure, the parameter of different classifications process is in an ordered configuration, so that the parameter invocation procedure of big data disaggregated model complexity forms structural categories system, so that the classification iterative parameter data in disaggregated model, key enumerates data, assorting process adjusting thresholds section can effectively manage and reasonable disposition, so that big data disaggregated model has the technological means moderately adjusted for source data changing features, avoid the frequent adjusting training collection of differentiation and learning process of the objective scene for source data forming process, reduce the time cost and cost of labor of big data classification.
Description
Technical field
The present invention relates to data classification technology fields, and in particular to a kind of optimization method and dress of big data classifying rules
It sets.
Background technique
In the prior art, big data classification is the similar features obtained between data by valid data classification method, according to
Data are divided into different classes of by similar features, and carry out further characteristic processing according to classification.It is learnt by oneself in data classification method
It practises classification method to be limited by training set, the disaggregated model and big data industry field of formation are positively correlated, and disaggregated model does not have logical
The property used.It is limited to the training set data human cost and training time cost of disaggregated model simultaneously, for flexible data classification
Scene can only realize the efficient process of fixed cluster classification, cannot do to the data classification Change of types in source data evolutionary process
Adjustment in time out.So that the classifying rules of disaggregated model does not have systematicness and continuity, rules customization process lacks effective
A possibility that multiplexing and improvement, can not form general classifying rules customization means for the variation of data classification subject scenes.
This, which just directly results in the corresponding data processing model in memory environment, model composition data and model addressing framework shortage
The addressing and operational capacity of effect considerably increase each of model reconstruction so that data and model addressing can not be reused effectively
Kind cost.
Summary of the invention
In view of the above problems, the embodiment of the present invention provides the optimization method and device of a kind of big data classifying rules, solves
The technical issues of existing rules customization process lacks efficient multiplexing and improves.
The optimization method of the big data classifying rules of the embodiment of the present invention, comprising:
Establish the data structure of storage rule;
The rule set of determining theme is formed by the data structure;
Scene classification is carried out to source data according to the rule set.
In one embodiment of the invention, further includes:
The rule set is parsed and updates storage rule.
In one embodiment of the invention, further includes:
Partial data structure is parsed and obtained to the rule set and rule is forwarded.
In one embodiment of the invention, the data structure for establishing storage rule includes:
Set the texture field of single rule;
Set the field keyword and compound fields keyword of single rule;
The data structure of the rule is formed according to the field topological structure of single rule.
In one embodiment of the invention, the rule set for forming determining theme by the data structure includes:
Establish rule set keyword;
Establish rule set rule set subset and corresponding subset keyword;
Establish the scene collection and corresponding scene keyword of rule set subset;
Keyword topology knot is established according to the rule set keyword, the subset keyword or the scene keyword
Structure;
The data structure for increasing rule in the keyword topological structure adds rule in the data structure of the rule
Then parameter or relevant parameter.
It is described to include: to source data progress scene classification according to the rule set in one embodiment of the invention
Corresponding preliminary source data is determined according to preliminary classification;
Rule set corresponding with the preliminary classification is determined in rule set;
Scene classification classification and scene classification supplemental characteristic are extracted from the rule set;
Classified using the scene classification supplemental characteristic to the preliminary source data, is formed and determined under preliminary classification
Corresponding classification source data under the scene classification classification;
The result classification data of preliminary classification is formed according to the classification source data.
In one embodiment of the invention, it is described the rule set is parsed and updates storage rule include:
It obtains determining classification and determines upgrading demand for scene;
According to upgrade demand determining rules data structure and the Rule content, shown by interactive frame;
More new content is updated into the corresponding rules data structure and Rule content by the interactive frame.
It is described that Partial data structure and rule progress are parsed and obtained to the rule set in one embodiment of the invention
Forwarding includes:
The associated topologies structure for determining the data structure of classification and determining scene is obtained according to forwarding demand;
Interim rules data structure and Rule content are formed according to the associated topologies structure;
The interim rules data structure and Rule content are formed into independent data objects, data link is provided.
The big data classifying rules of the embodiment of the present invention optimizes device, comprising:
Memory, for storing any big data classifying rules optimization method treatment process of figure claim 1 to 8
Corresponding program code;
Processor, for executing said program code.
The big data classifying rules of the embodiment of the present invention optimizes device, comprising:
Rule setting module, for establishing the data structure of storage rule;
Rule forms module, for forming the rule set of determining theme by the data structure;
Rule application module, for carrying out scene classification to source data according to the rule set.
The big data classifying rules optimization method and device of the embodiment of the present invention using data structure formed subject classification and
The complete topology structure of scene classification, the parameter of different classifications process is in an ordered configuration, so that big data disaggregated model is multiple
Miscellaneous parameter invocation procedure forms structural categories system, so that classification iterative parameter data, the key in disaggregated model are enumerated
Data, assorting process adjusting thresholds section can effectively manage and reasonable disposition, so that big data disaggregated model is directed to source number
There is the technological means moderately adjusted according to changing features, the differentiation of the objective scene for source data forming process is avoided frequently to adjust
Whole training set and learning process reduce the time cost and cost of labor of big data classification, improve big data disaggregated model
Application versatility and model tolerance.
Detailed description of the invention
Fig. 1 show the flow diagram of one embodiment of the invention big data classifying rules optimization method.
Fig. 2 show the process signal that data structure is established in one embodiment of the invention big data classifying rules optimization method
Figure.
Fig. 3 show the process signal that rule set is established in one embodiment of the invention big data classifying rules optimization method
Figure.
Fig. 4 show the process signal that scene classification is carried out in one embodiment of the invention big data classifying rules optimization method
Figure.
Fig. 5 show the stream for carrying out data processing in one embodiment of the invention big data classifying rules optimization method to rule
Journey schematic diagram.
Fig. 6 show the configuration diagram of one embodiment of the invention big data classifying rules optimization device.
Fig. 7 show the flow diagram of the Classified optimization method of one embodiment of the invention data displaying.
Fig. 8, which is shown in the Classified optimization method of one embodiment of the invention data displaying, to be formed classification data structure, establishes
Class categories and the flow diagram for being associated with related data.
Fig. 9 show the configuration diagram of the Classified optimization device of one embodiment of the invention data displaying.
Figure 10 show the main flow schematic diagram of one embodiment of the invention data classification optimization method.
Figure 11 show the detailed process schematic diagram of one embodiment of the invention data classification optimization method.
Figure 12 show the configuration diagram of one embodiment of the invention data classification optimization device.
Figure 13 show the main flow schematic diagram of the data processing method of one embodiment of the invention classification interactive interface.
Figure 14 show the detailed process schematic diagram of the data processing method of one embodiment of the invention classification interactive interface.
Figure 15 show the configuration diagram of the data processing equipment of one embodiment of the invention classification interactive interface.
Specific embodiment
To be clearer and more clear the objectives, technical solutions, and advantages of the present invention, below in conjunction with attached drawing and specific embodiment party
The invention will be further described for formula.Obviously, described embodiments are only a part of the embodiments of the present invention, rather than all
Embodiment.Based on the embodiments of the present invention, those of ordinary skill in the art institute without creative efforts
The every other embodiment obtained, shall fall within the protection scope of the present invention.The number of step and the process of data processing are unrelated.
The big data classifying rules optimization method of one embodiment of the invention is as shown in Figure 1.In Fig. 1, the present embodiment includes:
Step 110: establishing the data structure of storage rule.
Data structure is used for the design parameter of storage rule, according to design parameter according to each rule of determination to big data source
The classification benchmark of data forms assorting process.
Rule can be the numerical value of design parameter or relevant parameter, character string, feature vector value, data referencing position, number
It is judged that logic or rule verification data etc., the design parameter or relevant parameter of data structure and rule have adapting universal,
Adapting universal guarantees that data structure adapts to the type and magnitude of design parameter or relevant parameter.
Step 120: the rule set of determining theme is formed by data structure.
Data evolutionary process and the change in process that source data is formed in objective environment match, and the variation of scene of classifying can be with
The set of the data evolutionary process of the source data of response taxonomy theme, scene of classifying has determined the characteristic of division of classification scheme, point
Class theme includes further careful classification scene.Utilize data structure storage design parameter corresponding with classification scene or association
The corresponding rule of scene of classifying is made up of the rule set of single theme data structure by parameter.
Step 130: scene classification is carried out to source data according to rule set.
Source data subject classification is carried out according to rule set, is formed in subject classification on the basis of forming Data subject classification
The classification of different scenes.
According to the decision logic formed between the type, meaning of parameters and parameter of parameter of regularity in rule set, scene point is carried out
Parameter of regularity can be used for the data handling procedures such as retrieval, sequence or exclusion in rule set when class.
The big data classifying rules optimization method of the embodiment of the present invention forms subject classification and scene point using data structure
The complete topology structure of class, the parameter of different classifications process is in an ordered configuration, so that the ginseng of big data disaggregated model complexity
Number invocation procedure forms structural categories system, so that classification iterative parameter data, the key in disaggregated model are enumerated data, divided
Class process adjusting thresholds section can effectively manage and reasonable disposition, so that big data disaggregated model becomes for source data feature
Changing has the technological means moderately adjusted, avoids the frequent adjusting training collection of differentiation of the objective scene for source data forming process
And learning process, the time cost and cost of labor of big data classification are reduced, the application for improving big data disaggregated model is logical
With property and model tolerance.
As shown in Figure 1, in an embodiment of the present invention, big data classifying rules optimization method further include:
Step 140: rule set being parsed and updates storage rule.
Parse and update be by read rule set data structure, according to data structure obtain determine rule parameter or
Relevant parameter forms the rule set for determining regular update, and then updating determining theme by modifying content of parameter.
The big data classifying rules optimization method of the embodiment of the present invention provides a kind of big data disaggregated model local rule
Positioning-amendment-replacement renewal process.While ensure that existing big data disaggregated model integrality, provides and classification is joined
Number carries out the modified technological means of source data suitability.The data type of source data or Long-term change trend are being counted greatly in time
According to making appropriate adjustment in disaggregated model, while guaranteeing the general trend and stable accuracy of big data classification.
As shown in Figure 1, in an embodiment of the present invention, big data classifying rules optimization method further include:
Step 150: Partial data structure is parsed and obtained to rule set and rule is forwarded.
Parsing and obtaining is to obtain by the data structure of reading rule set and be directed to specific type in big data disaggregated model
The part classification of data and the data topology structure of scene classification and the parameter of regularity of storage form rule for determining theme
Collection.
The big data classifying rules optimization method of the embodiment of the present invention provides a kind of big data disaggregated model local rule
Extraction process.It ensure that existing big data disaggregated model can be from true for the classifying rules of specific classification type or scene of classifying
Surely the rule classified and in accurately extract, realize effective multiplexing of reliable rules subset in mature big data disaggregated model.Rule
The classifying rules for other big data disaggregated models being initially formed is improved in the multiplexing of subset has good effect, so that big data
Disaggregated model can make in due course adaptation according to the commercial field or application field of source data, meet big data under business scenario
The specific aim and accuracy of classification.For the combination tool of business datum, professional knowledge map and searching structure under business environment
There is good data response effect.
The process of data structure is established as shown in Fig. 2 in big data classifying rules optimization method of the embodiment of the present invention.Scheming
In 2, the process for establishing data structure includes:
Step 111: setting the texture field of single rule.
Using texture field setting and regular design parameter or the matched data type of relevant parameter, meet data storage
The substantive requirements of form.Exist between texture field and includes logic.
Step 112: setting the field keyword and compound fields keyword of single rule.
The readability that field type increases single structure field is referred to by field keyword, passes through compound fields keyword
The readability for increasing composite construction field meets personnel and reads understanding.
Step 113: according to the data structure of the field topological structure formation rule of single rule.
Data structure can be using normal structure in the prior art, such as XML standard, JSON standard etc..It (can with XML
Extending mark language) for data structure standard, texture field is formed using XML standard, for texture field, keyword, benefit are set
It is registered in XML standard with keyword and forms determining field topological structure, and then form the data structure of rules at different levels.Data
The citation form of structure can be such that
<theme rule keyword><
Regular 1 ((parameter type supplemental characteristic), (parameter type supplemental characteristic), (parameter type supplemental characteristic))
Regular 2 ((parameter type supplemental characteristic), (parameter type supplemental characteristic), (parameter type supplemental characteristics))
……
></subject rule keywords>
Example 1
<[1.0]<accrue>(" electric business ", " e-commerce ", " difference quotient ") // with " electric business ", " e-commerce ", " difference quotient "
For sort key word, weight coefficient is 1.0 in classification results
[0.5]<accrue>(" shop ", " B2B ", " O2O ") // with " shop ", " B2B ", " O2O " is that sort key word exists
In classification results weight coefficient be 0.5 >
Example 2
< [1.0] " electric business ", [1.0] " e-commerce ", [1.0] " difference quotient ", [0.5] " shop ", [0.5] " B2B ",
[0.5] " O2O ") // sort key word " electric business ", " e-commerce ", " difference quotient ", " shop ", " B2B " and " O2O " respective weights
Coefficient be [1.0], [1.0], [1.0], [0.5], [0.5] and [0.5] >
Big data classifying rules optimization method of the embodiment of the present invention establishes base according to the topological structure of keyword and keyword
This data structure makes the type adaptation of data structure and parameter of regularity, and the readability of data structure is improved by keyword.
The process of rule set is established as shown in Fig. 3 in one embodiment of the invention big data classifying rules optimization method.Scheming
In 3, the process for establishing rule set includes:
Step 121: establishing rule set keyword.
Step 122: establish rule set rule set subset and corresponding subset keyword.
Step 123: establishing the scene collection and corresponding scene keyword of rule set subset.
Step 124: keyword topological structure is established according to rule set keyword, subset keyword or scene keyword.
Step 125: increasing the data structure of rule in keyword topological structure, rule are added in the data structure of rule
Then parameter or relevant parameter.
By taking XML (extensible markup language) data structure standard as an example, add in conjunction with the data structure that above-described embodiment is formed
Theme and the corresponding parameter of regularity of theme is added to form following rule set:
<ruleset name>
<rule name>
<regular scene title><
Rule 1
Rule 2
…>
Example 1
<topic name=" other "><[cdata (
{ political economy },
{ business },
{ society },
)]>
<topic name=" political economy "><[cdata (
{ political situation of the time comment },
{ macroeconomy },
{ politician },
)]>
<comment of the topic name=" political situation of the time "><[cdata (
[1.0] (" economic reform ", " political reform ", " political reform ", " educational reform ", " Education equity ", " college entrance examination immigrant ",
" household registration system ")
[0.5] (" Gini coefficient ", " network public-opinion ", " liberalism ", " political examination ", " political development ", " political culture ",
" nihilism ")
)]>
</topic>
Big data classifying rules optimization method of the embodiment of the present invention establishes classifying rules and scene according to each rank keyword
The topological structure of rule, and fill in specific rules supplemental characteristic.Meet data structure on the basis of guaranteeing parameter storage efficiency
With the readability of data content.
The process of scene classification is carried out in one embodiment of the invention big data classifying rules optimization method as shown in Fig. 4.?
In Fig. 4, the process of scene classification includes:
Step 131: corresponding preliminary source data is determined according to preliminary classification.
Preliminary classification can be a kind of base categories, be also possible to the classification that a kind of data classification model on basis is formed.
Preliminary source data is that some determines corresponding source data of classifying after base categories.
Step 132: rule set corresponding with preliminary classification is determined in rule set.
Rule set may include rule (son) collection of different classifications, and regular (son) concentrates the field including different field or classification
The rule (son) of scape class categories collects.
Step 133: extracting scene classification classification and scene classification supplemental characteristic from rule set.
Scene classification classification can have default parameters, it is possible to have customize supplemental characteristic.
Step 134: being classified using scene classification supplemental characteristic to preliminary source data, formed and determined under preliminary classification
Corresponding classification source data under scene classification classification.
Scene classification supplemental characteristic as further data classification, to preliminary source data formed scene classification classification into one
Step is classified, the threshold decision of existing characteristics in assorting process, excludes to form classification source data except classification there are source data
Possibility.
Step 135: the result classification data of preliminary classification is formed according to classification source data.
The screening to preliminary source data is formed using the set of classification source data, is conducive to exclude the number after source data evolution
According to noise.
Big data classifying rules optimization method of the embodiment of the present invention is using rule set to basic big data disaggregated model shape
At the original source data of determination preliminary classification carry out further scene classification, filtered in original source data by scene classification
Bias data, and more accurately classification scene and corresponding classification data are formed, it improves in basic classification model according to scene spy
Levy the further characteristic of division formed and nicety of grading.
Process such as Fig. 5 institute of data processing is carried out in one embodiment of the invention big data classifying rules optimization method to rule
Show.In Fig. 5, the process for being updated data processing to rule includes:
Step 141: obtaining determining classification and determine upgrading demand for scene.
Step 142: according to determining rules data structure and the Rule content of upgrading demand, being shown by interactive frame.
Step 143: more new content is updated to the data structure and Rule content of the rule of correspondence by interactive frame.
In Fig. 5, include: to the process that rule carries out local acquisition and forwards data processing
Step 151: the associated topologies structure for determining the data structure of classification and determining scene is obtained according to forwarding demand.
Step 152: interim rules data structure and Rule content are formed according to associated topologies structure.
Step 153: interim rules data structure and Rule content being formed into independent data objects, data link is provided.
Big data classifying rules optimization method of the embodiment of the present invention is by the partial parameters root of complete big data disaggregated model
It extracts and updates according to data structure, avoid the destruction to big data disaggregated model integrality and the interference of data assorting process.
The update of classifying rules can have preferable timeliness, can substantially meet classification source data type and domain features it is progressive
Formula develops.The part of classifying rules is extracted and the formation of response data object can be used as the presence of independent rules data source, is it
His big data disaggregated model improves the reliable rule of data classification under related fields and scene, improves the reuse of authentic data rule
Property, reduce big data classification cost.
One embodiment of the invention big data classifying rules optimizes device
Memory, for storing above-described embodiment big data classifying rules optimization method treatment process corresponding program generation
Code;
Processor, for executing the program code of above-described embodiment big data classifying rules optimization method treatment process
The framework that one embodiment of the invention big data classifying rules optimizes device is as shown in Figure 6.In Fig. 6, the present embodiment
Include:
Rule setting module 1110, for establishing the data structure of storage rule;
Rule forms module 1120, for forming the rule set of determining theme by data structure;
Rule application module 1130, for carrying out scene classification to source data according to rule set.
As shown in fig. 6, in an embodiment of the present invention, further includes:
Policy Updates module 1140, for being parsed to rule set and updating storage rule.
As shown in fig. 6, in an embodiment of the present invention, further includes:
Rule Extraction module 1150 turns for Partial data structure and rule to be parsed and obtained to rule set
Hair.
As shown in fig. 6, in an embodiment of the present invention, rule setting module 1110 includes:
Rule field forms unit 1111, for setting the texture field of single rule;
Regular keyword forms unit 1112, for setting the field keyword and compound fields keyword of single rule;
Regular texture forms unit 1113, for the data knot according to the field topological structure formation rule of single rule
Structure.
As shown in fig. 6, in an embodiment of the present invention, rule forms module 1120 and includes:
Primary keyword setting unit 1121, for establishing rule set keyword;
Rules subset setting unit 1122, for establish rule set rule set subset and corresponding subset keyword;
Regular scene setting unit 1123, for establishing the scene collection and corresponding scene keyword of rule set subset;
Crucial word association setting unit 1124, for being built according to rule set keyword, subset keyword or scene keyword
Vertical keyword topological structure;
Parameter of regularity fills unit 1125, for increasing the data structure of rule in keyword topological structure, in rule
Data structure in add parameter of regularity or relevant parameter.
As shown in fig. 6, in an embodiment of the present invention, rule application module 1130 includes:
Preliminary data acquiring unit 1131, for determining corresponding preliminary source data according to preliminary classification;
Rule determination unit 1132, for determining rule set corresponding with preliminary classification in rule set;
Regular configuration unit 1133, for extracting scene classification classification and scene classification supplemental characteristic from rule set;
Execution unit 1134 of classifying is formed true for being classified using scene classification supplemental characteristic to preliminary source data
Determine corresponding classification source data under the scene classification classification under preliminary classification;
Data Synthesis unit 1135, for forming the result classification data of preliminary classification according to classification source data.
As shown in fig. 6, in an embodiment of the present invention, Policy Updates module 1140 includes:
Receiving unit 1141 is updated, for obtaining determining classification and determining upgrading demand for scene;
Interactive unit 1142 is updated, for passing through interaction according to upgrade demand determining rules data structure and Rule content
Frame is shown;
Receiving unit 1143 is updated, for more new content to be updated to data structure and the rule of the rule of correspondence by interactive frame
Then content.
As shown in fig. 6, in an embodiment of the present invention, Rule Extraction module 1150 includes:
Forwarding demand receiving unit 1151, for obtaining the data structure for determining classification and determining scene according to forwarding demand
Associated topologies structure;
Demand determination unit 1152, for forming interim rules data structure and Rule content according to associated topologies structure;
Demand separate unit 1153 is mentioned for interim rules data structure and Rule content to be formed independent data objects
For data link.
The optimization method of big data classifying rules makes classifying rules structuring, to the reuse of classifying rules, extension and more
The purposes such as new have whole frame, can satisfy the classifying rules versatility of related fields data, for forming close neck
The taxonomy model of versatility has supporting role between domain.
Data structure optimization is carried out for the data classification process of big data, equally could be formed with conducive to data displaying
Classified optimization method
The Classified optimization method that one embodiment of the invention data are shown is as shown in Figure 7.In Fig. 7, the present embodiment includes:
Step 210: establishing the classification data structure of storage class categories.
Classification data structure is used to form the topological structure of class categories, constitutes the frame foundation of whole class categories.Make
The class categories variation that combination is adapted in adaptation complicated applications field can be passed through by obtaining frame foundation.
Classification data structure is used to store the basic parameters of class categories, including but not limited to reference source, identifier or
Name information etc..
Classification data structure also provides for the structure features such as the control index of class categories.
Step 220: class categories are established by classification data structure storage.
According to the characteristic of division of FIELD Data in application field, tied by the topology that classification data structure forms class categories
Structure.
The topological structure of class categories forms the readable class categories descriptor format of user by classification data structure.
Step 230: the classifying rules of class categories is stored by classification data structure.
Classification system is established using classification data structure, and forms the data knot with rule using the versatility of classification system
The compatibility of structure, the rule process process of an embodiment big data classifying rules optimization method forms class categories through the invention
Classifying rules.The rule process process of one embodiment of the invention big data classifying rules optimization method in conjunction with the present embodiment,
Data mapping, which is established, by data structure establishes data structure part and whole reference or reuse.
The Classified optimization method that one embodiment of the invention data are shown utilizes the classification data structural planning with versatility
With the storage of design class categories, the class categories that the class categories formed through deep learning and supervised classification are formed are maintained
The data framework consistency of storage organization, so that the data processing advantage that class categories data have application and inherit.Using point
What the parameter of class data structure storage class categories advantageously formed different classifications classification same characteristic features or parameter shows consistency,
It is also beneficial to determine class categories and corresponding parameter or feature, can be further formed and reuse for class categories or update
Interactive means mitigate the application difficulty of data application person.There is product especially for the vertical data classification in single data fields
Pole effect.
Classification data Structure formation method such as Fig. 8 institute in Classified optimization method that one embodiment of the invention data are shown
Show.In fig. 8, classification data structure-forming process includes:
Step 211: forming class categories basic framework.
Basic framework includes that the formation such as the basic data type of structural classification data structure, Data Structures standard are complete
The essential element of frame.
Step 212: the class categories set in basis of formation frame.
The classification basis of class categories is formed using class categories set.Classification basis is formed as with other data systems
The mark of the data port of data exchange or one group of class categories.
Step 213: forming the class categories grade in class categories set.
The correlation logic of class categories class letter class categories, determine two neighbor classified classifications includes feature.Point
Class classification grade forms the internal association of neighbor classified classification.
Step 214: forming the class categories topological structure in the class categories grade in class categories set.
Class categories topological structure is used to establish the additional characteristic feature of class categories set entirety.Class categories topological structure shape
The internal association of integral class categories.
Step 215: forming the storage field of class categories in class categories grade.
Storing field includes but is not limited to numerical value, character string, feature vector value, data referencing position, data decision logic
Or rule verification data etc..Formed storage field have suitability, can according to the prefabricated condition of adaptation rule formed or according to
Data input type is formed, and the quantity for storing field does not do concrete regulation, in the way of the Memory Allocation of queue or structure of arrays
It is adapted to.
Data structure can be using normal structure in the prior art, such as XML standard, JSON standard etc..It (can with XML
Extending mark language) for data structure standard, texture field is formed using XML standard, for texture field, keyword, benefit are set
It is registered in XML standard with keyword and forms determining field topological structure, and then form the data structure of rules at different levels.Data
The citation form of structure can be such that
Example 3:
Step 216: according to the associated data of memory topology structure and storage field storage class categories.
Associated data can be the description data to class categories, such as the characteristic of class categories, be also possible to point
The associated data of class classification, such as the regular data of class categories.
The Classified optimization method that data of the embodiment of the present invention are shown is by establishing classification data structure new city for close
The preferable FIELD Data class categories frame of the versatility of data fields so that class categories have multiplexing, reuse and quickly more
New general data processing basis.
As shown in figure 8, in an embodiment of the present invention, the process for establishing class categories includes:
Step 221: class indication is registered by class categories basic framework.
Registration is so that the treatment process of class categories basic framework and available data disaggregated model forms data connection, formation
The addressable data port of data call.
Step 222: being established by class categories set and determine classification scheme collection.
Classification scheme collection determines the basis of classification in a data fields, establishes and the relatively independent number of other data fields
According to domain or addressing data range.Data field or addressing data range can be independent data objects, such as data source, the company of link
The classification scheme collection of the independent data files or link that connect.
Step 223: classification classification theme is established by class categories grade.
There are two the class categories tiered logics of neighbor classified classification to describe for class categories grade tool, completes two adjacent point
The rank of class classification or description comprising logic are managed the building of class categories by classification classification theme.
Step 224: the substance parameter for determining theme is established by storage field.
Class categories include storage field, and determining class categories may include identical or different storage field.
The Classified optimization method that the data of the embodiment of the present invention are shown forms classification scheme collection using classification data structure, divides
Class theme level and the structured storage for determining classification and corresponding parameter.Complicated classification hierarchical structure may be implemented, it is full simultaneously
Extension, reuse and the association of sufficient class categories.
As shown in figure 8, in an embodiment of the present invention, the process for being associated with related data includes:
Step 231: passing through classification data structure determination class indication and corresponding classification scheme collection.
The data interaction port that classification scheme collection is obtained using classification data structure, in data of setting out basic framework and data
The interactive process of appearance.
Step 232: the class categories for determining theme are obtained by classification scheme collection.
The class categories for determining theme include the topological structure and specific field structure of each determining subject classification classification.
Step 233: the rule set for determining theme is extracted by the data structure of rule.
The mapping of the data structure and classification data structure of formation rule, completes the rough set theory of data structure, will determine
The rule set of theme is correspondingly connected with the class categories formation for determining theme.
Step 234: rule set is added in the class categories associate field of determining theme according to classification data structure.
The data structure of rule or rule set are added to using the data structure of rule and the mapping of classification data structure
Classification data structure, and form the transmission of respective field and field contents.
The Classified optimization method that the data of the embodiment of the present invention are shown utilizes classification data structure and regular data structure
Mapping form structure compatible so that rule and classification can be with separate storage, independently adjust and are respectively associated, so that adjacent data
Assorting process in field can class categories in a distributed manner and classifying rules formed, the effective versatility of extension classification framework
With the suitability to data type.
One embodiment of the invention data show Classified optimization device include:
Memory, for storing the Classified optimization method treatment process corresponding program generation of above-described embodiment data displaying
Code;
Processor, the program code of the Classified optimization method treatment process for executing the displaying of above-described embodiment data
The Classified optimization device that one embodiment of the invention data are shown is as shown in Figure 9.In Fig. 9, the present embodiment includes:
Taxonomic structure forms module 2210, for establishing the classification data structure of storage class categories;
Categorised content memory module 2220, for establishing class categories by the storage of classification data structure;
Classifying rules relating module 2230, for storing the classifying rules of class categories by classification data structure.
As shown in figure 9, taxonomic structure forms module 2210 and includes: in one embodiment of the invention
Frame forms unit 2211, is used to form class categories basic framework;
Set forms unit 2212, the class categories set being used to form in basic framework;
Grade forms unit 2213, the class categories grade being used to form in class categories set;
Whole topology unit 2214, the class categories being used to form in the class categories grade in class categories set
Topological structure;
Field forms unit 2215, is used to form the storage field of class categories in class categories grade;
Parameter storage unit 2216, for the incidence number according to memory topology structure and storage field storage class categories
According to.
As shown in figure 9, in one embodiment of the invention, categorised content memory module 2220 includes:
Frame registering unit 2221, for registering class indication by class categories basic framework;
Theme registering unit 2222 determines classification scheme collection for establishing by class categories set;
It is classified registering unit 2223, for establishing classification classification theme by class categories grade;
Theme determination unit 2224, for establishing the substance parameter for determining theme by storage field.
As shown in figure 9, in one embodiment of the invention, classifying rules relating module 2230 includes:
Theme collection structure determination unit 2231, for being led by classification data structure determination class indication and corresponding classification
Topic collection;
Class categories determination unit 2232, for obtaining the class categories for determining theme by classification scheme collection;
Rule determination unit 2233 extracts the rule set for determining theme for the data structure by rule;
Data structure map unit 2234, for rule set to be added to the classification of determining theme according to classification data structure
In category associations field.
The classification shown using the optimization method and above-described embodiment data of above-described embodiment big data classifying rules is excellent
The data classification system that change method is formed, can preferably form the unitized construction for covering data classification and classifying rules.Knot
Conjunction unitized construction, which can be formed, is efficiently modified available data classification method.
One embodiment of the invention data classification optimization method is as shown in Figure 10.In Figure 10, the embodiment of the present invention includes:
Step 310: forming industry customization disaggregated model using classification data structure.
Versatility, scalability and data reusability based on the classification system that classification data structure is formed, by specific industry
Or specific area data part classifying classification and corresponding classifying rules using classification data structure to form orderly industry fixed
Disaggregated model processed.
Step 320: classification being carried out to source data by industry customization disaggregated model and forms data label.
Source data data classification has dominant classification and recessive classification, and dominant characteristic of division can be by industry customization point
It directly defines and describes in class model, the data stealth characteristics expression of recessiveness classification is unsuitable for directly definition and description.Utilize row
Industry customizes disaggregated model and carries out dominant classification, and carries out data markers to the characteristic of division of each source data and form corresponding data
Label.
Step 330: source data is divided by data training set according to data label.
The set of source data of same label type is combined into a data training set using data label, ensure that dominant character
Consistency.
Step 340: completing to classify according to data training set Intelligent Optimal disaggregated model and to source data.
The stealth characteristics for being combined supervised learning or semi-supervised learning acquisition data training set are formed to intelligent classification mould
The training sizing of type.
Source data can be the source data for not carrying out classification, be conducive to the data classification of incremental data in this way.Source data
It is also possible to whole source datas, is conducive to the accuracy of data classification in this way.
The data classification optimization method of the embodiment of the present invention using classification data structure formed it is reusable, take artificial point
Rule-like is combined with the disaggregated model based on computer intelligence algorithm.Accurately extensive number is formed using manual sort's rule
Intelligent classification model realization stealth classification is utilized using large-scale data training set Intelligent Optimal disaggregated model according to training set
Accuracy and classification effectiveness.
One embodiment of the invention data classification optimization method is as shown in figure 11.In Figure 11, industry is formed in the present embodiment
Customizing disaggregated model process includes:
Step 311: establishing the topological structure of class categories and class categories by classification data structure according to business demand.
The topological structure of class categories and class categories is formed using the normalization and setting logic of classification data structure
Step 312: establishing corresponding classifying rules according to class categories.
Classifying rules can be diversity data parameter corresponding with class categories, such as keyword, can be used as classification
Foundation or retrieval foundation.
Step 313: classification thresholds are set according to classifying rules.
Classification thresholds can for retrieval or search result, as a result in every data have corresponding relevance score value,
The data that fractional value is higher than the threshold value are wanted according to the classification thresholds preset to determine that classification collects.
Step 314: forming industry customization disaggregated model according to topological structure, classifying rules and classification thresholds.
Data structure can using normal structure in above-described embodiment or in the prior art, such as XML standard,
JSON standard etc..By taking XML (extensible markup language) data structure standard as an example, texture field is formed using XML standard, for knot
Keyword is arranged in structure field, is registered in XML standard using keyword and forms determining field topological structure, and then is formed each
The data structure of grade rule.The citation form of data structure can be such that
Example 4
Classification system is established according to the hierarchical structure that enterprise provides, classification system refers to hierarchical structure and the classification of classification
Specific name, it is as follows: to represent first-level class as " business ", lower business includes secondary classification " business persona "
Data classification of embodiment of the present invention optimization method forms orderly dominant characteristic of division using classification data structure
Classification system using the topological structure and the respective classifying rules parameter of class categories between classification system expression class categories and is divided
Class weight will be in an ordered configuration by the dominant character of the profession demand artificially adjusted, be the effective of dominant character
Adjustment and multiplexing, reuse or update provide reliable technology adjustment basis.So that the classification of profession demand dominant character can
To improve overall data classification effectiveness by operation personnel's complete independently.
As shown in figure 11, in an embodiment of the present invention, the process of formation data label includes: in the present embodiment
Step 321: obtaining interim source data.
Interim source data refers in unit time or concurrent or individual persistent state data in cycle duration.It can be
The dynamic data of continuous static data or corresponding service condition.
Step 322: dominant classification data is obtained by industry customization disaggregated model filtration stage source data.
The filter data excessively or retrieval data formed using the keyword parameter of class categories in model belongs to dominant classification number
According to.
Step 323: forming preliminary classification categorical data using classification thresholds adjustment classification data.
Classification thresholds are corresponding with classifying rules, and classification thresholds act on the data such as retrieval, the filtering of classifying rules formation, adjust
Entire data drift rate, adjustment data are sorted out.
Step 324: every a kind of preliminary classification categorical data is identified to form data label as identical data.
Data label is using class categories as the data characteristics of independent dimension, so that linear point of preliminary classification categorical data
Category feature quantization.
The data classification optimization method of the embodiment of the present invention is closed using the rule of class categories in industry customization disaggregated model
Keyword and regular weight are filtered source data and classify and be identified, and can make full use of the process resource of computer system
And storage resource, it efficiently completes dominant character classification and forms data label, improve overall data classification effectiveness.
As shown in figure 11, in an embodiment of the present invention, the process of formation data training set includes: in the present embodiment
Step 331: using data label as the characteristic of each preliminary classification categorical data.
Step 332: the characteristic dimension and feature vector of each preliminary classification categorical data are formed according to all kinds of characteristics.
Step 333: data training set is formed according to characteristic dimension and feature vector.
The data classification optimization method of the embodiment of the present invention, which is utilized, forms corresponding training according to intelligent classification model requirements
Collection, training set include necessary data characteristics dimension and quantization vector, and data label forms recessive character meter as dominant character
The basis of calculation utilizes the training for the dominant character substitution supervised learning that the process resource and storage resource of computer system are formed
Data characteristics, so that needing the formation of grade efficient, high-quality.
As shown in figure 11, in an embodiment of the present invention, optimize in the present embodiment and the process classified includes:
Step 341: optimization being iterated to intelligent classification model by changing training set data scale.
Training set data scale can be the training set of continuous local data, continuous partial data training set, with
The training set of the data of machine is applied to same intelligent classification model one by one.
Intelligent classification model in one embodiment of the invention uses model-naive Bayesian.
Step 342: being classified by intelligent classification model to whole source datas.
Intelligent classification model can form the whole classification of whole source datas, meet the analysis for determining business scale data and
Classification.
Step 343: being classified by intelligent classification model to increment source data.
Intelligent classification model can form the consecutive sort of incremental data, meet the analysis for determining business creation data and divide
Class.
The data classification optimization method of the embodiment of the present invention forms associated sub- training set by the fractionation of training set, utilizes
The data difference of sub- training set is iterated optimization to intelligent classification model, improves intelligent classification model to recessive characteristic of division
Classification effectiveness and quality.Existing classification data is obtained according to the data classification that source data generation time carries out increment and full dose simultaneously
Further exhaustive division.
One embodiment of the invention data classification optimizes device
Memory, for storing the corresponding program code of above-described embodiment data classification optimization method treatment process;
Processor, for executing the program code of above-described embodiment data classification optimization method treatment process.
The framework that one embodiment of the invention data classification optimizes device is as shown in figure 12.In Figure 12, the present embodiment includes:
Disaggregated model forms module 3310, for forming industry customization disaggregated model using classification data structure;
Label mark module 3320 forms data mark for carrying out classification to source data by industry customization disaggregated model
Label;
Training set forms module 3330, for source data to be divided data training set according to data label;
Classification forms module 3340, for completing to divide according to data training set Intelligent Optimal disaggregated model and to source data
Class.
As shown in figure 12, in one embodiment of the invention, disaggregated model forms module 3310 further include:
Topological structure forms unit 3311, for establishing class categories by classification data structure according to business demand and dividing
The topological structure of class classification;
Classifying rules forms unit 3312, for establishing corresponding classifying rules according to class categories;
Classification thresholds form unit 3313, for setting classification thresholds according to classifying rules;
Model forms unit 3314, for forming industry customization classification according to topological structure, classifying rules and classification thresholds
Model.
As shown in figure 12, in one embodiment of the invention, label mark module 3320 further include:
Source data acquiring unit 3321, for obtaining interim source data;
Dominant taxon 3322, for obtaining dominant classification by industry customization disaggregated model filtration stage source data
Data;
Classification adjustment unit 3323, for forming preliminary classification categorical data using classification thresholds adjustment classification data;
Tag making unit 3324, for identifying to form data mark as identical data to every a kind of preliminary classification categorical data
Label.
As shown in figure 12, in one embodiment of the invention, training set forms module 3330 further include:
Label characteristics form unit 3331, for using data label as the characteristic of each preliminary classification categorical data;
Characteristic quantification unit 3332, for forming the characteristic dimension of each preliminary classification categorical data according to all kinds of characteristics
And feature vector;
Training set synthesis unit 3333, for forming data training set according to characteristic dimension and feature vector.
As shown in figure 12, in one embodiment of the invention, classification forms module 3340 further include:
Iterative optimization unit 3341, it is excellent for being iterated by change training set data scale to intelligent classification model
Change;
Whole taxon 3342, for being classified by intelligent classification model to whole source datas;
Increment sort unit 3343, for being classified by intelligent classification model to increment source data.
The Classified optimization side shown using the optimization method of above-described embodiment big data classifying rules, above-described embodiment data
The data classification system and data point that the data classification system and above-described embodiment data classification optimization method that method is formed are formed
Class result has good data requirements features of response.The data structure of combined data classification system can form efficient data
Interaction display technology scheme.
The data processing method of one embodiment of the invention classification interactive interface is as shown in figure 13.In Figure 13, the present invention is real
Applying example includes:
Step 410: the class categories the of relevant classification classification are formed according to the first interactive data retrieval class categories data
One topological structure.
First interaction data includes search key, and search key can be class categories keyword or and class categories
Keyword senses are close or text similar in vocabulary, the similarity mode for passing through search key and class categories keyword obtains
Immediate relevant classification classification, and classification class is formed comprising logic according to the class categories data under relevant classification classification
Other first topological structure.
Step 420: class categories keyword being formed according to the first topological structure of class categories and is orderly shown.
The first topological structure of class categories can form tree-like, layer by mature data display technique in display frame
Folded or grouping orderly display.
Step 430: the first classification results data to search result adaptation are formed according to the first topological structure of class categories
Collection carries out classification results data and orderly shows.
Class categories (the classification class including dominant attribute and stealthy attribute retained in class categories data structure system
Not there is corresponding classification data, classification data trains the data classification model formed or keyword classification system according to corresponding)
It is formed.
Step 440: class categories combinational logic being formed according to the second interaction data, is formed according to class categories combinational logic
The second topological structure of class categories orderly shows class categories keyword according to the second topological structure of class categories.
Second interaction data includes to (the different node positions in data structure of topological node in the first topological structure of class categories
The class categories set) selection, selection includes the combination or choice of node, and selection passes through topological node in the first topological structure
The selection of the keyword of class categories embodies class categories combinational logic.
Determining class categories combinational logic forms the combination or choice of class categories, and then forms class categories second and open up
Flutter structure.
The second topological structure of class categories can form tree-like, layer by mature data display technique in display frame
Folded or grouping orderly adjustment.
Step 450: the second classification results data to search result adaptation are formed according to the second topological structure of class categories
Collection carries out classification results data and orderly shows.
Class categories (the classification class including dominant attribute and stealthy attribute retained in class categories data structure system
Not there is corresponding classification data, classification data trains the data classification model formed or keyword classification system according to corresponding)
It is formed.
Adaptation includes the sequence to the data in different classifications data, duplicate removal or index.
The class categories number formed during the data processing method data classification of classification interactive interface of the embodiment of the present invention
According to the classification data that structure and class categories and data classification source data are formed, is formed in conjunction with display technology and number is retrieved to magnanimity
According in due course classification, sort merge and Data Matching show, avoid data handling procedure in existing classification interactive interface and handed over
Mutual process influences search process and data classification dimension matching degree is limited, cannot quickly form data positioning and data match group is closed
Defect.The classification system that searching classification process in interactive interface is formed with class categories data structure can orderly be tied
It closes, the data information that data classification is formed sufficiently is presented in interactive process.One embodiment of the invention classification interaction
The data processing method at interface is as shown in figure 14.In Figure 14, the embodiment of the present invention forms the first topological structure of class categories
Process includes:
Step 411: similar key is determined in class categories data according to the first interaction data.
First interaction data includes searching keyword, fuzzy keyword or text paragraph.
Step 412: relevant classification classification is determined according to similar key.
Compare similarity using weight parameter regular in the fuzzy matching algorithm and class categories data of text with determination
Relevant classification classification.
Step 413: the first topological structure of class categories is established according to relevant classification classification and class categories data structure.
By determining that relevant classification classification obtains the counterpart node position and associated bottom in class categories data structure
Or host node position, the topological structure of associative classification classification is consequently formed.
The data processing method of one embodiment of the invention classification interactive interface is as shown in figure 14.In Figure 14, the present invention is real
Applying the process that example formation class categories keyword is orderly shown includes:
Step 421: the class categories keyword root in the first topological structure of class categories is formed according to the first topological structure
Optimize the first display data of topological structure.
Shown in the first topological structure of class categories according to the class categories data structure in above-mentioned data classification embodiment
It include the classification informations such as the corresponding keyword of class categories, classifying rules and classified weight.
Step 422: the first display data are shown according to the data display strategy of display frame.
First display data include the Topological Mapping structure between class categories keyword, with the tree data knot between vocabulary
Structure.
The data processing method of one embodiment of the invention classification interactive interface is as shown in figure 14.In Figure 14, the present invention is real
Apply example formed the first classification results data set process include:
Step 431: determining the class categories in the first topological structure of class categories.
Determining class categories can be obtained according to each back end of the first topological structure of class categories.
Step 432: determining corresponding classification data according to class categories.
Show that source data can form classification number according to the class categories data structure in above-mentioned data classification embodiment
According to.
Step 433: merging classification data progress duplicate removal to form the first classification results data set.
Influenced by characteristic of division diversity that there are redundant datas in classification data.
Step 434: the first classification results data set is shown according to the data display strategy of display frame.
First classification results data set forms as inquiry or search result data and shows content.
The data processing method of one embodiment of the invention classification interactive interface is as shown in figure 14.In Figure 14, the present invention is real
Apply example formed the second topological structure of class categories process include:
Step 441: receiving the interactive selection to class categories keyword.
Interactive selection includes selecting the keyword of class categories, including increase or exclude.
Step 442: determining that class categories combinational logic forms the second topological structure of class categories according to interactive selection.
The selection result formed after selecting keyword carries out the ownership logic judgment of corresponding data structure node, into
And form topological structure.
Step 443: the second display data and basis of optimization topological structure are formed according to the second topological structure of class categories
The data display strategy of display frame is shown.
Second display data include the Topological Mapping structure between class categories keyword, with the tree data knot between vocabulary
Structure.
The data processing method of one embodiment of the invention classification interactive interface is as shown in figure 14.In Figure 14, the present invention is real
Apply example formed the second classification results data set process include:
Step 451: the variation of the second topological structure of class categories being followed to determine corresponding classification data.
Show that source data can form classification number according to the class categories data structure in above-mentioned data classification embodiment
According to.
Step 452: merging classification data progress duplicate removal to form the second classification results data set.
Influenced by characteristic of division diversity that there are redundant datas in classification data.
Step 453: following data of the display of the second display data by the second classification results data set according to display frame
Display strategy is shown.
One embodiment of the invention classification interactive interface data processing equipment include:
Memory, for storing the corresponding program of data processing method treatment process of above-described embodiment classification interactive interface
Code;
Processor, the program generation of the data processing method treatment process for executing above-described embodiment classification interactive interface
Code.
The data processing equipment of one embodiment of the invention classification interactive interface is as shown in figure 15.In Figure 15, the present embodiment
Include:
First structure forms module 4410, for forming relevant classification according to the first interactive data retrieval class categories data
The first topological structure of class categories of classification;
First structure display module 4420 has for forming class categories keyword according to the first topological structure of class categories
Sequence is shown;
First data exhibiting module 4430, for being formed according to the first topological structure of class categories to search result adaptation
First classification results data set carries out classification results data and orderly shows;
Second structure formed module 4440, for according to the second interaction data formed class categories combinational logic, according to point
Class category combinations logic forms the second topological structure of class categories, according to the second topological structure of class categories to class categories key
Word is orderly shown;
Second data exhibiting module 4450, for being formed according to the second topological structure of class categories to search result adaptation
Second classification results data set carries out classification results data and orderly shows.
As shown in figure 15, in one embodiment of the invention, first structure forms module 4410 and includes:
Similar vocabulary determination unit 4411, for determining similar key in class categories data according to the first interaction data
Word;
Classification determination unit 4412, for determining relevant classification classification according to similar key;
First topology establishes unit 4413, for establishing classification class according to relevant classification classification and class categories data structure
Other first topological structure.
As shown in figure 15, in one embodiment of the invention, first structure display module 4420 includes:
First display planning unit 4421, for by the class categories keyword root evidence in the first topological structure of class categories
First topological structure forms the first display data of optimization topological structure;
First display transmission unit 4422, for carrying out the first display data according to the data display strategy of display frame
Display.
As shown in figure 15, in one embodiment of the invention, the first data exhibiting module 4430 includes:
Classification determination unit 4431, for determining the class categories in the first topological structure of class categories;
Classification data determination unit 4432, for determining corresponding classification data according to class categories;
Classification data integral unit 4433, for merging classification data progress duplicate removal to form the first classification results data
Collection;
Classification data transmission unit 4434, for the first classification results data set to be shown plan according to the data of display frame
Slightly shown.
As shown in figure 15, in one embodiment of the invention, the second structure forms module 4440 and includes:
Unit 4441 is established in interaction, for receiving the interactive selection to class categories keyword;
Second topological determination unit 4442, for determining that class categories combinational logic forms class categories according to interactive selection
Second topological structure;
Second data transmission unit 4443, for forming the of optimization topological structure according to the second topological structure of class categories
Two display data are simultaneously shown according to the data display strategy of display frame.
As shown in figure 15, in one embodiment of the invention, the second data exhibiting module 4450 includes:
Second classification determination unit 4451, for following the variation of the second topological structure of class categories to determine corresponding classification
Data;
Second Data Integration unit 4452, for merging classification data progress duplicate removal to form the second classification results data
Collection;
Data set transmissions unit 4453, for follow the display of the second display data by the second classification results data set according to
The data display strategy of display frame is shown.
In one embodiment of the invention, processor can be using DSP (Digital Signal Processing) number letter
Number processor, FPGA (Field-Programmable Gate Array) field programmable gate array, MCU
(Microcontroller Unit) system board, SoC (system on a chip) system board or the PLC including I/O
(Programmable Logic Controller) minimum system.
The foregoing is only a preferred embodiment of the present invention, but scope of protection of the present invention is not limited thereto,
In the technical scope disclosed by the present invention, any changes or substitutions that can be easily thought of by anyone skilled in the art,
It should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the protection model of claims
Subject to enclosing.
Claims (10)
1. a kind of optimization method of big data classifying rules characterized by comprising
Establish the data structure of storage rule;
The rule set of determining theme is formed by the data structure;
Scene classification is carried out to source data according to the rule set.
2. the optimization method of big data classifying rules as described in claim 1, which is characterized in that further include:
The rule set is parsed and updates storage rule.
3. the optimization method of big data classifying rules as claimed in claim 2, which is characterized in that further include:
Partial data structure is parsed and obtained to the rule set and rule is forwarded.
4. the optimization method of big data classifying rules as claimed in claim 3, which is characterized in that the storage rule of establishing
Data structure includes:
Set the texture field of single rule;
Set the field keyword and compound fields keyword of single rule;
The data structure of the rule is formed according to the field topological structure of single rule.
5. the optimization method of big data classifying rules as claimed in claim 3, which is characterized in that described to pass through the data knot
Be configured to determine theme rule set include:
Establish rule set keyword;
Establish rule set rule set subset and corresponding subset keyword;
Establish the scene collection and corresponding scene keyword of rule set subset;
Keyword topological structure is established according to the rule set keyword, the subset keyword or the scene keyword;
The data structure for increasing rule in the keyword topological structure adds rule ginseng in the data structure of the rule
Several or relevant parameter.
6. the optimization method of big data classifying rules as claimed in claim 3, which is characterized in that described according to the rule set
Carrying out scene classification to source data includes:
Corresponding preliminary source data is determined according to preliminary classification;
Rule set corresponding with the preliminary classification is determined in rule set;
Scene classification classification and scene classification supplemental characteristic are extracted from the rule set;
Classified using the scene classification supplemental characteristic to the preliminary source data, is formed described under determining preliminary classification
Corresponding classification source data under scene classification classification;
The result classification data of preliminary classification is formed according to the classification source data.
7. the optimization method of big data classifying rules as claimed in claim 3, which is characterized in that it is described to the rule set into
Row parses and updates storage rule
It obtains determining classification and determines upgrading demand for scene;
According to upgrade demand determining rules data structure and the Rule content, shown by interactive frame;
More new content is updated into the corresponding rules data structure and Rule content by the interactive frame.
8. the optimization method of big data classifying rules as claimed in claim 3, which is characterized in that it is described to the rule set into
Row parses and obtains Partial data structure and rule is forwarded and includes:
The associated topologies structure for determining the data structure of classification and determining scene is obtained according to forwarding demand;
Interim rules data structure and Rule content are formed according to the associated topologies structure;
The interim rules data structure and Rule content are formed into independent data objects, data link is provided.
9. a kind of big data classifying rules optimizes device characterized by comprising
Memory, it is corresponding for storing any big data classifying rules optimization method treatment process of figure claim 1 to 8
Program code;
Processor, for executing said program code.
10. a kind of big data classifying rules optimizes device characterized by comprising
Rule setting module, for establishing the data structure of storage rule;
Rule forms module, for forming the rule set of determining theme by the data structure;
Rule application module, for carrying out scene classification to source data according to the rule set.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910280279.6A CN110096519A (en) | 2019-04-09 | 2019-04-09 | A kind of optimization method and device of big data classifying rules |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910280279.6A CN110096519A (en) | 2019-04-09 | 2019-04-09 | A kind of optimization method and device of big data classifying rules |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110096519A true CN110096519A (en) | 2019-08-06 |
Family
ID=67444547
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910280279.6A Pending CN110096519A (en) | 2019-04-09 | 2019-04-09 | A kind of optimization method and device of big data classifying rules |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110096519A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111881287A (en) * | 2019-09-10 | 2020-11-03 | 马上消费金融股份有限公司 | Classification ambiguity analysis method and device |
CN112445810A (en) * | 2020-12-11 | 2021-03-05 | 中国人寿保险股份有限公司 | Data updating method and device for data warehouse, electronic device and storage medium |
CN112800138A (en) * | 2021-02-04 | 2021-05-14 | 广东云曌医疗科技有限公司 | Big data classification method and system |
CN113190650A (en) * | 2021-04-21 | 2021-07-30 | 武汉卓尔信息科技有限公司 | Method and system for screening big data of industrial product |
CN113271232A (en) * | 2020-10-27 | 2021-08-17 | 苏州铁头电子信息科技有限公司 | Online office network disturbance processing method and device |
CN114168075A (en) * | 2021-11-29 | 2022-03-11 | 华中科技大学 | Method, equipment and system for improving load access performance based on data relevance |
CN114860797A (en) * | 2022-03-16 | 2022-08-05 | 电子科技大学 | Data derivation processing method |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1889582A (en) * | 2005-06-30 | 2007-01-03 | 华为技术有限公司 | Method for conducting sorting to multi-protocol tag exchange business stream |
CN102414677A (en) * | 2009-04-22 | 2012-04-11 | 微软公司 | Data classification pipeline including automatic classification rules |
CN103678447A (en) * | 2012-09-04 | 2014-03-26 | Sap股份公司 | Multivariate transaction classification |
CN103729428A (en) * | 2013-12-25 | 2014-04-16 | 中国科学院计算技术研究所 | Big data classification method and system |
US20150278313A1 (en) * | 2005-05-24 | 2015-10-01 | International Business Machines Corporation | Tagging of facet elements in a facet tree |
CN107704869A (en) * | 2017-09-01 | 2018-02-16 | 厦门快商通科技股份有限公司 | A kind of corpus data methods of sampling and model training method |
-
2019
- 2019-04-09 CN CN201910280279.6A patent/CN110096519A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150278313A1 (en) * | 2005-05-24 | 2015-10-01 | International Business Machines Corporation | Tagging of facet elements in a facet tree |
CN1889582A (en) * | 2005-06-30 | 2007-01-03 | 华为技术有限公司 | Method for conducting sorting to multi-protocol tag exchange business stream |
CN102414677A (en) * | 2009-04-22 | 2012-04-11 | 微软公司 | Data classification pipeline including automatic classification rules |
CN103678447A (en) * | 2012-09-04 | 2014-03-26 | Sap股份公司 | Multivariate transaction classification |
CN103729428A (en) * | 2013-12-25 | 2014-04-16 | 中国科学院计算技术研究所 | Big data classification method and system |
CN107704869A (en) * | 2017-09-01 | 2018-02-16 | 厦门快商通科技股份有限公司 | A kind of corpus data methods of sampling and model training method |
Non-Patent Citations (1)
Title |
---|
张明卫: "一种大数据环境中分布式辅助关联分类算法", 《软件学报》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111881287A (en) * | 2019-09-10 | 2020-11-03 | 马上消费金融股份有限公司 | Classification ambiguity analysis method and device |
CN111881287B (en) * | 2019-09-10 | 2021-08-17 | 马上消费金融股份有限公司 | Classification ambiguity analysis method and device |
CN113271232A (en) * | 2020-10-27 | 2021-08-17 | 苏州铁头电子信息科技有限公司 | Online office network disturbance processing method and device |
CN112445810A (en) * | 2020-12-11 | 2021-03-05 | 中国人寿保险股份有限公司 | Data updating method and device for data warehouse, electronic device and storage medium |
CN112800138A (en) * | 2021-02-04 | 2021-05-14 | 广东云曌医疗科技有限公司 | Big data classification method and system |
CN112800138B (en) * | 2021-02-04 | 2021-10-15 | 广东云曌医疗科技有限公司 | Big data classification method and system |
CN113190650A (en) * | 2021-04-21 | 2021-07-30 | 武汉卓尔信息科技有限公司 | Method and system for screening big data of industrial product |
CN114168075A (en) * | 2021-11-29 | 2022-03-11 | 华中科技大学 | Method, equipment and system for improving load access performance based on data relevance |
CN114860797A (en) * | 2022-03-16 | 2022-08-05 | 电子科技大学 | Data derivation processing method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110096519A (en) | A kind of optimization method and device of big data classifying rules | |
CN107728995B (en) | A kind of technical documentation auxiliary writing system and method | |
CN109886349B (en) | A kind of user classification method based on multi-model fusion | |
CN102067128A (en) | Data processing device, data processing method, program, and integrated circuit | |
CN105975531B (en) | Robot dialog control method and system based on dialogue knowledge base | |
CN110263180A (en) | It is intended to knowledge mapping generation method, intension recognizing method and device | |
CN106503863A (en) | Based on the Forecasting Methodology of the age characteristicss of decision-tree model, system and terminal | |
CN108459955A (en) | Software Defects Predict Methods based on depth autoencoder network | |
CN107977363A (en) | Title generation method, device and electronic equipment | |
CN110442725A (en) | Entity relation extraction method and device | |
WO2023124191A1 (en) | Depth map matching-based automatic classification method and system for medical data elements | |
CN109582849A (en) | A kind of Internet resources intelligent search method of knowledge based map | |
CN107918657A (en) | The matching process and device of a kind of data source | |
CN110377751A (en) | Courseware intelligent generation method, device, computer equipment and storage medium | |
CN112308115A (en) | Multi-label image deep learning classification method and equipment | |
CN110059177A (en) | A kind of activity recommendation method and device based on user's portrait | |
CN110083663A (en) | A kind of Classified optimization method and apparatus that data are shown | |
CN108647258A (en) | A kind of expression learning method based on entity associated constraint | |
CN110109902A (en) | A kind of electric business platform recommender system based on integrated learning approach | |
CN111814528B (en) | Connectivity analysis noctilucent image city grade classification method | |
CN107016566A (en) | User model construction method based on body | |
Castano et al. | Classifying and reusing conceptual schemas | |
CN110110756A (en) | A kind of data classification optimization method and optimization device | |
CN110084668A (en) | A kind of data processing method and data processing equipment of interactive interface of classifying | |
CN104765763B (en) | A kind of semantic matching method of the Heterogeneous Spatial Information classification of service based on concept lattice |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190806 |
|
RJ01 | Rejection of invention patent application after publication |