CN107562854A - A kind of modeling method of quantitative analysis Party building data - Google Patents
A kind of modeling method of quantitative analysis Party building data Download PDFInfo
- Publication number
- CN107562854A CN107562854A CN201710751678.7A CN201710751678A CN107562854A CN 107562854 A CN107562854 A CN 107562854A CN 201710751678 A CN201710751678 A CN 201710751678A CN 107562854 A CN107562854 A CN 107562854A
- Authority
- CN
- China
- Prior art keywords
- party
- developing
- work
- keyword
- node
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
The present invention discloses a kind of modeling method of quantitative analysis Party building data.The present invention extracts the work of developing a party keyword in work of developing a party text first, dependency relation quantitative between work of developing a party keyword two-by-two is measured, then work of developing a party keyword is abstracted as to the node of polytree models, the directed edge of model is determined according to the dependency relation between work of developing a party keyword two-by-two, then determines the conditional probability parameter of model.The present invention provides the quantitative measurement method of dependency relation between work of developing a party keyword, and is provided with dependency relation of the polytree models between all work of developing a party keywords and intuitively modeled, and support is provided for further analysis Party building big data.
Description
Technical field
The invention belongs to data mining technology field, is related to a kind of modeling method of quantitative analysis Party building data.
Background technology
The work of developing a party is to do a good job of it the basic guarantee of party member troop, and carries out the basic guarantee of all work, improves Party building
The scientific level that works is a vital task of the current work of developing a party.During the work of developing a party is carried out, magnanimity is produced
Party building data, including ideological building data, organizational building data, work style improvement data and institutional improvement data etc., to Party building
Big data, which carries out intelligent management and effectively analysis, turns into an active demand.Quantitative modeling and pass are carried out to Party building big data
Connection analysis, and studies effective analysis mining method, is the key of effective analysis Party building big data, and to improve Party building scientific
Horizontal basis.Polytree models are a kind of probability graph models of simple uncertainty knowledge expression and reasoning, not only may be used
To catch quantitative uncertainty relation between data, while also provide efficient inference mechanism for the quantitative analysis of the work of developing a party.
The present invention carries out quantitative modeling with polytree models to the work of developing a party, passes through the phase between quantitative measurement work of developing a party keyword
Pass relation, there is provided a kind of modeling means for excavating global dependency relation, branch is provided for Party building text analyzing and work of developing a party analysis
Hold, also provide technical support to improve the scientific level of Party building.
The content of the invention
For caused mass data in the work of developing a party, the present invention provides one kind to excavate Party building data overall situation dependency relation
Effective modeling method, support is provided for the analysis of work of developing a party big data.This method mainly includes the following steps that:
The first step, each work of developing a party text is quantified, be specially:
1.1st, it is rightnIndividual work of developing a party text collectionD={d 1,d 2,…,d n , extractionmIndividual work of developing a party keyword setW={w 1,w 2,…,w m };
1.2nd, definition document frequency functionf(x), whereinxKey combination sequence occurring in document and occurring without is represented, it is right
Work of developing a party keywordα∈W,α 1Represent keywordαOccur in a document,α 0Represent keywordαOccur without in a document;Such asRepresentαOccur andβThe key combination occurred without,f(α 1,β 0) represent keywordαOccur butβThe text occurred without
Shelves frequency;
Second step, it is rightWIn any work of developing a party keywordα、βWithγ, definition, sentenced with Chi-square Test (chi-square test)
It is fixedαWithβIt is whether separate;With degree of correlation quantitative measurementαWithβBetween directly related relation, ifαWithβIndependently of each other, then, otherwise;
3rd step, establishmThe maximum weight spanning tree of individual nodeT
3.1st, willW={w 1,w 2,…,w m In each work of developing a party keyword be abstracted asTIn a node;
3.2nd, the degree of correlation between word two-by-two is investigated from big to small, if loop is not producedTIn plus a nothing
Xiang Bianα―β, otherwise abandon this, untilTIn havem- 1 side orUntill;
4th step is rightTMiddle subgraphα―γ―β, calculate, with card
(chi-square test) is examined to be judged in sideαWithβWhether onγConditional sampling;IfαWithβOn γ, condition is not only
It is vertical, and, then by subgraphα―γ―βIt is set to aggregation infrastructureα→γ←β, until there is no satisfaction
The subgraph of condition simultaneously obtains a figureG′;
5th step, will under conditions of new aggregation infrastructure is not producedG' in all nonoriented edges be set to directed edge, obtain
Polytree graph structureG;
6th step, calculateGIn each nodevIts father node pa (v) under the conditions of conditional probability, and obtain set of conditional probabilitiesP, finally give complete Party building big data
Polytree models (G,P)。
Brief description of the drawings
Fig. 1 build the process of Party building data polytree models;
Embodiment
Below in conjunction with accompanying drawing 1, to according to embodiment provided by the invention, describing in detail as follows.
The first step, each work of developing a party text is quantified, be specially:
1.1st, it is rightnIndividual work of developing a party text collectionD={d 1,d 2,…,d n , extractionmIndividual work of developing a party keyword setW={w 1,w 2,…,w m };
1.2nd, definition document frequency functionf(x), whereinxKey combination sequence occurring in document and occurring without is represented, it is right
Work of developing a party keywordα∈W,α 1Represent keywordαOccur in a document,α 0Represent keywordαOccur without in a document;Such asRepresentαOccur andβThe key combination occurred without,f(α 1,β 0) represent keywordαOccur butβThe text occurred without
Shelves frequency;
Assuming thatnWhen=100, certain two word for counting onαWithβDocument frequency bef(α 1,β 1)=20,f(α 1,β 0)=20,f(α 0,β 1)=10,f(α 0,β 0)=50,f(α 1)=40,f(α 0)=60,f(β 1)=30,f(β 0)=70。
Second step, it is rightWIn any work of developing a party keywordα、βWithγ, definition, carried out with Chi-square Test (chi-square test)
JudgeαWithβIt is whether separate;Use the degree of correlationQuantitative measurementαWithβBetween directly related relation, ifαWithβ
Independently of each other, then, otherwise;
If for example, two wordsαWithβDocument frequency be the result calculated in the first step respectively, then
=0.063;
3rd step, establishmThe maximum weight spanning tree of individual nodeT, it is specially
3.1st, willW={w 1,w 2,…,w m In each work of developing a party keyword be abstracted asTIn a node;
3.2nd, the degree of correlation between word two-by-two is investigated from big to small, if loop is not producedTIn plus one
Nonoriented edgeα―β, otherwise abandon this, untilTIn havem- 1 side or;Shown in Fig. 1 (left side)
For a maximum weight spanning treeT;
4th step is rightTMiddle subgraphw 1―w 2―w 4If, then not can determine that directed edge;Check subgraphw 3―w 4―w 2Ifw 2Withw 3Onw 4Not conditional sampling and, then willw 3―w 4―w 2
It is set to aggregation infrastructurew 3→w 4←w 2;Similarly check other subgraphs for meeting condition;Fig. 1 (in) shown in figureG' be exactly one can
The structure of energy;
5th step, will under conditions of new aggregation infrastructure is not producedG' in all nonoriented edges be set to directed edge, such as putw 1→w 2, orw 1→w 2;Equally putw 4→w 6;But it can not putw 4←w 6, because will so produce new aggregation infrastructurew 2→w 4←w 6;
According to such principle, polytree graph structure can be finally obtainedG;
6th step, calculateGIn each nodevIts father node pa (v) under the conditions of conditional probability, and obtain set of conditional probabilitiesP, finally obtain complete polyree
Model (G,P), as shown in Fig. 1 (right side).
Claims (1)
- A kind of 1. modeling method of quantitative analysis Party building data, the method is characterized in that comprising the following steps:Step 1:Quantify each work of developing a party text1.1st, it is rightnIndividual work of developing a party text collectionD={d 1,d 2,…,d n , extractionmIndividual work of developing a party keyword setW={w 1,w 2,…,w m };1.2nd, definition document frequency functionf(x), whereinxKey combination sequence occurring in document and occurring without is represented, it is right Work of developing a party keywordα∈W,α 1Represent keywordαOccur in a document,α 0Represent keywordαOccur without in a document;Step 2:It is rightWIn any work of developing a party keywordα、βWithγ, definition, judged with Chi-square Test (chi-square test)α WithβIt is whether separate;Use the degree of correlationQuantitative measurementαWithβBetween directly related relation, ifαWithβMutually solely It is vertical, then, otherwise;Step 3:EstablishmThe maximum weight spanning tree of individual nodeT3.1st, willW={w 1,w 2,…,w m In each work of developing a party keyword be abstracted asTIn a node;3.2nd, the degree of correlation is investigated from big to small, if loop is not producedTIn plus a nonoriented edgeα―β, Otherwise this is abandoned, untilTIn havem- 1 side or;Step 4:It is rightTMiddle subgraphα―γ―β, calculate, Judged with Chi-square Test (chi-square test)αWithβWhether onγConditional sampling;IfαWithβOn γ not bars Part is independent, and, then by subgraphα―γ―βIt is set to aggregation infrastructureα→γ←β, until There is no the subgraph for the condition that meets and obtain a figureG′;Step 5:, will under conditions of new aggregation infrastructure is not producedG' in all nonoriented edges be set to directed edge, obtain Polytree graph structureG;Step 6:CalculateGIn each nodevIts father node pa (v) under the conditions of conditional probability, and obtain set of conditional probabilitiesP, finally give complete Party building data polytree Model (G,P)。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710751678.7A CN107562854B (en) | 2017-08-28 | 2017-08-28 | Modeling method for quantitatively analyzing party building data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710751678.7A CN107562854B (en) | 2017-08-28 | 2017-08-28 | Modeling method for quantitatively analyzing party building data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107562854A true CN107562854A (en) | 2018-01-09 |
CN107562854B CN107562854B (en) | 2020-09-22 |
Family
ID=60977304
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710751678.7A Active CN107562854B (en) | 2017-08-28 | 2017-08-28 | Modeling method for quantitatively analyzing party building data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107562854B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103049569A (en) * | 2012-12-31 | 2013-04-17 | 武汉传神信息技术有限公司 | Text similarity matching method on basis of vector space model |
US20160224564A1 (en) * | 2013-09-29 | 2016-08-04 | Peking University Founder Group Co., Ltd. | Method and system for key knowledge point recommendation |
CN106598999A (en) * | 2015-10-19 | 2017-04-26 | 北京国双科技有限公司 | Method and device for calculating text theme membership degree |
CN106844328A (en) * | 2016-08-23 | 2017-06-13 | 华南师范大学 | A kind of new extensive document subject matter semantic analysis and system |
CN106874695A (en) * | 2017-03-22 | 2017-06-20 | 北京大数医达科技有限公司 | The construction method and device of medical knowledge collection of illustrative plates |
-
2017
- 2017-08-28 CN CN201710751678.7A patent/CN107562854B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103049569A (en) * | 2012-12-31 | 2013-04-17 | 武汉传神信息技术有限公司 | Text similarity matching method on basis of vector space model |
US20160224564A1 (en) * | 2013-09-29 | 2016-08-04 | Peking University Founder Group Co., Ltd. | Method and system for key knowledge point recommendation |
CN106598999A (en) * | 2015-10-19 | 2017-04-26 | 北京国双科技有限公司 | Method and device for calculating text theme membership degree |
CN106844328A (en) * | 2016-08-23 | 2017-06-13 | 华南师范大学 | A kind of new extensive document subject matter semantic analysis and system |
CN106874695A (en) * | 2017-03-22 | 2017-06-20 | 北京大数医达科技有限公司 | The construction method and device of medical knowledge collection of illustrative plates |
Also Published As
Publication number | Publication date |
---|---|
CN107562854B (en) | 2020-09-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Akbar et al. | Real-time probabilistic data fusion for large-scale IoT applications | |
Yang et al. | A system architecture for manufacturing process analysis based on big data and process mining techniques | |
Liu et al. | Weighted graph clustering for community detection of large social networks | |
Ceci et al. | Completion time and next activity prediction of processes using sequential pattern mining | |
CN106940679A (en) | Data processing method and device | |
Xu et al. | Modeling and representation for earthquake emergency response knowledge: perspective for working with geo-ontology | |
Kousiouris et al. | An integrated information lifecycle management framework for exploiting social network data to identify dynamic large crowd concentration events in smart cities applications | |
Farasat et al. | Social network analysis with data fusion | |
CN105389341A (en) | Text clustering and analysis method for repeating caller work orders of customer service calls | |
Bae et al. | Scalable flow-based community detection for large-scale network analysis | |
CN105574541A (en) | Compactness sorting based network community discovery method | |
Zheng et al. | Application of data mining technology in alarm analysis of communication network | |
CN104750499A (en) | Constraint solving and description logic based web service combination method | |
Keyvanpour | A survey on community detection methods based on the nature of social networks | |
CN104182489A (en) | Query processing method for text big data | |
CN103577899A (en) | Service composition method based on reliability prediction combined with QoS | |
US10783440B2 (en) | Method and system for generating a knowledge library after analysis of a user data request | |
De Roo et al. | Bridging archaeology and GIS: influencing factors for a 4D archaeological GIS | |
CN105787072A (en) | Field knowledge extracting and pushing method oriented to progress | |
CN107562854A (en) | A kind of modeling method of quantitative analysis Party building data | |
da F. Vieira et al. | Modularity based hierarchical community detection in networks | |
CN105608160A (en) | Distributed big data analysis method | |
CN101695079A (en) | Automatic service combination method capable of guaranteeing correction and system thereof | |
CN105630899B (en) | A kind of construction method of public health event early warning knowledge base | |
Lee et al. | Exploiting online social data in ontology learning for event tracking and emergency response |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |