CN107562854A - A kind of modeling method of quantitative analysis Party building data - Google Patents

A kind of modeling method of quantitative analysis Party building data Download PDF

Info

Publication number
CN107562854A
CN107562854A CN201710751678.7A CN201710751678A CN107562854A CN 107562854 A CN107562854 A CN 107562854A CN 201710751678 A CN201710751678 A CN 201710751678A CN 107562854 A CN107562854 A CN 107562854A
Authority
CN
China
Prior art keywords
party
developing
work
keyword
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710751678.7A
Other languages
Chinese (zh)
Other versions
CN107562854B (en
Inventor
李维华
王兵益
郭延哺
王顺芳
何敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunnan University YNU
Original Assignee
Yunnan University YNU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunnan University YNU filed Critical Yunnan University YNU
Priority to CN201710751678.7A priority Critical patent/CN107562854B/en
Publication of CN107562854A publication Critical patent/CN107562854A/en
Application granted granted Critical
Publication of CN107562854B publication Critical patent/CN107562854B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The present invention discloses a kind of modeling method of quantitative analysis Party building data.The present invention extracts the work of developing a party keyword in work of developing a party text first, dependency relation quantitative between work of developing a party keyword two-by-two is measured, then work of developing a party keyword is abstracted as to the node of polytree models, the directed edge of model is determined according to the dependency relation between work of developing a party keyword two-by-two, then determines the conditional probability parameter of model.The present invention provides the quantitative measurement method of dependency relation between work of developing a party keyword, and is provided with dependency relation of the polytree models between all work of developing a party keywords and intuitively modeled, and support is provided for further analysis Party building big data.

Description

A kind of modeling method of quantitative analysis Party building data
Technical field
The invention belongs to data mining technology field, is related to a kind of modeling method of quantitative analysis Party building data.
Background technology
The work of developing a party is to do a good job of it the basic guarantee of party member troop, and carries out the basic guarantee of all work, improves Party building The scientific level that works is a vital task of the current work of developing a party.During the work of developing a party is carried out, magnanimity is produced Party building data, including ideological building data, organizational building data, work style improvement data and institutional improvement data etc., to Party building Big data, which carries out intelligent management and effectively analysis, turns into an active demand.Quantitative modeling and pass are carried out to Party building big data Connection analysis, and studies effective analysis mining method, is the key of effective analysis Party building big data, and to improve Party building scientific Horizontal basis.Polytree models are a kind of probability graph models of simple uncertainty knowledge expression and reasoning, not only may be used To catch quantitative uncertainty relation between data, while also provide efficient inference mechanism for the quantitative analysis of the work of developing a party. The present invention carries out quantitative modeling with polytree models to the work of developing a party, passes through the phase between quantitative measurement work of developing a party keyword Pass relation, there is provided a kind of modeling means for excavating global dependency relation, branch is provided for Party building text analyzing and work of developing a party analysis Hold, also provide technical support to improve the scientific level of Party building.
The content of the invention
For caused mass data in the work of developing a party, the present invention provides one kind to excavate Party building data overall situation dependency relation Effective modeling method, support is provided for the analysis of work of developing a party big data.This method mainly includes the following steps that:
The first step, each work of developing a party text is quantified, be specially:
1.1st, it is rightnIndividual work of developing a party text collectionD={d 1,d 2,…,d n , extractionmIndividual work of developing a party keyword setW={w 1,w 2,…,w m };
1.2nd, definition document frequency functionf(x), whereinxKey combination sequence occurring in document and occurring without is represented, it is right Work of developing a party keywordαW,α 1Represent keywordαOccur in a document,α 0Represent keywordαOccur without in a document;Such asRepresentαOccur andβThe key combination occurred without,f(α 1,β 0) represent keywordαOccur butβThe text occurred without Shelves frequency;
Second step, it is rightWIn any work of developing a party keywordαβWithγ, definition, sentenced with Chi-square Test (chi-square test) It is fixedαWithβIt is whether separate;With degree of correlation quantitative measurementαWithβBetween directly related relation, ifαWithβIndependently of each other, then, otherwise
3rd step, establishmThe maximum weight spanning tree of individual nodeT
3.1st, willW={w 1,w 2,…,w m In each work of developing a party keyword be abstracted asTIn a node;
3.2nd, the degree of correlation between word two-by-two is investigated from big to small, if loop is not producedTIn plus a nothing Xiang Bianαβ, otherwise abandon this, untilTIn havem- 1 side orUntill;
4th step is rightTMiddle subgraphαγβ, calculate, with card (chi-square test) is examined to be judged in sideαWithβWhether onγConditional sampling;IfαWithβOn γ, condition is not only It is vertical, and, then by subgraphαγβIt is set to aggregation infrastructureαγβ, until there is no satisfaction The subgraph of condition simultaneously obtains a figureG′;
5th step, will under conditions of new aggregation infrastructure is not producedG' in all nonoriented edges be set to directed edge, obtain Polytree graph structureG
6th step, calculateGIn each nodevIts father node pa (v) under the conditions of conditional probability, and obtain set of conditional probabilitiesP, finally give complete Party building big data Polytree models (G,P)。
Brief description of the drawings
Fig. 1 build the process of Party building data polytree models;
Embodiment
Below in conjunction with accompanying drawing 1, to according to embodiment provided by the invention, describing in detail as follows.
The first step, each work of developing a party text is quantified, be specially:
1.1st, it is rightnIndividual work of developing a party text collectionD={d 1,d 2,…,d n , extractionmIndividual work of developing a party keyword setW={w 1,w 2,…,w m };
1.2nd, definition document frequency functionf(x), whereinxKey combination sequence occurring in document and occurring without is represented, it is right Work of developing a party keywordαW,α 1Represent keywordαOccur in a document,α 0Represent keywordαOccur without in a document;Such asRepresentαOccur andβThe key combination occurred without,f(α 1,β 0) represent keywordαOccur butβThe text occurred without Shelves frequency;
Assuming thatnWhen=100, certain two word for counting onαWithβDocument frequency bef(α 1,β 1)=20,f(α 1,β 0)=20,f(α 0,β 1)=10,f(α 0,β 0)=50,f(α 1)=40,f(α 0)=60,f(β 1)=30,f(β 0)=70。
Second step, it is rightWIn any work of developing a party keywordαβWithγ, definition, carried out with Chi-square Test (chi-square test) JudgeαWithβIt is whether separate;Use the degree of correlationQuantitative measurementαWithβBetween directly related relation, ifαWithβ Independently of each other, then, otherwise
If for example, two wordsαWithβDocument frequency be the result calculated in the first step respectively, then
=0.063;
3rd step, establishmThe maximum weight spanning tree of individual nodeT, it is specially
3.1st, willW={w 1,w 2,…,w m In each work of developing a party keyword be abstracted asTIn a node;
3.2nd, the degree of correlation between word two-by-two is investigated from big to small, if loop is not producedTIn plus one Nonoriented edgeαβ, otherwise abandon this, untilTIn havem- 1 side or;Shown in Fig. 1 (left side) For a maximum weight spanning treeT
4th step is rightTMiddle subgraphw 1w 2w 4If, then not can determine that directed edge;Check subgraphw 3w 4w 2Ifw 2Withw 3Onw 4Not conditional sampling and, then willw 3w 4w 2 It is set to aggregation infrastructurew 3w 4w 2;Similarly check other subgraphs for meeting condition;Fig. 1 (in) shown in figureG' be exactly one can The structure of energy;
5th step, will under conditions of new aggregation infrastructure is not producedG' in all nonoriented edges be set to directed edge, such as putw 1w 2, orw 1w 2;Equally putw 4w 6;But it can not putw 4w 6, because will so produce new aggregation infrastructurew 2w 4w 6; According to such principle, polytree graph structure can be finally obtainedG
6th step, calculateGIn each nodevIts father node pa (v) under the conditions of conditional probability, and obtain set of conditional probabilitiesP, finally obtain complete polyree Model (G,P), as shown in Fig. 1 (right side).

Claims (1)

  1. A kind of 1. modeling method of quantitative analysis Party building data, the method is characterized in that comprising the following steps:
    Step 1:Quantify each work of developing a party text
    1.1st, it is rightnIndividual work of developing a party text collectionD={d 1,d 2,…,d n , extractionmIndividual work of developing a party keyword setW={w 1,w 2,…,w m };
    1.2nd, definition document frequency functionf(x), whereinxKey combination sequence occurring in document and occurring without is represented, it is right Work of developing a party keywordαW,α 1Represent keywordαOccur in a document,α 0Represent keywordαOccur without in a document;
    Step 2:It is rightWIn any work of developing a party keywordαβWithγ, definition, judged with Chi-square Test (chi-square test)α WithβIt is whether separate;Use the degree of correlationQuantitative measurementαWithβBetween directly related relation, ifαWithβMutually solely It is vertical, then, otherwise
    Step 3:EstablishmThe maximum weight spanning tree of individual nodeT
    3.1st, willW={w 1,w 2,…,w m In each work of developing a party keyword be abstracted asTIn a node;
    3.2nd, the degree of correlation is investigated from big to small, if loop is not producedTIn plus a nonoriented edgeαβ, Otherwise this is abandoned, untilTIn havem- 1 side or
    Step 4:It is rightTMiddle subgraphαγβ, calculate, Judged with Chi-square Test (chi-square test)αWithβWhether onγConditional sampling;IfαWithβOn γ not bars Part is independent, and, then by subgraphαγβIt is set to aggregation infrastructureαγβ, until There is no the subgraph for the condition that meets and obtain a figureG′;
    Step 5:, will under conditions of new aggregation infrastructure is not producedG' in all nonoriented edges be set to directed edge, obtain Polytree graph structureG
    Step 6:CalculateGIn each nodevIts father node pa (v) under the conditions of conditional probability, and obtain set of conditional probabilitiesP, finally give complete Party building data polytree Model (G,P)。
CN201710751678.7A 2017-08-28 2017-08-28 Modeling method for quantitatively analyzing party building data Active CN107562854B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710751678.7A CN107562854B (en) 2017-08-28 2017-08-28 Modeling method for quantitatively analyzing party building data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710751678.7A CN107562854B (en) 2017-08-28 2017-08-28 Modeling method for quantitatively analyzing party building data

Publications (2)

Publication Number Publication Date
CN107562854A true CN107562854A (en) 2018-01-09
CN107562854B CN107562854B (en) 2020-09-22

Family

ID=60977304

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710751678.7A Active CN107562854B (en) 2017-08-28 2017-08-28 Modeling method for quantitatively analyzing party building data

Country Status (1)

Country Link
CN (1) CN107562854B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103049569A (en) * 2012-12-31 2013-04-17 武汉传神信息技术有限公司 Text similarity matching method on basis of vector space model
US20160224564A1 (en) * 2013-09-29 2016-08-04 Peking University Founder Group Co., Ltd. Method and system for key knowledge point recommendation
CN106598999A (en) * 2015-10-19 2017-04-26 北京国双科技有限公司 Method and device for calculating text theme membership degree
CN106844328A (en) * 2016-08-23 2017-06-13 华南师范大学 A kind of new extensive document subject matter semantic analysis and system
CN106874695A (en) * 2017-03-22 2017-06-20 北京大数医达科技有限公司 The construction method and device of medical knowledge collection of illustrative plates

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103049569A (en) * 2012-12-31 2013-04-17 武汉传神信息技术有限公司 Text similarity matching method on basis of vector space model
US20160224564A1 (en) * 2013-09-29 2016-08-04 Peking University Founder Group Co., Ltd. Method and system for key knowledge point recommendation
CN106598999A (en) * 2015-10-19 2017-04-26 北京国双科技有限公司 Method and device for calculating text theme membership degree
CN106844328A (en) * 2016-08-23 2017-06-13 华南师范大学 A kind of new extensive document subject matter semantic analysis and system
CN106874695A (en) * 2017-03-22 2017-06-20 北京大数医达科技有限公司 The construction method and device of medical knowledge collection of illustrative plates

Also Published As

Publication number Publication date
CN107562854B (en) 2020-09-22

Similar Documents

Publication Publication Date Title
Akbar et al. Real-time probabilistic data fusion for large-scale IoT applications
Yang et al. A system architecture for manufacturing process analysis based on big data and process mining techniques
Liu et al. Weighted graph clustering for community detection of large social networks
Ceci et al. Completion time and next activity prediction of processes using sequential pattern mining
CN106940679A (en) Data processing method and device
Xu et al. Modeling and representation for earthquake emergency response knowledge: perspective for working with geo-ontology
Kousiouris et al. An integrated information lifecycle management framework for exploiting social network data to identify dynamic large crowd concentration events in smart cities applications
Farasat et al. Social network analysis with data fusion
CN105389341A (en) Text clustering and analysis method for repeating caller work orders of customer service calls
Bae et al. Scalable flow-based community detection for large-scale network analysis
CN105574541A (en) Compactness sorting based network community discovery method
Zheng et al. Application of data mining technology in alarm analysis of communication network
CN104750499A (en) Constraint solving and description logic based web service combination method
Keyvanpour A survey on community detection methods based on the nature of social networks
CN104182489A (en) Query processing method for text big data
CN103577899A (en) Service composition method based on reliability prediction combined with QoS
US10783440B2 (en) Method and system for generating a knowledge library after analysis of a user data request
De Roo et al. Bridging archaeology and GIS: influencing factors for a 4D archaeological GIS
CN105787072A (en) Field knowledge extracting and pushing method oriented to progress
CN107562854A (en) A kind of modeling method of quantitative analysis Party building data
da F. Vieira et al. Modularity based hierarchical community detection in networks
CN105608160A (en) Distributed big data analysis method
CN101695079A (en) Automatic service combination method capable of guaranteeing correction and system thereof
CN105630899B (en) A kind of construction method of public health event early warning knowledge base
Lee et al. Exploiting online social data in ontology learning for event tracking and emergency response

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant