CN115859968B - Policy granulation analysis system based on natural language analysis and machine learning - Google Patents

Policy granulation analysis system based on natural language analysis and machine learning Download PDF

Info

Publication number
CN115859968B
CN115859968B CN202310166168.9A CN202310166168A CN115859968B CN 115859968 B CN115859968 B CN 115859968B CN 202310166168 A CN202310166168 A CN 202310166168A CN 115859968 B CN115859968 B CN 115859968B
Authority
CN
China
Prior art keywords
policy
machine learning
unit
natural language
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310166168.9A
Other languages
Chinese (zh)
Other versions
CN115859968A (en
Inventor
杨显华
杨弋
丁春利
王铮
牛颢
高屹嵩
龙树全
姚晗
王舒
魏兵兵
李�浩
廖建雄
周文安
唐山
聂珊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Institute Of Standardization
SICHUAN INSTITUTE OF COMPUTER SCIENCES
Original Assignee
Sichuan Institute Of Standardization
SICHUAN INSTITUTE OF COMPUTER SCIENCES
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Institute Of Standardization, SICHUAN INSTITUTE OF COMPUTER SCIENCES filed Critical Sichuan Institute Of Standardization
Priority to CN202310166168.9A priority Critical patent/CN115859968B/en
Publication of CN115859968A publication Critical patent/CN115859968A/en
Application granted granted Critical
Publication of CN115859968B publication Critical patent/CN115859968B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Machine Translation (AREA)

Abstract

The invention relates to a policy granulation analysis system based on natural language analysis and machine learning, which solves the technical problem of low accuracy rate by adopting a policy file acquisition input module, a natural language processing module, a machine learning optimization module and a policy granulation analysis output module; the policy granulation analysis output module analyzes and outputs the granulation parameters of the policy according to the preset policy dimension characteristics and the result of the natural language processing module; the natural language processing module comprises a file preprocessing unit, a core processing component unit, a word normalization unit, a part-of-speech labeling unit, a primary analysis unit, a dictionary inquiring unit, a deep analysis unit and a natural language processing output unit; the machine learning optimization module comprises a part-of-speech quantization unit, a machine learning algorithm library and a technical scheme of an optimization fusion unit, so that the problem is well solved, and the machine learning optimization module can be used for policy granulation analysis.

Description

Policy granulation analysis system based on natural language analysis and machine learning
Technical Field
The invention relates to the field of policy analysis systems, in particular to a policy granulation analysis system based on natural language analysis and machine learning.
Background
Policy analysis is the process by which individuals, groups, research institutions systematically investigate, observe, and make quantitative and qualitative analyses of, their reflected information, the conditions, questions, and conditions in the organizational policies, decision procedures, and activities that are currently or planarly enforced. The purpose of this is to assist policy makers in continuing to adhere to or improve policy goals, achieving social development and the benefit of most people. This concept was first proposed by the politician lindbulomb, U.S. and he believes that policy analysis is common in policy formulation. The policy analysis theoretical model mainly comprises: politics system model, community model, elite model, functional process model, system model, rational model, progressive model, game model, etc.
The invention provides a policy granulation analysis system based on natural language analysis and machine learning, which is used for solving the calculation problems.
Disclosure of Invention
The invention aims to solve the technical problem of a policy granulation analysis system based on natural language analysis and machine learning in the prior art. The novel policy granulation analysis system based on natural language analysis and machine learning has the characteristic of high accuracy.
In order to solve the technical problems, the technical scheme adopted is as follows:
a policy granular analysis system based on natural language parsing and machine learning, the policy granular analysis system based on natural language parsing and machine learning comprising:
the system comprises a policy file acquisition input module, a natural language processing module, a machine learning optimization module, a policy granulation analysis output module and a machine learning optimization module, wherein the machine learning optimization module is connected with the natural language processing module;
the policy granulation analysis output module analyzes and outputs the granulation parameters of the policy according to the preset policy dimension characteristics and the result of the natural language processing module;
the natural language processing module comprises a file preprocessing unit, a core processing component unit, a word normalization unit, a part-of-speech labeling unit, a primary analysis unit, a dictionary inquiring unit, a word normalization unit, a deep analysis unit and a natural language processing output unit;
the machine learning optimization module comprises a part-of-speech quantization unit, a machine learning algorithm library and an optimization fusion unit; the part-of-speech quantization unit is used for processing natural language into machine quantized language, the machine learning algorithm library is used for loading various machine learning algorithms, and the machine learning optimization module executes the following steps:
step s1, the part-of-speech quantization unit processes natural language into machine language;
step s2, policy is setThe text is divided into s k Group, corresponding to retrieving s from machine learning algorithm library k A machine learning algorithm model is planted;
step s3, select the s ki The subset data is defined as verification set, the rest k-1 group subset data is used as training set, and the s < th > is input ki Obtaining s by seeding a machine algorithm model k ×s k Individual model calculations, ki=1, 2,3,..k;
step s4, defineWherein { x 1 ,x 2 ,...x ki ,x k Is the s < th } is ki When the subset data is defined as a verification set, the calculated values of the independent ki algorithm models are obtained; ki=1, 2, 3..k, j and w are predefined parameters, w 1 ,w 2 ,...w k Is a real number set;
step s5, by y ki =μ+αt kiki μ=log (2γ), the characteristic index α calculates a weight dispersion coefficient γ; wherein,ε ki error term coefficients, t, of the same distribution but independent for a predefined mean value of 0 ki =log|w ki |;
Step s6, by z ki =δw kik i, calculating the parameter delta, wherein z ki =arctan(Im(w ki )/Re(w ki ),ε k Error term coefficients belonging to the same distribution but independent and having a predefined mean value of 0;
step s7, bringing the characteristic indexes oc, the weight dispersion coefficients γ, and the position parameters δ obtained in steps s5 and s6 into Φ (w) =exp { jδw- γjw|w| And performing Fourier transform calculation to obtain a weight distribution function f (x), multiplying the model calculation value by the weight distribution function f (x), and completing fitting of k algorithm model calculation values.
The working principle of the invention is as follows: the invention combines natural language recognition analysis and machine learning technology,the policy granulation analysis is efficiently realized. On the basis, in order to improve accuracy, the invention is provided with a machine learning optimization module, and a word quantization unit, a machine learning algorithm library and an optimization fusion unit are used in combination to realize the fusion of multiple algorithms of machine learning optimization. Dividing policy text into s k Group, corresponding to retrieving s from machine learning algorithm library k The machine learning algorithm model is adopted, then the algorithm model is adopted for fusion, and a special fusion algorithm is adopted, so that fusion weighting of various algorithms is realized, and a natural language recognition and analysis algorithm calculated value with high accuracy is obtained.
In the above preferred scheme, for optimization, further, the core processing component unit comprises a word segmentation device, a sentence boundary annotator, a substitute sentence detector, a mark generator and a document segment description annotator; the sentence boundary annotator is an OpenNLP sentence detection module.
Further, the file preprocessing unit: converting the policy file into a plain text file, inserting paragraph marks into the text, correcting wrongly connected words, and inserting hyphens;
word normalization unit: providing a representation form for each word in the policy text, normalizing the words according to vocabulary attributes, and specifically comprising letter cases, single complex forms, spelling changes, punctuation marks, attribute marks, stop words, inflexion marks, symbols and conjunctions; the mapping relation between the same word and different description characters can be mapped; can be completed by adopting the existing SPECIALIST vocabulary tool;
part of speech tagging unit: assigning a proper part of speech to each word in the text sentence, wherein the part of speech comprises nouns, verbs, adjectives and adverbs; the existing rule-based labeling algorithm, random labeling algorithm and mixed labeling algorithm can be adopted;
primary analysis unit: finishing keyword marking; a blocking module corresponding to the existing CTAKES model can be adopted;
policy feature entity identification unit: mapping each policy feature entity from a term to a concept based on the existing method for querying the dictionary, searching accurate matching items of words in dictionary entries and words in a policy text, and realizing matching of word canonical forms by searching the arrangement sequence of the words in the dictionary;
depth analysis unit: the method comprises the steps of providing syntax information and determining association relations among words; expressing vocabulary in natural language by using numerical value vector to obtain word vector;
natural language processing output unit: the method is used for outputting natural language identification processing results and further carrying out policy granulation analysis;
the deep analysis unit comprises the following steps of realizing word association:
step k1, using word1 in the seed word set, and performing association degree calculation on the word1 and word2 in the candidate word set;
step k2, calculating the association degree of word1 and word2Wherein, P (word 1, word 2) is the probability that word1 and word2 appear together; p (word 1) is the probability that word1 appears in the article, and P (word 2) is the probability that word2 appears in the article;
step k3, judging the magnitude of the association degree PMI (word 1, word 2) and a predefined threshold value, if the association degree PMI is larger than the rule, defining word1 to word2 association, classifying word1 into word2, and classifying word wor into word1; otherwise, the definition is irrelevant.
Further, the deep analysis unit executes the following steps to realize feature space spectrum fusion of the associated words;
step r1, selecting words word1 and words word2, defining the words word1 and words word2 as circle center nodes respectively, normalizing and calculating association degree values of word1 association word combinations, sequencing the association degree values of word2 association word combinations, normalizing and calculating association degree values of word2 association word combinations, sequencing the association degree values to obtain association relation space atlas gl1 of word1 and association relation space atlas gl2 of word2 respectively, and characterizing the association relation space atlas association degree values by using color depth values;
r2, selecting an incidence relation space spectrum gl1 or an incidence relation space spectrum gl2 as a source space spectrum, and the other as a target space spectrum;
step r3, selecting a center node as a starting point, selecting adjacent words as end points, and calling a starting point and end point association degree value pw in the association relation space map gl1 1 Pw associated with start point and end point of association relation space map gl2 2 Calculating pw 1 ×pw 2 A value, if the value is smaller than a predefined threshold value, executing a step r7, otherwise executing a step r4;
step r4, calculating pw 1 -pw 2 If the value is smaller than a predefined threshold value, performing coincidence fusion, otherwise, performing difference fusion; the words which are overlapped and fused as a starting point and a finishing point are fused in parallel, and the related connecting lines take smaller color depth values for fusion; the difference fusion is to rotate the association relation space map gl1 or the association relation space map gl2 by taking the starting point as the center, so that the starting point is fused, the ending points are different, and the association connecting lines take respective color depth values;
and r5, traversing all words to finish the feature space map fusion of the associated words.
Further, the dictionary comprises an objective dictionary and a subjective dictionary; the subjective dictionary is a dictionary composed of words indicating policy tendency.
The invention has the beneficial effects that: the invention efficiently realizes the granular analysis of the policy by combining natural language identification analysis and machine learning technology. On the basis, in order to improve accuracy, the invention is provided with a machine learning optimization module, and a word quantization unit, a machine learning algorithm library and an optimization fusion unit are used in combination to realize the fusion of multiple algorithms of machine learning optimization. Dividing policy text into s k Group, corresponding to retrieving s from machine learning algorithm library k The machine learning algorithm model is adopted, then the algorithm model is adopted for fusion, and a special fusion algorithm is adopted, so that fusion weighting of various algorithms is realized, and a natural language recognition and analysis algorithm calculated value with high accuracy is obtained. In the deep analysis unit, in order to realize the high-accuracy recognition and the high-efficiency recognition of the relevance of words and sentences, the invention adopts an intelligent map fusion technology to realize the purpose.
Drawings
The invention will be further described with reference to the drawings and examples.
FIG. 1 is a schematic diagram of a policy granular analysis system based on natural language parsing and machine learning.
Detailed Description
The present invention will be described in further detail with reference to the following examples in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Example 1
The present embodiment provides a policy granulation analysis system based on natural language analysis and machine learning, as shown in fig. 1, the policy granulation analysis system based on natural language analysis and machine learning includes:
the system comprises a policy file acquisition input module, a natural language processing module, a machine learning optimization module, a policy granulation analysis output module and a machine learning optimization module, wherein the machine learning optimization module is connected with the natural language processing module;
the policy granulation analysis output module analyzes and outputs the granulation parameters of the policy according to the preset policy dimension characteristics and the result of the natural language processing module;
the natural language processing module comprises a file preprocessing unit, a core processing component unit, a word normalization unit, a part-of-speech labeling unit, a primary analysis unit, a dictionary inquiring unit, a word normalization unit, a deep analysis unit and a natural language processing output unit;
the machine learning optimization module comprises a part-of-speech quantization unit, a machine learning algorithm library and an optimization fusion unit; the part-of-speech quantization unit is used for processing natural language into machine quantized language, the machine learning algorithm library is used for loading various machine learning algorithms, and the machine learning optimization module executes the following steps:
step s1, the part-of-speech quantization unit processes natural language into machine language;
step s2, dividing the policy text into s k Group, corresponding to fetch from machine learning algorithm librarys k A machine learning algorithm model is planted;
step s3, select the s ki The subset data is defined as verification set, the rest k-1 group subset data is used as training set, and the s < th > is input ki Obtaining s by seeding a machine algorithm model k ×s k Individual model calculations, ki=1, 2,3,..k;
step s4, defineWherein { x 1 ,x 2 ,...x ki ,x k Is the s < th } is ki When the subset data is defined as a verification set, the calculated values of the independent ki algorithm models are obtained; ki=1, 2, 3..k, j and w are predefined parameters, w 1 ,w 2 ,...w k Is a real number set;
step s5, by y ki =μ+αt kiki μ=log (2γ), the characteristic index α calculates a weight dispersion coefficient γ; wherein,ε ki error term coefficients, t, of the same distribution but independent for a predefined mean value of 0 ki =log|w ki |;
Step s6, by z ki =δw kik i, calculating the parameter delta, wherein z ki =arctan(Im(w ki )/Re(w ki ),ε k Error term coefficients belonging to the same distribution but independent and having a predefined mean value of 0;
step s7, bringing the characteristic indexes oc, the weight dispersion coefficients γ, and the position parameters δ obtained in steps s5 and s6 into Φ (w) =exp { jδw- γjw|w| And performing Fourier transform calculation to obtain a weight distribution function f (x), multiplying the model calculation value by the weight distribution function f (x), and completing fitting of k algorithm model calculation values.
The embodiment efficiently realizes the granular analysis of the policy by combining natural language recognition analysis and machine learning technology. On the basis, in order to improve the accuracy, the inventionThe machine learning optimization module is loaded, and the word quantization unit, the machine learning algorithm library and the optimization fusion unit are combined to realize the fusion of multiple algorithms of machine learning optimization. Dividing policy text into s k Group, corresponding to retrieving s from machine learning algorithm library k The machine learning algorithm model is adopted, then the algorithm model is adopted for fusion, and a special fusion algorithm is adopted, so that fusion weighting of various algorithms is realized, and a natural language recognition and analysis algorithm calculated value with high accuracy is obtained.
Specifically, the core processing component unit comprises a word segmentation device, a sentence boundary annotator, a substitute sentence detector, a mark generator and a document segment description annotator; the sentence boundary annotator is an OpenNLP sentence detection module.
Specifically, the file preprocessing unit: converting the policy file into a plain text file, inserting paragraph marks into the text, correcting wrongly connected words, and inserting hyphens;
word normalization unit: providing a representation form for each word in the policy text, normalizing the words according to vocabulary attributes, and specifically comprising letter cases, single complex forms, spelling changes, punctuation marks, attribute marks, stop words, inflexion marks, symbols and conjunctions; the mapping relation between the same word and different description characters can be mapped; can be completed by adopting the existing SPECIALIST vocabulary tool;
part of speech tagging unit: assigning a proper part of speech to each word in the text sentence, wherein the part of speech comprises nouns, verbs, adjectives and adverbs; the existing rule-based labeling algorithm, random labeling algorithm and mixed labeling algorithm can be adopted;
primary analysis unit: finishing keyword marking; a blocking module corresponding to the existing CTAKES model can be adopted;
policy feature entity identification unit: mapping each policy feature entity from a term to a concept based on the existing method for querying the dictionary, searching accurate matching items of words in dictionary entries and words in a policy text, and realizing matching of word canonical forms by searching the arrangement sequence of the words in the dictionary;
depth analysis unit: the method comprises the steps of providing syntax information and determining association relations among words; expressing vocabulary in natural language by using numerical value vector to obtain word vector;
natural language processing output unit: the method is used for outputting natural language identification processing results and further carrying out policy granulation analysis;
the deep analysis unit comprises the following steps of realizing word association:
step k1, using word1 in the seed word set, and performing association degree calculation on the word1 and word2 in the candidate word set;
step k2, calculating the association degree of word1 and word2Wherein, P (word 1, word 2) is the probability that word1 and word2 appear together; p (word 1) is the probability that word1 appears in the article, and P (word 2) is the probability that word2 appears in the article;
step k3, judging the magnitude of the association degree PMI (word 1, word 2) and a predefined threshold value, if the association degree PMI is larger than the rule, defining word1 to word2 association, classifying word1 into word2, and classifying word wor into word1; otherwise, the definition is irrelevant.
Preferably, the deep analysis unit executes the following steps to realize feature space spectrum fusion of the associated words;
step r1, selecting words word1 and words word2, defining the words word1 and words word2 as circle center nodes respectively, normalizing and calculating association degree values of word1 association word combinations, sequencing the association degree values of word2 association word combinations, normalizing and calculating association degree values of word2 association word combinations, sequencing the association degree values to obtain association relation space atlas gl1 of word1 and association relation space atlas gl2 of word2 respectively, and characterizing the association relation space atlas association degree values by using color depth values;
r2, selecting an incidence relation space spectrum gl1 or an incidence relation space spectrum gl2 as a source space spectrum, and the other as a target space spectrum;
step r3, selecting the center node as the originThe point, the adjacent words are used as the end points, and the relevance value pw of the start points and the end points in the relevance space map gl1 is called 1 Pw associated with start point and end point of association relation space map gl2 2 Calculating pw 1 ×pw 2 A value, if the value is smaller than a predefined threshold value, executing a step r7, otherwise executing a step r4;
step r4, calculating pw 1 -pw 2 If the value is smaller than a predefined threshold value, performing coincidence fusion, otherwise, performing difference fusion; the words which are overlapped and fused as a starting point and a finishing point are fused in parallel, and the related connecting lines take smaller color depth values for fusion; the difference fusion is to rotate the association relation space map gl1 or the association relation space map gl2 by taking the starting point as the center, so that the starting point is fused, the ending points are different, and the association connecting lines take respective color depth values;
and r5, traversing all words to finish the feature space map fusion of the associated words.
Preferably, the dictionary includes an objective dictionary and a subjective dictionary; the subjective dictionary is a dictionary composed of words indicating policy tendency. By adopting the subjective policy trend dictionary, policy granulation analysis can be further enriched on the basis of the conventional policy trend judgment.
The embodiment efficiently realizes the granular analysis of the policy by combining natural language recognition analysis and machine learning technology. On the basis, in order to improve accuracy, the invention is provided with a machine learning optimization module, and a word quantization unit, a machine learning algorithm library and an optimization fusion unit are used in combination to realize the fusion of multiple algorithms of machine learning optimization. Dividing policy text into s k Group, corresponding to retrieving s from machine learning algorithm library k The machine learning algorithm model is adopted, then the algorithm model is adopted for fusion, and a special fusion algorithm is adopted, so that fusion weighting of various algorithms is realized, and a natural language recognition and analysis algorithm calculated value with high accuracy is obtained. In the deep analysis unit, in order to realize the high-accuracy recognition and the high-efficiency recognition of the relevance of words and sentences, the invention adopts an intelligent map fusion technology to realize the purpose.
While the foregoing describes the illustrative embodiments of the present invention so that those skilled in the art may understand the present invention, the present invention is not limited to the specific embodiments, and all inventive innovations utilizing the inventive concepts are herein within the scope of the present invention as defined and defined by the appended claims, as long as the various changes are within the spirit and scope of the present invention.

Claims (5)

1. A policy granulation analysis system based on natural language analysis and machine learning is characterized in that: the policy granulation analysis system based on natural language parsing and machine learning comprises:
the system comprises a policy file acquisition input module, a natural language processing module, a machine learning optimization module, a policy granulation analysis output module and a machine learning optimization module, wherein the machine learning optimization module is connected with the natural language processing module;
the policy granulation analysis output module analyzes and outputs the granulation parameters of the policy according to the preset policy dimension characteristics and the result of the natural language processing module;
the natural language processing module comprises a file preprocessing unit, a core processing component unit, a word normalization unit, a part-of-speech labeling unit, a primary analysis unit, a dictionary inquiring unit, a word normalization unit, a deep analysis unit and a natural language processing output unit;
the machine learning optimization module comprises a part-of-speech quantization unit, a machine learning algorithm library and an optimization fusion unit; the part-of-speech quantization unit is used for processing natural language into machine quantized language, the machine learning algorithm library is used for loading various machine learning algorithms, and the machine learning optimization module executes the following steps:
step s1, the part-of-speech quantization unit processes natural language into machine language;
step s2, dividing the original text into s k Group, corresponding to retrieving s from machine learning algorithm library k A machine learning algorithm model is planted;
step s3, select the s ki The subset data is defined as verification set, the rest k-1 group subset data is used as training set, and the first is inputs ki Obtaining s by seeding a machine algorithm model k ×s k Model calculation, ki=1, 2,3,..k, k is an integer greater than 1;
step s4, defining intermediate parameters ψ (w) is an intermediate parameter, where { x } x 1 ,x 2 ,...x ki ,x k Is the s < th } is ki When the subset data is defined as a verification set, the calculated values of the independent ki algorithm models are obtained; ki=1, 2, 3..k, j and w are predefined parameters, w 1 ,w 2 ,...w k Is a real number set, w ki Is the ki w value;
step s5, by intermediate parameter y ki =μ+αt kiki The predefined coefficient mu=log (2 gamma), and the characteristic index alpha and the weight dispersion coefficient gamma are calculated; wherein,ε ki an intermediate parameter t, which is a predefined error term coefficient with an average value of 0 and belongs to the same distribution but is independent ki =log|w ki |;
Step s6, by intermediate parameter z ki =δw kiki Calculating a parameter delta, wherein z ki =arctan(Im(w ki )/Re(w ki ),ε ki Error term coefficients belonging to the same distribution but independent and having a predefined mean value of 0;
step s7, bringing the characteristic index α, the weight dispersion coefficient γ, and the position parameter δ obtained in steps s5 and s6 into an intermediate function Φ (w) =exp { jδw- γ | w| α And performing Fourier transform calculation to obtain a weight distribution function f (x), multiplying the model calculation value by the weight distribution function f (x) to obtain a fitting value, and completing the fitting of the k algorithm model calculation values.
2. The policy granular analysis system based on natural language parsing and machine learning according to claim 1, wherein: the core processing component unit comprises a word segmentation device, a sentence boundary annotator, a substitute sentence detector, a mark generator and a document segment description annotator; the sentence boundary annotator is an OpenNLP sentence detection module.
3. The policy granular analysis system based on natural language parsing and machine learning according to claim 1, wherein: the deep analysis unit comprises the following steps of realizing word association:
step k1, using word1 in the seed word set, and performing association degree calculation on the word1 and word2 in the candidate word set;
step k2, calculating the association degree of word1 and word2Wherein, P (word 1, word 2) is the probability that word1 and word2 appear together; p (word 1) is the probability that word1 appears in the article, and P (word 2) is the probability that word2 appears in the article;
step k3, judging the magnitude of the association degree PMI (word 1, word 2) and a predefined threshold value, if the association degree PMI is larger than the rule, defining word1 to word2 association, classifying word1 into word2, and classifying word wor into word1; otherwise, the definition is irrelevant.
4. The policy granular analysis system according to claim 3, wherein: the deep analysis unit performs the following steps to realize feature space map fusion of the associated words;
step r1, selecting words word1 and words word2, defining the words word1 and words word2 as circle center nodes respectively, normalizing and calculating association degree values of word1 association word combinations, sequencing the association degree values of word2 association word combinations, normalizing and calculating association degree values of word2 association word combinations, sequencing the association degree values to obtain association relation space atlas gl1 of word1 and association relation space atlas gl2 of word2 respectively, and characterizing the association relation space atlas association degree values by using color depth values;
r2, selecting an incidence relation space spectrum gl1 or an incidence relation space spectrum gl2 as a source space spectrum, and the other as a target space spectrum;
step r3, selecting a center node as a starting point, selecting adjacent words as end points, and calling a starting point and end point association degree value pw in the association relation space map gl1 1 Pw associated with start point and end point of association relation space map gl2 2 Calculating pw 1 ×pw 2 A value, if the value is smaller than a predefined threshold value, executing a step r7, otherwise executing a step r4;
step r4, calculating pw 1 -pw 2 If the value is smaller than a predefined threshold value, performing coincidence fusion, otherwise, performing difference fusion; the words which are overlapped and fused as a starting point and a finishing point are fused in parallel, and the related connecting lines take smaller color depth values for fusion; the difference fusion is to rotate the association relation space map gl1 or the association relation space map gl2 by taking the starting point as the center, so that the starting point is fused, the ending points are different, and the association connecting lines take respective color depth values;
and r5, traversing all words to finish the feature space map fusion of the associated words.
5. The policy granular analysis system based on natural language parsing and machine learning according to claim 1, wherein: the dictionary comprises an objective dictionary and a subjective dictionary; the subjective dictionary is a dictionary composed of words indicating policy tendency.
CN202310166168.9A 2023-02-27 2023-02-27 Policy granulation analysis system based on natural language analysis and machine learning Active CN115859968B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310166168.9A CN115859968B (en) 2023-02-27 2023-02-27 Policy granulation analysis system based on natural language analysis and machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310166168.9A CN115859968B (en) 2023-02-27 2023-02-27 Policy granulation analysis system based on natural language analysis and machine learning

Publications (2)

Publication Number Publication Date
CN115859968A CN115859968A (en) 2023-03-28
CN115859968B true CN115859968B (en) 2023-11-21

Family

ID=85658938

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310166168.9A Active CN115859968B (en) 2023-02-27 2023-02-27 Policy granulation analysis system based on natural language analysis and machine learning

Country Status (1)

Country Link
CN (1) CN115859968B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107679041A (en) * 2017-10-20 2018-02-09 苏州大学 English event synchronous anomalies method and system based on convolutional neural networks
CN108228701A (en) * 2017-10-23 2018-06-29 武汉大学 A kind of system for realizing Chinese near-nature forest language inquiry interface
CN108733653A (en) * 2018-05-18 2018-11-02 华中科技大学 A kind of sentiment analysis method of the Skip-gram models based on fusion part of speech and semantic information
AU2019100371A4 (en) * 2019-04-05 2019-05-16 Ba, He Mr A Sentiment Analysis System Based on Deep Learning
CN109766416A (en) * 2018-11-27 2019-05-17 中国电力科学研究院有限公司 A kind of new energy policy information abstracting method and system
CN110609983A (en) * 2019-08-19 2019-12-24 广州利科科技有限公司 Structured decomposition method for policy file
CN113032552A (en) * 2021-05-25 2021-06-25 南京鸿程信息科技有限公司 Text abstract-based policy key point extraction method and system
CN113254512A (en) * 2021-04-26 2021-08-13 中国人民解放军军事科学院国防科技创新研究院 Military and civil fusion policy information data analysis and optimization system
CN115455189A (en) * 2022-10-08 2022-12-09 浙江浙里信征信有限公司 Policy text classification method based on prompt learning

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107679041A (en) * 2017-10-20 2018-02-09 苏州大学 English event synchronous anomalies method and system based on convolutional neural networks
CN108228701A (en) * 2017-10-23 2018-06-29 武汉大学 A kind of system for realizing Chinese near-nature forest language inquiry interface
CN108733653A (en) * 2018-05-18 2018-11-02 华中科技大学 A kind of sentiment analysis method of the Skip-gram models based on fusion part of speech and semantic information
CN109766416A (en) * 2018-11-27 2019-05-17 中国电力科学研究院有限公司 A kind of new energy policy information abstracting method and system
AU2019100371A4 (en) * 2019-04-05 2019-05-16 Ba, He Mr A Sentiment Analysis System Based on Deep Learning
CN110609983A (en) * 2019-08-19 2019-12-24 广州利科科技有限公司 Structured decomposition method for policy file
CN113254512A (en) * 2021-04-26 2021-08-13 中国人民解放军军事科学院国防科技创新研究院 Military and civil fusion policy information data analysis and optimization system
CN113032552A (en) * 2021-05-25 2021-06-25 南京鸿程信息科技有限公司 Text abstract-based policy key point extraction method and system
CN115455189A (en) * 2022-10-08 2022-12-09 浙江浙里信征信有限公司 Policy text classification method based on prompt learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
国内外认知计算研究现状及其在图情领域应用研究;郭顺利 等;《情报科学》;137-146 *

Also Published As

Publication number Publication date
CN115859968A (en) 2023-03-28

Similar Documents

Publication Publication Date Title
US9176949B2 (en) Systems and methods for sentence comparison and sentence-based search
CN112069298A (en) Human-computer interaction method, device and medium based on semantic web and intention recognition
CN107562919B (en) Multi-index integrated software component retrieval method and system based on information retrieval
CN110727839A (en) Semantic parsing of natural language queries
US11487943B2 (en) Automatic synonyms using word embedding and word similarity models
CN112800239B (en) Training method of intention recognition model, and intention recognition method and device
US20220114340A1 (en) System and method for an automatic search and comparison tool
CN113705237B (en) Relationship extraction method and device integrating relationship phrase knowledge and electronic equipment
CN112270188A (en) Questioning type analysis path recommendation method, system and storage medium
CN114049505B (en) Method, device, equipment and medium for matching and identifying commodities
CN102346753A (en) Semi-supervised text clustering method and device fusing pairwise constraints and keywords
CN118096452B (en) Case auxiliary judgment method, device, terminal equipment and medium
CN114943220B (en) Sentence vector generation method and duplicate checking method for scientific research establishment duplicate checking
CN116342167A (en) Intelligent cost measurement method and device based on sequence labeling named entity recognition
CN116644148A (en) Keyword recognition method and device, electronic equipment and storage medium
Balaji et al. Text summarization using NLP technique
CN117891958A (en) Standard data processing method based on knowledge graph
CN117648916A (en) Text similarity recognition model training method and text related information acquisition method
CN112131246A (en) Data center intelligent query statistical method based on natural language semantic analysis
CN115859968B (en) Policy granulation analysis system based on natural language analysis and machine learning
Cahyani et al. Indonesian part of speech tagging using maximum entropy markov model on Indonesian manually tagged corpus
CN114239555A (en) Training method of keyword extraction model and related device
CN113190690A (en) Unsupervised knowledge graph inference processing method, unsupervised knowledge graph inference processing device, unsupervised knowledge graph inference processing equipment and unsupervised knowledge graph inference processing medium
CN111666770A (en) Semantic matching method and device
JP4314271B2 (en) Inter-word relevance calculation device, inter-word relevance calculation method, inter-word relevance calculation program, and recording medium recording the program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant