CN117829634A

CN117829634A - Information policy processing method and terminal

Info

Publication number: CN117829634A
Application number: CN202311523085.7A
Authority: CN
Inventors: 李源非; 郑楠; 陈思敏; 陈紫晗; 陈津莼
Original assignee: State Grid Fujian Electric Power Co Ltd; Economic and Technological Research Institute of State Grid Fujian Electric Power Co Ltd
Current assignee: State Grid Fujian Electric Power Co Ltd; Economic and Technological Research Institute of State Grid Fujian Electric Power Co Ltd
Priority date: 2023-11-15
Filing date: 2023-11-15
Publication date: 2024-04-05

Abstract

The invention discloses a processing method and a terminal of an information strategy, which collect the information strategy in the domestic and foreign energy field; constructing a comprehensive index evaluation system; according to key information corresponding to the bottom layer index in the comprehensive index evaluation system, extracting and clustering analysis are carried out on the key information based on the TF-IDF and the TextRank to obtain an index value corresponding to the bottom layer index of the information strategy; comprehensive evaluation and sequencing of importance are carried out on the information strategy based on a TOPSIS method; the invention automatically collects information strategies in the domestic and foreign energy fields, extracts keywords and analyzes importance degree by using a text analysis method combining TF-IDF and textRank based on a constructed comprehensive index evaluation system, realizes comprehensive evaluation and sorting of information strategy importance based on a TOPSIS method, and is beneficial to the development of future development strategies or plans by a mental agency.

Description

Information policy processing method and terminal

Technical Field

The present invention relates to the field of information processing technologies, and in particular, to a method and a terminal for processing an information policy.

Background

With the rise of new technological innovation and industrial revolution, the power grid enterprises serve as main undertakers for power grid construction, operation and maintenance, the intelligence institutions are urgently required to provide decision information by means of information and intelligence advantages as technological development and technological attack, and the information demands of the power grid enterprises and society are oriented, so that the information collection, processing and conversion become important contents of enterprise intelligence institution construction.

However, the information research of the energy enterprises still has a plurality of defects. Firstly, most energy enterprises do not have professional information departments, and some enterprises with information institutions are not provided, and a perfect information working system is not provided, so that a set of management methods capable of being used as references is lacking. Secondly, the current information acquisition channel is single, a fixed acquisition mode and a normalized acquisition route are not determined, and information collection is not systematic enough and classified roughly. In addition, comprehensive evaluation of information is a key process for realizing information collection result conversion of a mental agency, and information strategies are processed on the basis by collecting information strategies in the energy field of national-foreign double views, so that multi-angle evaluation work is realized, key information is extracted from the information strategies, and corresponding development strategies or plans are made. Therefore, the processing of information policies is particularly important.

Disclosure of Invention

The technical problems to be solved by the invention are as follows: the information strategy processing method and the terminal are provided, and automatic analysis and sequencing of the importance degree of the information strategy are realized.

In order to solve the technical problems, the invention adopts the following technical scheme:

a processing method of information strategy includes the steps:

s1, acquiring information strategies in the domestic and foreign energy fields;

s2, constructing a comprehensive index evaluation system;

s3, extracting and clustering analysis of the key information are carried out on the information strategy based on TF-IDF and textRank according to the key information corresponding to the bottom index in the comprehensive index evaluation system, and index values corresponding to the bottom index of the information strategy are obtained;

s4, comprehensively evaluating and sequencing importance of the information strategy based on a TOPSIS method.

In order to solve the technical problems, the invention adopts another technical scheme that:

an information policy processing terminal comprising a processor, a memory and a computer program stored in the memory and executable on the processor, the processor implementing the steps in a method for processing an information policy as described above when executing the computer program.

The invention has the beneficial effects that: according to the information strategy processing method and terminal, information strategies in the domestic and foreign energy fields are automatically collected, keywords are extracted and importance degree analysis is carried out by using a text analysis method combining TF-IDF and textRank based on a constructed comprehensive index evaluation system, comprehensive evaluation and sorting of the importance of the information strategies are realized based on a TOPSIS method, and the development of future development strategies or plans by a mental agency is facilitated.

Drawings

FIG. 1 is a flowchart of a method for processing an information policy according to an embodiment of the present invention;

fig. 2 is a block diagram of a processing terminal of an information policy according to an embodiment of the present invention;

FIG. 3 is a diagram illustrating an exemplary integrated index evaluation architecture of a method for processing an information policy according to an embodiment of the present invention;

FIG. 4 is a flowchart illustrating a TextRank method in a processing method of an information policy according to an embodiment of the present invention;

fig. 5 is a flowchart illustrating an example of AHP and entropy weight method-TOPSIS evaluation based in a method for processing an information policy according to an embodiment of the present invention;

description of the reference numerals:

1. a processing terminal of an information strategy; 2. a processor; 3. a memory.

Detailed Description

In order to describe the technical contents, the achieved objects and effects of the present invention in detail, the following description will be made with reference to the embodiments in conjunction with the accompanying drawings.

Referring to fig. 1 and fig. 3 to fig. 5, a method for processing an information policy includes the steps of:

s1, acquiring information strategies in the domestic and foreign energy fields;

s2, constructing a comprehensive index evaluation system;

From the above description, the beneficial effects of the invention are as follows: according to the information strategy processing method and terminal, information strategies in the domestic and foreign energy fields are automatically collected, keywords are extracted and importance degree analysis is carried out by using a text analysis method combining TF-IDF and textRank based on a constructed comprehensive index evaluation system, comprehensive evaluation and sorting of the importance of the information strategies are realized based on a TOPSIS method, and the assignment of a mental agency to future development strategies or plans is facilitated.

Further, the comprehensive index evaluation system is a three-level index system, wherein the highest-level index is an importance degree comprehensive index, and is obtained by weighting calculation of the middle-level index;

the intermediate layer indexes comprise an importance index, a timeliness index and an applicability index;

the importance index is obtained by weighting calculation of a bottom index comprising an information release unit index and an information category index;

the timeliness index is obtained by weighting calculation of a bottom index comprising information release time, strategy duration time and information effective time;

the applicability index is obtained by weighting calculation of an underlying index comprising an information coverage range and an information content related field.

From the above description, it can be seen that the importance degree is analyzed through three angles of importance, timeliness and applicability, and the importance degree is specifically implemented in the information release unit index, the information category, the information release time, the policy duration, the information effective time, the information coverage and the information content related field.

Further, the weight determination of each index includes the steps of:

first weight p determined by AHP method _j And a second weight q determined by an entropy weight method _j Linear combination is carried out, and the calculation formula is as follows:

w _j ＝α ₁ p _j +α ₂ q _j ；

wherein alpha is ₁ 、α ₂ Linear combination coefficients respectively representing an AHP method and an entropy weight method;

solving Nash equilibrium points by using game theory ideas:

Min||w _j -p _j ,w _j -q _j || ₂ ；

wherein, the two-paradigms of the matrix are denoted as I;

according to the matrix differential property, the second-form is transformed into a linear equation set under the following first derivative:

solving weight coefficients and carrying out normalization treatment:

calculating a combination weight:

the above description shows that the subjective weight analysis of the AHP method and the objective weight analysis of the entropy weight method are utilized, and the game theory method is utilized for combined calculation, so that the determination of each weight is more reasonable.

Further, in step S1, the collection of the information policy includes crawling information of each specified target website by a crawler, so as to obtain the information policy.

According to the description, the crawler crawls the information of the appointed website, so that the timing, the effectiveness and the completeness of automatic information acquisition are ensured.

Further, step S3 includes the steps of:

s31, word segmentation is carried out on the text of the information strategy, and stop word filtering is carried out through a preset stop word list;

s32, calculating importance scores of words in the text based on a TF-IDF method and a TextRank method respectively, combining the TF-IDF and the TextRank similarity matrix, and calculating a comprehensive score of each word: score (w) = (1-alpha) ×tfidf (w) +alpha×sum (sim (w, u) ×score (u)/sum (s im (u, v)));

wherein tfidf (w) is a TF-IDF score of the word w, alpha is a damping coefficient in TextRank algorithm, sim (u, w) is similarity between the word w and the word u, and the similarity is calculated by using methods such as cosine similarity, and a cosine similarity calculation formula is as follows:

sim(w,u)＝(w,u)/(||w||×||u||)；

where (w, u) represents the dot product of the word vectors w and u, the terms w and u represent the modulus of the word vectors w and u, respectively.

From the above description, it can be seen that, based on the combination of the TF-IDF method and the TextRank method, a total score of the importance degree of the words is determined, and the total score corresponds to the index of the bottom level of the comprehensive index evaluation system, that is, the index value of the bottom level index.

Further, calculating importance scores of words in the text based on the TF-IDF method specifically comprises the following steps:

TF-IDF(w)＝TF(w)×IDF(w)；

where TF (w) represents the number of occurrences of a given word w in text, cout (w) represents the number of occurrences of word w in text, |d _i I represents text D _i IDF (w) represents the inverse document frequency of the word w, N represents the total number of texts, DF (w) represents the number of texts containing the word w.

From the above description, the above is a specific step of analysis of importance of TF-IDF method based on word frequency.

Further, calculating importance scores of words in the text based on the TextRank method specifically comprises the following steps:

representing each word in the text as a node in the graph, connecting an edge between two words if they occur simultaneously in the text;

distributing a preset initial weight for each node, and iteratively calculating the weight of each node until the number of iterative preset rounds is converged, wherein the calculation formula is as follows:

wherein PR (V) _i ) Is the importance of the word i, in (V _i ) Is the word set pointing to word i, out (V _j ) Is the word set pointed out from the word j, d is the damping coefficient, when the number of the pointed words i is 0, the rightmost term of the formula is 0, and the damping coefficient avoids PR (V _i ) Is 0.

From the above description, the specific steps for calculating importance scores of words in text based on TextRank method are described.

Further, step S4 includes the steps of:

s41, constructing an original data decision matrix X for the current information strategy by taking the comprehensive score as an index value aiming at words of key information corresponding to a bottom index in the comprehensive index evaluation system:

the decision matrix is standardized by utilizing an extremum processing method, and a weighted standardized judgment matrix U is calculated according to index weight:

wherein m represents the number of evaluation objects, namely the number of information strategies, and n represents the number of bottom indexes in the comprehensive index evaluation system;

s42, determining a relative positive and negative ideal solution, wherein the calculation formula is as follows:

in the method, in the process of the invention,representing an ideal solution>Representing a negative ideal solution;

s43, calculating the Euclidean distance from each element in the weighted standardized judgment matrix to the ideal solution:

in the method, in the process of the invention,for each element to the just ideal Euclidean distance,>the Euclidean distance from each element to the negative ideal solution;

s44, calculating the closeness degree of each evaluation object to the optimal scheme:

wherein C is _i Representing the calculated evaluation value, the value range is [0,1]，C _i Closer to 1 means better scoring, according to C _i Value is ordered from high to low, C _i The greater the value, the better the evaluated object, i.e. the more important the corresponding information policy.

From the above description, the comprehensive evaluation and ranking are performed on the importance degree of the information policy by using a TOPSIS method, which is a multi-attribute decision method under a certain situation, and the method compares the decision scheme of the study object with the optimal scheme and the worst scheme at the same time, and judges the advantages and disadvantages of the schemes according to the distance between the schemes.

Referring to fig. 2, an information policy processing terminal includes a processor, a memory, and a computer program stored in the memory and capable of running on the processor, where the steps in the above information policy processing method are implemented when the processor executes the computer program.

The information strategy processing method and the terminal are suitable for the scene of the future development strategy or plan, which needs to be analyzed for the information strategy in the energy field.

Referring to fig. 1 and fig. 3 to 5, a first embodiment of the present invention is as follows:

a processing method of information strategy includes the steps:

s1, acquiring information strategies in the domestic and foreign energy fields;

the step S1 of collecting the information strategy comprises the step of crawling information of each appointed target website through a crawler to obtain the information strategy.

In this embodiment, information policies of "domestic-foreign" energy fields are collected, and domestic-foreign information collection is mainly based on a selected fixed website, so that authenticity of information is ensured, and technical means are mainly collected through crawler software and manual work.

S2, constructing a comprehensive index evaluation system;

the comprehensive index evaluation system is a three-level index system, wherein the highest-level index is an importance degree comprehensive index, and is obtained by weighting calculation of the middle-level index;

In this embodiment, the selection of the evaluation index needs to follow the following principles:

1) The principle of simplification is not how much better the selection of the index is, the key is the action size played in the process of evaluating the index again, and the purpose is the starting point of the index selection.

2) In principle of independence, each index should have clear connotation and independent mutually, and cannot be mutually overlapped and have causal relationship mutually.

3) The comparability principle is that the indexes can well reflect the characteristics of a certain aspect of a study object, and obvious differences are needed among the indexes.

4) The feasibility principle is that the selected index should be loaded objectively and practically, meaning is clear, and data is standard.

Referring to fig. 3, a hierarchical structure model may be built, and the energy domain information policy importance influencing factors may be decomposed and layered to construct the hierarchical structure model. Extracting and summarizing all factors affecting the evaluation result, forming different layers according to the distance from the target, and further constructing a layer model, wherein the layers can be divided into 3 types: the highest layer (target layer), the middle layer (criterion layer), and the bottom layer (factor layer).

The weight determination of each index comprises the following steps:

w _j ＝α ₁ p _j +α ₂ q _j ；

solving Nash equilibrium points by using game theory ideas:

Min||w _j -p _j ,w _j -q _j || ₂ ；

wherein, the two-paradigms of the matrix are denoted as I;

solving weight coefficients and carrying out normalization treatment:

calculating a combination weight:

in this embodiment, the first weight p is determined by the AHP method _j And determining a second weight q by entropy weighting _j The method comprises the following steps:

constructing all judgment matrixes of each level, introducing numbers 1-9 and reciprocal as scales to represent the relative importance degrees among different factors, scoring the relative importance degrees of different factors according to respective experiences by using expert scoring invitation energy field experts based on a 9-scale method, and constructing all the judgment matrixes in each level, wherein the situation of the judgment matrixes meets the following conditions:

wherein:

TABLE 1

And (3) carrying out consistency test, wherein the calculation formula is as follows:

wherein lambda is _max To judge the maximum feature root of the matrix:

wherein w is _i 、w _j Representation ofJudging the importance degree of each element in the matrix, a ₁₁ ＝w ₁ /w ₁ ，a ₁₂ ＝w ₁ /w ₂ 。

Searching a consistency index RI, wherein the average random consistency index RI is related to the order of the judgment matrix A, searching the corresponding average random consistency index RI by the following table, and calculating a consistency ratio CR to carry out consistency test, wherein the calculation formula is as follows:

TABLE 2

Consistency check formula:

the result load standard is acceptable when CR <0.10, otherwise, the judgment matrix needs to be properly corrected until CR <0.10, and a consistency result is obtained.

Calculating the weight vector of the index, since each column in the judgment matrix A approximately reflects the distribution of the weight value, the summation method is adopted to calculate the weight W of each index _i I.e. p _j The calculation formula is as follows:

weighting by adopting an entropy weighting method, normalizing data, carrying out dimensionality removal treatment on each index, and adopting a calculation formula as follows:

wherein X is ₁ ,X ₂ ,...X _m Represents a given m indices, where X _i ＝{x ₁ ,x ₂ ,...x _m }，Y ₁ ,Y ₂ ,...Y _m Is a value normalized for each index data.

Calculating the ratio of each index under each scheme, wherein the calculation formula is as follows:

wherein p is _ij Indicating the proportion of the jth index to the index in the ith scheme;

the information entropy of each index is calculated, and the information entropy calculation formula of one group of data is as follows:

wherein E is _j Not less than 0, if p _ij =0, then E _j ＝0。

According to the calculation formula of the information entropy, the information entropy of each index is calculated, and the weight w of each index is calculated through the information entropy _j I.e. q _j The calculation formula is as follows:

wherein E is ₁ ,...E _m For the information entropy of each index, k is the number of indexes, i.e., k=m.

the step S3 comprises the steps of:

s31, word segmentation is carried out on the text of the information strategy, and stop word filtering is carried out through a preset stop word list.

In this embodiment, text preprocessing is performed according to the text of the "domestic-foreign" information policy collected in step S1, and jieba.lcut () is used to perform word segmentation, and meanwhile, stop words are filtered by using the stop vocabulary. Word segmentation, in which a given text is segmented using a word segmentation tool (jieba), the text is converted into a word sequence, and a default precision pattern word=jieba. Before extracting the keywords of the information, the stop words need to be removed, and some meaningless high-frequency words such as ' yes ', yes, and ' are removed, and the frequency of the words is very high, but the words do not have too much semantic information per se, and the keyword extraction is not greatly assisted. And constructing a stop word list, wherein when text processing is carried out, each word is compared with the words in the stop word list, if the word belongs to the stop word, the stop word list is removed, and if not, the stop word is reserved, and the applicable type of the stop word is as shown in the following table 3.

TABLE 3 Table 3

sim(w,u)＝(w,u)/(||w||×||u||)；

where (w, u) represents the dot product of the word vectors w and u, the terms w and u represent the modulus of the word vectors w and u, respectively. The importance scores of words in the text are calculated based on the TF-IDF method specifically as follows:

TF-IDF(w)＝TF(w)×IDF(w)；

The method is based on TF-IDF method to calculate the score of words in text, i.e. importance level, and consists of word frequency and inverse document frequency, and its core idea is that the more times a word appears in a document, the less times it appears in other documents, the more representative the document. TF (word frequency) refers to the frequency of occurrence of a certain word in a document, IDF (inverse document frequency) refers to the inverse of the frequency of occurrence of a certain word in all documents, TF and IDF are considered comprehensively, TF-IDF values of a word can be calculated, and the higher the values are, the more important.

The importance scores of words in the text are calculated based on the TextRank method specifically as follows:

each node is assigned a preset initial weight finger.

In this embodiment, the initial value is 1.

The weight of each node is calculated iteratively until the number of preset rounds of iteration is converged, in this embodiment 126 rounds.

The calculation formula is as follows:

TextRank is a graph-based ranking algorithm that calculates a similarity matrix between words, can be used for extraction of text keywords, constructs an undirected graph for a given sequence of words, wherein each word corresponds to a node, if two words are adjacent in text, an edge is connected between them, and iteratively calculates the weight of each node (i.e., word) for identifying the most important word. And ordering nodes in the graph by using a PageRank algorithm to obtain the score of each word.

S4, comprehensively evaluating and sequencing importance of the information strategy based on a TOPSIS method;

step S4 includes the steps of:

Referring to fig. 2, a second embodiment of the present invention is as follows:

an information policy processing terminal 1 comprising a processor 2, a memory 3 and a computer program stored in said memory 3 and executable on said processor 2, said processor 2 implementing the steps of a method for processing an information policy as described above when executing said computer program.

In summary, the processing method and the terminal for information strategies provided by the invention automatically collect information strategies in the domestic and foreign energy fields, extract keywords and analyze importance degree by using a text analysis method combining TF-IDF and TextRank based on a constructed comprehensive index evaluation system, realize comprehensive evaluation and sorting of information strategy importance based on a TOPSIS method, and are favorable for the development of future development strategies or plans by a mental agency.

According to the method, the characteristics of the information strategy which is more focused by enterprises are selected as evaluation indexes by combining the specificity of the information strategy, so that an evaluation system is built, index weighting is performed by using a subjective and objective combination method, and an information importance policy evaluation index system with applicability is created. Based on the information policy collection work in the domestic-foreign energy field, the importance of the information policy to enterprises is inspected from the double-view level, a set of evaluation indexes aiming at the importance of the information is screened and established, and a comprehensive evaluation ordering flow combining a subjective and objective weighting method and a TOPSIS method is established, so that the evaluation ordering process is more convincing.

The foregoing description is only illustrative of the present invention and is not intended to limit the scope of the invention, and all equivalent changes made by the specification and drawings of the present invention, or direct or indirect application in the relevant art, are included in the scope of the present invention.

Claims

1. A method for processing an information policy, comprising the steps of:

s1, acquiring information strategies in the domestic and foreign energy fields;

s2, constructing a comprehensive index evaluation system;

2. The method for processing an information policy according to claim 1, wherein the comprehensive index evaluation system is a three-level index system, wherein the highest level index is an importance degree comprehensive index, and is obtained by weighting calculation of an intermediate level index;

3. The method for processing an information policy according to claim 2, wherein the weight determining of each index comprises the steps of:

w _j ＝α ₁ p _j +α ₂ q _j ；

solving Nash equilibrium points by using game theory ideas:

Min||w _j -p _j ,w _j -q _j || ₂ ；

wherein, the two-paradigms of the matrix are denoted as I;

solving weight coefficients and carrying out normalization treatment:

calculating a combination weight:

4. the method according to claim 1, wherein the step S1 of collecting the information policy includes crawling information of each specified target website by a crawler to obtain the information policy.

5. The method for processing an information policy according to claim 1, wherein step S3 comprises the steps of:

s32, calculating importance scores of words in the text based on a TF-IDF method and a TextRank method respectively, combining the TF-IDF and the TextRank similarity matrix, and calculating a comprehensive score of each word:

score(w)＝(1-alpha)×tfidf(w)+alpha×sum(sim(w,u)×score(u)/sum(sim(u,v)))；

sim(w,u)＝(w,u)/(||w||×||u||)；

6. The method for processing an information policy according to claim 5, wherein calculating the importance score of the word in the text based on TF-IDF method specifically comprises:

TF-IDF(w)＝TF(w)×IDF(w)；

7. The method for processing an information policy according to claim 5, wherein calculating importance scores of words in a text based on TextRank method specifically comprises:

8. The method for processing an information policy according to claim 5, wherein step S4 comprises the steps of:

9. An information policy handling terminal comprising a processor, a memory and a computer program stored in said memory and executable on said processor, characterized in that said processor, when executing said computer program, implements the steps of a method for handling an information policy according to any of the preceding claims 1-8.