CN109033132B

CN109033132B - Method and device for calculating text and subject correlation by using knowledge graph

Info

Publication number: CN109033132B
Application number: CN201810567101.5A
Authority: CN
Inventors: 孙雨轩; 吴成龙; 周劼人
Original assignee: Zhongzheng Zhengxin Shenzhen Co ltd
Current assignee: Zhongzheng Zhengxin Shenzhen Co ltd
Priority date: 2018-06-05
Filing date: 2018-06-05
Publication date: 2020-12-11
Anticipated expiration: 2038-06-05
Also published as: CN109033132A

Abstract

The invention discloses a method and a device for calculating the relevancy between a text and a main body by using a knowledge graph, wherein the method comprises the following steps: acquiring a text; performing word segmentation on a text, extracting a keyword set appearing in the text, and retrieving an enterprise subject associated with the keyword through a pre-established knowledge graph so as to take the enterprise subject associated with the keyword as a candidate enterprise set, wherein the knowledge graph comprises target node information, associated node information, a relation between the target node information and the associated node information and an association weight, the target node information comprises first enterprise subject information, and the associated node information comprises second subject information, a product or natural person information associated with the first subject enterprise subject information; and calculating the association degree of the text and the candidate enterprise subject according to the word frequency of the keywords associated with the candidate enterprise subject in the candidate enterprise set.

Description

Method and device for calculating text and subject correlation by using knowledge graph

Technical Field

The invention relates to a method and a device for calculating the relevancy between a text and a main body by using a knowledge graph.

Background

In the information age, acquisition, processing and analysis of mass data are a great difficulty. In some industries (e.g., the financial industry), people focus on information about various dimensions of an enterprise to help make decisions such as investments. On the one hand, market participants require broader, more complete data, and on the other hand, require that these data be processed in a timely manner. The enterprise public opinion information is a dimension of key attention of market participants, and as unstructured text information, the public opinion information has the characteristics of data dispersion, large data volume, complex data format, strong timeliness and the like. Therefore, it is a demand of many financial practitioners to efficiently process such data and extract valuable information by using technical means such as natural language processing. In the face of complicated public opinion information, how to associate the public opinion information with concerned enterprises to screen out information with low value or irrelevant to a main body is an important step for data analysis and mining.

The common method is to construct a keyword library of the enterprise main body, including the business name, enterprise abbreviation, enterprise listing code, etc. of the enterprise, and on the basis of this, to perform keyword matching search in the text information library, and to use the matched text as the related information of the enterprise main body. On one hand, the method needs to construct a relatively complete enterprise keyword library in advance as a retrieval basis; on the other hand, the results obtained by matching retrieval are ranked according to the degree of association, so that the effect is poor, keywords are often found in the text but not the information of the enterprise, and more redundant information still exists; meanwhile, the key words are directly matched and associated, so that important information of key association enterprises of the enterprises can be omitted, and information loss is caused.

Disclosure of Invention

Aiming at the defects of the prior art, the technical problems to be solved by the invention are as follows: the method and the device for calculating the relevance between the text and the main body by using the knowledge graph can optimize the traditional single-use keyword matching mode when analyzing massive texts. By combining a knowledge graph method, the degree of association between the target subject association and the text information can be quantified, the association dimensions of the text information and the target subject are enriched, and a basis is provided for subsequent further analysis.

In order to solve the technical problems, the invention adopts a technical scheme that: the method for calculating the relevancy between the text and the enterprise main body by using the knowledge graph comprises the following steps:

acquiring a text;

performing word segmentation on a text, extracting a keyword set appearing in the text, and retrieving an enterprise subject associated with the keyword through a pre-established knowledge graph so as to take the enterprise subject associated with the keyword as a candidate enterprise set, wherein the knowledge graph comprises target node information, associated node information, a relation between the target node information and the associated node information and an association weight, the target node information comprises first enterprise subject information, and the associated node information comprises second subject information, a product or natural person information associated with the first subject enterprise subject information;

and calculating the association degree of the text and the candidate enterprise subject according to the word frequency of the keywords associated with the candidate enterprise subject in the candidate enterprise set.

Further, the step of performing word segmentation processing on the text, extracting a keyword set appearing in the text, and retrieving an enterprise subject associated with the keyword through a pre-established knowledge graph so as to take the enterprise subject associated with the keyword as a candidate enterprise set includes:

performing word segmentation processing on a text to obtain all keywords to form a keyword set, wherein the keyword set is marked as K, searching the keywords in the keyword set K in the knowledge graph, and acquiring an enterprise subject associated with the keyword set K to use the enterprise subject associated with the keywords as a candidate enterprise set, and the candidate enterprise set is marked as C.

Further, in the step of calculating the association degree between the text and the candidate enterprise main body according to the word frequency of the keyword occurrence associated with the candidate enterprise main body in the candidate enterprise set, the method includes:

let F be the word frequency matrix of the keyword set K:

f_ithe word frequency of the ith keyword is represented;

let R be the correlation matrix of the main body set C and the keyword set K thereof, the connected nodes of the knowledge graph are 1, and the disconnected nodes of the graph are 0:

the summed word-frequency vector for the subject set C and its associated keywords:

wherein the content of the first and second substances,

representing the sum of all keyword word frequencies related to the ith candidate enterprise subject in the text;

defining a correlation factor RX, wherein the RX is used for measuring the correlation sequence among the candidate enterprise subjects in the text;

wherein the content of the first and second substances,

wherein the content of the first and second substances,

defining a relevance factor RY for measuring the relevance sequence of candidate enterprise bodies among different texts, wherein beta is more than 0, beta is a scaling adjustment parameter, and scale is more than 0, and is the number of participle words obtained after the total participle number of the text information is cleaned, and the relevance factor RY is used for measuring the text space;

wherein, 0 is not less than ry_i≤1

Obtaining a correlation matrix R of the text and the candidate enterprise main body set C^KC

Wherein, the lines are matrix dot product operations,

indicating the degree of association of the text to the ith candidate business entity.

Further, in the step of calculating the association degree between the text and the candidate business entity, the method further includes:

and calculating the association degree of the text and the candidate enterprise subject according to the word frequency and the relation weight of the keywords associated with the candidate enterprise subject in the candidate enterprise set.

Further, the step of calculating the association degree between the text and the candidate enterprise subject according to the word frequency and the relationship weight of the occurrence of the keyword associated with the candidate enterprise subject in the candidate enterprise set includes:

firstly, counting a word frequency vector F of a keyword K set:

f_ithe word frequency of the ith keyword is represented;

let R be the correlation coefficient matrix of the candidate enterprise set C and the keyword set K thereof:

r_ijrepresenting the correlation coefficient of the ith candidate enterprise subject and the jth keyword;

weighting the word frequency matrix for the correlation coefficients:

wherein

The sum of the weighted word frequencies of the keywords of the enterprise main body representing the ith candidate;

wherein the content of the first and second substances,

wherein the content of the first and second substances,

wherein, 0 is not less than ry_i≤1；

Obtaining a correlation matrix R of the text and the candidate enterprise main body set C^KC；

Wherein, the lines are matrix dot product operations,

Further, before the step of performing word segmentation processing on the text, the method further includes:

performing paragraph division preprocessing on the text, and giving corresponding weight to the position of a paragraph;

in the step of calculating the association degree of the text with the candidate business entity, the method further comprises:

and calculating the association degree of the text and the candidate enterprise main body according to the word frequency, the paragraph position, the relation weight and the text spread of the keywords associated with the candidate enterprise main body in the candidate enterprise set.

Further, the text is subjected to paragraph segmentation preprocessing by the following formula:

wherein the content of the first and second substances,

representing an integer not less than x, wherein P is a natural segment of the text, P is more than or equal to 1, and H is a split part of the text and is respectively marked as part₁,…,part_HTitle is denoted part₀H is more than or equal to 1, and the number of paragraphs in each part is recorded as L ═ L₀,l₁,…,l_H)，

Representing the maximum proportion of the first portion to the total number of segments P,

represents the maximum proportion of the H-th part to the total number of segments P,

further, the step of calculating the association degree between the text and the candidate enterprise main body according to the word frequency, paragraph position, relationship weight and text space of the keyword occurrence associated with the candidate enterprise main body in the candidate enterprise set comprises the following substeps:

let W be the weight matrix of the keyword at the paragraph position:

wherein w_iRepresenting the resulting weight of the keyword in section i, w₀The weight of the keyword in the title;

let R be the correlation coefficient matrix of the enterprise subject set C and the keyword set K:

f is a word frequency matrix obtained by the key word K at different paragraph positions:

f_ijindicates that the ith keyword is in part_jThe word frequency of the portion;

weighting the word frequency matrix for the correlation coefficients:

wherein

Business entity representing the ith candidate in part_jA sum of partial weighted word frequencies;

wherein the content of the first and second substances,

wherein the content of the first and second substances,

wherein, 0 is not less than ry_i≤1

Wherein, the lines are matrix dot product operations,

In order to solve the technical problem, the invention adopts another technical scheme that: an apparatus for calculating the relevancy of a text and an enterprise main body by using a knowledge graph is provided, which comprises:

the text acquisition module is used for acquiring a text;

the system comprises a word segmentation module, a word segmentation module and a word segmentation module, wherein the word segmentation module is used for performing word segmentation processing on a text, extracting a keyword set appearing in the text, and searching an enterprise main body associated with the keyword through a pre-established knowledge graph so as to take the enterprise main body associated with the keyword as a candidate enterprise set, the knowledge graph comprises a plurality of node information, and a relation and an association weight between each node information and the corresponding node information, and in the plurality of node information, the node information is enterprise main body information, and the rest node information is product information or natural person information corresponding to the corresponding enterprise main body;

and the association degree calculation module is used for calculating the association degree of the text and the candidate enterprise subject according to the word frequency of the occurrence of the keywords associated with the candidate enterprise subject in the candidate enterprise set.

Further, the relevancy calculation module is further configured to calculate relevancy of the text and the candidate enterprise subject according to word frequency and relationship weight of occurrence of keywords associated with the candidate enterprise subject in the candidate enterprise set.

The invention constructs the knowledge graph in the financial field, takes the knowledge graph as a relation network of candidate matching keywords, and covers the relation of industry and commerce full names, short names, products, high governments, stockholders, investment and the like with an enterprise as a target subject; in the invention, different weights are given to the positions of paragraphs given by keywords, and the importance of different paragraphs of the text is taken into consideration; and (3) calculating the association degree of all possible keywords by using a complex relation network constructed by a knowledge graph technology, finally weighting and quantizing, and improving the success rate and accuracy of the association of the text and the target subject.

Drawings

FIG. 1 is a flowchart of a first embodiment of a method for calculating relevance of text to an enterprise principal using a knowledge-graph of the present invention.

FIG. 2 is a schematic diagram of the structure of a knowledge-graph of the present invention.

FIG. 3 is a flowchart of a second embodiment of a method for calculating relevance of text to an enterprise principal using a knowledge-graph of the present invention.

FIG. 4 is a schematic illustration of a sample article in a specific example.

Fig. 5 is a schematic illustration of a knowledge-graph associated with the sample article in a specific example.

FIG. 6 is a block diagram of an embodiment of an apparatus for calculating relevance of text to an enterprise principal using a knowledge-graph of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, the method for calculating the relevance between a text and an enterprise subject by using a knowledge graph of the present invention includes the following steps:

s101, acquiring a text;

the text may be public opinion text (i.e., public opinion information).

S102, performing word segmentation processing on a text, extracting a keyword set appearing in the text, and retrieving an enterprise subject related to the keyword through a pre-established knowledge graph to take the enterprise subject related to the keyword as a candidate enterprise set, wherein the knowledge graph comprises target node information, related node information, a relation between the target node information and the related node information and an association weight, the target node information comprises first enterprise subject information, and the related node information comprises second subject information, product or natural person information related to the first subject enterprise subject information;

the knowledge graph is specifically established in the following way: target node information and associated node information are extracted from a database (such as a corpus), and corresponding relevance weights are given according to the relation between the target node information and the associated node information, so that the knowledge graph is formed (see fig. 2). The target node information is first enterprise subject information (e.g., the name of an enterprise: XX corporation), and the node information associated with the target node information may be second subject information associated with the first enterprise subject information, natural person information associated with the first subject enterprise information (e.g., a high manager, a shareholder, etc. of the first subject enterprise), or a product associated with the first subject enterprise information (e.g., a product developed and marketed by the first subject enterprise). In the knowledge-graph, both the first main body business information and the second business main body information can become target node information, and when the second business main body a in fig. 2 becomes the target node information, the original first business main body in fig. 2 is the node information associated with the second business main body a, but the relationship between the first business main body and the second business main body is changed correspondingly. The relation between each target node information and its associated node information and the relevance weight are also embodied in the knowledge-graph, and the relation between the first enterprise principal and the second enterprise principal includes but is not limited to: investment relations, supply-demand relations, guarantee relations, etc., and the relation between the natural person and the first business entity includes an occupational relation, etc. (e.g., stockholder, high manager, employee, etc.). For example, the relationship between the second enterprise principal a and the first enterprise principal is: the second business entity a is a supplier of the first business entity, the relevance weight is 0.65, the product a is a product under the first business entity, the relevance weight is 0.5, the natural person B is a stockholder of the first business entity, and the relevance weight is 1. In the knowledge graph, corresponding correlation is given according to attribute information of different relations, for example, the larger the investment relation proportion is, the larger the correlation is; the more important the job is, the more relevant the job is, etc., and the specific construction mode of the invention is not described in detail. The constructed knowledge graph can store information through a graph database and can be used for retrieval and query.

In the step S102, all keywords are obtained through word segmentation processing to form a keyword set, where the keyword set is denoted as K, keywords in the keyword set K are searched in the knowledge graph, and an enterprise subject associated with the keyword set K is obtained to serve as a candidate enterprise set, and the candidate enterprise set is denoted as C.

S103, calculating the association degree of the text and the candidate enterprise subject according to the word frequency of the keywords associated with the candidate enterprise subject in the candidate enterprise set. The method for calculating the association degree according to the word frequency comprises the following steps:

let F be the word frequency matrix of the keyword set K:

f_ithe word frequency of the ith keyword is represented;

wherein the content of the first and second substances,

wherein the content of the first and second substances,

wherein the content of the first and second substances,

wherein, 0 is not less than ry_i≤1

Wherein, the lines are matrix dot product operations,

indicating the degree of association of the text to the ith candidate business entity. Based on the relevance, a threshold value can be set to screen the enterprise subject with the closer relevance to the text; meanwhile, different texts related to the ith subject can be screened and sorted.

Preferably or optionally, the association degree of the text and the candidate business subject can be further calculated by the correlation coefficient of the word frequency, the keyword and the candidate business subject, as follows:

firstly, counting a word frequency vector F of a keyword K set:

f_ithe word frequency of the ith keyword is represented;

weighting the word frequency matrix for the correlation coefficients:

wherein

wherein the content of the first and second substances,

wherein the content of the first and second substances,

and defining a relevance factor RY for measuring the relevance sequence of the candidate enterprise main bodies among different texts, wherein beta is more than 0, beta is a scaling adjustment parameter, and scale is more than 0, and is the number of participle words obtained after the total participle number of the text information is cleaned, so as to measure the text space.

Wherein, 0 is not less than ry_i≤1

Wherein, the lines are matrix dot product operations,

It is understood that in other embodiments, the relationship weight is calculated to better and more accurately calculate the association between the keyword and the candidate business entity, and in some embodiments, the relationship weight is not a necessary technical feature.

According to the embodiment of the invention, according to a pre-established knowledge graph, after keywords in a text are extracted, each keyword is searched through the knowledge graph to obtain an enterprise subject corresponding to the keyword, the corresponding enterprise subject is used as a candidate enterprise subject to form a candidate enterprise subject set, and then according to word frequency of the keyword appearing in the text and relation weight between the word frequency and the candidate enterprise subject, the association degree between the text and the candidate enterprise subject is obtained, the success rate and accuracy of association between the text and the enterprise subject (called a target enterprise subject) are improved, the association dimension between text information and the target enterprise subject is enriched, and a more accurate basis is provided for subsequent further analysis.

Referring to fig. 3, fig. 3 is a flowchart illustrating a second embodiment of a method for calculating the relevance of a text to an enterprise principal by using a knowledge-graph according to the present invention. The method for calculating the relevance between the text and the enterprise main body by using the knowledge graph comprises the following steps:

s201, acquiring a text;

s202, performing paragraph division preprocessing on the text;

in this step, the text is subjected to paragraph segmentation preprocessing in the following manner:

the public opinion text information is set to comprise two main parts of a title and a text, and the text comprises more than or equal to 1 natural segment P. The text is divided into parts H being more than or equal to 1 and respectively marked as part₁,…,part_HWill part₀The number of paragraphs per section is denoted as L ═ L₀,l₁,…,l_H). Considering that different paragraphs of a text have different importance in the text, when the text is split, the length of the beginning and the end of the text is limited to make

The maximum ratio of the 1 st part and the H th part to the total number of the segments P is respectively adopted in the embodiment

The number of paragraphs contained for each partition is calculated as:

wherein the content of the first and second substances,

denotes an integer not less than x. P is a natural segment of the text, P is more than or equal to 1, and H is a text quiltThe split fractions are designated part₁,…,part_HTitle is denoted part₀H is more than or equal to 1, and the number of paragraphs in each part is recorded as L ═ L₀,l₁,…,l_H)，

in this step, after the paragraph segmentation preprocessing step, corresponding weights are also given to the paragraph positions. Generally, the title, front and tail segmentations of the text are given higher weights, and the text middle position weights are relatively lower. For example, the weight w of the title portion of the text₀0.35, weight w of the front part₁0.25, weight w of the tail portion_HIs 0.25, middle portion w₂～w_H-1Is 0.15.

S203, performing word segmentation processing on the text, extracting a keyword set appearing in the text, and retrieving an enterprise subject related to the keyword through a pre-established knowledge graph to take the enterprise subject related to the keyword as a candidate enterprise set, wherein the knowledge graph comprises target node information, related node information, a relation between the target node information and the related node information and an association weight, the target node information comprises first enterprise subject information, and the related node information comprises second subject information, product or natural person information related to the first subject enterprise subject information;

in this step, the segmented text obtained in step S202 is subjected to word segmentation processing, all candidate words that can be found in the knowledge graph in the text are obtained by combining the knowledge graph, and are used as keywords to be labeled, a keyword set formed by all keywords is recorded as K, keywords in the keyword set K are searched in the knowledge graph, an enterprise subject associated with the keyword set K is obtained, the enterprise subject associated with the keywords is used as a candidate enterprise set, and the candidate enterprise set is recorded as C.

S204, calculating the association degree of the text and the candidate enterprise body according to the word frequency, the paragraph position, the relation weight and the text space of the keywords associated with the candidate enterprise body in the candidate enterprise set, wherein the text space is determined by the number of words segmented in the word segmentation step.

In the step, the association degree of the text and the candidate enterprise subject is calculated in the following mode:

let W be the weight matrix of the keyword at the paragraph position:

let R be the correlation coefficient matrix of the main body set C and the keyword set K:

weighting the word frequency matrix for the correlation coefficients:

wherein

wherein the content of the first and second substances,

wherein the content of the first and second substances,

Wherein, 0 is not less than ry_i≤1

Wherein, the lines are matrix dot product operations,

According to the embodiment of the invention, the text is subjected to paragraph division preprocessing, and corresponding weights are given to the text paragraphs, so that after word segmentation processing, the weight matrix of the keywords is determined according to the positions of the paragraphs where the text is located, and then the word frequency matrix is weighted according to the correlation coefficient, the correlation factor can be obtained, and the correlation matrix of the text and the candidate enterprise subject set C is obtained, so that the correlation of the whole text and each enterprise subject in the candidate enterprise subject set C is more accurately obtained.

The following method for calculating the relevance of a text to an enterprise subject by using a knowledge graph is described in detail by a specific example:

referring to fig. 4 and 5, fig. 4 is a sample article of the example, and fig. 5 is a knowledge graph corresponding to the sample article, which shows only a partial knowledge graph centered on "le ye information technology (beijing) gmbh" due to limited locations.

The first step is to preprocess a sample article, wherein the text of the sample article has four natural sections in total, P is 4, and the sample article is taken

H＝3，

The paragraphs and weights obtained according to this formula are given in the following table:

table 1W ═ (0.35,0.25,0.15,0.25)

Secondly, extracting key words in the text and extracting a candidate subject set

(1) Title and keyword set in body text:

k is { look, grandbin, circle of friends, video net, new look intellectuality family, Tengchun video, video TV, creation and entertainment }

(2) And (3) searching in the knowledge graph, wherein the enterprise set directly related to the K comprises the following steps:

c ═ Leye information technology (Beijing) stock Limited, Shenzhen City Tengchen computer systems Limited }

Thirdly, calculating the relevance of the public sentiment text and the candidate target subject

Combining the correlation coefficients (numbers on the connecting lines) in the knowledge graph, a correlation coefficient matrix R of the main body set C and the keyword set K thereof can be obtained:

TABLE 2

The word frequency matrix F is as follows:

can obtain the product

The matrix is as follows:

cleaning the total word number of the word-segmentation words of the text information to obtain 148 word-segmentation words, wherein the scale is 148, and the beta is 100

Obtaining a correlation matrix R of the text and the main body set C^KCThe following were used:

therefore, the relevance of the sample article to the "Leye information technology (Beijing) stock Limited" is 0.526, and the relevance to the "Shenzhen Tengchen computer systems Limited" is 0.122. (the coefficients in the above specific examples are all example assumptions)

Referring to fig. 6, the present invention also discloses an apparatus for calculating the relevancy between a text and an enterprise subject by using a knowledge graph, which includes:

the text acquisition module is used for acquiring a text;

and the association degree calculation module is used for calculating the association degree of the text and the candidate enterprise subject according to the word frequency and the relation weight of the occurrence of the keywords associated with the candidate enterprise subject in the candidate enterprise set.

Optionally, the system further comprises a paragraph segmentation preprocessing module, configured to perform paragraph segmentation preprocessing on the text, and further configured to assign corresponding weights to text paragraphs;

the relevancy calculation module is further used for calculating the relevancy of the text and the candidate enterprise main body according to the word frequency, the paragraph position, the relation weight and the text space of the keywords associated with the candidate enterprise main body in the candidate enterprise set.

Optionally, the paragraph segmentation preprocessing module performs paragraph segmentation preprocessing according to the following formula:

wherein the content of the first and second substances,

optionally, the word segmentation module is further configured to perform word segmentation on a segmented text obtained by segmenting a paragraph to obtain all keywords to form a keyword set, where the keyword set is denoted as K, search keywords in the keyword set K in the knowledge graph, and obtain an enterprise subject associated with the keyword set K, so that the enterprise subject associated with the keywords is used as a candidate enterprise set, and the candidate enterprise set is denoted as C.

In the embodiment of the present invention, the functional description of each module of the apparatus for calculating the relevancy between a text and an enterprise body by using a knowledge graph may refer to the description of the method above, and is not repeated here.

The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A method for calculating the relevancy of a text and an enterprise main body by using a knowledge graph comprises the following steps:

acquiring a text;

performing word segmentation on a text, extracting a keyword set appearing in the text, and retrieving an enterprise subject associated with the keyword through a pre-established knowledge graph so as to take the enterprise subject associated with the keyword as a candidate enterprise set, wherein the knowledge graph comprises target node information, associated node information, a relation between the target node information and the associated node information and an association weight, the target node information comprises first enterprise subject information, and the associated node information comprises second subject information, products or natural person information associated with the first subject enterprise subject information;

calculating the association degree of the text and the candidate enterprise subject according to the word frequency of the keywords associated with the candidate enterprise subject in the candidate enterprise set;

the method is characterized in that in the steps of performing word segmentation processing on a text, extracting a keyword set appearing in the text, and searching an enterprise subject related to the keywords through a pre-established knowledge graph to take the enterprise subject related to the keywords as a candidate enterprise set, the method comprises the following steps:

2. The method of calculating the relevance of text to an enterprise subject using a knowledge-graph of claim 1, wherein in the step of calculating the relevance of text to the candidate enterprise subject, further comprising:

3. The method for calculating relevance of text to an enterprise principal using a knowledge-graph of claim 2, wherein prior to the step of tokenizing the text, further comprising:

4. An apparatus for calculating relevance of text to an enterprise subject using a knowledge graph, comprising:

the text acquisition module is used for acquiring a text;

5. The apparatus for calculating relevancy of text to business subjects using knowledge-graph as claimed in claim 4, wherein said relevancy calculation module is further configured to calculate relevancy of text to said candidate business subject according to word frequency and relationship weight of occurrence of keywords associated with candidate business subjects in said candidate business set.