CN106469187B - The extracting method and device of keyword - Google Patents
The extracting method and device of keyword Download PDFInfo
- Publication number
- CN106469187B CN106469187B CN201610751325.2A CN201610751325A CN106469187B CN 106469187 B CN106469187 B CN 106469187B CN 201610751325 A CN201610751325 A CN 201610751325A CN 106469187 B CN106469187 B CN 106469187B
- Authority
- CN
- China
- Prior art keywords
- word
- target text
- theme
- predicate
- institute
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of extracting method of keyword and devices, it is related to technical field of data processing, for solving the problems, such as that existing keyword extraction intelligence, efficiency are lower, main technical schemes of the invention are as follows: obtain the theme disturbance degree vector of each word in target text, the theme disturbance degree vector of institute's predicate is for indicating institute's predicate to the disturbance degree of theme in the target text;The different degree of each word in the target text is calculated according to the theme disturbance degree vector of the word figure of the target text and institute's predicate, the different degree is used to indicate the correlation degree of institute's predicate Yu the target text;Keyword of the word for meeting default different degree as the target text is chosen from the target text.Present invention is mainly used for extract keyword.
Description
Technical field
The present invention relates to technical field of data processing, more particularly to the extracting method and device of a kind of keyword.
Background technique
Keyword extraction is that the word or phrase that can reflect text purport information are extracted from given text, is being plucked automatically
It wants, text mining, play an important role in information retrieval, especially realize the key method of automatic marking.Wherein, according to being
It is no to need to mark training corpus and keyword abstraction method be divided into two major classes: to have supervision keyword abstraction and unsupervised key
Word extracts.
The unsupervised keyword abstraction of word-based figure is to establish word node of graph in turn based on the distribution of word in a document, so
Close on what word was transmitted according to three the covering influence power of word, position influence power and frequency influence power aspect weighted calculations afterwards
Influence power, that is, the side length of word figure interior joint is calculated, and keyword is extracted from document according to the side length of word figure interior joint.
But the word frequency of text can indicate frequency influence power, Term co-occurrence relationship in the keyword abstraction method of word-based figure
It can indicate position influence power and covering influence power, therefore the keyword that the keyword abstraction method of word-based figure extracts is in text
There is more word in word frequency and Term co-occurrence relationship, and these words often to the theme of text and uncorrelated, therefore in order to make to extract
Keyword more suit text subject, obtain preferable keyword effect and generally require artificial experience and intervened, that is, weighing
Often use relatively simple experience assignment method when the importance of quantifier language, such as the word occurred in theme assign compared with
High weight.Therefore the keyword abstraction method of existing word-based figure requires manual intervention, and extracts intelligence, the efficiency of keyword
It is lower.
Summary of the invention
In view of this, the present invention provides the extracting method and device of a kind of keyword, main purpose is to solve existing pass
Keyword extracts the lower problem of intelligence, efficiency.
According to the present invention on one side, a kind of extracting method of keyword is provided, comprising:
The theme disturbance degree vector of each word in target text is obtained, the theme disturbance degree vector of institute's predicate is for indicating institute
Disturbance degree of the predicate to theme in the target text;
It is calculated according to the theme disturbance degree vector of the word figure of the target text and institute's predicate each in the target text
The different degree of word, the different degree are used to indicate the correlation degree of institute's predicate Yu the target text;
Keyword of the word for meeting default different degree as the target text is chosen from the target text.
Specifically, the theme disturbance degree vector for stating each word in acquisition target text includes:
Model LDA, which is generated, by document subject matter calculates the probability and each master that each theme occurs in the target text
The probability that each word occurs in topic;
What the probability that theme each in the target text is occurred occurred with word each in each theme respectively
Probability carries out dot product calculating, obtains the theme disturbance degree vector of each word in the target text.
Further, described that the mesh is calculated according to the word figure of the target text and the theme disturbance degree vector of institute's predicate
It marks in text before the different degree of each word, the method also includes:
Node in using the word in the target text as institute's predicate figure, the adjacent pass in position of word in the target text
System constructs the word figure of the target text as the connection side between the node.
Specifically, described calculate the target according to the word figure of the target text and the theme disturbance degree vector of institute's predicate
The different degree of each word in text, comprising:
The similarity in target text between each word is calculated by the theme disturbance degree vector of institute's predicate;
According to each in target text described in the similarity calculation between the word figure of the target text and each word
The different degree of word.
Specifically, the theme disturbance degree vector by institute's predicate calculates the similarity in target text between each word
Include:
Obtain the two nodes corresponding word in the word figure of the target text with connection side;
By calculating the cosine similarity value of the theme disturbance degree vector of the corresponding word of two nodes with connection side,
Determine the similarity between each word.
Specifically, target described in the similarity calculation according between the word figure of the target text and each word
The different degree of each word includes: in text
Using the similarity between word and word as the boundary values on corresponding node connection side in the word figure of the target text;
The important of cumulative acquisition institute's predicate is carried out to the boundary values on each connection side of the word figure interior joint of the target text
Degree.
Specifically, described calculate the target according to the word figure of the target text and the theme disturbance degree vector of institute's predicate
The different degree of each word includes: in text
Set the theme disturbance degree vector of institute's predicate to the weighted value of the word figure interior joint of the target text;
The mesh is calculated according to the weighted value of the keyword abstraction TextRank algorithm of word-based graph model and the node
Mark the different degree of each word in text.
Specifically, the pass for choosing the word for meeting default different degree from the target text as the target text
Keyword includes:
Keyword of the highest word of different degree as the target text is chosen from the target text.
According to the present invention on the other hand, a kind of extraction element of keyword is provided, comprising:
Acquiring unit, for obtaining the theme disturbance degree vector of each word in target text, the theme disturbance degree of institute's predicate
Vector is for indicating institute's predicate to the disturbance degree of theme in the target text;
Computing unit, for calculating the mesh according to the word figure of the target text and the theme disturbance degree vector of institute's predicate
The different degree of each word in text is marked, the different degree is used to indicate the correlation degree of institute's predicate Yu the target text;
Selection unit, for choosing the word for meeting default different degree from the target text as the target text
Keyword.
Specifically, the acquiring unit includes:
Computing module calculates each theme appearance in the target text for generating model LDA by document subject matter
Probability and each theme in the probability that occurs of each word;
Dot product module, the probability for there is theme each in the target text is respectively and in each theme
The probability that each word occurs carries out dot product calculating, obtains the theme disturbance degree vector of each word in the target text.
Further, described device further include:
Construction unit, for the node in using the word in the target text as institute's predicate figure, in the target text
The position neighbouring relations of word construct the word figure of the target text as the connection side between the node.
Specifically, the computing unit includes:
First computing module calculates in target text between each word for the theme disturbance degree vector by institute's predicate
Similarity;
Second computing module, for the similarity calculation institute between the word figure and each word according to the target text
State the different degree of each word in target text.
Specifically, first computing module includes:
Acquisition submodule has the corresponding word of two nodes on connection side in the word figure for obtaining the target text;
Submodule is determined, for the theme disturbance degree vector by calculating the corresponding word of two nodes with connection side
Cosine similarity value, determine the similarity between each word.
Specifically, second computing module includes:
Submodule is configured, for connecting the similarity between word and word as corresponding node in the word figure of the target text
The boundary values of edge fit;
Cumulative submodule, the boundary values on each connection side for the word figure interior joint to the target text carry out cumulative obtain
Obtain the different degree of institute's predicate.
Specifically, the computing unit further include:
Setup module, for setting the theme disturbance degree vector of institute's predicate to the word figure interior joint of the target text
Weighted value;
Third computing module, for according to the keyword abstraction TextRank algorithm of word-based graph model and the node
Weighted value calculates the different degree of each word in the target text.
The selection unit is specifically used for choosing the highest word of different degree from the target text as the target text
This keyword.
By above-mentioned technical proposal, technical solution provided in an embodiment of the present invention is at least had the advantage that
The extracting method and device of a kind of keyword provided in an embodiment of the present invention, first each word in acquisition target text
Theme disturbance degree vector, the theme disturbance degree vector of institute's predicate is for indicating institute's predicate to the shadow of theme in the target text
Then loudness calculates each in the target text according to the theme disturbance degree vector of the word figure of the target text and institute's predicate
The different degree of word, the different degree are used to indicate the correlation degree of institute's predicate Yu the target text, finally from the target text
Keyword of the word for meeting default different degree as the target text is chosen in this.Exist with word is intervened by artificial experience at present
To realize that extracting keyword compares, the embodiment of the present invention generates model by document subject matter and calculates mesh the importance of theme in text
The theme disturbance degree vector of each word in text is marked, then using the theme disturbance degree vector of word as measurement word in target text
The importance of theme, therefore the embodiment of the present invention is without being arranged importance of the word in text subject by artificial experience again,
And it can accurately indicate word to main in target text according to the primary influences degree vector that document subject matter generates the word that model obtains
The disturbance degree of topic, therefore key can be extracted from target text according to the theme disturbance degree vector of the word figure of target text and word
Word, so that the extraction efficiency of keyword can be improved through the embodiment of the present invention and extract intelligence.
The above description is only an overview of the technical scheme of the present invention, in order to better understand the technical means of the present invention,
And it can be implemented in accordance with the contents of the specification, and in order to allow above and other objects of the present invention, feature and advantage can
It is clearer and more comprehensible, the followings are specific embodiments of the present invention.
Detailed description of the invention
By reading the following detailed description of the preferred embodiment, various other advantages and benefits are common for this field
Technical staff will become clear.The drawings are only for the purpose of illustrating a preferred embodiment, and is not considered as to the present invention
Limitation.And throughout the drawings, the same reference numbers will be used to refer to the same parts.In the accompanying drawings:
Fig. 1 shows a kind of extracting method flow chart of keyword provided in an embodiment of the present invention;
Fig. 2 shows the extracting method flow charts of another keyword provided in an embodiment of the present invention;
Fig. 3 shows a kind of extraction element structural block diagram of keyword provided in an embodiment of the present invention;
What Fig. 4 showed that another keyword provided in an embodiment of the present invention mentions takes apparatus structure block diagram.
The word diagram that Fig. 5 shows target text provided in an embodiment of the present invention is intended to.
Specific embodiment
Exemplary embodiments of the present disclosure are described in more detail below with reference to accompanying drawings.Although showing the disclosure in attached drawing
Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here
It is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure
It is fully disclosed to those skilled in the art.
The embodiment of the invention provides a kind of extracting methods of keyword, as shown in Figure 1, this method comprises:
101, the theme disturbance degree vector of each word in target text is obtained.
Wherein, the theme disturbance degree vector of institute's predicate is for indicating influence of institute's predicate to theme in the target text
Degree, the theme disturbance degree vector of word be word in target text to the disturbance degree of all themes.It should be noted that for target
For word w in text d, F is enabled to indicate word w to the theme disturbance degree vector in target text d, it is believed that word w is appeared in
Probability in one theme z is bigger, then word disturbance degree for theme z is bigger;If the corresponding theme z of word w is in mesh
The probability of occurrence marked in text d is bigger, then shows that theme z is bigger relative to the disturbance degree of target text d.Therefore it can pass through target
The product for the probability that word w occurs in the probability and theme z that the middle theme z of text d occurs determines word w in target text d to master
The theme disturbance degree vector of z is inscribed, but determined according to the product for the probability that word w occurs in the theme z probability occurred and theme z
Theme disturbance degree vector is disturbance degree vector of the word w to theme z, rather than influence of the word w to themes whole in target text word d
Degree.It and may include multiple themes in target text d, and word w can be appeared in simultaneously in multiple set a question, it is therefore desirable to according to
The dot product that the probability that word w occurs in the probability and each theme that each theme occurs in target text d carries out is as a result, really
Word w is determined to the theme disturbance degree vector F in target text d.
Based on the above analysis, the embodiment of the present invention can be according to LDA (Latent Dirichlet Allocation, document master
Topic generates model) the primary influences degree vector of each word in target text is obtained, it is specific to obtain theme disturbance degree vector process
Can be with are as follows: target text is segmented first, then by LDA calculate each theme occurs in target text probability and
The probability that each word occurs in each theme, the probability and each theme for then theme each in target text occur
In the dot product that carries out of the probability that occurs of each word as a result, primary influences degree vector as word each in target text.
102, it is calculated in the target text according to the theme disturbance degree vector of the word figure of the target text and institute's predicate
The different degree of each word.
Wherein, the different degree is used to indicate the correlation degree of institute's predicate Yu the target text, and the different degree of word is bigger,
The correlation degree of word and target text is stronger;The significance level of word is smaller, and the correlation degree of word and target text is weaker.It needs
Bright, the word figure of target text is constructed based on TextRank (the keyword abstraction algorithm of word-based graph model) algorithm,
Node in i.e. using the word in target text as institute's predicate figure, the position neighbouring relations of word are as the node in target text
Between connection side, construct the word figure of target text.
It in embodiments of the present invention, can be using the theme disturbance degree vector of word as the weight of target text word figure interior joint
Then value substitutes into the weighted value of node in TextRank algorithm formula, and the weighted value based on node and target text word figure
In close on the different degree that the influence power that word is transmitted calculates each word in the target text;The theme shadow of word can also be passed through
Loudness vector calculates the boundary values that two nodes connection side is closed in target text word figure, and it is same then to count connection in target text word figure
One node connects the boundary values on side, finally using the result of statistics as the different degree of word each in target text.
For example, including node A, B, C, D, E, node A equivalent A, node B equivalent B, node C in target text word figure
Equivalent C, node D equivalent D, node E equivalent E, node A are connected with node B, C, D, i.e. node A and node B, C, D it
Between there is connection side, if the theme disturbance degree vector of node A is a, the theme disturbance degree vector of node B is b, the theme of node C
Disturbance degree vector is c, and the theme disturbance degree vector of node D is d, and the theme disturbance degree vector of node E is e.It then can be using a as section
The weighted value of point A, weighted value of the b as node B, weighted value of the c as node C, weighted value of the d as node D, e is as section
The weighted value of point E calculates the node relationships in the weighted value and target text word figure of node by TextRank algorithm,
To obtain the different degree of each node to get the different degree of word each into target text;In addition it can according to word figure interior joint
The boundary values for connecting side obtains the different degree of each word in target text, i.e., first by the primary influences degree of word A and word B to meter
Between primary influences degree the vector calculate node A and node C for connecting the boundary values ab, word A and word C on side between operator node A and node B
The boundary values ad that side is connected between primary influences degree the vector calculate node A and node D of the boundary values ac, word A and word D on side is connected, so
It adds up afterwards to boundary values ab, ac, the ad on the connection side of connecting node A, obtains the different degree of node to get word A and target is arrived
The correlation degree of text.
103, keyword of the word for meeting default different degree as the target text is chosen from the target text.
Wherein, the default different degree can be configured according to actual needs, can such as be chosen from target text important
Keyword of the highest word as target text is spent, the word that different degree can also be chosen from target text more than default value is made
For the keyword of target text, the embodiment of the present invention is not specifically limited.It should be noted that default value setting is bigger,
The keyword extracted from target text is more;Default value is arranged smaller, and the keyword extracted from target text is fewer.
A kind of extracting method of keyword provided in an embodiment of the present invention generates model by document subject matter first and calculates mesh
The theme disturbance degree vector of each word in text is marked, then using the theme disturbance degree vector of word as measurement word in target text
The importance of theme, and calculated in the target text respectively according to the theme disturbance degree vector of the word figure of target text and institute's predicate
The different degree of a word finally chooses keyword of the word for meeting default different degree as the target text from target text.
Since the embodiment of the present invention is during obtaining the keyword of target text, without artificial experience setting word in text subject
Importance, and generating the primary influences degree vector of word that model obtains according to document subject matter can accurately indicate word to target text
The disturbance degree of theme in this, therefore the extraction efficiency of keyword can be improved through the embodiment of the present invention and extract intelligence.
The embodiment of the invention provides the extracting methods of another keyword, as shown in Figure 2, which comprises
201, the theme disturbance degree vector of each word in target text is obtained.
Wherein, the theme disturbance degree vector of institute's predicate is for indicating influence of institute's predicate to theme in the target text
Degree, the theme disturbance degree vector of word is influence of the word in target text to all themes.It is described to obtain for the embodiment of the present invention
The theme disturbance degree vector for taking each word in target text includes: to generate model LDA by document subject matter to calculate the target text
The probability that each word occurs in the probability and each theme that each theme occurs in this;By master each in the target text
It inscribes the probability that the probability occurred occurs with word each in each theme respectively and carries out dot product calculating, obtain the target text
In each word theme disturbance degree vector.About obtain target text in each word theme disturbance degree vector associated description,
The description of Fig. 1 corresponding part can be referred to, the embodiment of the present invention will not be described in great detail herein.
202, the node in using the word in the target text as word figure, the adjacent pass in position of word in the target text
System constructs the word figure of target text as the connection side between the node.
Wherein, the position neighbouring relations of word are sentence sequencing of the word in target text, and the connection side between node is
Undirected connection side.Such as there are word A, B, C, D, E in target text, and the sequence that upper predicate occurs in target text is
ABCDBEA, then the word figure of the target text that the sequence occurred in the text according to word can construct as shown in figure 5, its interior joint B with
The position node A, C, D, E is adjacent, so there is connection side between node B and node A, C, D, E, node E is adjacent with node location A,
So there is connection side between node E and node A.
203, it is calculated in the target text according to the theme disturbance degree vector of the word figure of the target text and institute's predicate
The different degree of each word.
Wherein, the different degree is used to indicate the correlation degree of institute's predicate Yu the target text, and the different degree of word is bigger,
The correlation degree of word and target text is stronger;The significance level of word is smaller, and the correlation degree of word and target text is weaker.
For the embodiment of the present invention, step 203 includes: to be calculated in target text by the theme disturbance degree vector of institute's predicate
Similarity between each word;According to target described in the similarity calculation between the word figure of the target text and each word
The different degree of each word in text.Wherein, the embodiment of the present invention can calculate word by Euclidean distance, cosine similarity scheduling algorithm
Similarity between word, the embodiment of the present invention are not specifically limited.It specifically can be by calculating the theme disturbance degree between word
The Euclidean distance or cosine similarity of vector obtain the similarity between word, such as the theme disturbance degree vector of word A is a, word B's
Theme disturbance degree vector is b, then the similarity for calculating word A and word B can be by the cosine phase of calculating theme disturbance degree vector a and b
Obtained like degree, then using the similarity of word A and word B as between target text word figure interior joint A and node B connect while while
Value.
Specifically, the theme disturbance degree vector by institute's predicate calculates the similarity in target text between each word
It include: the corresponding word of two nodes in the word figure for obtain the target text with connection side;It is described with connection by calculating
The cosine similarity value of the theme disturbance degree vector of the corresponding word of two nodes on side, determines the similarity between each word.Such as exist
In the word figure of the target text of Fig. 5, node A equivalent A, node B equivalent B, node C equivalent C, node D equivalent D, section
Point E equivalent E has connection side between node B and node A, C, D, E, then will be more than the theme disturbance degree vector of word B and word A
The cosine similarity value of the theme disturbance degree vector of similarity of the string similarity value as word B and word A, word B and word C is as word B
With the similarity of word C, similarity of the cosine similarity value of the theme disturbance degree vector of word B and word D as word B and word D, word B
Similarity with the cosine similarity value of the theme disturbance degree vector of word E as word B and word E.
Specifically, target described in the similarity calculation according between the word figure of the target text and each word
The different degree of each word includes: using the similarity between word and word as corresponding node in the word figure of the target text in text
Connect the boundary values on side;Cumulative acquisition institute's predicate is carried out to the boundary values on each connection side of the word figure interior joint of the target text
Different degree.Such as in the word figure of the target text of Fig. 5, node A equivalent A, node B equivalent B, node C equivalent C, node D
Equivalent D, node E equivalent E have connection side, then make the similarity of word B and word A between node B and node A, C, D, E
Word B, is connect the boundary values bc on side, by word B and word by the boundary values ba that side is connected for node B, A with the similarity of word C as node B, C
Word B, is connect the boundary values on side by boundary values bd of the similarity of D as node B, D connection side with the similarity of word E as node B, E
Be, calculating different degree of the word B in target text can be added up to obtain by the boundary values on the connection side to connecting node B,
Different degree of the word B in target text is obtained according to the sum of bc+bd+bd+be.
For the embodiment of the present invention, step 203 further include: set the target for the theme disturbance degree vector of institute's predicate
The weighted value of the word figure interior joint of text;According to the keyword abstraction TextRank algorithm of word-based graph model and the node
Weighted value calculates the different degree of each word in the target text.In embodiments of the present invention, by the theme disturbance degree vector of word
It is set as the weighted value of the word figure interior joint of the target text, i.e., word is measured in target text by the theme disturbance degree vector of word
Importance in this is omitted the assignment procedure by artificial experience to word in target text, and then improves TextRank algorithm
Word importance iterate to calculate formula, therefore calculated according to the weighted value of TextRank algorithm and node each in the target text
The different degree of a word can be improved the extraction efficiency of keyword and extract intelligence.
204, keyword of the highest word of different degree as the target text is chosen from the target text.
The extracting method of another kind keyword provided in an embodiment of the present invention, due to the structure composition and target of target text
The subject information contained between text is the important evidence of keyword abstraction, therefore the embodiment of the present invention is based on LDA theme mould
Type can obtain the theme disturbance degree vector of each word in target text, then according to the word figure of target text and each word it
Between similarity calculation described in target text each word different degree, finally using the highest word of different degree in target text as
The keyword of target text.I.e. the embodiment of the present invention extracts keyword by LDA topic model and TextRank algorithm,
Word is measured to the importance of theme in target text since the theme disturbance degree vector with word can be used as, and raw according to document subject matter
The primary influences degree vector of the word obtained at model can accurately indicate that word to the disturbance degree of theme in target text, therefore passes through
The embodiment of the present invention can be improved the extraction efficiency of keyword and extract intelligence.
Further, the embodiment of the present invention provides a kind of extraction element of keyword, as shown in figure 3, described device includes:
Acquiring unit 31, computing unit 32, selection unit 33.
Acquiring unit 31, for obtaining the theme disturbance degree vector of each word in target text, the theme of institute's predicate influences
For degree vector for indicating institute's predicate to the disturbance degree of theme in the target text, the theme disturbance degree vector of word is word in target
To the disturbance degree of all themes in text.
It should be noted that enabling F indicate word w to the theme in target text d for the word w in target text d
Disturbance degree vector, it is believed that the probability that word w is appeared in a theme z is bigger, then the word influences for theme z
It spends bigger;If probability of occurrence of the corresponding theme z of word w in target text d is bigger, show theme z relative to target text d
Disturbance degree it is bigger.Therefore it can pass through the probability of the middle theme z of target text d appearance and multiplying for the probability occurred of word w in theme z
Determining word w is accumulated in target text d to the theme disturbance degree vector of theme z, but according to the theme z probability occurred and theme z
The theme disturbance degree vector that the product for the probability that middle word w occurs determines is disturbance degree vector of the word w to theme z, rather than word w is to mesh
Mark the disturbance degree of whole themes in text word d.It and may include multiple themes in target text d, and word w can be appeared in simultaneously
It is multiple set a question in, it is therefore desirable to according to theme each in target text d occur probability go out with word w in each theme
The dot product that existing probability carries out is as a result, determine word w to the theme disturbance degree vector F in target text d.
Based on the above analysis, the embodiment of the present invention can be according to LDA (Latent Dirichlet Allocation, document master
Topic generates model) the primary influences degree vector of each word in target text is obtained, it is specific to obtain theme disturbance degree vector process
Can be with are as follows: target text is segmented first, then by LDA calculate each theme occurs in target text probability and
The probability that each word occurs in each theme, the probability and each theme for then theme each in target text occur
In the dot product that carries out of the probability that occurs of each word as a result, primary influences degree vector as word each in target text.
Computing unit 32, for according to the calculating of the theme disturbance degree vector of the word figure of the target text and institute's predicate
The different degree of each word in target text, the different degree are used to indicate the correlation degree of institute's predicate Yu the target text.
Wherein, the different degree of word is bigger, and the correlation degree of word and target text is stronger;The significance level of word is smaller, word with
The correlation degree of target text is weaker.It should be noted that the word figure of target text is based on TextRank (word-based graph model
Keyword abstraction algorithm) algorithm building, i.e., using the word in target text as institute's predicate figure in node, in target text
The position neighbouring relations of word construct the word figure of target text as the connection side between the node.
It in embodiments of the present invention, can be using the theme disturbance degree vector of word as the weight of target text word figure interior joint
Then value substitutes into the weighted value of node in TextRank algorithm formula, and the weighted value based on node and target text word figure
In close on the different degree that the influence power that word is transmitted calculates each word in the target text;The theme shadow of word can also be passed through
Loudness vector calculates the boundary values that two nodes connection side is closed in target text word figure, and it is same then to count connection in target text word figure
One node connects the boundary values on side, finally using the result of statistics as the different degree of word each in target text.
For example, including node A, B, C, D, E, node A equivalent A, node B equivalent B, node C in target text word figure
Equivalent C, node D equivalent D, node E equivalent E, node A are connected with node B, C, D, i.e. node A and node B, C, D it
Between there is connection side, if the theme disturbance degree vector of node A is a, the theme disturbance degree vector of node B is b, the theme of node C
Disturbance degree vector is c, and the theme disturbance degree vector of node D is d, and the theme disturbance degree vector of node E is e.It then can be using a as section
The weighted value of point A, weighted value of the b as node B, weighted value of the c as node C, weighted value of the d as node D, e is as section
The weighted value of point E calculates the node relationships in the weighted value and target text word figure of node by TextRank algorithm,
To obtain the different degree of each node to get the different degree of word each into target text;In addition it can according to word figure interior joint
The boundary values for connecting side obtains the different degree of each word in target text, i.e., first by the primary influences degree of word A and word B to meter
Between primary influences degree the vector calculate node A and node C for connecting the boundary values ab, word A and word C on side between operator node A and node B
The boundary values ad that side is connected between primary influences degree the vector calculate node A and node D of the boundary values ac, word A and word D on side is connected, so
It adds up afterwards to boundary values ab, ac, the ad on the connection side of connecting node A, obtains the different degree of node to get word A and target is arrived
The correlation degree of text.
Selection unit 33, for choosing the word for meeting default different degree from the target text as the target text
Keyword.
Wherein, the default different degree can be configured according to actual needs, can such as be chosen from target text important
Keyword of the highest word as target text is spent, the word that different degree can also be chosen from target text more than default value is made
For the keyword of target text, the embodiment of the present invention is not specifically limited.It should be noted that default value setting is bigger,
The keyword extracted from target text is more;Default value is arranged smaller, and the keyword extracted from target text is fewer.
It should be noted that each functional unit involved by a kind of extraction element of keyword provided in an embodiment of the present invention
Other are accordingly described, can be with reference to the corresponding description of method shown in Fig. 2, and details are not described herein, it should be understood that in the present embodiment
Device can correspond to realize preceding method embodiment in full content.
A kind of extraction element of keyword provided in an embodiment of the present invention generates model by document subject matter first and calculates mesh
The theme disturbance degree vector of each word in text is marked, then using the theme disturbance degree vector of word as measurement word in target text
The importance of theme, and calculated in the target text respectively according to the theme disturbance degree vector of the word figure of target text and institute's predicate
The different degree of a word finally chooses keyword of the word for meeting default different degree as the target text from target text.
Since the embodiment of the present invention is during obtaining the keyword of target text, without artificial experience setting word in text subject
Importance, and generating the primary influences degree vector of word that model obtains according to document subject matter can accurately indicate word to target text
The disturbance degree of theme in this, therefore the extraction efficiency of keyword can be improved through the embodiment of the present invention and extract intelligence.
Further, the embodiment of the present invention provides the extraction element of another keyword, as shown in figure 4, described device packet
It includes: acquiring unit 41, computing unit 42, selection unit 43.
Acquiring unit 41, for obtaining the theme disturbance degree vector of each word in target text, the theme of institute's predicate influences
Degree vector is for indicating institute's predicate to the disturbance degree of theme in the target text;
Computing unit 42, for according to the calculating of the theme disturbance degree vector of the word figure of the target text and institute's predicate
The different degree of each word in target text, the different degree are used to indicate the correlation degree of institute's predicate Yu the target text;
Selection unit 43, for choosing the word for meeting default different degree from the target text as the target text
Keyword.
Specifically, the acquiring unit 41 includes:
Computing module 411 goes out for calculating each theme in the target text by document subject matter generation model LDA
The probability that each word occurs in existing probability and each theme;
Dot product module 412, probability for there is theme each in the target text respectively with each master
The probability that each word occurs in topic carries out dot product calculating, obtains the theme disturbance degree vector of each word in the target text.
Further, described device further include:
Construction unit 44, for the node in using the word in the target text as institute's predicate figure, the target text
The position neighbouring relations of middle word construct the word figure of the target text as the connection side between the node.
Wherein, the position neighbouring relations of word are sequencing of the word in target text, and the connection side between node is undirected
Connect side.Such as there are word A, B, C, D, E in target text, and the sequence that upper predicate occurs in target text is
ABCDBEA, then the word figure of the target text that the sequence occurred in the text according to word can construct as shown in figure 5, its interior joint B with
The position node A, C, D, E is adjacent, so there is connection side between node B and node A, C, D, E, node E is adjacent with node location A,
So there is connection side between node E and node A.
Specifically, the computing unit 42 includes:
First computing module 421, for by the theme disturbance degree vector of institute's predicate calculate in target text each word it
Between similarity;
Second computing module 422, based on the similarity between the word figure and each word according to the target text
Calculate the different degree of each word in the target text.
Wherein, the embodiment of the present invention can calculate the phase between word and word by Euclidean distance, cosine similarity scheduling algorithm
Like degree, the embodiment of the present invention is not specifically limited.Specifically can by calculate word between theme disturbance degree vector it is European away from
From or cosine similarity obtain the similarity between word, such as the theme disturbance degree vector of word A is a, the theme disturbance degree of word B to
Amount is b, then the similarity for calculating word A and word B can be obtained by calculating the cosine similarity of theme disturbance degree vector a and b, then
Using the similarity of word A and word B as the boundary values for connecting side between target text word figure interior joint A and node B.
Specifically, first computing module 421 includes:
Acquisition submodule 4211 has in the word figure for obtaining the target text two nodes on connection side corresponding
Word;
Submodule 4212 is determined, for the theme disturbance degree by calculating the corresponding word of two nodes with connection side
The cosine similarity value of vector, determines the similarity between each word.
Specifically, second computing module 422 includes:
Submodule 4221 is configured, for using the similarity between word and word as section corresponding in the word figure of the target text
The boundary values on point connection side;
Cumulative submodule 4222, the boundary values on each connection side for the word figure interior joint to the target text carries out tired
Add the different degree for obtaining institute's predicate.
Such as in the word figure of the target text of Fig. 5, node A equivalent A, node B equivalent B, node C equivalent C, node
D equivalent D, node E equivalent E have connection side, then make the similarity of word B and word A between node B and node A, C, D, E
Word B, is connect the boundary values bc on side, by word B and word by the boundary values ba that side is connected for node B, A with the similarity of word C as node B, C
Word B, is connect the boundary values on side by boundary values bd of the similarity of D as node B, D connection side with the similarity of word E as node B, E
Be, calculating different degree of the word B in target text can be added up to obtain by the boundary values on the connection side to connecting node B,
Different degree of the word B in target text is obtained according to the sum of bc+bd+bd+be.
Specifically, the computing unit 42 further include:
Setup module 423, for the theme disturbance degree vector of institute's predicate to be set as saving in the word figure of the target text
The weighted value of point;
Third computing module 424, for according to word-based graph model keyword abstraction TextRank algorithm and the section
The weighted value of point calculates the different degree of each word in the target text.
In embodiments of the present invention, the theme disturbance degree vector of word is set to the word figure interior joint of the target text
Weighted value measures importance of the word in target text by the theme disturbance degree vector of word, is omitted and passes through artificial experience
To the assignment procedure of word in target text, and then the word importance for improving TextRank algorithm iterates to calculate formula, therefore basis
The weighted value of TextRank algorithm and node calculates the different degree of each word in the target text, and the extraction of keyword can be improved
Efficiency and extraction intelligence.
The selection unit 43 is specifically used for choosing the highest word of different degree from the target text as the target
The keyword of text.
It should be noted that each functional unit involved by a kind of extraction element of keyword provided in an embodiment of the present invention
Other are accordingly described, can be with reference to the corresponding description of method shown in Fig. 2, and details are not described herein, it should be understood that in the present embodiment
Device can correspond to realize preceding method embodiment in full content.
The extraction element of another kind keyword provided in an embodiment of the present invention, due to the structure composition and target of target text
The subject information contained between text is the important evidence of keyword abstraction, therefore the embodiment of the present invention is based on LDA theme mould
Type can obtain the theme disturbance degree vector of each word in target text, then according to the word figure of target text and each word it
Between similarity calculation described in target text each word different degree, finally using the highest word of different degree in target text as
The keyword of target text.I.e. the embodiment of the present invention extracts keyword by LDA topic model and TextRank algorithm,
Word is measured to the importance of theme in target text since the theme disturbance degree vector with word can be used as, and raw according to document subject matter
The primary influences degree vector of the word obtained at model can accurately indicate that word to the disturbance degree of theme in target text, therefore passes through
The embodiment of the present invention can be improved the extraction efficiency of keyword and extract intelligence.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment
Point, reference can be made to the related descriptions of other embodiments.
It is understood that the correlated characteristic in the above method and device can be referred to mutually.In addition, in above-described embodiment
" first ", " second " etc. be and not represent the superiority and inferiority of each embodiment for distinguishing each embodiment.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description,
The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
Algorithm and display are not inherently related to any particular computer, virtual system, or other device provided herein.
Various general-purpose systems can also be used together with teachings based herein.As described above, it constructs required by this kind of system
Structure be obvious.In addition, the present invention is also not directed to any particular programming language.It should be understood that can use various
Programming language realizes summary of the invention described herein, and the description done above to language-specific is to disclose this hair
Bright preferred forms.
In the instructions provided here, numerous specific details are set forth.It is to be appreciated, however, that implementation of the invention
Example can be practiced without these specific details.In some instances, well known method, structure is not been shown in detail
And technology, so as not to obscure the understanding of this specification.
Similarly, it should be understood that in order to simplify the disclosure and help to understand one or more of the various inventive aspects, In
Above in the description of exemplary embodiment of the present invention, each feature of the invention is grouped together into single implementation sometimes
In example, figure or descriptions thereof.However, the disclosed method should not be interpreted as reflecting the following intention: i.e. required to protect
Shield the present invention claims features more more than feature expressly recited in each claim.More precisely, as following
Claims reflect as, inventive aspect is all features less than single embodiment disclosed above.Therefore,
Thus the claims for following specific embodiment are expressly incorporated in the specific embodiment, wherein each claim itself
All as a separate embodiment of the present invention.
Those skilled in the art will understand that can be carried out adaptively to the module in the equipment in embodiment
Change and they are arranged in one or more devices different from this embodiment.It can be the module or list in embodiment
Member or component are combined into a module or unit or component, and furthermore they can be divided into multiple submodule or subelement or
Sub-component.Other than such feature and/or at least some of process or unit exclude each other, it can use any
Combination is to all features disclosed in this specification (including adjoint claim, abstract and attached drawing) and so disclosed
All process or units of what method or apparatus are combined.Unless expressly stated otherwise, this specification is (including adjoint power
Benefit require, abstract and attached drawing) disclosed in each feature can carry out generation with an alternative feature that provides the same, equivalent, or similar purpose
It replaces.
In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments
In included certain features rather than other feature, but the combination of the feature of different embodiments mean it is of the invention
Within the scope of and form different embodiments.For example, in the following claims, embodiment claimed is appointed
Meaning one of can in any combination mode come using.
Various component embodiments of the invention can be implemented in hardware, or to run on one or more processors
Software module realize, or be implemented in a combination thereof.It will be understood by those of skill in the art that can be used in practice
Microprocessor or digital signal processor (DSP) are realized in keyword extracting method and device according to an embodiment of the present invention
Some or all components some or all functions.The present invention is also implemented as executing side as described herein
Some or all device or device programs (for example, computer program and computer program product) of method.It is such
It realizes that program of the invention can store on a computer-readable medium, or can have the shape of one or more signal
Formula.Such signal can be downloaded from an internet website to obtain, and perhaps be provided on the carrier signal or with any other shape
Formula provides.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and ability
Field technique personnel can be designed alternative embodiment without departing from the scope of the appended claims.In the claims,
Any reference symbol between parentheses should not be configured to limitations on claims.Word "comprising" does not exclude the presence of not
Element or step listed in the claims.Word "a" or "an" located in front of the element does not exclude the presence of multiple such
Element.The present invention can be by means of including the hardware of several different elements and being come by means of properly programmed computer real
It is existing.In the unit claims listing several devices, several in these devices can be through the same hardware branch
To embody.The use of word first, second, and third does not indicate any sequence.These words can be explained and be run after fame
Claim.
Claims (10)
1. a kind of extracting method of keyword characterized by comprising
The theme disturbance degree vector of each word in target text is obtained, the theme disturbance degree vector of institute's predicate is for indicating institute's predicate
To the disturbance degree of theme in the target text;
The theme disturbance degree vector of each word includes: in the acquisition target text
Model LDA is generated by document subject matter to calculate in the probability and each theme that each theme occurs in the target text
The probability that each word occurs;
The probability that the probability that theme each in the target text is occurred occurs with word each in each theme respectively
Dot product calculating is carried out, the theme disturbance degree vector of each word in the target text is obtained;
Each word in the target text is calculated according to the theme disturbance degree vector of the word figure of the target text and institute's predicate
Different degree, the different degree are used to indicate the correlation degree of institute's predicate Yu the target text;
The theme disturbance degree vector of the word figure and institute's predicate according to the target text calculates each in the target text
The different degree of word, comprising:
The similarity in target text between each word is calculated by the theme disturbance degree vector of institute's predicate;
According to each word in target text described in the similarity calculation between the word figure of the target text and each word
Different degree;
It is each in target text described in the similarity calculation according between the word figure of the target text and each word
The different degree of word includes:
Using the similarity between word and word as the boundary values on corresponding node connection side in the word figure of the target text;
The cumulative different degree for obtaining institute's predicate is carried out to the boundary values on each connection side of the word figure interior joint of the target text;
Keyword of the word for meeting default different degree as the target text is chosen from the target text.
2. the method according to claim 1, wherein the word figure and institute's predicate according to the target text
Theme disturbance degree vector calculates in the target text before the different degree of each word, the method also includes:
Node in using the word in the target text as institute's predicate figure, the position neighbouring relations of word are made in the target text
Connection side between the node constructs the word figure of the target text.
3. the method according to claim 1, wherein the theme disturbance degree vector by institute's predicate calculates mesh
Marking the similarity in text between each word includes:
Obtain the two nodes corresponding word in the word figure of the target text with connection side;
By calculating the cosine similarity value of the theme disturbance degree vector of the corresponding word of two nodes with connection side, determine
Similarity between each word.
4. according to the method described in claim 2, it is characterized in that, the word figure and institute's predicate according to the target text
The different degree that theme disturbance degree vector calculates each word in the target text includes:
Set the theme disturbance degree vector of institute's predicate to the weighted value of the word figure interior joint of the target text;
The target text is calculated according to the weighted value of the keyword abstraction TextRank algorithm of word-based graph model and the node
The different degree of each word in this.
5. method according to claim 1 to 4, which is characterized in that described to be chosen from the target text
The word for meeting default different degree includes: as the keyword of the target text
Keyword of the highest word of different degree as the target text is chosen from the target text.
6. a kind of extraction element of keyword characterized by comprising
Acquiring unit, for obtaining the theme disturbance degree vector of each word in target text, the theme disturbance degree vector of institute's predicate
Disturbance degree of the predicate to theme in the target text for indicating;
The acquiring unit includes:
Computing module, for by document subject matter generate model LDA calculate each theme in the target text occur it is general
The probability that each word occurs in rate and each theme;
Dot product module, probability for there is theme each in the target text respectively with it is each in each theme
The probability that word occurs carries out dot product calculating, obtains the theme disturbance degree vector of each word in the target text;
Computing unit, for calculating the target text according to the word figure of the target text and the theme disturbance degree vector of institute's predicate
The different degree of each word in this, the different degree are used to indicate the correlation degree of institute's predicate Yu the target text;
The computing unit includes:
First computing module calculates similar between each word in target text for the theme disturbance degree vector by institute's predicate
Degree;
Second computing module, for the mesh according to the similarity calculation between the word figure of the target text and each word
Mark the different degree of each word in text;
Second computing module includes:
Submodule is configured, for connecting side for the similarity between word and word as corresponding node in the word figure of the target text
Boundary values;
Cumulative submodule, the boundary values on each connection side for the word figure interior joint to the target text carry out cumulative acquisition institute
The different degree of predicate;
Selection unit, for choosing key of the word for meeting default different degree as the target text from the target text
Word.
7. device according to claim 6, which is characterized in that described device further include:
Construction unit, for the node in using the word in the target text as institute's predicate figure, word in the target text
Position neighbouring relations construct the word figure of the target text as the connection side between the node.
8. device according to claim 6, which is characterized in that first computing module includes:
Acquisition submodule has the corresponding word of two nodes on connection side in the word figure for obtaining the target text;
Submodule is determined, more than the theme disturbance degree vector by calculating the corresponding word of two nodes with connection side
String similarity value determines the similarity between each word.
9. device according to claim 7, which is characterized in that the computing unit further include:
Setup module, the weight of the word figure interior joint for setting the theme disturbance degree vector of institute's predicate to the target text
Value;
Third computing module, for according to the keyword abstraction TextRank algorithm of word-based graph model and the weight of the node
Value calculates the different degree of each word in the target text.
10. device according to any one of claims 6 to 9, which is characterized in that the selection unit, be specifically used for from
Keyword of the highest word of different degree as the target text is chosen in the target text.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610751325.2A CN106469187B (en) | 2016-08-29 | 2016-08-29 | The extracting method and device of keyword |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610751325.2A CN106469187B (en) | 2016-08-29 | 2016-08-29 | The extracting method and device of keyword |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106469187A CN106469187A (en) | 2017-03-01 |
CN106469187B true CN106469187B (en) | 2019-12-03 |
Family
ID=58229950
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610751325.2A Active CN106469187B (en) | 2016-08-29 | 2016-08-29 | The extracting method and device of keyword |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106469187B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106997382B (en) * | 2017-03-22 | 2020-12-01 | 山东大学 | Innovative creative tag automatic labeling method and system based on big data |
CN107220232B (en) * | 2017-04-06 | 2021-06-11 | 北京百度网讯科技有限公司 | Keyword extraction method and device based on artificial intelligence, equipment and readable medium |
CN107193973B (en) * | 2017-05-25 | 2021-07-20 | 百度在线网络技术(北京)有限公司 | Method, device and equipment for identifying field of semantic analysis information and readable medium |
CN107193803B (en) * | 2017-05-26 | 2020-07-10 | 北京东方科诺科技发展有限公司 | Semantic-based specific task text keyword extraction method |
CN108304377B (en) * | 2017-12-28 | 2021-08-06 | 东软集团股份有限公司 | Extraction method of long-tail words and related device |
CN108846023A (en) * | 2018-05-24 | 2018-11-20 | 普强信息技术(北京)有限公司 | The unconventional characteristic method for digging and device of text |
CN110705282A (en) * | 2019-09-04 | 2020-01-17 | 东软集团股份有限公司 | Keyword extraction method and device, storage medium and electronic equipment |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103092950A (en) * | 2013-01-15 | 2013-05-08 | 重庆邮电大学 | Online public opinion geographical location real time monitoring system and method |
CN105488023A (en) * | 2015-03-20 | 2016-04-13 | 广州爱九游信息技术有限公司 | Text similarity assessment method and device |
-
2016
- 2016-08-29 CN CN201610751325.2A patent/CN106469187B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103092950A (en) * | 2013-01-15 | 2013-05-08 | 重庆邮电大学 | Online public opinion geographical location real time monitoring system and method |
CN105488023A (en) * | 2015-03-20 | 2016-04-13 | 广州爱九游信息技术有限公司 | Text similarity assessment method and device |
Non-Patent Citations (3)
Title |
---|
Topic and keyword re-ranking for LDA-based topic modeling;Yangqiu Song 等;《Proceedings of the 18th ACM conference on Information and knowledge management》;20091130;第1757-1760页 * |
基于图和LDA主题模型的关键词抽取算法;刘啸剑 等;《情报学报》;20160630;第35卷(第6期);第664-672页,正文第3.2、3.4、5节,图4 * |
融合LDA与TextRank的关键词抽取研究;顾益军 等;《现代图书情报技术》;20141231(第248/249期);第41-47页,正文第3、4节,图1 * |
Also Published As
Publication number | Publication date |
---|---|
CN106469187A (en) | 2017-03-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106469187B (en) | The extracting method and device of keyword | |
CN103440335B (en) | Video recommendation method and device | |
CN106611052B (en) | The determination method and device of text label | |
CN105893478B (en) | A kind of tag extraction method and apparatus | |
CN106844314B (en) | A kind of duplicate checking method and device of article | |
CN107729322B (en) | Word segmentation method and device and sentence vector generation model establishment method and device | |
CN105550170B (en) | A kind of Chinese word cutting method and device | |
CN109710948A (en) | MT engine recommended method and device | |
Fernandez-Viagas et al. | A new set of high-performing heuristics to minimise flowtime in permutation flowshops | |
Vrard et al. | Helium signature in red giant oscillation patterns observed by Kepler | |
CN108563703A (en) | A kind of determination method of charge, device and computer equipment, storage medium | |
CN106649288A (en) | Translation method and device based on artificial intelligence | |
CN104462554B (en) | Question and answer page relevant issues recommended method and device | |
CN106528755A (en) | Hot topic generation method and device | |
CN107193806B (en) | A kind of automatic prediction method and device that vocabulary justice is former | |
CN110515838A (en) | Method and system for detecting software defects based on topic model | |
CN109117475B (en) | Text rewriting method and related equipment | |
CN109948140A (en) | A kind of term vector embedding grammar and device | |
CN108153730A (en) | A kind of polysemant term vector training method and device | |
CN105589976B (en) | Method and device is determined based on the target entity of semantic relevancy | |
CN102298618B (en) | Method for obtaining matching degree to execute corresponding operations and device and equipment | |
CN103870563B (en) | It is determined that the method and apparatus of the theme distribution of given text | |
CN110019806A (en) | A kind of document clustering method and equipment | |
CN110489744A (en) | A kind of processing method of corpus, device, electronic equipment and storage medium | |
Fayolle et al. | p-Laplace diffusion for distance function estimation, optimal transport approximation, and image enhancement |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |