CN110019801A - A kind of determination method and apparatus of text relevant - Google Patents
A kind of determination method and apparatus of text relevant Download PDFInfo
- Publication number
- CN110019801A CN110019801A CN201711252358.3A CN201711252358A CN110019801A CN 110019801 A CN110019801 A CN 110019801A CN 201711252358 A CN201711252358 A CN 201711252358A CN 110019801 A CN110019801 A CN 110019801A
- Authority
- CN
- China
- Prior art keywords
- keyword
- text
- vector
- dimension
- under
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The embodiment of the present application discloses a kind of determination method of text relevant, in analyzing bigraph (bipartite graph) when correlation between keyword and text, feature of the keyword under different dimensions can be extracted, and the corresponding feature vector of feature under different dimensions is calculated, to obtain the keyword in the data of different dimensions.When propagating feature vector, propagation vector under available different dimensions, the correlative character under different dimensions can be calculated according to the propagation vector under different dimensions, when obtaining the correlative character under different dimensions, the correlative character under the different dimensions, which is carried out integration, can be obtained keyword and according to the correlation results between the opened text of the keyword.In this way, by obtaining keyword in the data of different dimensions, the correlation between keyword and text is embodied from different dimensions, keeps calculated correlation confidence level higher, the search need of user is more easily satisfied according to the text that the correlation is shown, improves the search experience of user.
Description
Technical field
This application involves data processing fields, more particularly to a kind of determination method and apparatus of text relevant.
Background technique
With popularizing for network, user can pass through keyword search information needed by search engine on network.It is logical
Crossing keyword may search for text relevant to the keyword, and user can select required text to open from these texts
Browsing.It can establish keyword and the corresponding bigraph (bipartite graph) for opening text by above-mentioned search behavior, such as shown in Fig. 1, bigraph (bipartite graph)
The node q in left side can be keyword, and the node d on right side can be the text of opening, and between q and d on line
Number can represent the number that this d is searched for and opened by this q, user passes through search key q to example as shown in figure 11This
One behavior, opened 3 d1。
The correlation between keyword and text can be determined by bigraph (bipartite graph), thus when there is user to search again in bigraph (bipartite graph)
Keyword when, search engine can be determined according to the height of correlation text in search result displaying sequence.
In traditional approach, if desired analysis of key word and the correlation between text, usually segment keyword, lead to
The term vector of participle is crossed to calculate the correlation between keyword and text.But what the term vector major embodiment of keyword participle went out
Participle or the meaning of a word of the keyword, the actual purpose scanned for user by the keyword may difference, therefore singly
The correlation between keyword and text is calculated from this dimension of the meaning of a word, is difficult to meet user according to the text that the correlation is shown
Search seek, reduce the search experience of user.
Summary of the invention
It in order to solve the above-mentioned technical problem, can be with this application provides a kind of determination method and apparatus of text relevant
Keyword and the correlation between text are calculated according to multiple dimensions of keyword, in this way, the correlation calculated can be more
Accurately, so that seeking according to the search that user is more easily satisfied in the text that the correlation is shown, the search experience of user is improved.
The embodiment of the present application discloses following technical solution:
In a first aspect, the embodiment of the present application provides a kind of determination method of text relevant, it is applied to bigraph (bipartite graph), it is described
Bigraph (bipartite graph) includes keyword and according to the corresponding relationship between the opened text of keyword, which comprises
The first keyword is extracted to close in the fisrt feature under the first dimension and the second feature under the second dimension, described first
Keyword is the keyword in the bigraph (bipartite graph);
Calculate the first eigenvector of the fisrt feature and the second feature vector of the second feature;
According to first keyword and according to the corresponding relationship between the opened text of the first keyword, by described
One feature vector is propagated in the bigraph (bipartite graph), obtain first keyword under the first dimension first propagate to
Amount;And propagate the second feature vector in the bigraph (bipartite graph), first keyword is obtained in the second dimension
Under second propagate vector;
Vector, which is propagated, according to described first calculates first keyword under first dimension and according to described first
The first correlative character between the opened text of keyword;Vector, which is propagated, according to described second calculates first keyword in institute
It states under the second dimension and according to the second correlative character between the opened text of the first keyword;
According to first correlative character and the second correlative character obtain first keyword with according to described the
Correlation results between the opened text of one keyword.
Optionally, it is described according to first keyword with according between the opened text of the first keyword it is corresponding pass
System, the first eigenvector is propagated in the bigraph (bipartite graph), obtains first keyword under the first dimension
First propagates vector, comprising:
According to open number, from first keyword with according between the opened text of the first keyword it is corresponding close
The text propagated is determined in system;
According to first keyword and the corresponding relationship of the file propagated determined, by described the
One feature vector is propagated in the bigraph (bipartite graph), obtain first keyword under the first dimension first propagate to
Amount.
Optionally, it is described according to first correlative character and the second correlative character obtain first keyword with
According to the correlation results between the opened text of the first keyword, comprising:
By Logic Regression Models, described first is obtained according to first correlative character and the second correlative character and is closed
Keyword and according to the correlation results between the opened text of the first keyword.
Optionally, the method also includes:
The second keyword is extracted in the third feature under first dimension and the fourth feature under second dimension, root
It is identical as according to the opened text of the first keyword according to some or all of the opened text of the second keyword;
Calculate the fourth feature vector of fourth feature described in the third feature vector sum of the third feature;
According to second keyword and according to the corresponding relationship between the opened text of the second keyword, by described
Three feature vectors are propagated in the bigraph (bipartite graph), obtain third of second keyword under the first dimension propagate to
Amount;And propagate the fourth feature vector in the bigraph (bipartite graph), second keyword is obtained in the second dimension
Under the 4th propagate vector;
Vector, which is propagated, according to the third calculates second keyword under the first dimension between first keyword
Third correlative character;According to the described 4th propagate vector calculate second keyword it is described under the second dimension with institute
State the 4th correlative character between the first keyword;
Second keyword and described first is obtained according to the third correlative character and the 4th correlative character to close
Correlation results between keyword.
Optionally, the method also includes:
Obtain text to be analyzed;
According to the determining keyword with the text to be analyzed with correlation results of the bigraph (bipartite graph);
Correlation results are met into the keyword of preset condition as the corresponding keyword of the text to be analyzed.
Second aspect, the embodiment of the present application provide a kind of determining device of text relevant, are applied to bigraph (bipartite graph), described
Bigraph (bipartite graph) includes keyword and according to the corresponding relationship between the opened text of keyword, described device include the first extraction unit,
First computing unit, the first propagation unit, the second computing unit and first acquisition unit:
First extraction unit, for extracting fisrt feature of first keyword under the first dimension and in the second dimension
Lower second feature, first keyword are the keyword in the bigraph (bipartite graph);
First computing unit, for calculate the fisrt feature first eigenvector and the second feature
Two feature vectors;
First propagation unit, for according to first keyword and according to the opened text of the first keyword
Between corresponding relationship, the first eigenvector is propagated in the bigraph (bipartite graph), obtains first keyword
First under dimension propagates vector;And propagate the second feature vector in the bigraph (bipartite graph), it obtains described
Second propagation vector of first keyword under the second dimension;
Second computing unit calculates first keyword described first for propagating vector according to described first
Under dimension and according to the first correlative character between the opened text of the first keyword;It propagates according to described second to meter
First keyword is calculated under second dimension and according to the second correlation between the opened text of the first keyword
Feature;
The first acquisition unit, for obtaining described according to first correlative character and the second correlative character
One keyword and according to the correlation results between the opened text of the first keyword.
Optionally, first propagation unit includes determining that subelement and first propagates subelement:
The determining subelement is used for according to number is opened, from first keyword and according to first keyword
The text propagated is determined in corresponding relationship between opened text;
Described first propagates subelement, for described being propagated according to first keyword with what is determined
The corresponding relationship of file propagates the first eigenvector in the bigraph (bipartite graph), obtains first keyword and exists
First under first dimension propagates vector.
Optionally, the first acquisition unit includes the first acquisition subelement:
Described first obtains subelement, for passing through Logic Regression Models, according to first correlative character and second
Correlative character obtains first keyword and according to the correlation results between the opened text of the first keyword.
Optionally, described device further includes the second extraction unit, third computing unit, the second propagation unit, the 4th calculating
Unit and second acquisition unit:
Second extraction unit, for extracting third feature of second keyword under first dimension and described
Fourth feature under second dimension, according to some or all of described opened text of second keyword with it is crucial according to described first
The opened text of word is identical;
The third computing unit, for calculating of fourth feature described in the third feature vector sum of the third feature
Four feature vectors;
Second propagation unit, for according to second keyword and according to the opened text of the second keyword
Between corresponding relationship, the third feature vector is propagated in the bigraph (bipartite graph), obtains second keyword
Third under dimension propagates vector;And propagate the fourth feature vector in the bigraph (bipartite graph), it obtains described
Fourth propagation vector of second keyword under the second dimension;
4th computing unit calculates second keyword in the first dimension for propagating vector according to the third
Third correlative character between lower and described first keyword;Second keyword is calculated according to the 4th propagation vector to exist
The 4th correlative character under the second dimension between first keyword;
The second acquisition unit, for obtaining described according to the third correlative character and the 4th correlative character
Correlation results between two keywords and first keyword.
Optionally, described device further includes third acquiring unit, determination unit and the 4th acquiring unit:
The third acquiring unit, for obtaining text to be analyzed;
The determination unit, for according to the determining pass with the text to be analyzed with correlation results of the bigraph (bipartite graph)
Keyword;
4th acquiring unit, for correlation results to be met to the keyword of preset condition as the text to be analyzed
This corresponding keyword.
The third aspect, the embodiment of the present application provide a kind of processing equipment of determination for text relevant, feature
It is, includes that perhaps more than one program one of them or more than one program is stored in and deposits by memory and one
In reservoir, and it is configured to execute the one or more programs by one or more than one processor to include to be used for
The instruction performed the following operation:
The first keyword is extracted to close in the fisrt feature under the first dimension and the second feature under the second dimension, described first
Keyword is the keyword in the bigraph (bipartite graph);
Calculate the first eigenvector of the fisrt feature and the second feature vector of the second feature;
According to first keyword and according to the corresponding relationship between the opened text of the first keyword, by described
One feature vector is propagated in the bigraph (bipartite graph), obtain first keyword under the first dimension first propagate to
Amount;And propagate the second feature vector in the bigraph (bipartite graph), first keyword is obtained in the second dimension
Under second propagate vector;
Vector, which is propagated, according to described first calculates first keyword under first dimension and according to described first
The first correlative character between the opened text of keyword;Vector, which is propagated, according to described second calculates first keyword in institute
It states under the second dimension and according to the second correlative character between the opened text of the first keyword;
According to first correlative character and the second correlative character obtain first keyword with according to described the
Correlation results between the opened text of one keyword.
Fourth aspect, the embodiment of the present application provide a kind of machine readable media, are stored thereon with instruction, when by one or
When multiple processors execute, so that device executes the determination side of text relevant described in one or more of first aspect
Method.
It can be seen from above-mentioned technical proposal in analyzing bigraph (bipartite graph) when correlation between keyword and text, Ke Yigen
Keyword and the correlation between text are calculated according to multiple dimensions of keyword.Specifically, the first keyword can be extracted in difference
Feature under dimension, and calculate separately the corresponding feature vector of feature under different dimensions, wherein the feature under different dimensions can
To embody the relevant information between keyword and text from different dimensions, keyword is obtained in the data of different dimensions to calculate
First keyword with according to the correlation between the opened text of the first keyword.It, can when propagating feature vector
To obtain the propagation vector under different dimensions, the phase under different dimensions can be calculated according to the propagation vector under different dimensions
Closing property feature, when obtaining the correlative character under different dimensions, wherein a correlative character can be embodied in a dimension
Lower degree of correlation between keyword and text, another correlative character can be embodied in keyword and text under another dimension
Between degree of correlation, in this way, by the correlative character under obtained different dimensions carry out integration can be obtained the first keyword with
According to the correlation results between the opened text of the first keyword.It can be seen that relative in traditional approach, only from one
Dimension calculates keyword and the correlation between text, such as only calculates keyword and the correlation between text from meaning of a word dimension, this
Application embodiment can be provided to calculate keyword with the correlation between text by obtaining keyword in the data of different dimensions
More information, the correlation that can be embodied the correlation between keyword and text from different dimensions, and then be calculated
Confidence level is higher, and the search need of user is more easily satisfied according to the text that the correlation is shown, improves the search body of user
It tests.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of application without any creative labor, may be used also for those of ordinary skill in the art
To obtain other drawings based on these drawings.
Fig. 1 is bigraph (bipartite graph) exemplary diagram provided by the embodiments of the present application;
Fig. 2 is a kind of flow chart of the determination method of text relevant provided by the embodiments of the present application;
Fig. 3 is a kind of flow chart for calculating the correlation method between keyword provided by the embodiments of the present application;
Fig. 4 is a kind of method according to keyword text exhibition provided by the embodiments of the present application;
Fig. 5 is a kind of structural block diagram of the determining device of text relevant provided by the embodiments of the present application;
Fig. 6 is a kind of block diagram of the device of the determination for text relevant provided by the embodiments of the present application;
Fig. 7 is a kind of block diagram of the server of the determination for text relevant provided by the embodiments of the present application.
Specific embodiment
With reference to the accompanying drawing, embodiments herein is described.
Inventor it has been investigated that, in the method for the correlation between traditional determination keyword with text, typically just
Keyword is segmented, calculates the correlation between keyword and text using this dimension of the keyword meaning of a word, is counted in this way
The correlation calculated possibly can not react the actual search purpose of user, so that calculated keyword is related between text
Property accuracy it is insufficient, in this way, seeking according to the search that the text that the correlation is shown is difficult to meet user, reduce searching for user
Cable body is tested.
For example, user wishes to search for the related letter of this mobile phone brand of millet mobile phone by input keyword " millet mobile phone "
Breath, but correlation of traditional mode in basis " millet mobile phone " calculating " millet mobile phone " with text, to be shown to user
When text corresponding to " millet mobile phone ", " millet mobile phone " can be segmented, such as participle obtains " millet " and " mobile phone ",
It can determine that the meaning of a word of keyword " millet mobile phone " is primarily referred to as mobile phone according to word segmentation result, i.e., what user needed to scan for is
Mobile phone, in this way, will be by millet cell phone display to user, without the correlation to user's displaying this mobile phone brand of millet mobile phone
Information, to reduce the search experience of user.
For this purpose, the embodiment of the present application provides a kind of solution regarding to the issue above, this method can be according to key
Multiple dimensions of word calculate keyword and the correlation between text, such as the multiple dimension includes the first dimension and the second dimension
Degree, the keyword include the first keyword, then can extract first fisrt feature of first keyword under the first dimension and
The second feature under the second dimension, and calculate separately the fisrt feature first eigenvector and the second feature second
Feature vector.Then, the corresponding pass between the first keyword according to bigraph (bipartite graph) and the opened text of the first keyword
System, the first eigenvector is propagated in bigraph (bipartite graph), obtains first of first keyword under the first dimension
Vector is propagated, and the second feature vector is propagated in bigraph (bipartite graph), obtains first keyword in the second dimension
Second under degree propagates vector.After again, according to described first propagate vector calculate first keyword under the first dimension with
The first correlative character between the opened text of first keyword, and propagate vector according to described second and calculate described first
Keyword is described under the second dimension and according to the second correlative character between the opened text of the first keyword.Most
Afterwards, according to first correlative character and the second correlative character obtain first keyword with it is crucial according to described first
Correlation results between the opened text of word.It can be seen that relative in traditional approach, only from a dimension calculate keyword with
Correlation between text, such as can only lead to from meaning of a word dimension calculating keyword and the correlation between text, the embodiment of the present application
It crosses to obtain keyword in the data of different dimensions, provides more information with the correlation between text to calculate keyword, it can be with
The correlation between keyword and text is embodied from different dimensions, and then the correlation confidence level being calculated is higher, according to
The search that user is more easily satisfied in the text that the correlation is shown is sought, and the search experience of user is improved.
The bigraph (bipartite graph) being previously mentioned in the embodiment of the present application is constructed according to historical search data, may include that user looks into
The keyword of inquiry and according to the corresponding relationship between the opened text of keyword, wherein can be according to the opened text of keyword
Refer to what user obtained by inputting keyword search, and the text opened by user.The text can be webpage, can also
To be advertisement etc..The bigraph (bipartite graph) can with as shown in Figure 1, include node, while and while on number.Wherein, the section in bigraph (bipartite graph)
Point can indicate keyword and according to the opened text of keyword, and the node q on the left of bigraph (bipartite graph) can be keyword, the section on right side
Point d can be the text of opening;Side in bigraph (bipartite graph) can be line between q and d, side can indicate keyword and
According to the corresponding relationship between the opened text of keyword, this text d is only opened by search key q, it just can be in this q
There are sides between d, for example, q1And d1Between side, indicate by search q1Open text d1;The number on side between node
It can indicate in historical search data, the opening number of a d is searched for and opened by a q, it can be with table by opening number
Show a q and according to the q open text d corresponding relationship with the q have corresponding relationship all d in weight, such as
User passes through search key q in Fig. 11This behavior, opened 3 d1, then q1And d1Between side on mark 3, number 3 can
To indicate q1And d1Between corresponding relationship with the q1All d (such as d with corresponding relationship1And d2) in weight.
It is to search for information needed, the content inputted on a search engine by search engine that keyword, which can be user,.
Keyword can be word, for example, " mountain ", " mobile phone ", " millet mobile phone " etc.;Keyword can be phrase, for example, " silk scarf
Be method ", " we are " etc.;Keyword is also possible to sentence etc., for example, " sleep before can eat apple? " Deng.
Dimension can refer to according to key word analysis keyword and thought angle when correlation between text, from it is several not
Same thought angle analysis of key word is with the correlation between text it may be considered that including several dimensions, feature can indicate crucial
Word corresponding specific data or information under different thought angles.For example, it is directed to keyword " sogou browser ", it can be from key
The meaning of a word of word and two dimensional analysis " sogou browser " of attribute of keyword and the correlation between text.Specifically, Ke Yitong
It crosses and the keyword is segmented, " sogou browser " is segmented as " search dog " and " browser ", then " search dog " and " browser "
It can be used as the feature of " sogou browser " under this dimension of the meaning of a word of keyword.Since " sogou browser " can on attribute
To belong to browser, therefore, browser can be used as the spy of " sogou browser " under this dimension of the attribute of keyword
Sign.
Feature vector, which can be, is indicated the form of the feature of keyword vector, so that computer can be according to spy
Sign vector such as is identified to the feature of keyword, is calculated at the relevant treatments.For example, keyword " sogou browser " is in keyword
Feature under this dimension of attribute is browser, and states browser this feature by the way of text and be unfavorable for using
In subsequent calculating, it is consequently possible to calculate feature vector corresponding to this feature of browser, utilizes the form of feature vector
This feature of browser is stated, convenient for processing such as subsequent calculating.
Propagation of the feature vector in bigraph (bipartite graph) can be feature vector according to the line between node from one in bigraph (bipartite graph)
Node is transferred to another node, and when this feature vector travels on another described node, another described node can be with
Calculation processing is carried out to this feature vector and obtains the corresponding new feature vector of another described node, which can be
Propagate vector.Feature vector is once traveled into another node from a node and the operation for generating new feature vector can regard
For the Once dissemination in bigraph (bipartite graph).The line that the new feature vector can continue between node is propagated, pre- until reaching
Until if the feature vector of number or each node tends towards stability.
By taking bigraph (bipartite graph) shown in FIG. 1 as an example, q1Feature vector can be from q1Travel to node d1, d1It can be to obtaining
Feature vector carries out calculation processing, obtains d1New feature vector as d1Propagation vector.d1The new feature that can also will be obtained
Vector is broadcast to q again1, q1It can will receive from d1New feature vector carry out calculation processing, obtain q1New feature vector
As q1Propagation vector.Certainly, d1After obtaining new feature vector, d1It can also be by obtained new feature vector to q2It propagates, q2
It can will receive from d1New feature vector carry out calculation processing, obtain q2New feature vector as q2Propagation vector.
q2It can also be by obtained new feature vector to d2It propagates, d2It can will receive from q2New feature vector carry out at calculating
Reason, obtains d2New feature vector as d2Propagation vector.d2It can also be by obtained new feature vector to q3It propagates, q3It can be with
It will receive from d2New feature vector carry out calculation processing, obtain q3New feature vector as q3Propagation vector.At this
In implementation, there are also other circulation ways for feature vector, as long as according to keyword in bigraph (bipartite graph) and according to the opened text of keyword
This corresponding relationship is propagated, and is not repeated one by one herein.
Correlative character can be used to measure the degree of correlation in bigraph (bipartite graph) between a node and another node, the phase
Pass degree may include the degree of correlation in bigraph (bipartite graph) between keyword and keyword, such as q1And q2Between degree of correlation, with
And including the degree of correlation in bigraph (bipartite graph) between keyword and text, for example, q1And d1Between degree of correlation.The correlative character
It can be indicated in the form of percentage, such as correlative character can be 90%.
Correlation results can be to be integrated using the correlative character under different dimensions, for indicating two
Correlation in figure between keyword and text.
With reference to the accompanying drawing, the determination method of text relevant provided by the embodiments of the present application is described in detail.
Referring to fig. 2, Fig. 2 is a kind of flow chart of the determination method of text relevant provided by the embodiments of the present application, the party
Method can be applied to bigraph (bipartite graph), and the bigraph (bipartite graph) includes keyword and according to the corresponding relationship between the opened text of keyword, institute
The method of stating includes:
S201, the first keyword is extracted in the fisrt feature under the first dimension and the second feature under the second dimension.
Bigraph (bipartite graph) is established in the keyword inputted according to user and according to the corresponding relationship between the opened text of keyword
Afterwards, the present embodiment can determine the keyword and basis according to multiple dimensions of the keyword for the keyword in bigraph (bipartite graph)
Correlation results between the opened text of the keyword are wrapped in the multiple dimension with the first keyword for including in bigraph (bipartite graph)
For the first dimension and the second dimension that include, then fisrt feature and institute of first keyword under the first dimension can be extracted first
State the first keyword second feature under the second dimension.
It should be noted that the feature under different dimensions, such as fisrt feature and second feature, it can be and utilize classifier
It obtains, the dimension may include the meaning of a word, attribute, field etc., and the feature may include the dimensions such as participle, attribute, field
Under corresponding specific data.
For example, the first keyword is " sogou browser ", the first dimension is the meaning of a word, and the second dimension is attribute, utilizes classification
Device segments " sogou browser ", obtains " search dog " and " browser ", the meaning of a word be " search dog " and " browser " in itself,
The attribute of " sogou browser " is browser, then, the fisrt feature extracted in the present embodiment can be " search dog " and " clear
Look at device ", second feature can be browser.
The second feature vector of S202, the first eigenvector for calculating the fisrt feature and the second feature.
It needs to propagate fisrt feature and second feature in bigraph (bipartite graph) due to subsequent, and the data processings such as calculating
Operation can be special to fisrt feature and second using the form of vector for the ease of the progress of the operation of above-mentioned data processing
Sign is indicated, i.e., converts first eigenvector for fisrt feature by calculating, and convert the second spy for second feature
Levy vector.
S203, according to first keyword and according to the corresponding relationship between the opened text of the first keyword, will
The first eigenvector is propagated in the bigraph (bipartite graph), obtains first biography of first keyword under the first dimension
Broadcast vector;And propagate the second feature vector in the bigraph (bipartite graph), first keyword is obtained second
Second under dimension propagates vector.
In the present embodiment, the feature vector of the first keyword can be along the side in bigraph (bipartite graph) to according to first in bigraph (bipartite graph)
The opened text of keyword is propagated, and the text is by calculating propagation of the available new feature vector as the text
Vector, after the text is propagated new feature vector along this side to the first keyword, the first keyword is by calculating
Propagation vector to new feature vector as the first keyword.Therefore, the feature for the first keyword under different dimensions to
Amount, wherein first eigenvector can be propagated in bigraph (bipartite graph), after propagating the calculating with the first keyword, obtain
First propagation vector of first keyword under the first dimension;Correspondingly, second feature vector can be passed in bigraph (bipartite graph)
It broadcasts, after propagating the calculating with the first keyword, obtains second propagation vector of first keyword under the second dimension.
It should be noted that the corresponding new feature vector of the first keyword can continue on after completing Once dissemination
Line between node is propagated, until the feature vector until reaching preset times or each node tends towards stability.
It is understood that for the ease of propagating the calculating of vector, the operation such as comparing, can to obtained propagation vector into
Row normalized.
It should be noted that in the present embodiment, propagation and calculating and second of the first eigenvector in bigraph (bipartite graph) are special
Levying propagation and calculating of the vector in bigraph (bipartite graph) can be separated, be independent of each other.To first eigenvector in bigraph (bipartite graph)
Propagation and calculate with propagation and computation sequence of the second feature vector in bigraph (bipartite graph) without limitation, can successively carry out,
It can carry out side by side.
With the q in Fig. 11For the first keyword, q1First eigenvector under the first dimension can be A, second
Second feature vector under dimension can be B, according to q1Opened text is d1, d1Feature vector can be C, q1First
Feature vector A is after propagating, d1An available new feature is calculated to first eigenvector on the basis of feature vector C
Vector C ' is used as d1First eigenvector under the first dimension.q1Second feature vector B through propagation after, d1Feature to
An available new feature vector C " is calculated to second feature vector on the basis of amount C and is used as d1Second under the second dimension
Feature vector.It can be seen that propagation and calculating of the first eigenvector in bigraph (bipartite graph) are with second feature vector in bigraph (bipartite graph)
Propagation and calculating can be separated, be independent of each other.
d1It can be again to q1Propagate new feature vector C ', q1After receiving new feature vector C ', in first eigenvector A
On the basis of a new feature vector A ' is calculated as q1First under the first dimension propagates vector.d1Can again to
q1Propagate new feature vector C ", q1After receiving new feature vector C ", one is calculated on the basis of second feature vector B
New feature vector B ' is used as q1Second under the second dimension propagates vector.As it can be seen that q1After receiving new feature vector C ' and C ",
The line that new feature vector C ' and C " can be continued between node is propagated, the spy until reaching preset times or each node
Until sign vector tends towards stability.
It should be noted that since the number in bigraph (bipartite graph) on side can be indicated through this keyword search and open this
The number of a text, the number can indicate the keyword and open the degree of correlation of text according to the keyword.It opens secondary
Number is more, has corresponding close with the keyword for the corresponding relationship of a keyword and the text opened according to the keyword
Weight in all texts of system is bigger, bigger to the interdependence effects calculated between keyword and text, and opening number is fewer, needle
There are all of corresponding relationship with the keyword to the corresponding relationship of a keyword and the text opened according to the keyword
Weight in text is smaller, smaller to the interdependence effects calculated between keyword and text.
For a keyword, if all propagated according to all texts that the keyword is opened in bigraph (bipartite graph),
Have the propagation between the lesser text of weight in all texts of corresponding relationship may be to the meter of correlation with the keyword
Calculation does not help, in this case, the present embodiment can according to the openings number in bigraph (bipartite graph) to need the text propagated into
Row selection, determines the text propagated, and described is propagated according to first keyword with what is determined
File corresponding relationship, the first eigenvector is propagated in the bigraph (bipartite graph), obtains first keyword
First under the first dimension propagates vector, correspondingly, described being passed according to first keyword with what is determined
The corresponding relationship for the file broadcast propagates the second feature vector in the bigraph (bipartite graph), and it is crucial to obtain described first
Second propagation vector of the word under the second dimension.Such as opening time can be selected from according to the opened text of the first keyword
The more text of number is propagated, so that the first eigenvector and second feature vector that avoid the first keyword are along expression first
The keyword side small with the degree of correlation for opening text according to the first keyword is propagated, and realizes basis and the first keyword phase
Degree higher text in pass, which calculates, propagates vector, reduces the purpose of calculation amount.
With the q in Fig. 12For the first keyword, for q2For, according to q2Open d1Number be 5, according to q2It opens
d2Number be 1, then q2And q2The text d of opening2Corresponding relationship with the q2All d (such as d with corresponding relationship1With
d2) in weight it is smaller, then q2Feature vector to text d2Propagation for calculate q2Corresponding relationship between text helps
It is smaller, hence, it can be determined that going out the text propagated is d1, in this way, can be according to q2With d1Corresponding relationship, by q2
First eigenvector propagated in the bigraph (bipartite graph), obtain q2First under the first dimension propagates vector.
S204, vector calculating first keyword is propagated according to described first under first dimension and according to
The first correlative character between the opened text of first keyword;Vector, which is propagated, according to described second calculates first keyword
Under second dimension and according to the second correlative character between the opened text of the first keyword.
In the present embodiment, obtained propagation vector of the feature vector after propagating has compared with original feature vector
Have information more abundant, according to the propagation vector under different dimensions can calculate the first keyword under different dimensions with root
According to the correlative character between the opened text of the first keyword.
The first keyword q is obtained with aforementioned1First propagation vector be A ' and second propagation vector be B ' for, if related
Property characteristic use percentage indicate, then be that A ' can be calculated q according to the first propagation vector1Under the first dimension with basis
q1Opened text d1Between the first correlative character be 90%, according to second propagation vector be B ' q can be calculated1?
Under two-dimensions and according to q1Opened text d1Between the second correlative character can be 85%.
S205, first keyword is obtained according to first correlative character and the second correlative character and according to institute
State the correlation results between the opened text of the first keyword.
Since aforementioned the first obtained correlative character and the second correlative character indicate the first key under different dimensions
Word and according to the degree of correlation between the opened text of the first keyword, in order to accurately obtain the first keyword with according to institute
The correlation between the opened text of the first keyword is stated, the degree of correlation of different dimensions can be comprehensively considered, it therefore, can
The first correlative character and the second correlative character to be integrated to obtain the first keyword and according to first keyword
Correlation results between opened text.Wherein, the correlation results can be indicated with percentage.
It is alternatively possible to by model, such as Logic Regression Models, by the first correlative character and the second correlative character
It is integrated to obtain first keyword and according to the correlation results between the opened text of the first keyword.
As an example, first keyword and root are obtained according to the first correlative character and the second correlative character
It may is that according to the mode of the correlation results between the opened text of the first keyword and calculate the first correlative character and second
The average value of correlative character, using the average value as the correlation results.
Due to, the influence of the first correlative character and the second correlative character to the correlation results may be different, because
This, can be respectively set weighted value to the first correlative character and the second correlative character, utilize the first correlative character and
Two correlative characters are weighted to obtain the correlation results.
It is, of course, also possible to calculate the correlation results, details are not described herein again by other different models.
It can be seen from above-mentioned technical proposal in analyzing bigraph (bipartite graph) when correlation between keyword and text, Ke Yigen
Keyword and the correlation between text are calculated according to multiple dimensions of keyword.Specifically, the first keyword can be extracted in difference
Feature under dimension, and calculate separately the corresponding feature vector of feature under different dimensions, wherein the feature under different dimensions can
To embody the relevant information between keyword and text from different dimensions, keyword is obtained in the data of different dimensions to calculate
First keyword with according to the correlation between the opened text of the first keyword.It, can when propagating feature vector
To obtain the propagation vector under different dimensions, the phase under different dimensions can be calculated according to the propagation vector under different dimensions
Closing property feature, when obtaining the correlative character under different dimensions, wherein a correlative character can be embodied in a dimension
Lower degree of correlation between keyword and text, another correlative character can be embodied in keyword and text under another dimension
Between degree of correlation, in this way, by the correlative character under obtained different dimensions carry out integration can be obtained the first keyword with
According to the correlation results between the opened text of the first keyword.It can be seen that relative in traditional approach, only from one
Dimension calculates keyword and the correlation between text, such as only calculates keyword and the correlation between text from meaning of a word dimension, this
Application embodiment can be provided to calculate keyword with the correlation between text by obtaining keyword in the data of different dimensions
More information, the correlation that can be embodied the correlation between keyword and text from different dimensions, and then be calculated
Confidence level is higher, and the search need of user is more easily satisfied according to the text that the correlation is shown, improves the search body of user
It tests.
It is opened it should be noted that can not only calculate keyword using the method for the present embodiment with according to the keyword
Correlation between text can also calculate the correlation between different keywords, embodiment of the present embodiment corresponding to Fig. 2
On the basis of, provide the correlation method between a kind of calculating keyword, referring to Fig. 3, Fig. 3 show a kind of calculating keyword it
Between correlation method flow chart, which comprises
S301, the second keyword is extracted in the third feature under first dimension and the 4th spy under second dimension
Sign.
In the present embodiment, when identical according to some or all of different the opened text of keyword, this is different
There may be correlation between keyword, therefore, in the present embodiment, according to the part of the opened text of the second keyword
Or it is all identical as according to the opened text of the first keyword.
By taking Fig. 1 as an example, as can be seen from the figure according to q1And q2D can be opened1, therefore, q1And q2It can have correlation
Property, wherein q2It can be used as the first keyword, q1It can be used as the second keyword.
S302, the fourth feature vector for calculating fourth feature described in the third feature vector sum of the third feature.
S303, according to second keyword and according to the corresponding relationship between the opened text of the second keyword, will
The third feature vector is propagated in the bigraph (bipartite graph), is obtained third of second keyword under the first dimension and is passed
Broadcast vector;And propagate the fourth feature vector in the bigraph (bipartite graph), second keyword is obtained second
The 4th under dimension propagates vector.
In the present embodiment, the specific implementation of S301-S303 respectively with aforementioned S201-S203 specific implementation phase
Together, details are not described herein again.
S304, vector is propagated according to the third, and to calculate second keyword crucial with described first under the first dimension
Third correlative character between word;According to the described 4th propagate vector calculate second keyword under second dimension with
The 4th correlative character between first keyword.
Due to according to q1And q2D can be opened1, the second keyword q1Feature vector after Once dissemination, by d1
The available d of calculating1New feature vector, which can be along d1And q2Between line travel to q2, by q2Meter
Calculate available q2New feature vector, q2New feature vector can be along d1And q2Between line travel to d again1, by d1's
Calculating can make d1It obtains including q2Relevant information new feature vector, include q2Relevant information new feature vector
It can be along d1And q1Between line travel to q1, by q1Calculating can make q1It obtains including q2Relevant information new feature to
Amount.In this way, being propagated by multiple vector, so that the second keyword q1Propagation vector under different dimensions may include with
First keyword q2Therefore identical information can calculate the second keyword not according to the propagation vector under different dimensions
With the correlative character under dimension between first keyword.
S305, second keyword and described the are obtained according to the third correlative character and the 4th correlative character
Correlation results between one keyword.
Object S304-S305 targeted when calculating is keyword and keyword, and targeted when S204-S205 calculating
Object be keyword with according to the opened text of keyword, it is seen then that other than targeted object difference, S304-S305's
Specific implementation is identical as aforementioned S204-S205, and details are not described herein again.
By the correlation method between a kind of calculating keyword of above-mentioned offer can determine keyword and keyword it
Between correlation, in this way, user input a keyword when, can according to user input keyword and other keywords it
Between correlation, recommend relevant to the keyword other keywords to scan for user, improve the search experience of user.
For example, user inputs keyword " fresh flower ", and it is " fresh flower express delivery " to keyword recommended to the user, it can be according to two
Correlation results between portion's figure and calculated " fresh flower " and " fresh flower express delivery ", the phase between " fresh flower " and " fresh flower express delivery "
When closing property result reaches a certain threshold value, it is believed that the degree of correlation between " fresh flower " and " fresh flower express delivery " is higher, and user may wish
Prestige scans for " fresh flower express delivery ", at this moment, can really directional user's recommended keywords " fresh flower express delivery " scan for.
Based on a kind of determination method of text relevant of aforementioned offer, in some cases, user is passing through keyword
When scanning for, search engine often shows some texts to user, when the text is related to the keyword that user inputs
Property it is relatively high when, show that the text can satisfy the search need of user to user, bring preferable experience for user;When this
When the correlation for the keyword that text is inputted with user is relatively low, the text shown to user may not be needed for user
It wants, bad experience may be brought instead for user.For this purpose, needing to carry out text before showing the text to user
Analysis, according to the correlation of keyword and text, determines user when searching for certain keyword, if by the text to user's exhibition
Show, to improve the search experience of user.
The present embodiment additionally provides a kind of method according to keyword text exhibition, referring to fig. 4, shows a kind of according to pass
The flow chart of the method for keyword text exhibition, which comprises
S401, text to be analyzed is obtained.
It, can be using the text as text to be analyzed when in the present embodiment, in order to determine whether that user shows the text
This, the text to be analyzed can be the text that user is searched and opened by search key in bigraph (bipartite graph).It is described to point
Analysis text for example can be advertisement text etc..
S402, the keyword according to bigraph (bipartite graph) determination with the text to be analyzed with correlation results.
Due to the determination method using aforementioned texts correlation can determine in bigraph (bipartite graph) keyword with according to keyword
Correlation results between opened text, and the correlation results are saved, in this way, after obtaining the text to be analyzed,
It can be according to the determining keyword with the text to be analyzed with correlation results of bigraph (bipartite graph).
It is advertisement text with the text to be analyzed, for the bigraph (bipartite graph) is as shown in Figure 1, if utilizing aforementioned texts correlation
The determination method of property can determine in bigraph (bipartite graph) keyword and according to the correlation results between the opened text of keyword successively
Are as follows:
q1And d1Between correlation results be 80%;
q2And d1Between correlation results be 95%;
q2And d2Between correlation results be 20%;
q3And d2Between correlation results be 90%.
In this way, advertisement text can be split, such as advertisement keyword, title description can be split into etc., according to
Split result and bigraph (bipartite graph) discovery: advertisement keyword corresponds to text d in bigraph (bipartite graph)1, title description corresponding text in bigraph (bipartite graph)
This d2, wherein with d1Keyword with correlation results includes q1And q2, with d2Keyword with correlation results includes q2
And q3, in this way, bigraph (bipartite graph) according to figure 1 can determine that the keyword for having correlation results with the advertisement text is q1、
q2And q3。
S403, correlation results are met to the keyword of preset condition as the corresponding keyword of the text to be analyzed.
Since the size of correlation results may be inconsistent, presumable correlation results are very big, in order to make to user's exhibition
The reliability that the text shown meets user's search need is higher, correlation results can be met the keyword of preset condition as
The corresponding keyword of the text to be analyzed, when user's relevance of searches result meets the keyword of preset condition, Cai Xiangyong
Family shows the text.Wherein, the preset condition can be correlation results greater than threshold value, and the threshold value can be artificial basis
Experience setting.
By taking aforementioned obtained correlation results as an example, if the preset condition is that correlation results are greater than threshold value, the threshold
Value can be 85%, in this way, caning be found that q according to above-mentioned correlation results1And d1Between correlation results be 80%, be less than
85%, it is not possible to as the corresponding keyword of the text to be analyzed;q2With d1Between correlation results be 95%, be greater than
85%, from the point of view of the correlation results, q2It can be used as the corresponding keyword of the text to be analyzed;q2And d2Between correlation
Property result be 20%, be much smaller than 85%, from the point of view of the correlation results, q2The corresponding pass of the text to be analyzed cannot be used as
Keyword;q3And d2Between correlation results be 90%, be greater than 85%, q3It can be used as the corresponding key of the text to be analyzed
Word.
It can be seen that keyword q from above-mentioned analysis result2With d1And d2All there is correlation results, a but correlation
As a result meet preset condition, another correlation results is unsatisfactory for preset condition, for this purpose, can use q2With d1Between correlation
Property result 95% and q2With d2Between the mean values of correlation results 20% judged, according to the mean value of two correlation results
Can determine the mean value less than 85%, therefore, q2The corresponding keyword of the text to be analyzed cannot be used as, i.e., it is final to determine
Keyword corresponding with the text to be analyzed out is q3.Therefore, when user inputs keyword q3When, search engine just can be to
User shows the text to be analyzed.
It in the present embodiment, can be according to the keyword and text calculated before showing the text to user
Correlation results determine keyword corresponding with text when correlation results meet preset condition, to search for certain in user
It whether can be keyword corresponding with text when correlation results meet preset condition according to the keyword when keyword,
Determine whether to show the text to user, to improve the search experience of user.
Based on a kind of determination method for text relevant that previous embodiment provides, a kind of text phase is present embodiments provided
The determining device of closing property, Fig. 5 show a kind of structural block diagram of the determining device of text relevant, and described device is applied to two
Figure, the bigraph (bipartite graph) include keyword and according to the corresponding relationship between the opened text of keyword, and described device is mentioned including first
Take unit 501, the first computing unit 502, the first propagation unit 503, the second computing unit 504 and first acquisition unit 505:
First extraction unit 501, for extracting fisrt feature of first keyword under the first dimension and second
Second feature under dimension, first keyword are the keyword in the bigraph (bipartite graph);
First computing unit 502, for calculate the fisrt feature first eigenvector and the second feature
Second feature vector;
First propagation unit 503, for being opened according to first keyword with according to first keyword
The first eigenvector is propagated in the bigraph (bipartite graph), obtains first keyword by the corresponding relationship between text
First under the first dimension propagates vector;And propagate the second feature vector in the bigraph (bipartite graph), it obtains
Second propagation vector of first keyword under the second dimension;
Second computing unit 504 calculates first keyword described for propagating vector according to described first
Under first dimension and according to the first correlative character between the opened text of the first keyword;According to described second propagate to
Amount calculates first keyword under second dimension and according to the second phase between the opened text of the first keyword
Closing property feature;
The first acquisition unit 505, for obtaining institute according to first correlative character and the second correlative character
State the first keyword and according to the correlation results between the opened text of the first keyword.
Optionally, first propagation unit includes determining that subelement and first propagates subelement:
The determining subelement is used for according to number is opened, from first keyword and according to first keyword
The text propagated is determined in corresponding relationship between opened text;
Described first propagates subelement, for described being propagated according to first keyword with what is determined
The corresponding relationship of file propagates the first eigenvector in the bigraph (bipartite graph), obtains first keyword and exists
First under first dimension propagates vector.
Optionally, the first acquisition unit includes the first acquisition subelement:
Described first obtains subelement, for passing through Logic Regression Models, according to first correlative character and second
Correlative character obtains first keyword and according to the correlation results between the opened text of the first keyword.
Optionally, described device further includes the second extraction unit, third computing unit, the second propagation unit, the 4th calculating
Unit and second acquisition unit:
Second extraction unit, for extracting third feature of second keyword under first dimension and described
Fourth feature under second dimension, according to some or all of described opened text of second keyword with it is crucial according to described first
The opened text of word is identical;
The third computing unit, for calculating of fourth feature described in the third feature vector sum of the third feature
Four feature vectors;
Second propagation unit, for according to second keyword and according to the opened text of the second keyword
Between corresponding relationship, the third feature vector is propagated in the bigraph (bipartite graph), obtains second keyword
Third under dimension propagates vector;And propagate the fourth feature vector in the bigraph (bipartite graph), it obtains described
Fourth propagation vector of second keyword under the second dimension;
4th computing unit calculates second keyword in the first dimension for propagating vector according to the third
Third correlative character between lower and described first keyword;Second keyword is calculated according to the 4th propagation vector to exist
The 4th correlative character under the second dimension between first keyword;
The second acquisition unit, for obtaining described according to the third correlative character and the 4th correlative character
Correlation results between two keywords and first keyword.
Optionally, described device further includes third acquiring unit, determination unit and the 4th acquiring unit:
The third acquiring unit, for obtaining text to be analyzed;
The determination unit, for according to the determining pass with the text to be analyzed with correlation results of the bigraph (bipartite graph)
Keyword;
4th acquiring unit, for correlation results to be met to the keyword of preset condition as the text to be analyzed
This corresponding keyword.
It can be seen from above-mentioned technical proposal in analyzing bigraph (bipartite graph) when correlation between keyword and text, Ke Yigen
Keyword and the correlation between text are calculated according to multiple dimensions of keyword.Specifically, the first keyword can be extracted in difference
Feature under dimension, and calculate separately the corresponding feature vector of feature under different dimensions, wherein the feature under different dimensions can
To embody the relevant information between keyword and text from different dimensions, keyword is obtained in the data of different dimensions to calculate
First keyword with according to the correlation between the opened text of the first keyword.It, can when propagating feature vector
To obtain the propagation vector under different dimensions, the phase under different dimensions can be calculated according to the propagation vector under different dimensions
Closing property feature, when obtaining the correlative character under different dimensions, wherein a correlative character can be embodied in a dimension
Lower degree of correlation between keyword and text, another correlative character can be embodied in keyword and text under another dimension
Between degree of correlation, in this way, by the correlative character under obtained different dimensions carry out integration can be obtained the first keyword with
According to the correlation results between the opened text of the first keyword.It can be seen that relative in traditional approach, only from one
Dimension calculates keyword and the correlation between text, such as only calculates keyword and the correlation between text from meaning of a word dimension, this
Application embodiment can be provided to calculate keyword with the correlation between text by obtaining keyword in the data of different dimensions
More information, the correlation that can be embodied the correlation between keyword and text from different dimensions, and then be calculated
Confidence level is higher, and the search need of user is more easily satisfied according to the text that the correlation is shown, improves the search body of user
It tests.
Fig. 6 is a kind of block diagram of the device 600 of determination for text relevant shown according to an exemplary embodiment.
For example, device 600 can be robot, mobile phone, computer, digital broadcasting terminal, messaging device, game control
Platform, tablet device, Medical Devices, body-building equipment, personal digital assistant etc..
Referring to Fig. 6, device 600 may include following one or more components: processing component 602, memory 604, power supply
Component 606, multimedia component 608, audio component 610, the interface 612 of input/output (I/O), sensor module 614, and
Communication component 616.
The integrated operation of the usual control device 600 of processing component 602, such as with display, telephone call, data communication, phase
Machine operation and record operate associated operation.Processing element 602 may include that one or more processors 620 refer to execute
It enables, to perform all or part of the steps of the methods described above.In addition, processing component 602 may include one or more modules, just
Interaction between processing component 602 and other assemblies.For example, processing component 602 may include multi-media module, it is more to facilitate
Interaction between media component 608 and processing component 602.
Memory 604 is configured as storing various types of data to support the operation in device 600.These data are shown
Example includes the instruction of any application or method for operating on device 600, contact data, and telephone book data disappears
Breath, picture, video etc..Memory 604 can be by any kind of volatibility or non-volatile memory device or their group
It closes and realizes, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM) is erasable to compile
Journey read-only memory (EPROM), programmable read only memory (PROM), read-only memory (ROM), magnetic memory, flash
Device, disk or CD.
Power supply module 606 provides electric power for the various assemblies of device 600.Power supply module 606 may include power management system
System, one or more power supplys and other with for device 600 generate, manage, and distribute the associated component of electric power.
Multimedia component 608 includes the screen of one output interface of offer between described device 600 and user.One
In a little embodiments, screen may include liquid crystal display (LCD) and touch panel (TP).If screen includes touch panel, screen
Curtain may be implemented as touch screen, to receive input signal from the user.Touch panel includes one or more touch sensings
Device is to sense the gesture on touch, slide, and touch panel.The touch sensor can not only sense touch or sliding action
Boundary, but also detect duration and pressure associated with the touch or slide operation.In some embodiments, more matchmakers
Body component 608 includes a front camera and/or rear camera.When device 600 is in operation mode, such as screening-mode or
When video mode, front camera and/or rear camera can receive external multi-medium data.Each front camera and
Rear camera can be a fixed optical lens system or have focusing and optical zoom capabilities.
Audio component 610 is configured as output and/or input audio signal.For example, audio component 610 includes a Mike
Wind (MIC), when device 600 is in operation mode, when such as call mode, recording mode, and voice recognition mode, microphone is matched
It is set to reception external audio signal.The received audio signal can be further stored in memory 604 or via communication set
Part 616 is sent.In some embodiments, audio component 610 further includes a loudspeaker, is used for output audio signal.
I/O interface 612 provides interface between processing component 602 and peripheral interface module, and above-mentioned peripheral interface module can
To be keyboard, click wheel, button etc..These buttons may include, but are not limited to: home button, volume button, start button and lock
Determine button.
Sensor module 614 includes one or more sensors, and the state for providing various aspects for device 600 is commented
Estimate.For example, sensor module 614 can detecte the state that opens/closes of device 600, and the relative positioning of component, for example, it is described
Component is the display and keypad of device 600, and sensor module 614 can be with 600 1 components of detection device 600 or device
Position change, the existence or non-existence that user contacts with device 600,600 orientation of device or acceleration/deceleration and device 600
Temperature change.Sensor module 614 may include proximity sensor, be configured to detect without any physical contact
Presence of nearby objects.Sensor module 614 can also include optical sensor, such as CMOS or ccd image sensor, at
As being used in application.In some embodiments, which can also include acceleration transducer, gyro sensors
Device, Magnetic Sensor, pressure sensor or temperature sensor.
Communication component 616 is configured to facilitate the communication of wired or wireless way between device 600 and other equipment.Device
600 can access the wireless network based on communication standard, such as WiFi, 2G or 8G or their combination.In an exemplary implementation
In example, communication component 616 receives broadcast singal or broadcast related information from external broadcasting management system via broadcast channel.
In one exemplary embodiment, the communication component 616 further includes near-field communication (NFC) module, to promote short range communication.Example
Such as, NFC module can be based on radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra wide band (UWB) technology,
Bluetooth (BT) technology and other technologies are realized.
In the exemplary embodiment, device 600 can be believed by one or more application specific integrated circuit (ASIC), number
Number processor (DSP), digital signal processing appts (DSPD), programmable logic device (PLD), field programmable gate array
(FPGA), controller, microcontroller, microprocessor or other electronic components are realized, for executing the above method.
In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium including instruction, example are additionally provided
It such as include the memory 604 of instruction, above-metioned instruction can be executed by the processor 620 of device 600 to complete the above method.For example,
The non-transitorycomputer readable storage medium can be ROM, random access memory (RAM), CD-ROM, tape, floppy disk
With optical data storage devices etc..
A kind of non-transitorycomputer readable storage medium, when the instruction in the storage medium is by the processing of mobile terminal
When device executes, so that mobile terminal is able to carry out a kind of determination method for text relevant, which comprises
The first keyword is extracted to close in the fisrt feature under the first dimension and the second feature under the second dimension, described first
Keyword is the keyword in the bigraph (bipartite graph);
Calculate the first eigenvector of the fisrt feature and the second feature vector of the second feature;
According to first keyword and according to the corresponding relationship between the opened text of the first keyword, by described
One feature vector is propagated in the bigraph (bipartite graph), obtain first keyword under the first dimension first propagate to
Amount;And propagate the second feature vector in the bigraph (bipartite graph), first keyword is obtained in the second dimension
Under second propagate vector;
Vector, which is propagated, according to described first calculates first keyword under first dimension and according to described first
The first correlative character between the opened text of keyword;Vector, which is propagated, according to described second calculates first keyword in institute
It states under the second dimension and according to the second correlative character between the opened text of the first keyword;
According to first correlative character and the second correlative character obtain first keyword with according to described the
Correlation results between the opened text of one keyword.
Fig. 7 is the structural schematic diagram of server in the embodiment of the present invention.The server 700 can be due to configuration or performance be different
Generate bigger difference, may include one or more central processing units (central processing units,
CPU) 722 (for example, one or more processors) and memory 732, one or more storage application programs 742 or
The storage medium 730 (such as one or more mass memory units) of data 744.Wherein, memory 732 and storage medium
730 can be of short duration storage or persistent storage.The program for being stored in storage medium 730 may include one or more modules
(diagram does not mark), each module may include to the series of instructions operation in server.Further, central processing unit
722 can be set to communicate with storage medium 730, and the series of instructions behaviour in storage medium 730 is executed on server 700
Make.
Server 700 can also include one or more power supplys 724, one or more wired or wireless networks
Interface 750, one or more input/output interfaces 758, one or more keyboards 754, and/or, one or one
The above operating system 741, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM etc..
It should be noted that all the embodiments in this specification are described in a progressive manner, each embodiment it
Between same and similar part may refer to each other, each embodiment focuses on the differences from other embodiments.
For equipment and system embodiment, since it is substantially similar to the method embodiment, so describe fairly simple,
The relevent part can refer to the partial explaination of embodiments of method.Equipment and system embodiment described above is only schematic
, wherein unit may or may not be physically separated as illustrated by the separation member, it is shown as a unit
Component may or may not be physical unit, it can and it is in one place, or may be distributed over multiple networks
On unit.Some or all of the modules therein can be selected to achieve the purpose of the solution of this embodiment according to the actual needs.
Those of ordinary skill in the art can understand and implement without creative efforts.
The above, only a kind of specific embodiment of the application, but the protection scope of the application is not limited thereto,
Within the technical scope of the present application, any changes or substitutions that can be easily thought of by anyone skilled in the art,
Should all it cover within the scope of protection of this application.Therefore, the protection scope of the application should be with scope of protection of the claims
Subject to.
Claims (10)
1. a kind of determination method of text relevant, which is characterized in that be applied to bigraph (bipartite graph), the bigraph (bipartite graph) include keyword and
According to the corresponding relationship between the opened text of keyword, which comprises
The first keyword is extracted in the fisrt feature under the first dimension and the second feature under the second dimension, first keyword
For the keyword in the bigraph (bipartite graph);
Calculate the first eigenvector of the fisrt feature and the second feature vector of the second feature;
It is special by described first according to first keyword and according to the corresponding relationship between the opened text of the first keyword
Sign vector is propagated in the bigraph (bipartite graph), obtains first propagation vector of first keyword under the first dimension;With
And propagate the second feature vector in the bigraph (bipartite graph), first keyword is obtained under the second dimension
Second propagates vector;
Vector, which is propagated, according to described first calculates first keyword under first dimension and according to first key
The first correlative character between the opened text of word;Vector, which is propagated, according to described second calculates first keyword described the
Under two-dimensions and according to the second correlative character between the opened text of the first keyword;
First keyword is obtained according to first correlative character and the second correlative character to close with according to described first
Correlation results between the opened text of keyword.
2. the method according to claim 1, wherein described according to first keyword and according to described first
The first eigenvector is propagated in the bigraph (bipartite graph), obtains institute by the corresponding relationship between the opened text of keyword
State first propagation vector of first keyword under the first dimension, comprising:
According to number is opened, from first keyword and according in the corresponding relationship between the opened text of the first keyword
Determine the text propagated;
It is special by described first according to the corresponding relationship of first keyword and the file propagated determined
Sign vector is propagated in the bigraph (bipartite graph), obtains first propagation vector of first keyword under the first dimension.
3. the method according to claim 1, wherein described related according to first correlative character and second
Property feature obtains first keyword and according to the correlation results between the opened text of the first keyword, comprising:
By Logic Regression Models, first keyword is obtained according to first correlative character and the second correlative character
With according to the correlation results between the opened text of the first keyword.
4. the method according to claim 1, wherein the method also includes:
The second keyword is extracted in the third feature under first dimension and the fourth feature under second dimension, according to institute
It is identical as according to the opened text of the first keyword to state some or all of opened text of the second keyword;
Calculate the fourth feature vector of fourth feature described in the third feature vector sum of the third feature;
It is according to second keyword and according to the corresponding relationship between the opened text of the second keyword, the third is special
Sign vector is propagated in the bigraph (bipartite graph), is obtained third of second keyword under the first dimension and is propagated vector;With
And propagate the fourth feature vector in the bigraph (bipartite graph), second keyword is obtained under the second dimension
4th propagates vector;
Vector, which is propagated, according to the third calculates the of second keyword under the first dimension between first keyword
Three correlative characters;According to the described 4th propagate vector calculate second keyword it is described under the second dimension with described the
The 4th correlative character between one keyword;
Second keyword and first keyword are obtained according to the third correlative character and the 4th correlative character
Between correlation results.
5. method according to any of claims 1-4, which is characterized in that the method also includes:
Obtain text to be analyzed;
According to the determining keyword with the text to be analyzed with correlation results of the bigraph (bipartite graph);
Correlation results are met into the keyword of preset condition as the corresponding keyword of the text to be analyzed.
6. a kind of determining device of text relevant, which is characterized in that be applied to bigraph (bipartite graph), the bigraph (bipartite graph) include keyword and
According to the corresponding relationship between the opened text of keyword, described device includes the first extraction unit, the first computing unit, the first biography
Broadcast unit, the second computing unit and first acquisition unit:
First extraction unit, for extracting the first keyword in the fisrt feature under the first dimension and under the second dimension
Two features, first keyword are the keyword in the bigraph (bipartite graph);
First computing unit, for calculating the first eigenvector of the fisrt feature and the second spy of the second feature
Levy vector;
First propagation unit, for according to first keyword and according between the opened text of the first keyword
Corresponding relationship propagates the first eigenvector in the bigraph (bipartite graph), obtains first keyword in the first dimension
First under degree propagates vector;And propagate the second feature vector in the bigraph (bipartite graph), obtain described first
Second propagation vector of the keyword under the second dimension;
Second computing unit calculates first keyword in first dimension for propagating vector according to described first
Down and according to the first correlative character between the opened text of the first keyword;Vector, which is propagated, according to described second calculates institute
The first keyword is stated under second dimension and according to the second correlative character between the opened text of the first keyword;
The first acquisition unit is closed for obtaining described first according to first correlative character and the second correlative character
Keyword and according to the correlation results between the opened text of the first keyword.
7. device according to claim 6, which is characterized in that first propagation unit includes determining subelement and first
Propagate subelement:
The determining subelement, for being beaten from first keyword and according to first keyword according to number is opened
Open the text for determining and being propagated in the corresponding relationship between text;
Described first propagates subelement, for according to first keyword and the file propagated determined
Corresponding relationship, the first eigenvector is propagated in the bigraph (bipartite graph), obtains first keyword first
First under dimension propagates vector.
8. device according to claim 6, which is characterized in that the first acquisition unit includes the first acquisition subelement:
Described first obtains subelement, related according to first correlative character and second for passing through Logic Regression Models
Property feature obtains first keyword and according to the correlation results between the opened text of the first keyword.
9. a kind of processing equipment of the determination for text relevant, which is characterized in that include memory and one or
More than one program, perhaps more than one program is stored in memory and is configured to by one or one for one of them
It includes the instruction for performing the following operation that a above processor, which executes the one or more programs:
The first keyword is extracted in the fisrt feature under the first dimension and the second feature under the second dimension, first keyword
For the keyword in the bigraph (bipartite graph);
Calculate the first eigenvector of the fisrt feature and the second feature vector of the second feature;
It is special by described first according to first keyword and according to the corresponding relationship between the opened text of the first keyword
Sign vector is propagated in the bigraph (bipartite graph), obtains first propagation vector of first keyword under the first dimension;With
And propagate the second feature vector in the bigraph (bipartite graph), first keyword is obtained under the second dimension
Second propagates vector;
Vector, which is propagated, according to described first calculates first keyword under first dimension and according to first key
The first correlative character between the opened text of word;Vector, which is propagated, according to described second calculates first keyword described the
Under two-dimensions and according to the second correlative character between the opened text of the first keyword;
First keyword is obtained according to first correlative character and the second correlative character to close with according to described first
Correlation results between the opened text of keyword.
10. a kind of machine readable media is stored thereon with instruction, when executed by one or more processors, so that device is held
The determination method of text relevant of the row as described in one or more in claim 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711252358.3A CN110019801B (en) | 2017-12-01 | 2017-12-01 | Text relevance determining method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711252358.3A CN110019801B (en) | 2017-12-01 | 2017-12-01 | Text relevance determining method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110019801A true CN110019801A (en) | 2019-07-16 |
CN110019801B CN110019801B (en) | 2021-03-23 |
Family
ID=67185941
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711252358.3A Active CN110019801B (en) | 2017-12-01 | 2017-12-01 | Text relevance determining method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110019801B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104424279A (en) * | 2013-08-30 | 2015-03-18 | 腾讯科技(深圳)有限公司 | Text relevance calculating method and device |
US20160378765A1 (en) * | 2015-06-29 | 2016-12-29 | Microsoft Technology Licensing, Llc | Concept expansion using tables |
CN106682095A (en) * | 2016-12-01 | 2017-05-17 | 浙江大学 | Subjectterm and descriptor prediction and ordering method based on diagram |
-
2017
- 2017-12-01 CN CN201711252358.3A patent/CN110019801B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104424279A (en) * | 2013-08-30 | 2015-03-18 | 腾讯科技(深圳)有限公司 | Text relevance calculating method and device |
US20160378765A1 (en) * | 2015-06-29 | 2016-12-29 | Microsoft Technology Licensing, Llc | Concept expansion using tables |
CN106682095A (en) * | 2016-12-01 | 2017-05-17 | 浙江大学 | Subjectterm and descriptor prediction and ordering method based on diagram |
Non-Patent Citations (1)
Title |
---|
董宇欣 等: ""一种面向不确定图的SimRank算法"", 《哈尔滨工程大学学报》 * |
Also Published As
Publication number | Publication date |
---|---|
CN110019801B (en) | 2021-03-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107102746B (en) | Candidate word generation method and device and candidate word generation device | |
CN106708282B (en) | A kind of recommended method and device, a kind of device for recommendation | |
CN109189987A (en) | Video searching method and device | |
CN109918669B (en) | Entity determining method, device and storage medium | |
CN109800325A (en) | Video recommendation method, device and computer readable storage medium | |
CN107992812A (en) | A kind of lip reading recognition methods and device | |
CN107239535A (en) | Similar pictures search method and device | |
CN110175223A (en) | A kind of method and device that problem of implementation generates | |
CN113792207B (en) | Cross-modal retrieval method based on multi-level feature representation alignment | |
CN109522419A (en) | Session information complementing method and device | |
CN109493852A (en) | A kind of evaluating method and device of speech recognition | |
CN108121736A (en) | A kind of descriptor determines the method for building up, device and electronic equipment of model | |
CN108073606A (en) | A kind of news recommends method and apparatus, a kind of device recommended for news | |
CN109933714A (en) | A kind of calculation method, searching method and the relevant apparatus of entry weight | |
CN109471919A (en) | Empty anaphora resolution method and device | |
CN116166843B (en) | Text video cross-modal retrieval method and device based on fine granularity perception | |
CN111984749A (en) | Method and device for ordering interest points | |
CN111611490A (en) | Resource searching method, device, equipment and storage medium | |
CN108255940A (en) | A kind of cross-language search method and apparatus, a kind of device for cross-language search | |
CN108304412A (en) | A kind of cross-language search method and apparatus, a kind of device for cross-language search | |
CN109521888A (en) | A kind of input method, device and medium | |
CN112307281A (en) | Entity recommendation method and device | |
CN108572979A (en) | A kind of position service method and device, a kind of device for location-based service | |
CN109977293A (en) | A kind of calculation method and device of search result relevance | |
CN106156299B (en) | The subject content recognition methods of text information and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |