CN110019801A - A kind of determination method and apparatus of text relevant - Google Patents

A kind of determination method and apparatus of text relevant Download PDF

Info

Publication number
CN110019801A
CN110019801A CN201711252358.3A CN201711252358A CN110019801A CN 110019801 A CN110019801 A CN 110019801A CN 201711252358 A CN201711252358 A CN 201711252358A CN 110019801 A CN110019801 A CN 110019801A
Authority
CN
China
Prior art keywords
keyword
text
vector
dimension
under
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711252358.3A
Other languages
Chinese (zh)
Other versions
CN110019801B (en
Inventor
朱昌磊
叶祺
刘志敏
王峰
李刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sogou Technology Development Co Ltd
Original Assignee
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sogou Technology Development Co Ltd filed Critical Beijing Sogou Technology Development Co Ltd
Priority to CN201711252358.3A priority Critical patent/CN110019801B/en
Publication of CN110019801A publication Critical patent/CN110019801A/en
Application granted granted Critical
Publication of CN110019801B publication Critical patent/CN110019801B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the present application discloses a kind of determination method of text relevant, in analyzing bigraph (bipartite graph) when correlation between keyword and text, feature of the keyword under different dimensions can be extracted, and the corresponding feature vector of feature under different dimensions is calculated, to obtain the keyword in the data of different dimensions.When propagating feature vector, propagation vector under available different dimensions, the correlative character under different dimensions can be calculated according to the propagation vector under different dimensions, when obtaining the correlative character under different dimensions, the correlative character under the different dimensions, which is carried out integration, can be obtained keyword and according to the correlation results between the opened text of the keyword.In this way, by obtaining keyword in the data of different dimensions, the correlation between keyword and text is embodied from different dimensions, keeps calculated correlation confidence level higher, the search need of user is more easily satisfied according to the text that the correlation is shown, improves the search experience of user.

Description

A kind of determination method and apparatus of text relevant
Technical field
This application involves data processing fields, more particularly to a kind of determination method and apparatus of text relevant.
Background technique
With popularizing for network, user can pass through keyword search information needed by search engine on network.It is logical Crossing keyword may search for text relevant to the keyword, and user can select required text to open from these texts Browsing.It can establish keyword and the corresponding bigraph (bipartite graph) for opening text by above-mentioned search behavior, such as shown in Fig. 1, bigraph (bipartite graph) The node q in left side can be keyword, and the node d on right side can be the text of opening, and between q and d on line Number can represent the number that this d is searched for and opened by this q, user passes through search key q to example as shown in figure 11This One behavior, opened 3 d1
The correlation between keyword and text can be determined by bigraph (bipartite graph), thus when there is user to search again in bigraph (bipartite graph) Keyword when, search engine can be determined according to the height of correlation text in search result displaying sequence.
In traditional approach, if desired analysis of key word and the correlation between text, usually segment keyword, lead to The term vector of participle is crossed to calculate the correlation between keyword and text.But what the term vector major embodiment of keyword participle went out Participle or the meaning of a word of the keyword, the actual purpose scanned for user by the keyword may difference, therefore singly The correlation between keyword and text is calculated from this dimension of the meaning of a word, is difficult to meet user according to the text that the correlation is shown Search seek, reduce the search experience of user.
Summary of the invention
It in order to solve the above-mentioned technical problem, can be with this application provides a kind of determination method and apparatus of text relevant Keyword and the correlation between text are calculated according to multiple dimensions of keyword, in this way, the correlation calculated can be more Accurately, so that seeking according to the search that user is more easily satisfied in the text that the correlation is shown, the search experience of user is improved.
The embodiment of the present application discloses following technical solution:
In a first aspect, the embodiment of the present application provides a kind of determination method of text relevant, it is applied to bigraph (bipartite graph), it is described Bigraph (bipartite graph) includes keyword and according to the corresponding relationship between the opened text of keyword, which comprises
The first keyword is extracted to close in the fisrt feature under the first dimension and the second feature under the second dimension, described first Keyword is the keyword in the bigraph (bipartite graph);
Calculate the first eigenvector of the fisrt feature and the second feature vector of the second feature;
According to first keyword and according to the corresponding relationship between the opened text of the first keyword, by described One feature vector is propagated in the bigraph (bipartite graph), obtain first keyword under the first dimension first propagate to Amount;And propagate the second feature vector in the bigraph (bipartite graph), first keyword is obtained in the second dimension Under second propagate vector;
Vector, which is propagated, according to described first calculates first keyword under first dimension and according to described first The first correlative character between the opened text of keyword;Vector, which is propagated, according to described second calculates first keyword in institute It states under the second dimension and according to the second correlative character between the opened text of the first keyword;
According to first correlative character and the second correlative character obtain first keyword with according to described the Correlation results between the opened text of one keyword.
Optionally, it is described according to first keyword with according between the opened text of the first keyword it is corresponding pass System, the first eigenvector is propagated in the bigraph (bipartite graph), obtains first keyword under the first dimension First propagates vector, comprising:
According to open number, from first keyword with according between the opened text of the first keyword it is corresponding close The text propagated is determined in system;
According to first keyword and the corresponding relationship of the file propagated determined, by described the One feature vector is propagated in the bigraph (bipartite graph), obtain first keyword under the first dimension first propagate to Amount.
Optionally, it is described according to first correlative character and the second correlative character obtain first keyword with According to the correlation results between the opened text of the first keyword, comprising:
By Logic Regression Models, described first is obtained according to first correlative character and the second correlative character and is closed Keyword and according to the correlation results between the opened text of the first keyword.
Optionally, the method also includes:
The second keyword is extracted in the third feature under first dimension and the fourth feature under second dimension, root It is identical as according to the opened text of the first keyword according to some or all of the opened text of the second keyword;
Calculate the fourth feature vector of fourth feature described in the third feature vector sum of the third feature;
According to second keyword and according to the corresponding relationship between the opened text of the second keyword, by described Three feature vectors are propagated in the bigraph (bipartite graph), obtain third of second keyword under the first dimension propagate to Amount;And propagate the fourth feature vector in the bigraph (bipartite graph), second keyword is obtained in the second dimension Under the 4th propagate vector;
Vector, which is propagated, according to the third calculates second keyword under the first dimension between first keyword Third correlative character;According to the described 4th propagate vector calculate second keyword it is described under the second dimension with institute State the 4th correlative character between the first keyword;
Second keyword and described first is obtained according to the third correlative character and the 4th correlative character to close Correlation results between keyword.
Optionally, the method also includes:
Obtain text to be analyzed;
According to the determining keyword with the text to be analyzed with correlation results of the bigraph (bipartite graph);
Correlation results are met into the keyword of preset condition as the corresponding keyword of the text to be analyzed.
Second aspect, the embodiment of the present application provide a kind of determining device of text relevant, are applied to bigraph (bipartite graph), described Bigraph (bipartite graph) includes keyword and according to the corresponding relationship between the opened text of keyword, described device include the first extraction unit, First computing unit, the first propagation unit, the second computing unit and first acquisition unit:
First extraction unit, for extracting fisrt feature of first keyword under the first dimension and in the second dimension Lower second feature, first keyword are the keyword in the bigraph (bipartite graph);
First computing unit, for calculate the fisrt feature first eigenvector and the second feature Two feature vectors;
First propagation unit, for according to first keyword and according to the opened text of the first keyword Between corresponding relationship, the first eigenvector is propagated in the bigraph (bipartite graph), obtains first keyword First under dimension propagates vector;And propagate the second feature vector in the bigraph (bipartite graph), it obtains described Second propagation vector of first keyword under the second dimension;
Second computing unit calculates first keyword described first for propagating vector according to described first Under dimension and according to the first correlative character between the opened text of the first keyword;It propagates according to described second to meter First keyword is calculated under second dimension and according to the second correlation between the opened text of the first keyword Feature;
The first acquisition unit, for obtaining described according to first correlative character and the second correlative character One keyword and according to the correlation results between the opened text of the first keyword.
Optionally, first propagation unit includes determining that subelement and first propagates subelement:
The determining subelement is used for according to number is opened, from first keyword and according to first keyword The text propagated is determined in corresponding relationship between opened text;
Described first propagates subelement, for described being propagated according to first keyword with what is determined The corresponding relationship of file propagates the first eigenvector in the bigraph (bipartite graph), obtains first keyword and exists First under first dimension propagates vector.
Optionally, the first acquisition unit includes the first acquisition subelement:
Described first obtains subelement, for passing through Logic Regression Models, according to first correlative character and second Correlative character obtains first keyword and according to the correlation results between the opened text of the first keyword.
Optionally, described device further includes the second extraction unit, third computing unit, the second propagation unit, the 4th calculating Unit and second acquisition unit:
Second extraction unit, for extracting third feature of second keyword under first dimension and described Fourth feature under second dimension, according to some or all of described opened text of second keyword with it is crucial according to described first The opened text of word is identical;
The third computing unit, for calculating of fourth feature described in the third feature vector sum of the third feature Four feature vectors;
Second propagation unit, for according to second keyword and according to the opened text of the second keyword Between corresponding relationship, the third feature vector is propagated in the bigraph (bipartite graph), obtains second keyword Third under dimension propagates vector;And propagate the fourth feature vector in the bigraph (bipartite graph), it obtains described Fourth propagation vector of second keyword under the second dimension;
4th computing unit calculates second keyword in the first dimension for propagating vector according to the third Third correlative character between lower and described first keyword;Second keyword is calculated according to the 4th propagation vector to exist The 4th correlative character under the second dimension between first keyword;
The second acquisition unit, for obtaining described according to the third correlative character and the 4th correlative character Correlation results between two keywords and first keyword.
Optionally, described device further includes third acquiring unit, determination unit and the 4th acquiring unit:
The third acquiring unit, for obtaining text to be analyzed;
The determination unit, for according to the determining pass with the text to be analyzed with correlation results of the bigraph (bipartite graph) Keyword;
4th acquiring unit, for correlation results to be met to the keyword of preset condition as the text to be analyzed This corresponding keyword.
The third aspect, the embodiment of the present application provide a kind of processing equipment of determination for text relevant, feature It is, includes that perhaps more than one program one of them or more than one program is stored in and deposits by memory and one In reservoir, and it is configured to execute the one or more programs by one or more than one processor to include to be used for The instruction performed the following operation:
The first keyword is extracted to close in the fisrt feature under the first dimension and the second feature under the second dimension, described first Keyword is the keyword in the bigraph (bipartite graph);
Calculate the first eigenvector of the fisrt feature and the second feature vector of the second feature;
According to first keyword and according to the corresponding relationship between the opened text of the first keyword, by described One feature vector is propagated in the bigraph (bipartite graph), obtain first keyword under the first dimension first propagate to Amount;And propagate the second feature vector in the bigraph (bipartite graph), first keyword is obtained in the second dimension Under second propagate vector;
Vector, which is propagated, according to described first calculates first keyword under first dimension and according to described first The first correlative character between the opened text of keyword;Vector, which is propagated, according to described second calculates first keyword in institute It states under the second dimension and according to the second correlative character between the opened text of the first keyword;
According to first correlative character and the second correlative character obtain first keyword with according to described the Correlation results between the opened text of one keyword.
Fourth aspect, the embodiment of the present application provide a kind of machine readable media, are stored thereon with instruction, when by one or When multiple processors execute, so that device executes the determination side of text relevant described in one or more of first aspect Method.
It can be seen from above-mentioned technical proposal in analyzing bigraph (bipartite graph) when correlation between keyword and text, Ke Yigen Keyword and the correlation between text are calculated according to multiple dimensions of keyword.Specifically, the first keyword can be extracted in difference Feature under dimension, and calculate separately the corresponding feature vector of feature under different dimensions, wherein the feature under different dimensions can To embody the relevant information between keyword and text from different dimensions, keyword is obtained in the data of different dimensions to calculate First keyword with according to the correlation between the opened text of the first keyword.It, can when propagating feature vector To obtain the propagation vector under different dimensions, the phase under different dimensions can be calculated according to the propagation vector under different dimensions Closing property feature, when obtaining the correlative character under different dimensions, wherein a correlative character can be embodied in a dimension Lower degree of correlation between keyword and text, another correlative character can be embodied in keyword and text under another dimension Between degree of correlation, in this way, by the correlative character under obtained different dimensions carry out integration can be obtained the first keyword with According to the correlation results between the opened text of the first keyword.It can be seen that relative in traditional approach, only from one Dimension calculates keyword and the correlation between text, such as only calculates keyword and the correlation between text from meaning of a word dimension, this Application embodiment can be provided to calculate keyword with the correlation between text by obtaining keyword in the data of different dimensions More information, the correlation that can be embodied the correlation between keyword and text from different dimensions, and then be calculated Confidence level is higher, and the search need of user is more easily satisfied according to the text that the correlation is shown, improves the search body of user It tests.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of application without any creative labor, may be used also for those of ordinary skill in the art To obtain other drawings based on these drawings.
Fig. 1 is bigraph (bipartite graph) exemplary diagram provided by the embodiments of the present application;
Fig. 2 is a kind of flow chart of the determination method of text relevant provided by the embodiments of the present application;
Fig. 3 is a kind of flow chart for calculating the correlation method between keyword provided by the embodiments of the present application;
Fig. 4 is a kind of method according to keyword text exhibition provided by the embodiments of the present application;
Fig. 5 is a kind of structural block diagram of the determining device of text relevant provided by the embodiments of the present application;
Fig. 6 is a kind of block diagram of the device of the determination for text relevant provided by the embodiments of the present application;
Fig. 7 is a kind of block diagram of the server of the determination for text relevant provided by the embodiments of the present application.
Specific embodiment
With reference to the accompanying drawing, embodiments herein is described.
Inventor it has been investigated that, in the method for the correlation between traditional determination keyword with text, typically just Keyword is segmented, calculates the correlation between keyword and text using this dimension of the keyword meaning of a word, is counted in this way The correlation calculated possibly can not react the actual search purpose of user, so that calculated keyword is related between text Property accuracy it is insufficient, in this way, seeking according to the search that the text that the correlation is shown is difficult to meet user, reduce searching for user Cable body is tested.
For example, user wishes to search for the related letter of this mobile phone brand of millet mobile phone by input keyword " millet mobile phone " Breath, but correlation of traditional mode in basis " millet mobile phone " calculating " millet mobile phone " with text, to be shown to user When text corresponding to " millet mobile phone ", " millet mobile phone " can be segmented, such as participle obtains " millet " and " mobile phone ", It can determine that the meaning of a word of keyword " millet mobile phone " is primarily referred to as mobile phone according to word segmentation result, i.e., what user needed to scan for is Mobile phone, in this way, will be by millet cell phone display to user, without the correlation to user's displaying this mobile phone brand of millet mobile phone Information, to reduce the search experience of user.
For this purpose, the embodiment of the present application provides a kind of solution regarding to the issue above, this method can be according to key Multiple dimensions of word calculate keyword and the correlation between text, such as the multiple dimension includes the first dimension and the second dimension Degree, the keyword include the first keyword, then can extract first fisrt feature of first keyword under the first dimension and The second feature under the second dimension, and calculate separately the fisrt feature first eigenvector and the second feature second Feature vector.Then, the corresponding pass between the first keyword according to bigraph (bipartite graph) and the opened text of the first keyword System, the first eigenvector is propagated in bigraph (bipartite graph), obtains first of first keyword under the first dimension Vector is propagated, and the second feature vector is propagated in bigraph (bipartite graph), obtains first keyword in the second dimension Second under degree propagates vector.After again, according to described first propagate vector calculate first keyword under the first dimension with The first correlative character between the opened text of first keyword, and propagate vector according to described second and calculate described first Keyword is described under the second dimension and according to the second correlative character between the opened text of the first keyword.Most Afterwards, according to first correlative character and the second correlative character obtain first keyword with it is crucial according to described first Correlation results between the opened text of word.It can be seen that relative in traditional approach, only from a dimension calculate keyword with Correlation between text, such as can only lead to from meaning of a word dimension calculating keyword and the correlation between text, the embodiment of the present application It crosses to obtain keyword in the data of different dimensions, provides more information with the correlation between text to calculate keyword, it can be with The correlation between keyword and text is embodied from different dimensions, and then the correlation confidence level being calculated is higher, according to The search that user is more easily satisfied in the text that the correlation is shown is sought, and the search experience of user is improved.
The bigraph (bipartite graph) being previously mentioned in the embodiment of the present application is constructed according to historical search data, may include that user looks into The keyword of inquiry and according to the corresponding relationship between the opened text of keyword, wherein can be according to the opened text of keyword Refer to what user obtained by inputting keyword search, and the text opened by user.The text can be webpage, can also To be advertisement etc..The bigraph (bipartite graph) can with as shown in Figure 1, include node, while and while on number.Wherein, the section in bigraph (bipartite graph) Point can indicate keyword and according to the opened text of keyword, and the node q on the left of bigraph (bipartite graph) can be keyword, the section on right side Point d can be the text of opening;Side in bigraph (bipartite graph) can be line between q and d, side can indicate keyword and According to the corresponding relationship between the opened text of keyword, this text d is only opened by search key q, it just can be in this q There are sides between d, for example, q1And d1Between side, indicate by search q1Open text d1;The number on side between node It can indicate in historical search data, the opening number of a d is searched for and opened by a q, it can be with table by opening number Show a q and according to the q open text d corresponding relationship with the q have corresponding relationship all d in weight, such as User passes through search key q in Fig. 11This behavior, opened 3 d1, then q1And d1Between side on mark 3, number 3 can To indicate q1And d1Between corresponding relationship with the q1All d (such as d with corresponding relationship1And d2) in weight.
It is to search for information needed, the content inputted on a search engine by search engine that keyword, which can be user,. Keyword can be word, for example, " mountain ", " mobile phone ", " millet mobile phone " etc.;Keyword can be phrase, for example, " silk scarf Be method ", " we are " etc.;Keyword is also possible to sentence etc., for example, " sleep before can eat apple? " Deng.
Dimension can refer to according to key word analysis keyword and thought angle when correlation between text, from it is several not Same thought angle analysis of key word is with the correlation between text it may be considered that including several dimensions, feature can indicate crucial Word corresponding specific data or information under different thought angles.For example, it is directed to keyword " sogou browser ", it can be from key The meaning of a word of word and two dimensional analysis " sogou browser " of attribute of keyword and the correlation between text.Specifically, Ke Yitong It crosses and the keyword is segmented, " sogou browser " is segmented as " search dog " and " browser ", then " search dog " and " browser " It can be used as the feature of " sogou browser " under this dimension of the meaning of a word of keyword.Since " sogou browser " can on attribute To belong to browser, therefore, browser can be used as the spy of " sogou browser " under this dimension of the attribute of keyword Sign.
Feature vector, which can be, is indicated the form of the feature of keyword vector, so that computer can be according to spy Sign vector such as is identified to the feature of keyword, is calculated at the relevant treatments.For example, keyword " sogou browser " is in keyword Feature under this dimension of attribute is browser, and states browser this feature by the way of text and be unfavorable for using In subsequent calculating, it is consequently possible to calculate feature vector corresponding to this feature of browser, utilizes the form of feature vector This feature of browser is stated, convenient for processing such as subsequent calculating.
Propagation of the feature vector in bigraph (bipartite graph) can be feature vector according to the line between node from one in bigraph (bipartite graph) Node is transferred to another node, and when this feature vector travels on another described node, another described node can be with Calculation processing is carried out to this feature vector and obtains the corresponding new feature vector of another described node, which can be Propagate vector.Feature vector is once traveled into another node from a node and the operation for generating new feature vector can regard For the Once dissemination in bigraph (bipartite graph).The line that the new feature vector can continue between node is propagated, pre- until reaching Until if the feature vector of number or each node tends towards stability.
By taking bigraph (bipartite graph) shown in FIG. 1 as an example, q1Feature vector can be from q1Travel to node d1, d1It can be to obtaining Feature vector carries out calculation processing, obtains d1New feature vector as d1Propagation vector.d1The new feature that can also will be obtained Vector is broadcast to q again1, q1It can will receive from d1New feature vector carry out calculation processing, obtain q1New feature vector As q1Propagation vector.Certainly, d1After obtaining new feature vector, d1It can also be by obtained new feature vector to q2It propagates, q2 It can will receive from d1New feature vector carry out calculation processing, obtain q2New feature vector as q2Propagation vector. q2It can also be by obtained new feature vector to d2It propagates, d2It can will receive from q2New feature vector carry out at calculating Reason, obtains d2New feature vector as d2Propagation vector.d2It can also be by obtained new feature vector to q3It propagates, q3It can be with It will receive from d2New feature vector carry out calculation processing, obtain q3New feature vector as q3Propagation vector.At this In implementation, there are also other circulation ways for feature vector, as long as according to keyword in bigraph (bipartite graph) and according to the opened text of keyword This corresponding relationship is propagated, and is not repeated one by one herein.
Correlative character can be used to measure the degree of correlation in bigraph (bipartite graph) between a node and another node, the phase Pass degree may include the degree of correlation in bigraph (bipartite graph) between keyword and keyword, such as q1And q2Between degree of correlation, with And including the degree of correlation in bigraph (bipartite graph) between keyword and text, for example, q1And d1Between degree of correlation.The correlative character It can be indicated in the form of percentage, such as correlative character can be 90%.
Correlation results can be to be integrated using the correlative character under different dimensions, for indicating two Correlation in figure between keyword and text.
With reference to the accompanying drawing, the determination method of text relevant provided by the embodiments of the present application is described in detail.
Referring to fig. 2, Fig. 2 is a kind of flow chart of the determination method of text relevant provided by the embodiments of the present application, the party Method can be applied to bigraph (bipartite graph), and the bigraph (bipartite graph) includes keyword and according to the corresponding relationship between the opened text of keyword, institute The method of stating includes:
S201, the first keyword is extracted in the fisrt feature under the first dimension and the second feature under the second dimension.
Bigraph (bipartite graph) is established in the keyword inputted according to user and according to the corresponding relationship between the opened text of keyword Afterwards, the present embodiment can determine the keyword and basis according to multiple dimensions of the keyword for the keyword in bigraph (bipartite graph) Correlation results between the opened text of the keyword are wrapped in the multiple dimension with the first keyword for including in bigraph (bipartite graph) For the first dimension and the second dimension that include, then fisrt feature and institute of first keyword under the first dimension can be extracted first State the first keyword second feature under the second dimension.
It should be noted that the feature under different dimensions, such as fisrt feature and second feature, it can be and utilize classifier It obtains, the dimension may include the meaning of a word, attribute, field etc., and the feature may include the dimensions such as participle, attribute, field Under corresponding specific data.
For example, the first keyword is " sogou browser ", the first dimension is the meaning of a word, and the second dimension is attribute, utilizes classification Device segments " sogou browser ", obtains " search dog " and " browser ", the meaning of a word be " search dog " and " browser " in itself, The attribute of " sogou browser " is browser, then, the fisrt feature extracted in the present embodiment can be " search dog " and " clear Look at device ", second feature can be browser.
The second feature vector of S202, the first eigenvector for calculating the fisrt feature and the second feature.
It needs to propagate fisrt feature and second feature in bigraph (bipartite graph) due to subsequent, and the data processings such as calculating Operation can be special to fisrt feature and second using the form of vector for the ease of the progress of the operation of above-mentioned data processing Sign is indicated, i.e., converts first eigenvector for fisrt feature by calculating, and convert the second spy for second feature Levy vector.
S203, according to first keyword and according to the corresponding relationship between the opened text of the first keyword, will The first eigenvector is propagated in the bigraph (bipartite graph), obtains first biography of first keyword under the first dimension Broadcast vector;And propagate the second feature vector in the bigraph (bipartite graph), first keyword is obtained second Second under dimension propagates vector.
In the present embodiment, the feature vector of the first keyword can be along the side in bigraph (bipartite graph) to according to first in bigraph (bipartite graph) The opened text of keyword is propagated, and the text is by calculating propagation of the available new feature vector as the text Vector, after the text is propagated new feature vector along this side to the first keyword, the first keyword is by calculating Propagation vector to new feature vector as the first keyword.Therefore, the feature for the first keyword under different dimensions to Amount, wherein first eigenvector can be propagated in bigraph (bipartite graph), after propagating the calculating with the first keyword, obtain First propagation vector of first keyword under the first dimension;Correspondingly, second feature vector can be passed in bigraph (bipartite graph) It broadcasts, after propagating the calculating with the first keyword, obtains second propagation vector of first keyword under the second dimension.
It should be noted that the corresponding new feature vector of the first keyword can continue on after completing Once dissemination Line between node is propagated, until the feature vector until reaching preset times or each node tends towards stability.
It is understood that for the ease of propagating the calculating of vector, the operation such as comparing, can to obtained propagation vector into Row normalized.
It should be noted that in the present embodiment, propagation and calculating and second of the first eigenvector in bigraph (bipartite graph) are special Levying propagation and calculating of the vector in bigraph (bipartite graph) can be separated, be independent of each other.To first eigenvector in bigraph (bipartite graph) Propagation and calculate with propagation and computation sequence of the second feature vector in bigraph (bipartite graph) without limitation, can successively carry out, It can carry out side by side.
With the q in Fig. 11For the first keyword, q1First eigenvector under the first dimension can be A, second Second feature vector under dimension can be B, according to q1Opened text is d1, d1Feature vector can be C, q1First Feature vector A is after propagating, d1An available new feature is calculated to first eigenvector on the basis of feature vector C Vector C ' is used as d1First eigenvector under the first dimension.q1Second feature vector B through propagation after, d1Feature to An available new feature vector C " is calculated to second feature vector on the basis of amount C and is used as d1Second under the second dimension Feature vector.It can be seen that propagation and calculating of the first eigenvector in bigraph (bipartite graph) are with second feature vector in bigraph (bipartite graph) Propagation and calculating can be separated, be independent of each other.
d1It can be again to q1Propagate new feature vector C ', q1After receiving new feature vector C ', in first eigenvector A On the basis of a new feature vector A ' is calculated as q1First under the first dimension propagates vector.d1Can again to q1Propagate new feature vector C ", q1After receiving new feature vector C ", one is calculated on the basis of second feature vector B New feature vector B ' is used as q1Second under the second dimension propagates vector.As it can be seen that q1After receiving new feature vector C ' and C ", The line that new feature vector C ' and C " can be continued between node is propagated, the spy until reaching preset times or each node Until sign vector tends towards stability.
It should be noted that since the number in bigraph (bipartite graph) on side can be indicated through this keyword search and open this The number of a text, the number can indicate the keyword and open the degree of correlation of text according to the keyword.It opens secondary Number is more, has corresponding close with the keyword for the corresponding relationship of a keyword and the text opened according to the keyword Weight in all texts of system is bigger, bigger to the interdependence effects calculated between keyword and text, and opening number is fewer, needle There are all of corresponding relationship with the keyword to the corresponding relationship of a keyword and the text opened according to the keyword Weight in text is smaller, smaller to the interdependence effects calculated between keyword and text.
For a keyword, if all propagated according to all texts that the keyword is opened in bigraph (bipartite graph), Have the propagation between the lesser text of weight in all texts of corresponding relationship may be to the meter of correlation with the keyword Calculation does not help, in this case, the present embodiment can according to the openings number in bigraph (bipartite graph) to need the text propagated into Row selection, determines the text propagated, and described is propagated according to first keyword with what is determined File corresponding relationship, the first eigenvector is propagated in the bigraph (bipartite graph), obtains first keyword First under the first dimension propagates vector, correspondingly, described being passed according to first keyword with what is determined The corresponding relationship for the file broadcast propagates the second feature vector in the bigraph (bipartite graph), and it is crucial to obtain described first Second propagation vector of the word under the second dimension.Such as opening time can be selected from according to the opened text of the first keyword The more text of number is propagated, so that the first eigenvector and second feature vector that avoid the first keyword are along expression first The keyword side small with the degree of correlation for opening text according to the first keyword is propagated, and realizes basis and the first keyword phase Degree higher text in pass, which calculates, propagates vector, reduces the purpose of calculation amount.
With the q in Fig. 12For the first keyword, for q2For, according to q2Open d1Number be 5, according to q2It opens d2Number be 1, then q2And q2The text d of opening2Corresponding relationship with the q2All d (such as d with corresponding relationship1With d2) in weight it is smaller, then q2Feature vector to text d2Propagation for calculate q2Corresponding relationship between text helps It is smaller, hence, it can be determined that going out the text propagated is d1, in this way, can be according to q2With d1Corresponding relationship, by q2 First eigenvector propagated in the bigraph (bipartite graph), obtain q2First under the first dimension propagates vector.
S204, vector calculating first keyword is propagated according to described first under first dimension and according to The first correlative character between the opened text of first keyword;Vector, which is propagated, according to described second calculates first keyword Under second dimension and according to the second correlative character between the opened text of the first keyword.
In the present embodiment, obtained propagation vector of the feature vector after propagating has compared with original feature vector Have information more abundant, according to the propagation vector under different dimensions can calculate the first keyword under different dimensions with root According to the correlative character between the opened text of the first keyword.
The first keyword q is obtained with aforementioned1First propagation vector be A ' and second propagation vector be B ' for, if related Property characteristic use percentage indicate, then be that A ' can be calculated q according to the first propagation vector1Under the first dimension with basis q1Opened text d1Between the first correlative character be 90%, according to second propagation vector be B ' q can be calculated1? Under two-dimensions and according to q1Opened text d1Between the second correlative character can be 85%.
S205, first keyword is obtained according to first correlative character and the second correlative character and according to institute State the correlation results between the opened text of the first keyword.
Since aforementioned the first obtained correlative character and the second correlative character indicate the first key under different dimensions Word and according to the degree of correlation between the opened text of the first keyword, in order to accurately obtain the first keyword with according to institute The correlation between the opened text of the first keyword is stated, the degree of correlation of different dimensions can be comprehensively considered, it therefore, can The first correlative character and the second correlative character to be integrated to obtain the first keyword and according to first keyword Correlation results between opened text.Wherein, the correlation results can be indicated with percentage.
It is alternatively possible to by model, such as Logic Regression Models, by the first correlative character and the second correlative character It is integrated to obtain first keyword and according to the correlation results between the opened text of the first keyword.
As an example, first keyword and root are obtained according to the first correlative character and the second correlative character It may is that according to the mode of the correlation results between the opened text of the first keyword and calculate the first correlative character and second The average value of correlative character, using the average value as the correlation results.
Due to, the influence of the first correlative character and the second correlative character to the correlation results may be different, because This, can be respectively set weighted value to the first correlative character and the second correlative character, utilize the first correlative character and Two correlative characters are weighted to obtain the correlation results.
It is, of course, also possible to calculate the correlation results, details are not described herein again by other different models.
It can be seen from above-mentioned technical proposal in analyzing bigraph (bipartite graph) when correlation between keyword and text, Ke Yigen Keyword and the correlation between text are calculated according to multiple dimensions of keyword.Specifically, the first keyword can be extracted in difference Feature under dimension, and calculate separately the corresponding feature vector of feature under different dimensions, wherein the feature under different dimensions can To embody the relevant information between keyword and text from different dimensions, keyword is obtained in the data of different dimensions to calculate First keyword with according to the correlation between the opened text of the first keyword.It, can when propagating feature vector To obtain the propagation vector under different dimensions, the phase under different dimensions can be calculated according to the propagation vector under different dimensions Closing property feature, when obtaining the correlative character under different dimensions, wherein a correlative character can be embodied in a dimension Lower degree of correlation between keyword and text, another correlative character can be embodied in keyword and text under another dimension Between degree of correlation, in this way, by the correlative character under obtained different dimensions carry out integration can be obtained the first keyword with According to the correlation results between the opened text of the first keyword.It can be seen that relative in traditional approach, only from one Dimension calculates keyword and the correlation between text, such as only calculates keyword and the correlation between text from meaning of a word dimension, this Application embodiment can be provided to calculate keyword with the correlation between text by obtaining keyword in the data of different dimensions More information, the correlation that can be embodied the correlation between keyword and text from different dimensions, and then be calculated Confidence level is higher, and the search need of user is more easily satisfied according to the text that the correlation is shown, improves the search body of user It tests.
It is opened it should be noted that can not only calculate keyword using the method for the present embodiment with according to the keyword Correlation between text can also calculate the correlation between different keywords, embodiment of the present embodiment corresponding to Fig. 2 On the basis of, provide the correlation method between a kind of calculating keyword, referring to Fig. 3, Fig. 3 show a kind of calculating keyword it Between correlation method flow chart, which comprises
S301, the second keyword is extracted in the third feature under first dimension and the 4th spy under second dimension Sign.
In the present embodiment, when identical according to some or all of different the opened text of keyword, this is different There may be correlation between keyword, therefore, in the present embodiment, according to the part of the opened text of the second keyword Or it is all identical as according to the opened text of the first keyword.
By taking Fig. 1 as an example, as can be seen from the figure according to q1And q2D can be opened1, therefore, q1And q2It can have correlation Property, wherein q2It can be used as the first keyword, q1It can be used as the second keyword.
S302, the fourth feature vector for calculating fourth feature described in the third feature vector sum of the third feature.
S303, according to second keyword and according to the corresponding relationship between the opened text of the second keyword, will The third feature vector is propagated in the bigraph (bipartite graph), is obtained third of second keyword under the first dimension and is passed Broadcast vector;And propagate the fourth feature vector in the bigraph (bipartite graph), second keyword is obtained second The 4th under dimension propagates vector.
In the present embodiment, the specific implementation of S301-S303 respectively with aforementioned S201-S203 specific implementation phase Together, details are not described herein again.
S304, vector is propagated according to the third, and to calculate second keyword crucial with described first under the first dimension Third correlative character between word;According to the described 4th propagate vector calculate second keyword under second dimension with The 4th correlative character between first keyword.
Due to according to q1And q2D can be opened1, the second keyword q1Feature vector after Once dissemination, by d1 The available d of calculating1New feature vector, which can be along d1And q2Between line travel to q2, by q2Meter Calculate available q2New feature vector, q2New feature vector can be along d1And q2Between line travel to d again1, by d1's Calculating can make d1It obtains including q2Relevant information new feature vector, include q2Relevant information new feature vector It can be along d1And q1Between line travel to q1, by q1Calculating can make q1It obtains including q2Relevant information new feature to Amount.In this way, being propagated by multiple vector, so that the second keyword q1Propagation vector under different dimensions may include with First keyword q2Therefore identical information can calculate the second keyword not according to the propagation vector under different dimensions With the correlative character under dimension between first keyword.
S305, second keyword and described the are obtained according to the third correlative character and the 4th correlative character Correlation results between one keyword.
Object S304-S305 targeted when calculating is keyword and keyword, and targeted when S204-S205 calculating Object be keyword with according to the opened text of keyword, it is seen then that other than targeted object difference, S304-S305's Specific implementation is identical as aforementioned S204-S205, and details are not described herein again.
By the correlation method between a kind of calculating keyword of above-mentioned offer can determine keyword and keyword it Between correlation, in this way, user input a keyword when, can according to user input keyword and other keywords it Between correlation, recommend relevant to the keyword other keywords to scan for user, improve the search experience of user.
For example, user inputs keyword " fresh flower ", and it is " fresh flower express delivery " to keyword recommended to the user, it can be according to two Correlation results between portion's figure and calculated " fresh flower " and " fresh flower express delivery ", the phase between " fresh flower " and " fresh flower express delivery " When closing property result reaches a certain threshold value, it is believed that the degree of correlation between " fresh flower " and " fresh flower express delivery " is higher, and user may wish Prestige scans for " fresh flower express delivery ", at this moment, can really directional user's recommended keywords " fresh flower express delivery " scan for.
Based on a kind of determination method of text relevant of aforementioned offer, in some cases, user is passing through keyword When scanning for, search engine often shows some texts to user, when the text is related to the keyword that user inputs Property it is relatively high when, show that the text can satisfy the search need of user to user, bring preferable experience for user;When this When the correlation for the keyword that text is inputted with user is relatively low, the text shown to user may not be needed for user It wants, bad experience may be brought instead for user.For this purpose, needing to carry out text before showing the text to user Analysis, according to the correlation of keyword and text, determines user when searching for certain keyword, if by the text to user's exhibition Show, to improve the search experience of user.
The present embodiment additionally provides a kind of method according to keyword text exhibition, referring to fig. 4, shows a kind of according to pass The flow chart of the method for keyword text exhibition, which comprises
S401, text to be analyzed is obtained.
It, can be using the text as text to be analyzed when in the present embodiment, in order to determine whether that user shows the text This, the text to be analyzed can be the text that user is searched and opened by search key in bigraph (bipartite graph).It is described to point Analysis text for example can be advertisement text etc..
S402, the keyword according to bigraph (bipartite graph) determination with the text to be analyzed with correlation results.
Due to the determination method using aforementioned texts correlation can determine in bigraph (bipartite graph) keyword with according to keyword Correlation results between opened text, and the correlation results are saved, in this way, after obtaining the text to be analyzed, It can be according to the determining keyword with the text to be analyzed with correlation results of bigraph (bipartite graph).
It is advertisement text with the text to be analyzed, for the bigraph (bipartite graph) is as shown in Figure 1, if utilizing aforementioned texts correlation The determination method of property can determine in bigraph (bipartite graph) keyword and according to the correlation results between the opened text of keyword successively Are as follows:
q1And d1Between correlation results be 80%;
q2And d1Between correlation results be 95%;
q2And d2Between correlation results be 20%;
q3And d2Between correlation results be 90%.
In this way, advertisement text can be split, such as advertisement keyword, title description can be split into etc., according to Split result and bigraph (bipartite graph) discovery: advertisement keyword corresponds to text d in bigraph (bipartite graph)1, title description corresponding text in bigraph (bipartite graph) This d2, wherein with d1Keyword with correlation results includes q1And q2, with d2Keyword with correlation results includes q2 And q3, in this way, bigraph (bipartite graph) according to figure 1 can determine that the keyword for having correlation results with the advertisement text is q1、 q2And q3
S403, correlation results are met to the keyword of preset condition as the corresponding keyword of the text to be analyzed.
Since the size of correlation results may be inconsistent, presumable correlation results are very big, in order to make to user's exhibition The reliability that the text shown meets user's search need is higher, correlation results can be met the keyword of preset condition as The corresponding keyword of the text to be analyzed, when user's relevance of searches result meets the keyword of preset condition, Cai Xiangyong Family shows the text.Wherein, the preset condition can be correlation results greater than threshold value, and the threshold value can be artificial basis Experience setting.
By taking aforementioned obtained correlation results as an example, if the preset condition is that correlation results are greater than threshold value, the threshold Value can be 85%, in this way, caning be found that q according to above-mentioned correlation results1And d1Between correlation results be 80%, be less than 85%, it is not possible to as the corresponding keyword of the text to be analyzed;q2With d1Between correlation results be 95%, be greater than 85%, from the point of view of the correlation results, q2It can be used as the corresponding keyword of the text to be analyzed;q2And d2Between correlation Property result be 20%, be much smaller than 85%, from the point of view of the correlation results, q2The corresponding pass of the text to be analyzed cannot be used as Keyword;q3And d2Between correlation results be 90%, be greater than 85%, q3It can be used as the corresponding key of the text to be analyzed Word.
It can be seen that keyword q from above-mentioned analysis result2With d1And d2All there is correlation results, a but correlation As a result meet preset condition, another correlation results is unsatisfactory for preset condition, for this purpose, can use q2With d1Between correlation Property result 95% and q2With d2Between the mean values of correlation results 20% judged, according to the mean value of two correlation results Can determine the mean value less than 85%, therefore, q2The corresponding keyword of the text to be analyzed cannot be used as, i.e., it is final to determine Keyword corresponding with the text to be analyzed out is q3.Therefore, when user inputs keyword q3When, search engine just can be to User shows the text to be analyzed.
It in the present embodiment, can be according to the keyword and text calculated before showing the text to user Correlation results determine keyword corresponding with text when correlation results meet preset condition, to search for certain in user It whether can be keyword corresponding with text when correlation results meet preset condition according to the keyword when keyword, Determine whether to show the text to user, to improve the search experience of user.
Based on a kind of determination method for text relevant that previous embodiment provides, a kind of text phase is present embodiments provided The determining device of closing property, Fig. 5 show a kind of structural block diagram of the determining device of text relevant, and described device is applied to two Figure, the bigraph (bipartite graph) include keyword and according to the corresponding relationship between the opened text of keyword, and described device is mentioned including first Take unit 501, the first computing unit 502, the first propagation unit 503, the second computing unit 504 and first acquisition unit 505:
First extraction unit 501, for extracting fisrt feature of first keyword under the first dimension and second Second feature under dimension, first keyword are the keyword in the bigraph (bipartite graph);
First computing unit 502, for calculate the fisrt feature first eigenvector and the second feature Second feature vector;
First propagation unit 503, for being opened according to first keyword with according to first keyword The first eigenvector is propagated in the bigraph (bipartite graph), obtains first keyword by the corresponding relationship between text First under the first dimension propagates vector;And propagate the second feature vector in the bigraph (bipartite graph), it obtains Second propagation vector of first keyword under the second dimension;
Second computing unit 504 calculates first keyword described for propagating vector according to described first Under first dimension and according to the first correlative character between the opened text of the first keyword;According to described second propagate to Amount calculates first keyword under second dimension and according to the second phase between the opened text of the first keyword Closing property feature;
The first acquisition unit 505, for obtaining institute according to first correlative character and the second correlative character State the first keyword and according to the correlation results between the opened text of the first keyword.
Optionally, first propagation unit includes determining that subelement and first propagates subelement:
The determining subelement is used for according to number is opened, from first keyword and according to first keyword The text propagated is determined in corresponding relationship between opened text;
Described first propagates subelement, for described being propagated according to first keyword with what is determined The corresponding relationship of file propagates the first eigenvector in the bigraph (bipartite graph), obtains first keyword and exists First under first dimension propagates vector.
Optionally, the first acquisition unit includes the first acquisition subelement:
Described first obtains subelement, for passing through Logic Regression Models, according to first correlative character and second Correlative character obtains first keyword and according to the correlation results between the opened text of the first keyword.
Optionally, described device further includes the second extraction unit, third computing unit, the second propagation unit, the 4th calculating Unit and second acquisition unit:
Second extraction unit, for extracting third feature of second keyword under first dimension and described Fourth feature under second dimension, according to some or all of described opened text of second keyword with it is crucial according to described first The opened text of word is identical;
The third computing unit, for calculating of fourth feature described in the third feature vector sum of the third feature Four feature vectors;
Second propagation unit, for according to second keyword and according to the opened text of the second keyword Between corresponding relationship, the third feature vector is propagated in the bigraph (bipartite graph), obtains second keyword Third under dimension propagates vector;And propagate the fourth feature vector in the bigraph (bipartite graph), it obtains described Fourth propagation vector of second keyword under the second dimension;
4th computing unit calculates second keyword in the first dimension for propagating vector according to the third Third correlative character between lower and described first keyword;Second keyword is calculated according to the 4th propagation vector to exist The 4th correlative character under the second dimension between first keyword;
The second acquisition unit, for obtaining described according to the third correlative character and the 4th correlative character Correlation results between two keywords and first keyword.
Optionally, described device further includes third acquiring unit, determination unit and the 4th acquiring unit:
The third acquiring unit, for obtaining text to be analyzed;
The determination unit, for according to the determining pass with the text to be analyzed with correlation results of the bigraph (bipartite graph) Keyword;
4th acquiring unit, for correlation results to be met to the keyword of preset condition as the text to be analyzed This corresponding keyword.
It can be seen from above-mentioned technical proposal in analyzing bigraph (bipartite graph) when correlation between keyword and text, Ke Yigen Keyword and the correlation between text are calculated according to multiple dimensions of keyword.Specifically, the first keyword can be extracted in difference Feature under dimension, and calculate separately the corresponding feature vector of feature under different dimensions, wherein the feature under different dimensions can To embody the relevant information between keyword and text from different dimensions, keyword is obtained in the data of different dimensions to calculate First keyword with according to the correlation between the opened text of the first keyword.It, can when propagating feature vector To obtain the propagation vector under different dimensions, the phase under different dimensions can be calculated according to the propagation vector under different dimensions Closing property feature, when obtaining the correlative character under different dimensions, wherein a correlative character can be embodied in a dimension Lower degree of correlation between keyword and text, another correlative character can be embodied in keyword and text under another dimension Between degree of correlation, in this way, by the correlative character under obtained different dimensions carry out integration can be obtained the first keyword with According to the correlation results between the opened text of the first keyword.It can be seen that relative in traditional approach, only from one Dimension calculates keyword and the correlation between text, such as only calculates keyword and the correlation between text from meaning of a word dimension, this Application embodiment can be provided to calculate keyword with the correlation between text by obtaining keyword in the data of different dimensions More information, the correlation that can be embodied the correlation between keyword and text from different dimensions, and then be calculated Confidence level is higher, and the search need of user is more easily satisfied according to the text that the correlation is shown, improves the search body of user It tests.
Fig. 6 is a kind of block diagram of the device 600 of determination for text relevant shown according to an exemplary embodiment. For example, device 600 can be robot, mobile phone, computer, digital broadcasting terminal, messaging device, game control Platform, tablet device, Medical Devices, body-building equipment, personal digital assistant etc..
Referring to Fig. 6, device 600 may include following one or more components: processing component 602, memory 604, power supply Component 606, multimedia component 608, audio component 610, the interface 612 of input/output (I/O), sensor module 614, and Communication component 616.
The integrated operation of the usual control device 600 of processing component 602, such as with display, telephone call, data communication, phase Machine operation and record operate associated operation.Processing element 602 may include that one or more processors 620 refer to execute It enables, to perform all or part of the steps of the methods described above.In addition, processing component 602 may include one or more modules, just Interaction between processing component 602 and other assemblies.For example, processing component 602 may include multi-media module, it is more to facilitate Interaction between media component 608 and processing component 602.
Memory 604 is configured as storing various types of data to support the operation in device 600.These data are shown Example includes the instruction of any application or method for operating on device 600, contact data, and telephone book data disappears Breath, picture, video etc..Memory 604 can be by any kind of volatibility or non-volatile memory device or their group It closes and realizes, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM) is erasable to compile Journey read-only memory (EPROM), programmable read only memory (PROM), read-only memory (ROM), magnetic memory, flash Device, disk or CD.
Power supply module 606 provides electric power for the various assemblies of device 600.Power supply module 606 may include power management system System, one or more power supplys and other with for device 600 generate, manage, and distribute the associated component of electric power.
Multimedia component 608 includes the screen of one output interface of offer between described device 600 and user.One In a little embodiments, screen may include liquid crystal display (LCD) and touch panel (TP).If screen includes touch panel, screen Curtain may be implemented as touch screen, to receive input signal from the user.Touch panel includes one or more touch sensings Device is to sense the gesture on touch, slide, and touch panel.The touch sensor can not only sense touch or sliding action Boundary, but also detect duration and pressure associated with the touch or slide operation.In some embodiments, more matchmakers Body component 608 includes a front camera and/or rear camera.When device 600 is in operation mode, such as screening-mode or When video mode, front camera and/or rear camera can receive external multi-medium data.Each front camera and Rear camera can be a fixed optical lens system or have focusing and optical zoom capabilities.
Audio component 610 is configured as output and/or input audio signal.For example, audio component 610 includes a Mike Wind (MIC), when device 600 is in operation mode, when such as call mode, recording mode, and voice recognition mode, microphone is matched It is set to reception external audio signal.The received audio signal can be further stored in memory 604 or via communication set Part 616 is sent.In some embodiments, audio component 610 further includes a loudspeaker, is used for output audio signal.
I/O interface 612 provides interface between processing component 602 and peripheral interface module, and above-mentioned peripheral interface module can To be keyboard, click wheel, button etc..These buttons may include, but are not limited to: home button, volume button, start button and lock Determine button.
Sensor module 614 includes one or more sensors, and the state for providing various aspects for device 600 is commented Estimate.For example, sensor module 614 can detecte the state that opens/closes of device 600, and the relative positioning of component, for example, it is described Component is the display and keypad of device 600, and sensor module 614 can be with 600 1 components of detection device 600 or device Position change, the existence or non-existence that user contacts with device 600,600 orientation of device or acceleration/deceleration and device 600 Temperature change.Sensor module 614 may include proximity sensor, be configured to detect without any physical contact Presence of nearby objects.Sensor module 614 can also include optical sensor, such as CMOS or ccd image sensor, at As being used in application.In some embodiments, which can also include acceleration transducer, gyro sensors Device, Magnetic Sensor, pressure sensor or temperature sensor.
Communication component 616 is configured to facilitate the communication of wired or wireless way between device 600 and other equipment.Device 600 can access the wireless network based on communication standard, such as WiFi, 2G or 8G or their combination.In an exemplary implementation In example, communication component 616 receives broadcast singal or broadcast related information from external broadcasting management system via broadcast channel. In one exemplary embodiment, the communication component 616 further includes near-field communication (NFC) module, to promote short range communication.Example Such as, NFC module can be based on radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra wide band (UWB) technology, Bluetooth (BT) technology and other technologies are realized.
In the exemplary embodiment, device 600 can be believed by one or more application specific integrated circuit (ASIC), number Number processor (DSP), digital signal processing appts (DSPD), programmable logic device (PLD), field programmable gate array (FPGA), controller, microcontroller, microprocessor or other electronic components are realized, for executing the above method.
In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium including instruction, example are additionally provided It such as include the memory 604 of instruction, above-metioned instruction can be executed by the processor 620 of device 600 to complete the above method.For example, The non-transitorycomputer readable storage medium can be ROM, random access memory (RAM), CD-ROM, tape, floppy disk With optical data storage devices etc..
A kind of non-transitorycomputer readable storage medium, when the instruction in the storage medium is by the processing of mobile terminal When device executes, so that mobile terminal is able to carry out a kind of determination method for text relevant, which comprises
The first keyword is extracted to close in the fisrt feature under the first dimension and the second feature under the second dimension, described first Keyword is the keyword in the bigraph (bipartite graph);
Calculate the first eigenvector of the fisrt feature and the second feature vector of the second feature;
According to first keyword and according to the corresponding relationship between the opened text of the first keyword, by described One feature vector is propagated in the bigraph (bipartite graph), obtain first keyword under the first dimension first propagate to Amount;And propagate the second feature vector in the bigraph (bipartite graph), first keyword is obtained in the second dimension Under second propagate vector;
Vector, which is propagated, according to described first calculates first keyword under first dimension and according to described first The first correlative character between the opened text of keyword;Vector, which is propagated, according to described second calculates first keyword in institute It states under the second dimension and according to the second correlative character between the opened text of the first keyword;
According to first correlative character and the second correlative character obtain first keyword with according to described the Correlation results between the opened text of one keyword.
Fig. 7 is the structural schematic diagram of server in the embodiment of the present invention.The server 700 can be due to configuration or performance be different Generate bigger difference, may include one or more central processing units (central processing units, CPU) 722 (for example, one or more processors) and memory 732, one or more storage application programs 742 or The storage medium 730 (such as one or more mass memory units) of data 744.Wherein, memory 732 and storage medium 730 can be of short duration storage or persistent storage.The program for being stored in storage medium 730 may include one or more modules (diagram does not mark), each module may include to the series of instructions operation in server.Further, central processing unit 722 can be set to communicate with storage medium 730, and the series of instructions behaviour in storage medium 730 is executed on server 700 Make.
Server 700 can also include one or more power supplys 724, one or more wired or wireless networks Interface 750, one or more input/output interfaces 758, one or more keyboards 754, and/or, one or one The above operating system 741, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM etc..
It should be noted that all the embodiments in this specification are described in a progressive manner, each embodiment it Between same and similar part may refer to each other, each embodiment focuses on the differences from other embodiments. For equipment and system embodiment, since it is substantially similar to the method embodiment, so describe fairly simple, The relevent part can refer to the partial explaination of embodiments of method.Equipment and system embodiment described above is only schematic , wherein unit may or may not be physically separated as illustrated by the separation member, it is shown as a unit Component may or may not be physical unit, it can and it is in one place, or may be distributed over multiple networks On unit.Some or all of the modules therein can be selected to achieve the purpose of the solution of this embodiment according to the actual needs. Those of ordinary skill in the art can understand and implement without creative efforts.
The above, only a kind of specific embodiment of the application, but the protection scope of the application is not limited thereto, Within the technical scope of the present application, any changes or substitutions that can be easily thought of by anyone skilled in the art, Should all it cover within the scope of protection of this application.Therefore, the protection scope of the application should be with scope of protection of the claims Subject to.

Claims (10)

1. a kind of determination method of text relevant, which is characterized in that be applied to bigraph (bipartite graph), the bigraph (bipartite graph) include keyword and According to the corresponding relationship between the opened text of keyword, which comprises
The first keyword is extracted in the fisrt feature under the first dimension and the second feature under the second dimension, first keyword For the keyword in the bigraph (bipartite graph);
Calculate the first eigenvector of the fisrt feature and the second feature vector of the second feature;
It is special by described first according to first keyword and according to the corresponding relationship between the opened text of the first keyword Sign vector is propagated in the bigraph (bipartite graph), obtains first propagation vector of first keyword under the first dimension;With And propagate the second feature vector in the bigraph (bipartite graph), first keyword is obtained under the second dimension Second propagates vector;
Vector, which is propagated, according to described first calculates first keyword under first dimension and according to first key The first correlative character between the opened text of word;Vector, which is propagated, according to described second calculates first keyword described the Under two-dimensions and according to the second correlative character between the opened text of the first keyword;
First keyword is obtained according to first correlative character and the second correlative character to close with according to described first Correlation results between the opened text of keyword.
2. the method according to claim 1, wherein described according to first keyword and according to described first The first eigenvector is propagated in the bigraph (bipartite graph), obtains institute by the corresponding relationship between the opened text of keyword State first propagation vector of first keyword under the first dimension, comprising:
According to number is opened, from first keyword and according in the corresponding relationship between the opened text of the first keyword Determine the text propagated;
It is special by described first according to the corresponding relationship of first keyword and the file propagated determined Sign vector is propagated in the bigraph (bipartite graph), obtains first propagation vector of first keyword under the first dimension.
3. the method according to claim 1, wherein described related according to first correlative character and second Property feature obtains first keyword and according to the correlation results between the opened text of the first keyword, comprising:
By Logic Regression Models, first keyword is obtained according to first correlative character and the second correlative character With according to the correlation results between the opened text of the first keyword.
4. the method according to claim 1, wherein the method also includes:
The second keyword is extracted in the third feature under first dimension and the fourth feature under second dimension, according to institute It is identical as according to the opened text of the first keyword to state some or all of opened text of the second keyword;
Calculate the fourth feature vector of fourth feature described in the third feature vector sum of the third feature;
It is according to second keyword and according to the corresponding relationship between the opened text of the second keyword, the third is special Sign vector is propagated in the bigraph (bipartite graph), is obtained third of second keyword under the first dimension and is propagated vector;With And propagate the fourth feature vector in the bigraph (bipartite graph), second keyword is obtained under the second dimension 4th propagates vector;
Vector, which is propagated, according to the third calculates the of second keyword under the first dimension between first keyword Three correlative characters;According to the described 4th propagate vector calculate second keyword it is described under the second dimension with described the The 4th correlative character between one keyword;
Second keyword and first keyword are obtained according to the third correlative character and the 4th correlative character Between correlation results.
5. method according to any of claims 1-4, which is characterized in that the method also includes:
Obtain text to be analyzed;
According to the determining keyword with the text to be analyzed with correlation results of the bigraph (bipartite graph);
Correlation results are met into the keyword of preset condition as the corresponding keyword of the text to be analyzed.
6. a kind of determining device of text relevant, which is characterized in that be applied to bigraph (bipartite graph), the bigraph (bipartite graph) include keyword and According to the corresponding relationship between the opened text of keyword, described device includes the first extraction unit, the first computing unit, the first biography Broadcast unit, the second computing unit and first acquisition unit:
First extraction unit, for extracting the first keyword in the fisrt feature under the first dimension and under the second dimension Two features, first keyword are the keyword in the bigraph (bipartite graph);
First computing unit, for calculating the first eigenvector of the fisrt feature and the second spy of the second feature Levy vector;
First propagation unit, for according to first keyword and according between the opened text of the first keyword Corresponding relationship propagates the first eigenvector in the bigraph (bipartite graph), obtains first keyword in the first dimension First under degree propagates vector;And propagate the second feature vector in the bigraph (bipartite graph), obtain described first Second propagation vector of the keyword under the second dimension;
Second computing unit calculates first keyword in first dimension for propagating vector according to described first Down and according to the first correlative character between the opened text of the first keyword;Vector, which is propagated, according to described second calculates institute The first keyword is stated under second dimension and according to the second correlative character between the opened text of the first keyword;
The first acquisition unit is closed for obtaining described first according to first correlative character and the second correlative character Keyword and according to the correlation results between the opened text of the first keyword.
7. device according to claim 6, which is characterized in that first propagation unit includes determining subelement and first Propagate subelement:
The determining subelement, for being beaten from first keyword and according to first keyword according to number is opened Open the text for determining and being propagated in the corresponding relationship between text;
Described first propagates subelement, for according to first keyword and the file propagated determined Corresponding relationship, the first eigenvector is propagated in the bigraph (bipartite graph), obtains first keyword first First under dimension propagates vector.
8. device according to claim 6, which is characterized in that the first acquisition unit includes the first acquisition subelement:
Described first obtains subelement, related according to first correlative character and second for passing through Logic Regression Models Property feature obtains first keyword and according to the correlation results between the opened text of the first keyword.
9. a kind of processing equipment of the determination for text relevant, which is characterized in that include memory and one or More than one program, perhaps more than one program is stored in memory and is configured to by one or one for one of them It includes the instruction for performing the following operation that a above processor, which executes the one or more programs:
The first keyword is extracted in the fisrt feature under the first dimension and the second feature under the second dimension, first keyword For the keyword in the bigraph (bipartite graph);
Calculate the first eigenvector of the fisrt feature and the second feature vector of the second feature;
It is special by described first according to first keyword and according to the corresponding relationship between the opened text of the first keyword Sign vector is propagated in the bigraph (bipartite graph), obtains first propagation vector of first keyword under the first dimension;With And propagate the second feature vector in the bigraph (bipartite graph), first keyword is obtained under the second dimension Second propagates vector;
Vector, which is propagated, according to described first calculates first keyword under first dimension and according to first key The first correlative character between the opened text of word;Vector, which is propagated, according to described second calculates first keyword described the Under two-dimensions and according to the second correlative character between the opened text of the first keyword;
First keyword is obtained according to first correlative character and the second correlative character to close with according to described first Correlation results between the opened text of keyword.
10. a kind of machine readable media is stored thereon with instruction, when executed by one or more processors, so that device is held The determination method of text relevant of the row as described in one or more in claim 1 to 5.
CN201711252358.3A 2017-12-01 2017-12-01 Text relevance determining method and device Active CN110019801B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711252358.3A CN110019801B (en) 2017-12-01 2017-12-01 Text relevance determining method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711252358.3A CN110019801B (en) 2017-12-01 2017-12-01 Text relevance determining method and device

Publications (2)

Publication Number Publication Date
CN110019801A true CN110019801A (en) 2019-07-16
CN110019801B CN110019801B (en) 2021-03-23

Family

ID=67185941

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711252358.3A Active CN110019801B (en) 2017-12-01 2017-12-01 Text relevance determining method and device

Country Status (1)

Country Link
CN (1) CN110019801B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104424279A (en) * 2013-08-30 2015-03-18 腾讯科技(深圳)有限公司 Text relevance calculating method and device
US20160378765A1 (en) * 2015-06-29 2016-12-29 Microsoft Technology Licensing, Llc Concept expansion using tables
CN106682095A (en) * 2016-12-01 2017-05-17 浙江大学 Subjectterm and descriptor prediction and ordering method based on diagram

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104424279A (en) * 2013-08-30 2015-03-18 腾讯科技(深圳)有限公司 Text relevance calculating method and device
US20160378765A1 (en) * 2015-06-29 2016-12-29 Microsoft Technology Licensing, Llc Concept expansion using tables
CN106682095A (en) * 2016-12-01 2017-05-17 浙江大学 Subjectterm and descriptor prediction and ordering method based on diagram

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
董宇欣 等: ""一种面向不确定图的SimRank算法"", 《哈尔滨工程大学学报》 *

Also Published As

Publication number Publication date
CN110019801B (en) 2021-03-23

Similar Documents

Publication Publication Date Title
CN107102746B (en) Candidate word generation method and device and candidate word generation device
CN106708282B (en) A kind of recommended method and device, a kind of device for recommendation
CN109189987A (en) Video searching method and device
CN109918669B (en) Entity determining method, device and storage medium
CN109800325A (en) Video recommendation method, device and computer readable storage medium
CN107992812A (en) A kind of lip reading recognition methods and device
CN107239535A (en) Similar pictures search method and device
CN110175223A (en) A kind of method and device that problem of implementation generates
CN113792207B (en) Cross-modal retrieval method based on multi-level feature representation alignment
CN109522419A (en) Session information complementing method and device
CN109493852A (en) A kind of evaluating method and device of speech recognition
CN108121736A (en) A kind of descriptor determines the method for building up, device and electronic equipment of model
CN108073606A (en) A kind of news recommends method and apparatus, a kind of device recommended for news
CN109933714A (en) A kind of calculation method, searching method and the relevant apparatus of entry weight
CN109471919A (en) Empty anaphora resolution method and device
CN116166843B (en) Text video cross-modal retrieval method and device based on fine granularity perception
CN111984749A (en) Method and device for ordering interest points
CN111611490A (en) Resource searching method, device, equipment and storage medium
CN108255940A (en) A kind of cross-language search method and apparatus, a kind of device for cross-language search
CN108304412A (en) A kind of cross-language search method and apparatus, a kind of device for cross-language search
CN109521888A (en) A kind of input method, device and medium
CN112307281A (en) Entity recommendation method and device
CN108572979A (en) A kind of position service method and device, a kind of device for location-based service
CN109977293A (en) A kind of calculation method and device of search result relevance
CN106156299B (en) The subject content recognition methods of text information and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant