Summary of the invention
This specification embodiment is intended to provide a kind of more effective text retrieval scheme, with solve it is in the prior art not
Foot.
To achieve the above object, this specification provides a kind of text searching method on one side, and the method is based on preparatory
The text material library of acquisition carries out, and includes multiple texts for specific transactions in the text material library, and the method is directed to
The multiple text is preset with the text component of predetermined number, respective type in each text component and the multiple text
Content respectively correspond, which comprises
Obtain user's input;
It is inputted from the user and identifies text component wherein included;And
Based on the text component identified, by the text material library pre-establish with text component for inspection
The inverted index table of Suo Jian retrieves multiple texts from the text material library.
In one embodiment, it is inputted from the user and identifies that text component wherein included includes, by training in advance
Sequence labelling model inputted from the user and identify text component wherein included.
In one embodiment, based on the text component identified, pass through pre-establishing for the text material library
Using text component as the inverted index table of index button, retrieving multiple texts from the text material library includes:
Obtain whole nonvoid subsets of the set of the text component identified described in including;
It relative to each subset, is retrieved by the inverted index table, obtains corresponding search result,
In, the corresponding search result of the subset is that each text component carries out retrieving acquired retrieval knot as index button using in the subset
The intersection of fruit, the search result are the list of the Text Flag of corresponding text.
In one embodiment, based on the text component identified, pass through pre-establishing for the text material library
Using text component as the inverted index table of index button, retrieving multiple texts from the text material library further includes,
Relative to each subset, retrieved by the inverted index table, obtain corresponding search result it
Afterwards, the full text for including in whole search results is identified, based at least one of following to the full text mark-row
Sequence: the number of the text component for including in the corresponding subset of each Text Flag and the corresponding text of each Text Flag with
The similarity of user's input.
It in one embodiment, include that the full text is identified, is first based on to full text mark sequence
The number for the text component for including in the corresponding subset of each Text Flag carries out the first hierarchical ranking, then is based on each text mark
The similarity for knowing corresponding text and user input carries out the second hierarchical ranking under the first layer time.
In one embodiment, the method also includes, after retrieving multiple texts in the text material library,
Based on the sequence to full text mark, Xiang Suoshu user shows the text retrieved.
In one embodiment, based on the sequence to full text mark, Xiang Suoshu user shows and retrieves
Text include that in displayed page, in addition to being based on the sequence, Xiang Suoshu user is shown except the text that retrieves, also mixed
It closes ground and shows the multiple texts for carrying out keyword retrieval acquisition by inputting for the user.
On the other hand this specification provides a kind of method of inverted index table for constructing text material library, the text material
Library includes multiple texts for specific transactions, the method for the multiple text be preset with the text of predetermined number at
Point, each text component and the content of respective type in the multiple text respectively correspond, which comprises
For each text in text material library, text component wherein included is identified from the text;And
Based on the text component that each text includes, construct the inverted index table in the text material library, wherein it is described fall
First index button of row's concordance list is the first text component in the text component of predetermined number, corresponding with first index button
Searching value be each text comprising first text component Text Flag.
In one embodiment, the text component for including based on each text constructs the row's of falling rope in the text material library
Drawing table includes, and the keyword that the text component and each text for including based on each text include constructs the text material library
Inverted index table, wherein the second index button of the inverted index table be the first keyword, it is corresponding with second index button
Searching value be each text comprising first keyword Text Flag, wherein first keyword be it is the multiple
A keyword for including in text, wherein in the inverted index table, first index button is indicated by predetermined mark
Corresponding to text component.
On the other hand this specification provides a kind of text retrieval device, described device is based on the text material library obtained in advance
Implement, include multiple texts for specific transactions in the text material library, described device is default for the multiple text
There is the text component of predetermined number, each text component and the content of respective type in the multiple text respectively correspond,
Described device includes:
Acquiring unit is configured to, and obtains user's input;
Recognition unit is configured to, and is inputted from the user and is identified text component wherein included;And
Retrieval unit is configured to, and based on the text component identified, passes through pre-establishing for the text material library
Using text component as the inverted index table of index button, retrieve multiple texts from the text material library.
In one embodiment, the recognition unit is additionally configured to, by sequence labelling model trained in advance from described
User, which inputs, identifies text component wherein included.
In one embodiment, the retrieval unit further include:
Subelement is obtained, is configured to, whole nonvoid subsets of the set of the text component identified described in including are obtained;
Subelement is retrieved, is configured to, relative to each subset, is retrieved by the inverted index table, is obtained
Corresponding search result, wherein the corresponding search result of the subset is that each text component is carried out as index button using in the subset
The intersection of the acquired search result of retrieval, the search result are the list of the Text Flag of corresponding text.
In one embodiment, the retrieval unit further include:
Sorting subunit is configured to, and relative to each subset, is retrieved, is obtained by the inverted index table
After taking corresponding search result, the full text for including in whole search results is identified, based at least one of following right
Full text mark sequence: the number for the text component for including in the corresponding subset of each Text Flag and each text
This identifies the similarity of corresponding text and user input.
In one embodiment, the sorting subunit is additionally configured to, and the full text is identified, first based on each
The number for the text component for including in the corresponding subset of Text Flag carries out the first hierarchical ranking, then is based on each Text Flag pair
The text answered and the similarity of user input carry out the second hierarchical ranking under the first layer time.
In one embodiment, described device further includes that display unit is configured to, and is examined from the text material library
Rope goes out after multiple texts, and based on the sequence to full text mark, Xiang Suoshu user shows the text retrieved.
In one embodiment, the display unit is additionally configured to, in displayed page, in addition to being based on the sequence, to
The user shows except the text retrieved, also mixedly shows and is obtained by inputting progress keyword retrieval for the user
The multiple texts taken.
On the other hand this specification provides a kind of device of inverted index table for constructing text material library, the text material
Library includes multiple texts for specific transactions, described device for the multiple text be preset with the text of predetermined number at
Point, each text component and the content of respective type in the multiple text respectively correspond, and described device includes:
Recognition unit is configured to, and for each text in text material library, identifies text wherein included from the text
Ingredient;And
Construction unit is configured to, and based on the text component that each text includes, constructs the row's of falling rope in the text material library
Draw table, wherein the first index button of the inverted index table is the first text component in the text component of predetermined number, with institute
State the Text Flag that the corresponding searching value of the first index button is each text comprising first text component.
In one embodiment, the construction unit is additionally configured to, the text component that includes based on each text and each
The keyword that text includes constructs the inverted index table in the text material library, wherein the second retrieval of the inverted index table
Key is the first keyword, and searching value corresponding with second index button is the text of each text comprising first keyword
Mark, wherein first keyword is a keyword for including in the multiple text, wherein in the inverted index
In table, indicate that first index button corresponds to text component by predetermined mark.
On the other hand this specification provides a kind of computer readable storage medium, be stored thereon with computer program, work as institute
When stating computer program and executing in a computer, computer is enabled to execute any of the above-described method.
On the other hand this specification provides a kind of calculating equipment, including memory and processor, which is characterized in that described to deposit
It is stored with executable code in reservoir, when the processor executes the executable code, realizes any of the above-described method.
It by the text retrieval scheme according to this specification embodiment, is retrieved based on text component, or base simultaneously
It is retrieved in text component and text key word, has both met the accuracy of retrieval, also meet the diversity of retrieval.
Specific embodiment
This specification embodiment is described below in conjunction with attached drawing.
Fig. 1 shows the text retrieval system 100 according to this specification embodiment.As shown, system 100 includes: index
Construction unit 11, retrieval unit 12, sequencing unit 13 and display unit 14.Wherein, index construct unit 11 is for constructing text
The inverted index of material database.In this specification embodiment, which may include the inverted index based on text component,
Alternatively, the inverted index may include the inverted index based on text component and keyword.Wherein, it falls to arrange rope in building text component
When drawing, for each text in material database, the text in preparatory trained each text of sequence labelling model extraction can be passed through
Ingredient, and constructed based on the text component in each text using text component and be as key, with the Text Flag comprising text ingredient
The inverted index table of value.Retrieval unit 12, which is used to input based on user, carries out retrieving.The user is usually text creator
Member can be related to the text that it will write by inputting to retrieval unit 12 when writing the text for specific transactions
Keyword, and an at least text is retrieved from text material relevant to specific transactions library with for referring to.Retrieval is single
Member 12 can wherein carry out the retrieval using text component as index button, or carry out simultaneously after receiving the input of user
Using text component as index button and using keyword as the retrieval of index button, to obtain search result.Retrieval unit 12 is obtaining
Search result can be sent to sequencing unit 13 after search result, so that sequencing unit 13 is based on scheduled sort by this
Search result is ranked up, to obtain ranked search result, and the preceding text that can will sort is sent to display unit
14 to show user.
Above-mentioned each process is described hereinafter.
Fig. 2 shows a kind of text searching methods according to this specification embodiment, and the method is based on the text obtained in advance
This material database carries out, and includes multiple texts for specific transactions in the text material library, the method is for the multiple
Text is preset with the text component of predetermined number, the content point of respective type in each text component and the multiple text
Dui Ying not, which comprises
Step S202 obtains user's input;
Step S204 is inputted from the user and is identified text component wherein included;And
Step S206, based on the text component identified, by the text material library pre-establish with text
This ingredient is the inverted index table of index button, retrieves multiple texts from the text material library.
Text search method for example can be used for help user to write official documents and correspondence in intelligent intention platform, such as can be by this
Official documents and correspondence search engine in platform executes, and described search engine is after the search key for receiving user's input, in text element
Material library is retrieved, and returns to search result.As described above, the text for including in the text material library is for specific industry
The text of business, such as sales service, advertising business, publicity business, therefore, the key element ratio of the text in text material database
It is relatively fixed, that is, usually there is similar content characteristic or structure feature, can include that keyword carries out ingredient to the text therefore
It is abstract.For example, multiple texts in the text material library are marketing text in the case where the specific transactions are sales service
Case, multiple marketing official documents and correspondence can be the intention platform oneself accumulation or be obtained by external channel.Typical marketing text
Case for example, " * * supermarket, the Spring Festival promote 5 folding of drinks ", " being paid using Alipay, every singly to return now 1 yuan " etc., to these battalion
Sell official documents and correspondence and carry out that ingredient is abstract, can by including content conclude into such as eight kinds of ingredients: brand (for example, " * * supermarket ",
" Alipay "), action (such as " barcode scanning ", " registration ", " payment " etc.), (such as preferential amount of money returns amount in cash to the amount of money, as above-mentioned
" 1 yuan "), discount (such as above-mentioned " 5 folding "), gift (as " iphonex "), festivals or holidays (such as Christmas Day, the Spring Festival), activity scene
(such as Above-the-line, Below-the-line), activity venue (such as India, China).It is appreciated that the text material library is targeted
Specific transactions be not limited to sales service, advertising business, publicity business etc., can also be various other business, text material
Library is abstracted so as to carry out ingredient based on the feature of the text for the business, to obtain the text component of predetermined number.
Each step shown in Fig. 2 is described below in detail.
Step S202 obtains user's input.
User can input search key to above-mentioned search engine, so that the search engine obtains the input of user, and
Input based on user executes this method.The input of user can be any text, for example, user's input can be current operation
The relevant lists of keywords of activity, can wherein be separated between keyword with space, for example, " Alipay returns 1 yuan existing ".
Step S204 is inputted from the user and is identified text component wherein included.
The text component is described above for the preset text component in text material library, such as above-mentioned for marketing
The preset eight kinds of text components in text material library.Sequence labelling model trained in advance can be inputted again by inputting user,
To export the text component for including in user input.The sequence labelling model can be BILSTM+CRF model, can also be with
Using HMM and CRF model etc., it might even be possible to be the model of rule-based knowledge and dictionary.It can be by also belonging to above-mentioned spy
Multiple texts of business are determined to carry out the training to the sequence labelling model, for example, by the multiple texts for obtaining specific transactions,
And to each text marking including text component can be by using multiple training to obtain multiple training samples
Sequence labelling model described in sample training.After obtaining trained sequence labelling model, for example, defeated for above-mentioned user
Enter, which can be sequentially input sequence labelling model, sequence labelling model for example exports wherein by " Alipay returns 1 yuan existing "
Text component set { brand, the amount of money }.It is more in the content that user's input includes, to input the text of identification from user
Ingredient includes when repeating ingredient, also duplicate removal being carried out to the text component identified, to obtain the set of final text component.
For example, including the relevant content of multiple amount of money in user's input, such as " now 1 yuan is returned, 5 yuan of red packets are given ", pass through sequence labelling mould
Type may recognize that two " amount of money " ingredients, thus one " amount of money " of removal.
Step S206, based on the text component identified, by the text material library pre-establish with text
This ingredient is the inverted index table of index button, retrieves multiple texts from the text material library.
The text material library will be described in more detail below by the foundation of the inverted index table of index button of text component.
In the inverted index table, with the text component of preset predetermined number for each index button (key), to include text ingredient
Each text Text Flag as searching value (value) corresponding with the index button.
It can be by kinds of schemes based on the text component identified, by above-mentioned inverted index table, from text material
Multiple texts are retrieved in library.A kind of specific retrieval scheme is described below as example.
For example, can be primarily based on the set for the set { brand, the amount of money } of above-mentioned text component and obtain its whole non-empty
Subset, typically for the set comprising n element, can obtain its 2n- 1 nonvoid subset, therefore, for above-mentioned set { product
Board, the amount of money }, three nonvoid subsets can be obtained: { brand, the amount of money }, { brand } and { amount of money }.Then, can for each subset into
Row retrieval, to obtain corresponding search result respectively.For example, for subset { brand, the amount of money }, it can be respectively with text component " product
Board " and " amount of money " are retrieved for index button, and using the intersection of this search result retrieved twice as corresponding with the subset
Search result.For subset { brand }, with text component " brand " can be that index button be retrieved, and using search result as with
The corresponding search result of the subset.The search result is the corresponding Text Flag of corresponding text component.For example, with each height
Collecting corresponding search result may be as shown in Table 1 below:
Table 1
Wherein, in table 1, number 1~8 is the Text Flag of corresponding text, different from 8 in text material library
Text respectively corresponds.For example, 4 corresponding texts of mark " are paid, every singly return shows 1 for the text in material database using Alipay
Member " wherein not only having included " brand ", but also includes " amount of money ", therefore since the ingredient that the text includes is { brand, action, the amount of money }
Appear in simultaneously { brand, the amount of money }, { brand }, in { amount of money } three corresponding search results of subset.
After obtaining above-mentioned search result corresponding with each subset, each Text Flag group therein can be based on
At multiple groups triple (i, pi,si), wherein i is Text Flag.piThe element number in corresponding subset is identified for the text,
It can be considered Sort Priority value, for example, the Sort Priority value of the corresponding Text Flag of subset { brand, the amount of money } is 2, subset
The Sort Priority value of { brand } corresponding Text Flag is 1.It wherein, can for duplicate Text Flag in table 1, such as " 4 "
Remove its piLesser triple is only left piMaximum that group of triple.siCorresponding text is identified for the text and user is defeated
The similarity value entered, such as Jaccard coefficient can be used and calculate acquisition.
To, such as according to table 1, triple as shown in Table 2 can be obtained
(4,2,0.8) |
(1,1,0.7) |
(2,1,0.6) |
(3,1,0.5) |
(4,1,0.8) (deletion) |
(5,1,0.2) |
(4,1,0.8) (deletion) |
(6,1,0.9) |
(7,1,0.4) |
(8,1,0.3) |
Table 2
So as to be ranked up based on table 2 to each Text Flag, firstly, can be based on the corresponding priority value of each mark
The sequence of the first level is carried out to multiple mark, that is to say, that come the triple (4,2,0.8) that priority value is 2 most
Front comes each triple that priority value is 1 behind triple (4,2,0.8).It then, can be based in triple
Similarity, the sequence to each triple the second level of progress that priority value is 1 are as shown in table 3 through arranging so as to obtain
The triple of sequence:
(4,2,0.8) |
(6,1,0.9) |
(1,1,0.7) |
(2,1,0.6) |
(3,1,0.5) |
(7,1,0.4) |
(8,1,0.3) |
(5,1,0.2) |
Table 3
After obtaining table 3, namely ranked the results list is obtained, search engine is so as to being based on the ranking results
Corresponding text is shown in displayed page.For example, being preset as showing 5 texts in the page, so as in the page
In this five texts are shown with the sequence of text 4,6,1,2,3.
It is appreciated that above-mentioned retrieval scheme is only schematical, this specification embodiment is without being limited thereto, but can be used
It may occur to persons skilled in the art that any retrieval scheme, for example, obtain text component set { brand, the amount of money } it
Afterwards, it is not necessarily required to that be classified as 3 subsets is retrieved respectively, for example, can be directly respectively with ingredient " brand " and " amount of money "
It is retrieved for index button, to retrieve multiple texts respectively, and each text in the intersection of two search results is carried out
Sequencing of similarity, to obtain ranked search result, etc., not example one by one herein.
In one embodiment, it inputs for user, is retrieved in addition to being based on text component index button as described above
Except, also retrieved based on the keyword in user's input.Specifically, firstly, extracting keyword from user's input, and lead to
The processing such as filtering stop words, duplicate removal, obtains keyword set.Then, it is based on the keyword set, passes through text material database
Pre-establish using keyword as the concordance list of index button, can be retrieved and the use with retrieval scheme described above similarly
Family inputs corresponding ranked the results list.So as to show two kinds of retrieval knots in displayed page with predetermined ratio mixing
Fruit.For example, being preset with 10 exhibition positions in displayed page, then it can set and two the results lists are mixed with the ratio of 5:5
It shows, comes preceding 5 texts for mixing exhibition in displayed page for example, can take out respectively from two the results lists
Show.
Fig. 3 schematically illustrates the process that search engine is retrieved based on above two index button.As shown in figure 3, left side is right
Retrieval of the Ying Yu based on keyword, right side correspond to the retrieval based on text component.Specifically, in step S31, it is defeated to obtain user
Enter.In step S32, is inputted from user and extract keyword;In step S34, the keyword based on extraction generates keyword subset;
In step S36, it is based on keyword subset, is retrieved by keyword index table, and sorted to search result, to obtain first
The results list.In step S33, identification text component is inputted from user;In step S35, the text component based on identification generates text
This is at Molecule Set;In step S37, it is based on text component subset, is retrieved by text component concordance list, and to search result
Sequence, to obtain second the results list.Wherein, the dotted line frame in figure indicates that step can be carried out by the same inverted index table
S36 and S37 includes simultaneously wherein the index button as text component and the index button as keyword in the inverted index table.
In step S38, mixed with the text that sequence of the predetermined ratio to two the results lists is forward;It is defeated and in step S39
The text mixed out, for being shown in displayed page.
The method that Fig. 4 shows a kind of inverted index table in building text material library according to this specification embodiment, it is described
Text material library includes multiple texts for specific transactions, and the method is preset with predetermined number for the multiple text
Text component, each text component and the content of respective type in the multiple text respectively correspond, which comprises
Step S402 identifies text component wherein included from the text for each text in text material library;With
And
Step S404 constructs the inverted index table in the text material library based on the text component that each text includes,
In, the first index button of the inverted index table is the first text component in the text component of predetermined number, with described first
The corresponding searching value of index button is the Text Flag of each text comprising first text component.
Specifically, in step S402, for each text in text material library, text wherein included is identified from the text
This ingredient.The step can refer to above to the specific descriptions of step S204, can be by by each text in text material library
It inputs in trained sequence labelling model in advance, to identify the set of the corresponding text component of each text.For example, right
In the text " being paid using Alipay, every singly to return now 1 yuan " of material database, by being inputted sequence labelling model, may recognize that
The collection of text component therein is combined into { brand, action, the amount of money }, that is, wherein, " Alipay " corresponds to " brand ", and " payment " is right
Ying Yu " action ", " 1 yuan " corresponds to the amount of money.
The inverted index table in the text material library is constructed based on the text component that each text includes in step S404,
Wherein, the first index button of the inverted index table is the first text component in the text component of predetermined number, with described the
The corresponding searching value of one index button is the Text Flag of each text comprising first text component.
The inverted index table is the key assignments table with mapping (map) structure, and wherein key (key) is any text component,
It is worth the Text Flag that (value) is each text comprising text ingredient.For example, the text material library is marketing official documents and correspondence
Material database, in this case, the key in the inverted index table is for example including above-mentioned eight kinds of text components.For example, for wherein one
A key " brand ", corresponding value are the official documents and correspondence mark of each official documents and correspondence in official documents and correspondence material database including " brand " ingredient.
In one embodiment, as shown in Figure 3, it by being retrieved respectively in conjunction with two kinds of index buttons, in this case, is searching
During index is held up, it can be based on official documents and correspondence material database, establish two concordance lists for corresponding respectively to two kinds of index buttons, or can also be one
It include two kinds of index buttons in a concordance list.In latter case, two kinds of index buttons can be distinguished by predetermined mark.For example, right
In the index button " brand " as text component, " # brand # " can be identified as, using with the index button " product as keyword
Board " is mutually distinguished, so that search engine obtains right in retrieval table " brand " when being that index button is retrieved with keyword " brand "
The Text Flag answered, when being that index button is retrieved with text component " brand ", search engine obtains " # product in retrieval table
The corresponding Text Flag of board # ".
Fig. 5, which is shown, includes the process of the inverted index table of two kinds of index buttons according to the foundation of this specification embodiment.Such as Fig. 5
It is shown, in step S51, from each Text Feature Extraction keyword in official documents and correspondence material database;In step S52, in official documents and correspondence material database
Each text identification text component;In step S53, keyword and text component based on each text establish inverted index table,
So as to obtain the inverted index table including two kinds of index buttons of keyword and text component in figure.
Fig. 6 shows a kind of text retrieval device 600 according to this specification embodiment, and described device based on obtaining in advance
Text material library is implemented, and includes multiple texts for specific transactions in the text material library, described device is for described more
A text is preset with the text component of predetermined number, the content of respective type in each text component and the multiple text
It respectively corresponds, described device includes:
Acquiring unit 61, is configured to, and obtains user's input;
Recognition unit 62, is configured to, and inputs from the user and identifies text component wherein included;And
Retrieval unit 63, is configured to, and based on the text component identified, passes through building in advance for the text material library
It is vertical using text component as the inverted index table of index button, retrieve multiple texts from the text material library.
In one embodiment, the recognition unit 62 is additionally configured to, by sequence labelling model trained in advance from institute
It states user and inputs identification text component wherein included.
In one embodiment, the retrieval unit 63 further include:
Subelement 631 is obtained, is configured to, the non-gap of whole of the set of the text component identified described in including is obtained
Collection;
Subelement 632 is retrieved, is configured to, relative to each subset, is retrieved, obtained by the inverted index table
Take corresponding search result, wherein the corresponding search result of the subset be using in the subset each text component as index button into
The intersection of the acquired search result of row retrieval, the search result are the list of the Text Flag of corresponding text.
In one embodiment, the retrieval unit 63 further include:
Sorting subunit 633, is configured to, and relative to each subset, is retrieved by the inverted index table,
After obtaining corresponding search result, the full text for including in whole search results is identified, based at least one of following
The full text is identified and is sorted: the number for the text component for including in the corresponding subset of each Text Flag and each
The similarity of the corresponding text of Text Flag and user input.
In one embodiment, the sorting subunit 633 is additionally configured to, and the full text is identified, is first based on
The number for the text component for including in the corresponding subset of each Text Flag carries out the first hierarchical ranking, then is based on each text mark
The similarity for knowing corresponding text and user input carries out the second hierarchical ranking under the first layer time.
In one embodiment, described device 600 further includes that display unit 64 is configured to, from the text material library
In retrieve multiple texts after, based on the sequence to full text mark, Xiang Suoshu user shows and retrieves
Text.
In one embodiment, the display unit 64 is additionally configured to, in displayed page, in addition to being based on the sequence,
It is shown except the text retrieved to the user, also mixedly shows and carry out keyword retrieval by inputting for the user
The multiple texts obtained.
Fig. 7 shows a kind of device 700 of the inverted index table in building text material library according to this specification embodiment, institute
Stating text material library includes multiple texts for specific transactions, and described device is preset with predetermined number for the multiple text
Text component, each text component and the content of respective type in the multiple text respectively correspond, described device packet
It includes:
Recognition unit 71, is configured to, and for each text in text material library, identifies text wherein included from the text
This ingredient;And
Construction unit 72, is configured to, and based on the text component that each text includes, constructs the row of falling in the text material library
Concordance list, wherein the first index button of the inverted index table is the first text component in the text component of predetermined number, with
The corresponding searching value of first index button is the Text Flag of each text comprising first text component.
In one embodiment, the construction unit is additionally configured to, the text component that includes based on each text and each
The keyword that text includes constructs the inverted index table in the text material library, wherein the second retrieval of the inverted index table
Key is the first keyword, and searching value corresponding with second index button is the text of each text comprising first keyword
Mark, wherein first keyword is a keyword for including in the multiple text, wherein in the inverted index
In table, indicate that first index button corresponds to text component by predetermined mark.
On the other hand this specification provides a kind of computer readable storage medium, be stored thereon with computer program, work as institute
When stating computer program and executing in a computer, computer is enabled to execute any of the above-described method.
On the other hand this specification provides a kind of calculating equipment, including memory and processor, which is characterized in that described to deposit
It is stored with executable code in reservoir, when the processor executes the executable code, realizes any of the above-described method.
It by the text retrieval scheme according to this specification embodiment, is retrieved based on text component, or base simultaneously
It is retrieved in text component and text key word, literal identical text can be recalled, can also recall literal different but text
The identical text of ingredient, had both met the accuracy of retrieval, also met the diversity of retrieval, met the multiplicity of text writing
Property and rich demand, avoid repetition launch after human fatigue.
It is to be understood that herein " first ", the description such as " second ", it is for illustration only simple and to similar concept into
Row is distinguished, and does not have other restriction effects.
All the embodiments in this specification are described in a progressive manner, same and similar portion between each embodiment
Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for system reality
For applying example, since it is substantially similar to the method embodiment, so being described relatively simple, related place is referring to embodiment of the method
Part explanation.
It is above-mentioned that this specification specific embodiment is described.Other embodiments are in the scope of the appended claims
It is interior.In some cases, the movement recorded in detail in the claims or step can be come according to the sequence being different from embodiment
It executes and desired result still may be implemented.In addition, process depicted in the drawing not necessarily require show it is specific suitable
Sequence or consecutive order are just able to achieve desired result.In some embodiments, multitasking and parallel processing be also can
With or may be advantageous.
Those of ordinary skill in the art should further appreciate that, describe in conjunction with the embodiments described herein
Each exemplary unit and algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clear
Illustrate to Chu the interchangeability of hardware and software, generally describes each exemplary group according to function in the above description
At and step.These functions hold track actually with hardware or software mode, depending on technical solution specific application and set
Count constraint condition.Those of ordinary skill in the art can realize each specific application using distinct methods described
Function, but this realization is it is not considered that exceed scope of the present application.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can hold track with hardware, processor
Software module or the combination of the two implement.Software module can be placed in random access memory (RAM), memory, read-only storage
Device (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology neck
In any other form of storage medium well known in domain.
Above-described specific embodiment has carried out further the purpose of the present invention, technical scheme and beneficial effects
It is described in detail, it should be understood that being not intended to limit the present invention the foregoing is merely a specific embodiment of the invention
Protection scope, all within the spirits and principles of the present invention, any modification, equivalent substitution, improvement and etc. done should all include
Within protection scope of the present invention.