WO2017097075A1 - Procédé et appareil de mise en correspondance de mot-clé flou - Google Patents

Procédé et appareil de mise en correspondance de mot-clé flou Download PDF

Info

Publication number
WO2017097075A1
WO2017097075A1 PCT/CN2016/104693 CN2016104693W WO2017097075A1 WO 2017097075 A1 WO2017097075 A1 WO 2017097075A1 CN 2016104693 W CN2016104693 W CN 2016104693W WO 2017097075 A1 WO2017097075 A1 WO 2017097075A1
Authority
WO
WIPO (PCT)
Prior art keywords
character
keyword
matching
text
matched
Prior art date
Application number
PCT/CN2016/104693
Other languages
English (en)
Chinese (zh)
Inventor
李剑
毛宏
Original Assignee
北京搜狗科技发展有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京搜狗科技发展有限公司 filed Critical 北京搜狗科技发展有限公司
Publication of WO2017097075A1 publication Critical patent/WO2017097075A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Definitions

  • the present invention relates to the field of Internet technologies, and in particular, to a method and apparatus for keyword fuzzy matching.
  • a regular expression is usually used, that is, a single pattern string is used to describe and match a series of strings conforming to the characteristic rule, and the regular expression is generally compiled into a regular expression by the regular matching engine.
  • NFA non-deterministic finite automaton
  • DFA deterministic finite automaton
  • the invention provides a method for keyword fuzzy matching, which can solve the defect that the existing keyword matching efficiency is low to a certain extent.
  • the invention provides the following solutions:
  • a method for fuzzy matching of a keyword comprising: obtaining, for each character in the matched text, a keyword to which the character belongs according to the keyword set and an index bit of the character in the associated keyword; Determining, in the keyword of the keyword, whether the character is the first character of the keyword; if the character is the first character of the keyword, recording the keyword of the character in the matching information set, and The first character of the keyword in the record is stored in the to-be-matched text; if the character is not the first character of the keyword, and the record of the keyword belongs to the matching information set, the acquisition is performed.
  • An apparatus for fuzzy matching of a keyword comprising: an obtaining module, configured to acquire each keyword in the text to be matched, and respectively obtain a keyword to which the character belongs according to the keyword set and an index bit of the character in the keyword to be associated a judging module, configured to determine, according to the index bit of the character in the keyword, whether the character is the first character of the keyword, and the first tagging module, when the determining result of the determining module is yes, Recording, in the matching information set, the keyword to which the character belongs, and marking the first character of the keyword in the record in the text to be matched; the second marking module is configured to: when the determining result of the determining module is If the record of the keyword to which the character belongs exists in the matching information set, the record of the keyword to which the character belongs is obtained, and the character in the keyword is marked in the record to be present in the record In the matching text, the determining module is configured to determine that the text to be matched hits when each character in a keyword is marked in the text to be matched Keyword
  • An apparatus for keyword fuzzy matching comprising a memory, and one or more programs, wherein one or more programs are stored in a memory and configured to execute the one or more by one or more processors
  • More than one program includes instructions for: each character in the text to be matched, respectively acquiring a keyword to which the character belongs and an index bit of the character in the associated keyword according to the keyword set; Determining, by the index bit of the character in the keyword, whether the character is the first character of the keyword; if the character is the first character of the keyword, the keyword of the character is recorded in the matching information set, and The first character of the keyword is recorded in the record, and if the character is not the first character of the keyword, and the record of the keyword belongs to the matching information set, Obtaining a record of a keyword to which the character belongs, and marking the word in the keyword in the record Present in the text to be matched; when a keyword tag of each character are present in the text to be matched, she is determined that the text to be matched to the hit keyword.
  • a program comprising readable code that, when executed on a server, causes the server to perform a method of keyword fuzzy matching according to any of the embodiments of the present invention.
  • a readable medium in which the program described in the embodiments of the present invention is stored.
  • the present invention discloses the following technical effects:
  • each keyword in the acquired text to be matched is respectively obtained as a keyword to which the character belongs, and the index bit of the character in the associated keyword is determined, if If it is the first character, the keyword of the character is recorded, and the first character of the keyword is found in the text to be matched; if it is not the first character, the keyword of the character is searched for from the recorded keyword.
  • the character in the keyword is present in the text to be matched; when each character in a keyword is marked in the text to be matched, the keyword is hit by the text to be matched.
  • FIG. 1 is a flowchart of a method for keyword fuzzy matching according to an embodiment of the present invention
  • FIG. 2 is a flowchart of a method for performing keyword fuzzy matching for each character in a text to be matched according to an embodiment of the present invention
  • FIG. 3 is a block diagram of a multi-keyword fuzzy matching device according to an embodiment of the present invention.
  • FIG. 4 is a block diagram of an apparatus for keyword fuzzy matching, according to an exemplary embodiment
  • FIG. 5 is a schematic structural diagram of a server in an embodiment of the present invention.
  • FIG. 6 shows a block diagram of a server for performing a multi-keyword fuzzy matching method according to the present invention
  • Fig. 7 shows a storage unit for holding or carrying program code implementing the multi-keyword fuzzy matching method according to the present invention.
  • the invention provides a method for keyword fuzzy matching, as shown in FIG. 1 , comprising:
  • Step S101 acquiring, for each character in the matched text, a keyword to which the character belongs and an index bit of the character in the associated keyword according to the keyword set;
  • the text to be matched is scanned, and each time a character is scanned, the keyword to which the character belongs is obtained from the keyword set, and an index bit of the character in the keyword is obtained.
  • One character in the text to be matched may correspond to one or more keywords, or may not have corresponding keywords.
  • the method further includes: constructing, for each character of each keyword in the keyword set, a matching rule corresponding to each character, where the matching rule includes: a character, a keyword belonging to the character, and a character included in the keyword belonging to the character. a number, and an index bit of the character in the keyword; obtaining a matching rule corresponding to each character in the keyword, forming a matching rule set corresponding to the keyword; constructing an inverted row from the character to the matching rule set according to the matching rule set direction chart.
  • the inverted index table includes: a character, and all matching rules corresponding to the character; based on the foregoing, the matching rule corresponding to the character may refer to a matching rule including the character.
  • the keyword to which the character belongs and the index bit of the character in the associated keyword are obtained according to the inverted index table. Specifically, each character in the matched text is traversed by the inverted index table, and when the inverted index table includes the character, all matching rules corresponding to the character are obtained, and For each matching rule, the keyword of the character included in the matching rule and the index bit of the character in the associated keyword are respectively obtained.
  • an inverted index table is created, so that all the key points corresponding to the character can be quickly obtained by traversing the inverted index table. Words, as well as determining the index bits of the characters in the respective keywords, no longer need to match the respective keywords, making the matching process easier, faster, and more efficient.
  • the keywords may include wildcards and non-wildcards.
  • non-wildcards are collectively referred to as characters.
  • the keywords may contain one or several wildcards. These wildcards may be consecutive or intervald, and the length of the wildcard. Can be any character length.
  • the keyword set may be added, modified, and deleted. In response, when the keyword set is added, modified, and deleted, the content in the inverted index table is updated according to a specific operation. .
  • Step S102 determining whether the character is the first character of the keyword according to the index bit of the character in the keyword, if the character is the first character of the keyword, step S103 is performed; if it is not the first character, Go to step S104;
  • the characters in the keyword are divided into a first character and a non-first character, wherein the first character refers to the first non-wildcard in the keyword.
  • Non-first characters refer to non-wildcards in the keyword other than the first character. When there is only one non-wildcard in the keyword, the character is the first character.
  • Step S103 Record the keyword to which the character belongs in the matching information set, and mark the first character of the keyword in the record to be in the text to be matched;
  • step S103 if the character is the first character of the keyword, the matching process information corresponding to the keyword to which the character belongs is newly created, and the matching process information is saved in the matching information set; and the character is in the to-be-matched text.
  • the index bit in the record is recorded in the matching process information.
  • the matching process information is in one-to-one correspondence with the keywords to which the characters belong.
  • the index bit of the character in the text to be matched is recorded into the matching process information, and the character in the belonging keyword can be marked to exist in the text to be matched, and the index bit of the character in the text to be matched is recorded. Used to output matching information after subsequent hits of keywords.
  • Step S104 When there is a record of the keyword to which the character belongs in the matching information set, the record of the keyword to which the character belongs is obtained, and the character in the marked keyword is present in the text to be matched in the record. Specifically, in step S104, if the character is not the first character of the keyword, the matching information set is searched, and it is determined whether the record of the keyword belongs to the matching information set, and if yes, the character is acquired. The record of the keyword, and the character in the markup keyword exists in the text to be matched in the record; if it does not exist, the character is ignored, and the next character in the text to be matched is continuously scanned.
  • the determining whether the record of the keyword belongs to the matching information set may be used to determine whether the matching of the keyword belongs to the matching process set.
  • the process information if any, obtains the matching process information corresponding to the keyword to which the character belongs, and records the index bit of the character in the text to be matched into the matching process information; if not, the character is ignored.
  • the record of the keyword to which the character belongs may be obtained, which may be the matching process information corresponding to the keyword to which the character belongs.
  • the index bit of the character in the text to be matched is recorded in the matching process information.
  • the character can be ignored and obtained from the text to be matched. The next character is matched.
  • the matching process information is in one-to-one correspondence with the keyword to which the character belongs, and the number of bits included in each matching process information is the same as the number of characters included in the corresponding keyword; each of the matching process information A character used to mark the corresponding number of bits in the corresponding keyword respectively appears in the text to be matched. If it occurs, the corresponding bit in the matching process information is set to the index bit of the character corresponding to the number of bits in the text to be matched. .
  • each bit in a matching process information is set to the index bit of the character corresponding to the number of bits in the text to be matched, it can be determined that each character of the keyword corresponding to the matching process information is Appearing in the text to be matched indicates that the keyword is hit by the text to be matched.
  • This specific matching process will be described in detail in the subsequent embodiments.
  • the method may further include: outputting the matching information. Specifically, when it is determined that the index bits of all the characters in the corresponding keyword in the matching text are recorded in the matching process information, the text to be matched is determined to hit the keyword, and the character may be in the text to be matched according to each character. The index bit obtains matching information from the text to be matched, and outputs matching information.
  • the matching process information by using the matching process information, it can be determined whether each character in the keyword exists in the text to be matched, and an index bit in the text to be matched.
  • step S104 after the matching process information corresponding to the keyword to which the character belongs is obtained, before the index bit of the character in the text to be matched is recorded in the matching process information, The method further includes: determining whether an index bit of the character in the text to be matched has been recorded in the matching process information, and if the record is already, copying the matching process information, and updating the current index bit in the text to be matched with the character And the step of recording the index bit of the character in the text to be matched in the copied matching process information; if there is no record, performing the step of recording the index bit of the character in the text to be matched into the matching process information.
  • the method further includes: constructing a character distance rule including the number of characters in the keyword and the effective distance between each character and the previous character for each keyword in the keyword set to form a character distance rule set;
  • the matching and verifying of the keyword of the current character according to the character distance rule set includes: obtaining an index bit of the previous character of the current character in the text to be matched from the keyword of the current character, according to the index The index bit of the bit and the current character in the text to be matched, the first distance between the current character and the previous character of the current character is calculated; and the current character and the current character are obtained from the character distance rule corresponding to the keyword of the current character.
  • the verification result is that the addition is successful, and the next character of the current character is obtained.
  • the character matches the keyword it belongs to; if the second distance indicates that it is not any interval length, and the interval length indicated by the second distance is smaller than the interval length indicated by the first distance, the verification result is a failure, the matching process information is invalid, and the end is The match of the keyword to which the character belongs.
  • this embodiment provides a specific example for description, as follows:
  • Each character of each keyword in the keyword set may be separately constructed to include the character, the keyword to which the character belongs, the number of characters included in the keyword to which the character belongs, and the index bit of the character in the keyword to which the keyword belongs.
  • Matching rules to form a matching rule set constructing an inverted index table from a character to a matching rule set according to the matching rule set.
  • each character of each keyword in the keyword set is respectively constructed to include the character, all the keywords of the character, the number of characters included in the keyword to which the character belongs, and the character in each keyword.
  • the matching rule of the index bit, each character of each keyword in the keyword set has at least one matching rule, and the matching rule corresponding to all the characters in each keyword in the keyword set constitutes a matching rule set, All non-repeating characters construct this character into the inverted index table of the matching rule set.
  • a character distance rule including the number of characters in the keyword and the effective distance between each character and the previous character is constructed to form a character distance rule set.
  • Each keyword corresponds to a character distance rule.
  • the number of characters refers to the number of non-wildcards included in the keyword.
  • the effective distance between each character and its previous character refers to the distance between each non-wildcard and its previous non-wildcard.
  • Each character can be continuous with its previous character.
  • the character distance can be set to 0 when continuous.
  • Each character and its previous character can also be discontinuous.
  • the case of discontinuity can be divided into two types: one case: any length interval, which can include a wildcard "*" indicating an arbitrary length between the character and the previous character, and the character distance can be set to -1 at this time;
  • n is a natural number
  • the keyword can be verified by using the character distance rule.
  • the character distance rule set can also be generated when the keyword is matched and checked.
  • the keyword set contains two keywords, of which the keyword one is: generation? ? Open * invoice, keyword two: find? ? proxy.
  • a matching rule is constructed for each character in each keyword, and the matching rule includes the character, the keyword to which the character belongs, the number of characters included in the keyword to which the character belongs, and the index bit of the character in the keyword;
  • the matching rules of each character in the keyword one and the keyword two are constructed, and the matching rule set is formed.
  • Table 1 the mapping relationship between each character and the matching rule set is constructed and numbered, and the inverted index table is obtained. The details are shown in Table 2.
  • Table 1 The following is an example of a matching rule set as shown in Table 1. For example, the keyword "generation? open * invoice”, which includes 4 valid characters, corresponding to the construction of 4 matching rules.
  • the first character (character) "generation”
  • the corresponding keyword (keyword) is "generation? open * invoice”
  • the keyword includes the number of characters (size) is 4, the "generation” index in the keyword (index) is 0.
  • the second character is "on", and the corresponding keyword (keyword) is "generation? open * invoice”.
  • the keyword includes the number of characters (4), and the "open" index position in the keyword. (index) is 1.
  • the third character (character) is "sent", and the corresponding keyword (keyword) is "generation? open * invoice”, the keyword includes the number of characters (size) is 4, "send” the index bit in the keyword (index) is 2.
  • the fourth character (character) "voucher”, the corresponding keyword (keyword) is "generation? open * invoice”, the keyword includes the number of characters (size) is 4, the index of the "ticket” in the keyword (index) is 3.
  • the characters "generation” exist in the keyword “generation? open * invoice” and the keyword “find the agent”, so corresponding to the two matching rules, respectively construct the character “generation” and two matching rules
  • the mapping relationship between them, and numbered, is stored in the inverted index table, as shown in Table 2.
  • the character "on” exists only in the keyword “generation? open * invoice”, so corresponding to a matching rule, the mapping relationship between the character "open” and the matching rule is constructed, and numbered, saved in In the inverted index table, as shown in Table 2.
  • the number of characters is four, and there is two wildcard characters "??" between the second character “on” and the first character “generation”, that is, the character distance is 2.
  • the character distance rule corresponding to the constructed keyword is [4, 2, -1, 0].
  • the character distance rule corresponding to the constructed keyword two is [3, 2, 0].
  • the character distance rule set contains two character distance rules, which are the character distance rule [4, 2, -1, 0] and the keyword “search” of the keyword “generation?
  • each character in the matched text is respectively subjected to keyword fuzzy matching, as shown in FIG. 2, as follows:
  • Step 201 Obtain a character from the text to be matched as the current character
  • the text to be matched is: looking for a cheap agent to open a business invoice. Scan the text to be matched, as follows: Retrieve the character 'seek' as the current character.
  • Step 202 Obtain all matching rules corresponding to the current character from the inverted index table, and determine, for each matching rule, whether the matching process information set of the keyword to which the matching rule belongs is empty, and if the matching process information set is empty, Then, step 203 is performed; if the matching process information set is not empty, step 207 is performed.
  • the matching rule corresponding to the current character is not obtained from the inverted index table, the current character is ignored, and the next character of the current character is obtained from the text to be matched, and the current character is used as the current character. 202.
  • Step 203 Determine whether the current character is the first character of the keyword, if yes, execute step 204; otherwise, execute step 212, that is, obtain the next character of the current character from the text to be matched, and use it as the current character, and then perform step 202.
  • the judgment is performed according to an index bit of the current character recorded in the matching rule in the keyword.
  • Step 204 Add a matching process information to the matching process information set, and record the index bit of the current character in the to-be-matched text into the matching process information, and perform step 205.
  • the matching process information is used to record the index bits of each character in a keyword in the text to be matched.
  • an initial value may be set for each element in the matching process information, and each element in the matching process information represents whether each character in the keyword appears in the text to be matched, and The index bit in the text to be matched.
  • the number of elements included in the matching process information is equal to the number of characters included in the corresponding keyword.
  • each element indicates whether each character in the keyword is included in the text to be matched. If not, the element corresponding to the character is an initial value, and if so, the element corresponding to the character is the character in the text to be matched. Index bit.
  • each element in the newly created matching process information is set to an initial value.
  • the initial value of each element in the newly created matching process information is set to -1, indicating that each character in the keyword does not appear in the text to be matched, for example, the keyword "find?? proxy" has three characters, and is created.
  • Step 205 Determine whether the matching is completed according to the matching process information. If the matching is completed, go to step 206. If the matching is not completed, execute step 212 to obtain the next character of the current character from the text to be matched, and use it as the current character. Step 202 is performed again.
  • the determining whether the matching is completed according to the matching process information may include: determining whether there is an element whose value is an initial value in the matching process information, and if yes, indicating that the matching is not completed, otherwise indicating that the matching is completed.
  • the element corresponding to the character in the matching process information corresponding to the keyword is set as the character in the text to be matched.
  • the index bit otherwise, the element corresponding to the character retains the initial value. Therefore, when all the elements included in the matching process information are non-initial values, it may be determined that all the characters in the keyword are included in the to-be-matched text, that is, the matching is completed.
  • Step 206 Output matching information according to the matching process information.
  • the two index bits are obtained from the text to be matched.
  • the character including the characters on the two index bits, is used as matching information to output the matching information.
  • Step 207 Determine whether the current character is the first character of the keyword, if yes, go to step 208; otherwise, go to step 209.
  • Step 208 Add a new matching process information in the matching process information, record the index bit of the current character in the text to be matched into the matching process information, and perform step 205.
  • step 209 the matching process information corresponding to all the keywords of the current character is obtained from the matching process information set, and the index bit of the current character in the text to be matched is recorded in each matching process information, and step 210 is performed.
  • the method further includes: Determining whether the index bit of the current character in the text to be matched has been recorded in the corresponding matching process information, and copying the current matching process information, and updating the index bit of the current character in the text to be matched to the matching process after copying In the information; otherwise, the index bit of the current character in the text to be matched is recorded into each matching process information, and step 210 is performed.
  • Step 210 Perform a distance check on each matching process information according to the character distance rule. If the verification result is successful, step 205 is performed; if the verification result is a failure, step 211 is performed.
  • step 211 the keyword is marked as invalid, that is, the matching of the keyword to which the current character belongs is ended.
  • Step 212 Obtain the next character of the current character from the text to be matched, and use it as the current character; then perform step 202.
  • each character in the keyword has already appeared in the text to be matched, and whether the character distance rule corresponding to the keyword and the index bit in the text to be matched in the keyword match the pre-pre- Set the relationship to determine the case where the characters match.
  • the method can be implemented as follows: obtaining a character distance rule corresponding to a keyword to which the current character belongs, and obtaining an index bit of a character of the current character in the to-be-matched text from the matching process information corresponding to the keyword to which the character belongs, according to the index bit and The index of the current character in the text to be matched, the first distance between the current character and the previous character of the current character is calculated, and the current character and the previous character of the current character are obtained from the character distance rule corresponding to the keyword of the current character.
  • the second distance between the two when the second distance indicates the length of any interval, or the second distance is greater than the first distance, it is determined whether there is an element with an initial value in the matching process information, and the addition indicates that the adding is successful, from the to-be-matched Gets the next character of the current character in the text and matches it as the current character; otherwise, the match is completed. If the second distance indicates that the interval length is not any interval, and the interval length indicated by the second distance is smaller than the interval length indicated by the first distance, indicating that the keyword matching of the current character is invalid, and ending the matching of the keyword may be matched from Get the next character of the current character in the text to match.
  • the fuzzy matching is performed by matching the text "Looking for a cheap agent to open a business invoice", for example:
  • the text to be matched is: “Looking for a cheap agent to open a business invoice”. Scan the text to be matched, as follows:
  • the corresponding matching rule 6 is found in the inverted index table, and the current character 'find' is the first character of the matching rule 6 corresponding to the keyword "finding agent".
  • Create a matching process information the process information number is 1, and use the current character "find” to replace the initial value of the corresponding element in the matching process information corresponding to the keyword "find the proxy” in the index bit 1 of the text to be matched, as shown in Table 3. Show:
  • the matching process information corresponding to the process information number 1 is verified, and the process information index[]:[1,-1,-1] is matched, and only the first bit is non-1, and the data of other bits are initial values. -1, at this time, indicates that the temporary match is not completed. For the current character "find”, it is the first character in its keyword “find the proxy”, there is no corresponding character distance rule, so it is no longer necessary to check according to the character distance rule.
  • the character ‘Yes’ is retrieved, and the index bit in the text to be matched is 2, and the corresponding matching rule is not found from the inverted index table, and is ignored.
  • the character ‘Yes’ is retrieved, and the index bit in the text to be matched is 3, and the corresponding matching rule is not found from the inverted index table, and is ignored.
  • the character 'generation' is retrieved, and the index bit in the text to be matched is 4, and the corresponding matching rules 1 and 2 are found from the inverted index table.
  • the matching rule 1 is processed, and the current character 'generation' is the first character of the matching rule 1 corresponding keyword "generation? invoice”.
  • the matching process information is newly created.
  • the process information number is 2, and the initial value of the corresponding element in the matching process information 2 is replaced by the index bit 4 in the text to be matched with the current character "generation", as shown in Table 4:
  • the matching process information corresponding to the process information number 2 is checked, and only the first bit of the matching process information index[]:[4,-1,-1,-1] is a non-initial value -1, and the remaining two bits are The data is all initial value -1, so the match is not completed yet.
  • the matching rule 2 is processed, and the current character 'generation' is not the first character of the keyword corresponding to the matching rule 2, and the matching process is not newly created.
  • the corresponding keyword "find the proxy" already has the matching process information 1. Therefore, the initial value of the corresponding element in the matching process information 1 is replaced by the index bit 4 in the text to be matched with the current character "generation".
  • the matching process information table is shown in Table 5:
  • the character distance rule set is [3, 2, 0]; in this keyword, the character distance rule between 'find' and 'generation' is 2,
  • the distance between the two characters is represented, that is, the second distance is equal to 2, so the first distance and the second distance are in accordance with the character distance rule of the character 'generation' in the keyword "finding agent".
  • the last bit is -1, so the match is not successful.
  • the corresponding matching rule 7 is found from the inverted index table.
  • the current character 'ration' is not the first character of the keyword corresponding to the matching rule 7, and no new matching process is created.
  • the corresponding keyword "find the proxy" already has matching process information 1, so the index bit 5 of the current character "reason" in the text to be matched is updated to the matching process information 1, and the updated matching process information table As shown in Table 6:
  • the corresponding matching rule 3 is found from the inverted index table.
  • the current character 'on' is not the first character of the keyword corresponding to the matching rule 3, and no new matching process is performed.
  • the corresponding keyword "generation? invoice” has matching process information 2, therefore, the index bit 6 of the current character "on” in the text to be matched is updated to the matching process information 2, and the updated matching process
  • Table 7 The information table is shown in Table 7:
  • the character ‘battalion' is retrieved, and the index bit in the text to be matched is 7, and the corresponding matching rule is not found from the inverted index table, and is ignored.
  • the character ' industry' is retrieved, and the index bit in the text to be matched is 8, and the corresponding matching rule is not found from the inverted index table, and is ignored.
  • the character 'fat' is retrieved, and the index bit in the text to be matched is 9, and the corresponding matching rule 4 is found from the inverted index table.
  • the current character 'send' is not the first character of the keyword corresponding to the matching rule 4, and the new matching process is not created.
  • the corresponding keyword "generation? invoice” has matching process information 2, therefore, the index bit 9 of the current character "send" in the text to be matched is updated to the matching process information 2, and the updated matching process
  • Table 8 The information table is shown in Table 8:
  • the character distance rule between 'on' and 'fat' in this keyword is -1, indicating that the distance between two characters is the second distance is an arbitrary interval, indicating that the addition is successful. And the last bit is -1, so the match is not completed yet.
  • the character 'ticket' is retrieved, and the index bit in the text to be matched is 10, and the corresponding matching rule 5 is found from the inverted index table.
  • the current character 'ticket' is not the first character of the keyword corresponding to the matching rule 5, and the new matching process is not created.
  • the corresponding keyword "generation? invoice" has matching process information 2, therefore, the index bit of the current character in the text to be matched is updated to the matching process information 2, and the updated matching process information table is as shown in the table. 9 shows:
  • the character distance rule between 'on' and 'send' is -1, indicating that the distance between two characters, that is, the second distance is any interval, indicating that the addition is successful, in line with the key
  • all the bits of the matching process information are not -1, so the matching is completed, and the character string "proxy opening business invoice" is hit according to the first bit and the last bit of the matching process information.
  • the keyword is “generation??* invoice”
  • the text to be matched is “Beijing agent opens a business to open a tax invoice”.
  • the matching process information is updated, and the second bit of the matching process information array is updated to 4, that is, the matching process information array is [2, 4, -1, -1 ].
  • the second bit of the matching process information array in the existing matching process information has an existing value (non-initial value -1), and a new match is copied at this time.
  • Process information which matches the array of process information [2, 7, -1, -1].
  • the character to be matched has a certain keyword, but it is a non-first character in the keyword to be matched, for example, the keyword is “generation??* invoice” and “open* ticket”.
  • the text to be matched is “Beijing Opens a Tax Invoice”.
  • a matching rule is established for each character, and an inverted index table is created.
  • the keyword whose first character is not in the text to be matched is filtered out. It is not necessary to exhaust all the keywords to make the matching keyword operation easier.
  • the embodiment provides a device for multi-keyword fuzzy matching. As shown in FIG. 3, the method includes: an obtaining module 301, a determining module 302, a first marking module 303, a second marking module 304, and a determining module 305.
  • the obtaining module 301 is configured to obtain, for each character in the text to be matched, a keyword that belongs to the character and an index bit of the character in the keyword that belongs to the keyword according to the keyword set;
  • the determining module 302 is configured to determine, according to an index bit of the character in the keyword that belongs to the keyword, whether the character is the first character of the keyword that belongs to the keyword;
  • the first marking module 303 is configured to: when the determination result of the determining module is yes, record the keyword to which the character belongs in the matching information set, and mark the first character of the keyword in the record In the matching text;
  • a second marking module 304 configured to: when the determination result of the determining module is negative and the record of the keyword belongs to the matching information set, obtain a record of the keyword to which the character belongs, and Marking, in the record, the character in the keyword exists in the text to be matched;
  • the determining module 305 is configured to determine that the to-be-matched text hits the keyword when each character in a keyword is marked in the to-be-matched text.
  • the apparatus may further include: a matching rule building module, a matching rule set building module, and an inverted index building module.
  • the matching rule construction module is configured to respectively construct a matching rule corresponding to each character for each character of each keyword in the keyword set;
  • the matching rule includes: a character, a keyword of the character, and a The number of characters included in the keyword to which the character belongs, and the index bit of the character in the associated keyword;
  • the matching rule set construction module is configured to acquire a matching rule corresponding to each character in the keyword, and form a matching rule set corresponding to the keyword;
  • the inverted index construction module is configured to construct an inverted index table from the character to the matching rule according to the matching rule set; the inverted index table includes: a character, and all matches corresponding to the character rule.
  • the obtaining module 301 may include: a traversal unit and a first acquiring unit.
  • the traversing unit is configured to traverse the inverted index table, and when the inverted index table includes the character, acquire all matching rules corresponding to the character;
  • the first obtaining unit is configured to acquire, for each matching rule, a keyword of the character included in the matching rule, and an index bit of the character in the associated keyword.
  • the first marking module 303 may include: a first recording unit.
  • the first recording unit is configured to: when the determination result of the determining module is yes, create matching process information corresponding to the keyword to which the character belongs, and save the matching process information in the matching process information set; The index bit of the character in the text to be matched is recorded in the matching process letter.
  • the second marking module 304 may include: a second recording unit.
  • the second recording unit is configured to: when the determination result of the determining module is negative, search for a matching information set, and determine whether there is matching process information corresponding to the keyword to which the character belongs in the matching process set, if yes, And acquiring matching process information corresponding to the keyword to which the character belongs, and recording an index bit of the character in the to-be-matched text into the matching process information.
  • the second recording unit may further include: a determining subunit, a copy updating subunit, and an index bit recording subunit.
  • the determining subunit is configured to determine whether an index bit of the character in the to-be-matched text has been recorded in the matching process information corresponding to the keyword to which the character belongs;
  • the copy update subunit is configured to: when the judgment result of the judgment subunit is YES, copy the matching process information corresponding to the keyword to which the character belongs, and use the current index bit of the character in the to-be-matched text. Updating an index bit of the character recorded in the matching process information in the to-be-matched text;
  • the index bit recording subunit is configured to: when the determination result of the determining subunit is negative, perform matching process information corresponding to the keyword to which the character belongs, and the character is in the to-be-matched text.
  • the index bit is recorded to the step in the matching process information.
  • each character in the keyword is marked in the to-be-matched text, and each bit in the matching process information corresponding to the keyword to which the character belongs is set to correspond.
  • the index bit of the character of the number of bits in the text to be matched.
  • the apparatus may further include: an output module.
  • the output module is configured to: after the determining module determines that the to-be-matched text hits the keyword, obtain matching information from the to-be-matched text according to an index bit of each character in the text to be matched, and output the Match information.
  • the apparatus may further include: a character distance construction module and a matching verification module.
  • the character distance construction module is configured to separately construct a character distance rule for each keyword in the keyword set to form a character distance rule set, where the character distance rule includes: a number of characters included in the keyword, the key The effective distance between each character in the word and its previous character;
  • the matching check module is configured to perform distance matching check on the keyword to which the current character belongs according to the character distance rule set when the matching text is matched.
  • the matching verification module may include: a second obtaining unit, a third obtaining unit, a first checking unit, and a second checking unit.
  • the second obtaining unit is configured to obtain, from the keyword to which the current character belongs, an index bit of the previous character of the current character in the to-be-matched text, and the to-be-matched according to the previous character of the current character Calculating a first distance between the current character and a previous character of the current character by using an index bit in the text and an index bit of the current character in the to-be-matched text;
  • the third obtaining unit is configured to obtain, as a second distance, an effective distance between the current character and a previous character of the current character from a character distance rule corresponding to a keyword to which the current character belongs;
  • the first checking unit is configured to: if the second distance represents any interval length, or the second distance is greater than the first distance, indicating that the distance verification is successful, acquiring the next character of the current character Match
  • the second check unit is configured to: if the second distance indicates that it is not any interval length, and the second distance is smaller than the first distance, indicating that the distance check fails, the matching process information Invalidation, ending the matching of the keywords to which the character belongs.
  • FIG. 4 is a block diagram of an apparatus 800 for keyword fuzzy matching, according to an exemplary embodiment.
  • device 800 can be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a gaming console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like.
  • apparatus 800 can include one or more of the following components: processing component 802, memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, And a communication component 816.
  • Processing component 802 typically controls the overall operation of device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations.
  • Processing component 802 can include one or more processors 820 to execute instructions to perform all or part of the steps of the above described methods.
  • processing component 802 can include one or more modules to facilitate interaction between component 802 and other components.
  • processing component 802 can include a multimedia module to facilitate interaction between multimedia component 808 and processing component 802.
  • Memory 804 is configured to store various types of data to support operation at device 800. Examples of such data include instructions for any application or method operating on device 800, contact data, phone book data, messages, pictures, videos, and the like.
  • the memory 804 can be implemented by any type of volatile or non-volatile storage device, or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read only memory
  • EPROM Electrically erasable programmable read only memory
  • PROM Programmable Read Only Memory
  • ROM Read Only Memory
  • Magnetic Memory Flash Memory
  • Disk Disk or Optical Disk.
  • Power component 806 provides power to various components of device 800.
  • Power component 806 can include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for device 800.
  • the multimedia component 808 includes a screen between the device 800 and the user that provides an output interface.
  • the screen can include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen can be implemented as a touch screen to receive input signals from the user.
  • the touch panel includes one or more touch sensors to sense touches, slides, and gestures on the touch panel. The touch sensor may sense not only the boundary of the touch or sliding action, but also the duration and pressure associated with the touch or slide operation.
  • the multimedia component 808 includes a front camera and/or a rear camera. When the device 800 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.
  • the audio component 810 is configured to output and/or input an audio signal.
  • the audio component 810 includes a microphone (MIC) that is activated when the device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. Configured to receive external audio signals.
  • the received audio signal may be further stored in memory 804 or transmitted via communication component 816.
  • the audio component 810 also includes a speaker for outputting an audio signal.
  • the I/O interface 812 provides an interface between the processing component 802 and the peripheral interface module, which may be a keyboard, a click wheel, a button, or the like. These buttons may include, but are not limited to, a home button, a volume button, a start button, and a lock button.
  • Sensor assembly 814 includes one or more sensors for providing device 800 with a status assessment of various aspects.
  • sensor assembly 814 can detect an open/closed state of device 800, a relative positioning of components, such as the display and keypad of device 800, and sensor component 814 can also detect a change in position of one component of device 800 or device 800. The presence or absence of user contact with device 800, device 800 orientation or acceleration/deceleration, and temperature variation of device 800.
  • Sensor assembly 814 can include a proximity sensor configured to detect the presence of nearby objects without any physical contact.
  • Sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
  • the sensor assembly 814 can also include an acceleration sensor, a gyro sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
  • Communication component 816 is configured to facilitate wired or wireless communication between device 800 and other devices.
  • the device 800 can access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof.
  • the communication component 816 receives broadcast signals or broadcast associated information from an external broadcast management system via a broadcast channel.
  • the communication component 816 also includes a near field communication (NFC) module to facilitate short range communication.
  • NFC near field communication
  • the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
  • RFID radio frequency identification
  • IrDA infrared data association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • device 800 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable A gate array (FPGA), controller, microcontroller, microprocessor, or other electronic component implementation for performing the above methods.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGA field programmable A gate array
  • controller microcontroller, microprocessor, or other electronic component implementation for performing the above methods.
  • non-transitory computer readable storage medium comprising instructions, such as a memory 804 comprising instructions executable by processor 820 of apparatus 800 to perform the above method.
  • the non-transitory computer readable storage medium may be a ROM, a random access memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, and an optical data storage device.
  • a non-transitory computer readable storage medium when instructions in the storage medium are executed by a processor of a mobile terminal, enabling the mobile terminal to perform a method for keyword fuzzy matching, the method comprising: treating Matching each character in the text, respectively acquiring a keyword to which the character belongs and an index bit of the character in the keyword according to the keyword set; determining the character according to the index bit of the character in the keyword Whether it is the first character of the keyword; if the character is the first character of the keyword, the keyword belonging to the character is recorded in the matching information set, and the first character of the keyword is marked in the record.
  • FIG. 5 is a schematic structural diagram of a server in an embodiment of the present invention.
  • the server 1900 can vary considerably depending on configuration or performance, and can include one or more central processing units (CPUs) 1922 (eg, one or more processors) and memory 1932, one or one The above storage medium 1942 or storage medium 1930 of data 1944 (eg, one or one storage device in Shanghai).
  • the memory 1932 and the storage medium 1930 may be short-term storage or persistent storage.
  • the program stored on storage medium 1930 may include one or more modules (not shown), each of which may include a series of instruction operations in the server.
  • central processor 1922 can be configured to communicate with storage medium 1930, which performs a series of instruction operations in storage medium 1930.
  • the embodiment of the present invention further provides a program, including a readable code, when the readable code is run on a server, causing the server to perform the keyword fuzzy matching method according to any one of the embodiments of the present invention.
  • a program including a readable code, when the readable code is run on a server, causing the server to perform the keyword fuzzy matching method according to any one of the embodiments of the present invention.
  • a readable medium in which a program as described in an embodiment of the present invention is stored.
  • FIG. 6 shows a server that can implement the keyword fuzzy matching method according to the present invention.
  • the server conventionally includes a processor 1610 and a program product or readable medium in the form of a memory 1620.
  • the memory 1620 may be an electronic memory such as a flash memory, an EEPROM (Electrically Erasable Programmable Read Only Memory), an EPROM, or a ROM.
  • Memory 1620 has a memory space 1630 for program code 1631 for performing any of the method steps described above.
  • storage space 1630 for program code may include various program code 1631 for implementing various steps in the above methods, respectively.
  • These program codes can be read from or written to one or more program products.
  • These program products include program code carriers such as memory cards.
  • Such a program product is typically a portable or fixed storage unit as described with reference to FIG.
  • the storage unit may have a storage section, a storage space, and the like arranged similarly to the storage 1620 in the server of FIG.
  • the program code can be compressed, for example, in an appropriate form.
  • the storage unit includes readable code 1631', i.e., code that can be read by, for example, a processor such as 1610, which when executed by the server causes the server to perform various steps in the methods described above.
  • Server 1900 may also include one or more power sources 1926, one or more wired or wireless network interfaces 1950, one or more input and output interfaces 1958, one or more keyboards 1956, and/or one or more operating systems 1941.
  • power sources 1926 For example, Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.

Abstract

L'invention concerne un procédé et un appareil de mise en correspondance de mot-clé flou. Le procédé consiste : à acquérir un mot-clé auquel chaque caractère dans un texte à mettre en correspondance est associé, et un bit d'indice du caractère dans le mot-clé associé selon un ensemble de mots-clés respectivement (101) ; à déterminer le bit d'indice du caractère dans le mot-clé associé (102) ; si le caractère est un premier caractère, à enregistrer le mot-clé associé du caractère, et marquer le premier caractère dans le mot-clé dans le texte à mettre en correspondance (103) ; et si le caractère n'est pas le premier caractère et lorsque des mots-clés enregistrés existent, à rechercher, dans les mots-clés enregistrés, le mot-clé associé du caractère, et marquer le caractère dans le mot-clé dans le texte à mettre en correspondance (104). Lorsque chaque caractère dans un mot-clé est marqué dans le texte à mettre en correspondance, il est déterminé que le texte à mettre en correspondance atteint le mot-clé. Le procédé et l'appareil peuvent surmonter le défaut dans l'état de la technique qu'une faible efficacité de mise en correspondance de mot-clé dans une certaine mesure.
PCT/CN2016/104693 2015-12-11 2016-11-04 Procédé et appareil de mise en correspondance de mot-clé flou WO2017097075A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510921094.0 2015-12-11
CN201510921094.0A CN105550298B (zh) 2015-12-11 2015-12-11 一种关键词模糊匹配的方法及装置

Publications (1)

Publication Number Publication Date
WO2017097075A1 true WO2017097075A1 (fr) 2017-06-15

Family

ID=55829487

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/104693 WO2017097075A1 (fr) 2015-12-11 2016-11-04 Procédé et appareil de mise en correspondance de mot-clé flou

Country Status (2)

Country Link
CN (1) CN105550298B (fr)
WO (1) WO2017097075A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109783607A (zh) * 2018-12-19 2019-05-21 南京莱斯信息技术股份有限公司 一种在任意文本中匹配识别海量关键词的方法
CN109977422A (zh) * 2019-04-18 2019-07-05 中国石油大学(华东) 一种基于分词技术的病历关键信息提取模型
CN110134686A (zh) * 2019-05-07 2019-08-16 浪潮软件集团有限公司 一种中文关键词模糊查询的索引创建方法及系统
CN112052413A (zh) * 2020-08-28 2020-12-08 上海谋乐网络科技有限公司 Url模糊匹配方法、装置和系统
CN115210708A (zh) * 2019-08-07 2022-10-18 齐纳特科技公司 信息跟踪系统的数据条目特征

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105550298B (zh) * 2015-12-11 2019-12-10 北京搜狗科技发展有限公司 一种关键词模糊匹配的方法及装置
CN106649427B (zh) * 2016-08-08 2020-07-03 中国移动通信集团湖北有限公司 一种信息识别的方法及装置
CN109635009B (zh) * 2018-12-27 2023-09-15 北京航天智造科技发展有限公司 模糊匹配查询系统
CN110008383B (zh) * 2019-04-11 2021-07-27 北京安护环宇科技有限公司 一种基于多索引的黑白名单检索方法及装置
CN110442570B (zh) * 2019-06-06 2021-08-17 北京左江科技股份有限公司 一种BitMap高速模糊查找方法
CN113420192B (zh) * 2021-06-09 2022-04-05 湖南大学 一种基于模糊匹配的ui元素搜索方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100169341A1 (en) * 2008-12-30 2010-07-01 Ebay Inc. Predictive algorithm for search box auto-complete
CN102323929A (zh) * 2011-08-23 2012-01-18 上海粱江通信技术有限公司 一种实现中文短信模糊匹配关键字的方法
CN103902714A (zh) * 2014-04-03 2014-07-02 北京国双科技有限公司 关键词过滤方法和装置
CN104598464A (zh) * 2013-10-31 2015-05-06 联想(北京)有限公司 一种信息处理方法及电子设备
CN104750673A (zh) * 2013-12-31 2015-07-01 中国移动通信集团公司 文本匹配过滤方法及装置
CN105550298A (zh) * 2015-12-11 2016-05-04 北京搜狗科技发展有限公司 一种关键词模糊匹配的方法及装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102932421A (zh) * 2012-09-28 2013-02-13 中国联合网络通信集团有限公司 云备份方法及装置
CN104602206A (zh) * 2014-12-31 2015-05-06 上海大汉三通通信股份有限公司 一种垃圾短信识别方法与系统
CN105205048B (zh) * 2015-10-21 2018-05-04 迪爱斯信息技术股份有限公司 一种热词分析统计系统及方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100169341A1 (en) * 2008-12-30 2010-07-01 Ebay Inc. Predictive algorithm for search box auto-complete
CN102323929A (zh) * 2011-08-23 2012-01-18 上海粱江通信技术有限公司 一种实现中文短信模糊匹配关键字的方法
CN104598464A (zh) * 2013-10-31 2015-05-06 联想(北京)有限公司 一种信息处理方法及电子设备
CN104750673A (zh) * 2013-12-31 2015-07-01 中国移动通信集团公司 文本匹配过滤方法及装置
CN103902714A (zh) * 2014-04-03 2014-07-02 北京国双科技有限公司 关键词过滤方法和装置
CN105550298A (zh) * 2015-12-11 2016-05-04 北京搜狗科技发展有限公司 一种关键词模糊匹配的方法及装置

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109783607A (zh) * 2018-12-19 2019-05-21 南京莱斯信息技术股份有限公司 一种在任意文本中匹配识别海量关键词的方法
CN109977422A (zh) * 2019-04-18 2019-07-05 中国石油大学(华东) 一种基于分词技术的病历关键信息提取模型
CN110134686A (zh) * 2019-05-07 2019-08-16 浪潮软件集团有限公司 一种中文关键词模糊查询的索引创建方法及系统
CN115210708A (zh) * 2019-08-07 2022-10-18 齐纳特科技公司 信息跟踪系统的数据条目特征
CN115210708B (zh) * 2019-08-07 2023-09-01 齐纳特科技公司 处理文本数据的方法和系统、非暂时性计算机可读介质
US11783127B2 (en) 2019-08-07 2023-10-10 Zinatt Technologies, Inc. Data entry feature for information tracking system
CN112052413A (zh) * 2020-08-28 2020-12-08 上海谋乐网络科技有限公司 Url模糊匹配方法、装置和系统
CN112052413B (zh) * 2020-08-28 2024-02-13 上海谋乐网络科技有限公司 Url模糊匹配方法、装置和系统

Also Published As

Publication number Publication date
CN105550298B (zh) 2019-12-10
CN105550298A (zh) 2016-05-04

Similar Documents

Publication Publication Date Title
WO2017097075A1 (fr) Procédé et appareil de mise en correspondance de mot-clé flou
US10142351B1 (en) Retrieving contact information based on image recognition searches
CN107851092B (zh) 个人实体建模
CN107102746B (zh) 候选词生成方法、装置以及用于候选词生成的装置
CN107357779B (zh) 一种获取机构名称的方法及装置
US10223464B2 (en) Suggesting filters for search on online social networks
CN109522419B (zh) 会话信息补全方法及装置
JP5744892B2 (ja) テキストフィルタリングの方法およびシステム
WO2017157040A1 (fr) Procédé et dispositif de recherche, et dispositif utilisé pour la recherche
KR102138184B1 (ko) 소셜 미디어 콘텐츠의 요약을 위한 메타데이터 사용
WO2020221162A1 (fr) Procédé et appareil de recommandation de programme d'application, dispositif électronique, et support
WO2017143930A1 (fr) Procédé de tri de résultats de recherche, et dispositif associé
RU2673401C2 (ru) Способ и устройство для получения удостоверяющего документа
KR102046582B1 (ko) 호 기록을 제공하기 위한 방법 및 장치
CN108427761B (zh) 一种新闻事件处理的方法、终端、服务器及存储介质
JP2014531664A (ja) モバイル環境における漸進的パターンマッチングのための方法および装置
WO2017016384A1 (fr) Procédé de traitement de message court, procédé et dispositif de traitement d'informations, terminal mobile et support de stockage
EP3387556B1 (fr) Suggestions de mots-clics automatisées pour la catégorisation de communications
WO2017107708A1 (fr) Procédé et dispositif d'extraction de préfixes de localisateur uniforme de ressource pour auto-adaptation de mandataire d'utilisateur
CN109783244B (zh) 处理方法和装置、用于处理的装置
CN110928425A (zh) 信息监控方法及装置
CN107229698B (zh) 一种信息处理的方法及装置
CN109189824B (zh) 一种检索相似文章的方法及装置
CN110020082B (zh) 一种搜索方法及装置
CN106959970B (zh) 词库、词库的处理方法、装置和用于处理词库的装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16872267

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16872267

Country of ref document: EP

Kind code of ref document: A1